FormalPara Key Points for Decision Makers

The newly developed and validated Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ) is the first self-reported patient-reported outcome tool to evaluate daytime symptoms in people with insomnia disorder.

With a daily recall period, short completion time, and subject feedback on ease of completion, the IDSIQ is patient friendly and has the potential to be used in confirmatory studies of new drugs for insomnia.

1 Introduction

The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) defines insomnia disorder as a predominant complaint of dissatisfaction with sleep quantity or quality that causes distress or impairs functioning in social, occupational, educational, academic, or other settings [1]. Insomnia disorder has a negative effect on daytime functioning, which manifests as fatigue, sleepiness, comorbid mood disorders, and cognitive impairments [1,2,3,4,5]. In spite of this, current therapies for insomnia have mostly been examined for their effect on sleep, and few data are available to establish improvements in daytime functioning following insomnia treatment [6]. Importantly, there is a dearth of information on the symptom burden and its impact on patients’ lives, which are best assessed by patients themselves.

Regulatory authorities now require that efficacy in clinical trials of medical products for insomnia is captured not only by the treating physicians but also by patients themselves using validated patient-reported outcome (PRO) instruments [7, 8]. PROs also have utility in other forms of clinical research and in the routine assessment of patients with insomnia. However, when we reviewed the available literature during the development of a new medical product for insomnia, it became apparent that existing PRO instruments for assessing daytime functioning had not been developed using Food and Drug Administration (FDA) guidance for industry on PRO measures [7]. The most recognized self-report instrument for assessing the severity of insomnia is the Insomnia Severity Index (ISI). The ISI measures both the night-time and daytime components of insomnia, but only includes one item explicitly assessing daytime functioning [9]. Moreover, its 2-week recall period means that it does not capture day-to-day variability in symptoms.

Our review of the literature identified the Daytime Insomnia Symptom Scale (DISS) [10], the Daytime Consequences of Sleep Questionnaire (DCSQ) [11], the Functional Outcomes of Sleep Questionnaire (FOSQ) [12], the Pittsburgh Insomnia Rating Scale (PIRS) [13], and the Profile of Mood States (POMS) [14, 15] as existing PRO instruments that could be used to assess daytime functioning in insomnia. The review of the instruments included the following aspects: content of the questionnaire, evidence of face and content validity, meaning of the items based on their original wordings, relevance of the recall period, level of patient input in questionnaire development, and if psychometric evaluation was undertaken (including the population, type of analysis, and determination of clinical significance of changes or differences). It was judged that all five instruments had limitations and that a new or modified PRO instrument was needed. Four of the five instruments were assessed to be unsuitable for further development because: they were not publicly available and their development was not documented (DSCQ); they were judged to have a limited number of relevant items (DCSQ, FOSQ, PIRS) or a surfeit of irrelevant items (POMS); or they were developed for an indication other than insomnia (FOSQ) or to assess mood states not linked to a particular indication (POMS). Another instrument for assessing daytime functioning in insomnia, the Sleep Functional Impact Scale (SFIS), was published after our review of the literature [16].

The DISS is a 20-item instrument, developed at the University of Pittsburg, that assesses daytime functioning using items grouped into Alert/Cognition, Positive Mood, Negative Mood, and Sleepiness/Fatigue domains [10]. It combines nine items for measuring global vigor and affect [17] with 11 items derived from the Hopkins Symptom Checklist-90 [18], Thayer’s Activation-Deactivation Adjective Check List [19], and a chart review of 94 patients with chronic insomnia [20]. The factorial structure of the DISS was established in a prospective study in good sleepers and subjects with primary insomnia, which also included psychometric assessment of the DISS items and their ability to capture treatment benefits relating to improved daytime functioning [10]. However, the DISS was not developed using FDA guidance and had limited patient input during item selection and development of the final questionnaire. Moreover, psychometric evaluation of the DISS was limited: test-retest validity, known-groups validity, and responsiveness were not assessed, and meaningful change thresholds were not identified.

It was decided to use the DISS as the starting point for creating and validating a new PRO instrument for capturing patients’ experiences of daytime impairment during chronic insomnia. Psychometric properties of the new instrument were explored using data from a single-arm interventional study in subjects with insomnia and an observational study with good sleepers.

2 Early Development of the Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ)

2.1 Methods

Three sets of interviews (Fig. 1) were conducted by independent outcomes research teams in accordance with national laws and regulations. Institutional review board/independent ethics committee approval was obtained, and all subjects provided informed consent prior to study enrollment.

Fig. 1
figure 1

Development of the Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ). DISS Daytime Insomnia Symptom Scale, mDISS modified Daytime Insomnia Symptom Scale, NRS numeric rating scale, PRO patient-reported outcome, VAS visual analog scale

The qualitative interviews were conducted with subjects diagnosed with insomnia according to the fourth edition of the DSM (DSM-IV) or DSM-5. Subjects were recruited by a combination of clinical sites and recruitment agencies. All interviews were conducted using semi-structured interview guides by researchers trained in qualitative interview techniques; audio recorded and transcribed; and coded and analyzed using ATLAS.ti (Scientific Software Development GmbH). The qualitative data collected were analyzed using grounded theory [21,22,23] and by content analysis [24, 25].

One-hour face-to-face interviews were held with subjects with insomnia to elicit concepts relating to their insomnia; to subsequently assess the comprehensiveness of the DISS items and their relevance to the subjects’ experience of insomnia; and to test subjects’ understanding of the DISS’s item wordings, recall period, and response scale. This first set of interviews continued until saturation of concepts was demonstrated. Further 1-hour face-to-face interviews were subsequently conducted with subjects with insomnia to explore the comprehensiveness, relevance, and understandability of a modified version of the DISS comprising 23 items.

The DISS 23 item questionnaire was further modified based on this qualitative research, as well as on discussion with experts in insomnia research. The number of items was ultimately reduced to 18, and the modified instrument was renamed the Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ)-18. Following the change from a visual analog scale to a numeric rating scale (NRS), qualitative assessment was conducted via telephone interviews with subjects with insomnia to assess their understanding of the IDSIQ-18 response options.

Demographic characteristics of subjects in the qualitative interviews were summarized using descriptive statistics. Semi-structured interview guides available upon request.

2.2 Results

Early development of the IDSIQ is summarized in Fig. 1. Table 1 of the Electronic Supplementary Material (ESM) shows COREQ-required information for the qualitative interviews that contributed to early development of the IDSIQ [26]. Demographic characteristics of the subjects in the qualitative interviews are shown in Table 1.

Table 1 Demographic characteristics of subjects in the qualitative interviews (full analysis population)

In the first set of qualitative face-to-face interviews, conducted in 25 subjects diagnosed with insomnia (Table 1), participants generally demonstrated a good understanding of the DISS item wordings. Moreover, there was strong alignment between the item wordings and the language used in the concept elicitation exercise. Based on the interviews, the DISS’s response scale was modified to include an absolute zero (i.e., “not at all”) point and the assessment time was specified as “today” to capture the variability of symptoms throughout the day (in the original DISS, the recall period was not specified). In addition, four items that were poorly understood or non-specific were removed. Table 2 of the ESM summarizes all item changes during development of the IDSIQ. Further analysis of the qualitative data led to an additional five items being removed and ten items being added, giving a 21-item modified DISS (mDISS). The iterative methodology used during the concept elicitation supported the view that saturation had been achieved and that the instrument was reflective of the experience of patients with insomnia.

Subsequent revisions, which included the splitting of two items (tired = mentally tired and physically tired; alert = mentally alert and physically alert), yielded a 23-item version of the mDISS. These modifications required additional patient input, and 1-hour face-to-face interviews were conducted with 15 subjects with insomnia (Table 1). It was generally reported that the modified questionnaire was easy to understand. Subjects also demonstrated good understanding of the mDISS items, with the number of subjects who understood each item ranging from 10 to 15. Based on these interviews, five items were removed and one item was reworded. These modifications yielded the IDSIQ-18, which comprised three hypothesized domains: ‘Alert/Cognition’ (3 items), ‘Negative Mood’ (8 items), and ‘Sleepiness’ (7 items). Items 1, 2, 3, 14, and 18 are positively scored (i.e., 0 = worse status; 10 = best status) and require reverse scoring prior to analysis. In addition, based on evidence that an NRS can have superior test-retest validity than a visual analog scale [27], the 10-cm unipolar visual analog scale [10] used in the mDISS was replaced with an 11-point NRS ranging from 0 (not at all) to 10 (very). Subsequent telephone interviews with 14 subjects with insomnia (Table 1) indicated good understanding of the NRS: subjects were able to define the significance of the numbers on the scale in relation to symptom severity.

3 Content and Psychometric Evaluation of the IDSIQ

3.1 Methods

Clinical studies (Fig. 1) were conducted in accordance with International Council for Harmonisation E6 guidelines for Good Clinical Practice, the Declaration of Helsinki, and national laws and regulations. Institutional review board/independent ethics committee approval was obtained for all sites conducting the studies and informed consent was obtained from all subjects prior to study enrollment.

3.1.1 Study Design and Participants

To further evaluate the content validity of the IDSIQ-18 and assess its psychometric properties, data from two studies were analyzed: an interventional study in subjects with insomnia and a non-interventional observational study in good sleepers. The interventional study was conducted by Idorsia, the sponsor of this research. The observational study and psychometric analyses based on the interventional study were conducted by independent outcomes research teams. In both studies, the IDSIQ-18 was completed as part of an electronic daily diary with mandatory responses for all items.

The interventional study was a single-arm, open-label clinical trial (NCT03056053, EudraCT# 201600425959) conducted at nine sites in Germany and the USA from February to June 2017. Prior to initiation of the interventional study, the IDSIQ-18 was translated into German and reviewed by native German speakers to ensure that the items and instructions were well understood. A total of 114 subjects aged ≥ 18 years who met DSM-5 criteria for insomnia, with self-reported sleep disturbance (≥ 3 nights/week for ≥ 3 months) and an ISI score of ≥ 15 at screening, were enrolled. The design of the interventional study comprised a screening period (14 days) followed by a treatment period (14 days), during which subjects took zolpidem (5 or 10 mg according to the label) once daily in the evening. All subjects completed a daily eDiary in the morning and evening during the screening and treatment periods including the morning of day 15. The IDSIQ-18 was included in the evening diary; the end of treatment IDSIQ-18 assessment was completed in the evening on day 14 (the eDiary was returned during the day on day 15, meaning that no evening IDSIQ-18 assessment was possible). Additional PROs included in the morning and evening diaries were the Patient-Reported Outcomes Measurement Information System (PROMIS) Short Form Sleep Related Impairment (PROMIS Sleep) [28] and Patient Global Assessment of Disease Severity (PGA-S), assessed on days 1, 8, and 15; the ISI [9], assessed on days 1 and 15; and the Patient Global Impression of Change (PGI-C), assessed on days 8 and 15.

The observational study was a multi-site study conducted in the USA to support the validation of the IDSIQ. Its planned duration was February to May 2017. One hundred and three “good sleepers” aged ≥ 18 years without diagnosed insomnia, sleep apnea, or other diagnoses or drug treatments that can lead to sleep problems or fatigue completed the IDSIQ-18 daily for 15 days. All subjects provided written informed consent.

3.1.2 Safety

In the interventional study, adverse events (AEs) were recorded from the signing of informed consent to 30 days after the end of treatment. Treatment-emergent AEs were defined as AEs occurring from the first day of study treatment to the day after the last day of study treatment. Treatment-emergent AEs were assessed for severity, seriousness, relationship to study treatment, and action taken.

3.1.3 Exit Interviews

At the end of the interventional study, 45-minute exit interviews were conducted by telephone with a subset of subjects to generate a conceptual model of insomnia and the impact of insomnia on daytime functioning and health-related quality of life. Subjects’ understanding and interpretation of the IDSIQ-18 response options was also explored. Subjects were asked which three IDSIQ-18 items were most relevant to them, and for these items, were asked to indicate the minimum score change they would consider meaningful. They were also asked how long it took them to complete the IDSIQ-18 and how easy it was to fit into their daily routine. All interviews were held within 2 weeks of completing study treatment.

Interviews were conducted by experienced outcomes researchers using a semi-structured interview guide (available upon request) and were audio recorded and transcribed verbatim. The transcripts were anonymized and were analyzed thematically by grounded theory methods of constant comparison [21,22,23] and content analysis [24, 25] using ATLAS.ti version 7.0. The conceptual coding dictionary initially comprised codes that were defined a priori but was updated iteratively during the coding process.

3.1.4 Statistical Analyses

For both the interventional and observational studies, the following analysis populations were defined: the full analysis population, which comprised all subjects enrolled in the study; the cross-sectional population (CSP), which comprised subjects with non-missing data for the time point being analyzed; and the longitudinal analysis population, which comprised subjects with non-missing data for day 1 and for the subsequent time point being analyzed. The populations used to assess test-retest reliability comprised subjects in the CSP with data for days 1 and 8 and with a ≤ 1 point change on the PGA-S between day 1 and day 8 or no change on the PGA-S between day 1 and day 8. The safety set (interventional study only) comprised all subjects who received at least one dose of zolpidem.

Demographics and baseline characteristics for subjects with insomnia and good sleepers were summarized using descriptive statistics. The percentage of subjects selecting each response option on days 1, 8, and 15 was calculated for each item using the CSP. Floor and ceiling effects were defined as > 15% of subjects selecting the lowest and highest response options, respectively [29]. Scores for individual IDSIQ-18 items on day 1 were compared between subjects with insomnia and good sleepers by independent-groups t-test of means for continuous variables. The percentage of subjects who did not complete the IDSIQ-18 was calculated for each day during the treatment period (subjects with insomnia) or study period (good sleepers).

Inter-item correlations were calculated for item scores on day 1 using the CSP. Item pairs with highly correlated responses (> 0.80) were evaluated for redundancy [30,31,32].

3.1.5 Exploratory Factor Analysis

Using data from day 1, an exploratory factor analysis was performed to evaluate a hypothesized conceptual framework for the IDSIQ-18 comprising three domains: Alert/Cognition (items 1–3), Negative Mood (items 4–11), and Sleepiness (items 12–18) [33]. Scree plots and corresponding eigenvalues were examined to empirically approximate the number of factors for the IDSIQ-18 items. Factor solutions with eigenvalues near to or greater than 1 were examined. Root mean square error of approximation (RMSEA) and standardized root mean square residual (SRMR) were calculated to evaluate goodness of fit [34]. Models were estimated using weighted least-square mean and variance adjustment and solutions with an oblique rotation (promax) were examined. Items with factor loadings < 0.40 on all factors or that loaded on two or more factors were flagged for further review [35].

3.1.6 Rasch Analysis

The Andrich Rating Scale model [36], which assesses dimensionality, item and response scale fit, and residual correlations between the items (indicating redundancy or overlapping items), was used to confirm the unidimensional factor structure of the IDSIQ items. Criteria for unidimensionality were > 60% of the variance explained by the Rasch model and < 5% by the first residual factor. Local dependence (redundancy) was assumed when paired residual correlations were ≥ 0.40. Acceptable item fit was defined by infit and outfit statistics within the range of 0.5–1.5. To demonstrate the degree to which the items reliably differentiate subjects along the insomnia severity continuum, person and item reliability values from the Rasch model were assessed with a prespecified criterion of ≥ 0.80 (high discrimination).

Four of the IDSIQ-18 items were flagged for removal because of their high inter-item correlations with other items. The deletion of these four items was supported by the results of the exploratory factor and Rasch analyses and a review of the qualitative data specific to these items. The resulting 14-item version of the IDSIQ comprised three domains: Alert/Cognition, Mood, and Sleepiness. Psychometric evaluation of the IDSIQ-14 was performed using data for subjects with insomnia and good sleepers.

3.1.7 Confirmatory Factor Analysis

Confirmatory factor analysis models were used to confirm the hypothesized three-domain structure of the IDSIQ-14 [37, 38], which was similar to that of the DISS [10]. A full hierarchical model reflecting the relationships of the IDSIQ-14 domains and total score was estimated. The domains were analyzed as first-order factors and IDSIQ-14 total score as a second-order factor. Each domain was also evaluated separately. The confirmatory factor analysis models were estimated by weighted least-square mean and variance-adjusted estimation at day 1 for subjects with insomnia. Model fit was assessed by calculating comparative fit index (CFI: a score ≥ 0.95 indicates good fit), RMSEA (a score < 0.08 indicates acceptable fit), and SRMR (a score < 0.08 indicates good fit) [34, 37, 39]. Adequacy of item fit was evaluated by examining item factor loadings.

3.1.8 Item Total Correlations

Corrected item total Pearson correlations were calculated for the IDSIQ-14, and items for which the correlation with their own domain was < 0.40 were flagged for further review.

3.1.9 Internal Consistency

CSP data from day −6 to day 1 for subjects with insomnia and from day 1 for good sleepers were used to assess the internal consistency of the IDSIQ-14 domains. Cronbach’s alpha was calculated for IDSIQ-14 total score and for each domain to assess the homogeneity of the item scores. Cronbach’s alpha was recalculated for the domains after removal of individual items. A value > 0.70 indicates acceptable internal consistency [40, 41].

3.1.10 Test-Retest Reliability

Test-retest reliability was assessed by calculating the Shrout and Fleiss intraclass correlation coefficient (ICC) for IDSIQ-14 scores on days 1 and 8: ICC(2,1) = between-subject mean square/(between-subject mean square + within-subject mean square). Weekly average IDSIQ-14 scores, calculated as the mean score over days −6 to 1 (day 1) or days 2–8 (day 8), were used in the analyses, where possible (for good sleepers, no IDSIQ-14 data were collected prior to day 1). Subjects with IDSIQ-14 data for fewer than 2 days in a given week were excluded from analyses based on weekly average scores. An ICC value of ≥ 0.70 provides evidence of acceptable test-retest reliability for a scale to be used in detecting group mean differences [42, 43].

3.1.11 Concurrent Validity

Concurrent validity was evaluated at day 1 by calculating Pearson correlation coefficients for IDSIQ-14 total score and domain scores vs ISI total score and PROMIS Sleep T-score. For subjects with insomnia, IDSIQ-14 scores were calculated as the mean score over days −6 to 1. PROMIS Sleep T-score was calculated using a mapping algorithm [28]. A correlation coefficient of > 0.70 indicated a strong correlation; a coefficient of 0.40–0.70 a moderate correlation; and a coefficient of < 0.40 a weak correlation.

3.1.12 Known-Groups Validity

Mean weekly average IDSIQ-14 total and domain scores at day 1 were compared between subgroups of subjects with insomnia, grouped according to the ISI total score and PGA-S. Analysis of covariance models were used to assess the significance of differences in IDSIQ-14 total and domain scores, controlling for age and sex. Group means were compared between different ISI and PGA-S severity categories by post hoc pairwise testing with Scheffe’s test, to adjust for multiple comparisons. Mean weekly average IDSIQ-14 scores were also compared between subjects with insomnia and good sleepers by independent-groups t test.

3.1.13 Responsiveness

Responsiveness was calculated using data from the longitudinal analysis population for subjects with insomnia. The final assessment was on day 14 for the IDSIQ-14 and on day 15 for other PROs. For the IDSIQ-14, weekly average scores were calculated as the mean score over days −6 to 1 (day 1), days 2–8 (day 8), and days 9–14 (day 14/15). Differences in mean IDSIQ-14 total score and domain scores between day 1 and days 8 and 14 were calculated for different subgroups of subjects, grouped according to PGI-C, change in PGA-S score, and change in ISI score. The standardized response mean (Cohen’s d) was calculated by dividing the mean change in the IDSIQ-14 score between day 1 and day 8 or day 14 by the standard deviation of IDSIQ-14 score changes. The magnitude of responsiveness based on the standardized response mean was assessed using Cohen’s criteria of 0.20 = small, 0.5 = moderate, and 0.80 = large [44]. Mean differences in IDSIQ-14 scores between the different PGI-C, PGA-S, and ISI response categories were analyzed using analysis of covariance models adjusted for age, sex, and baseline IDSIQ-14 score. Change categories were compared with the “No change” (PGI-C) or “0” (PGA-S and ISI) category in pairwise fashion by Scheffe’s test to adjust for multiple comparisons.

3.1.14 Cumulative Distribution Function and Probability Density Function Curves

Cumulative distribution function and probability density function curves for change in weekly average IDSIQ-14 total score and domain scores according to PGI-C and PGA-S were plotted for subjects with insomnia using the same PGI-C and PGA-S categories as in the analysis of responsiveness.

3.1.15 Discriminant Validity

A multiple linear regression model adjusted for sex and age was used to predict outcomes on the IDSIQ using insomnia status (good sleepers vs subjects with insomnia) as the independent variable. Beta coefficients, standard error, 95% confidence intervals, and p values were calculated.

3.1.16 Interpretation of Scores

Thresholds for score changes which subjects might perceive as meaningful were derived by an anchor-based approach using the responsiveness data [45]. PGI-C and PGA-S were chosen as the anchors based on Pearson correlation coefficients >0.40 for correlations with weekly average IDSIQ-14 scores. Differences in mean weekly average IDSIQ-14 scores between day 1 and day 8 or day 14/15 were calculated for each PGI-C and PGA-S change category. Consistent with FDA recommendations [7], multiple estimates based on the anchor-based approach were compared to identify where they converged around a similar set of values that, together with the standardized response mean values and 95% confidence intervals for IDSIQ-14 score changes, could be used to identify meaningful changes.

3.2 Results

Development of the final 14-item IDSIQ is summarized in Fig. 1. Demographic characteristics of the subjects in the clinical studies are shown in Table 2. Subjects in both studies were predominantly female (65% for subjects with insomnia and 60% for good sleepers), with a mean (standard deviation) age of 51 (12) years for subjects with insomnia and 45 (17) years for good sleepers. A majority of subjects were white and non-Hispanic or Latino. Daytime sleepiness based on PGA-S was mild, very mild, or absent in 75% of good sleepers, compared with 8% of subjects with insomnia. Most subjects with insomnia (75%) had an ISI total score of 15–21, indicating insomnia of moderate severity. Current sleep-related health was good to excellent in 97% of good sleepers and fair in 3%.

Table 2 Demographic characteristics of subjects in the clinical studies (full analysis population)

In the interventional study, the AE data were in accordance with the known safety profile of zolpidem (Table 3 of the ESM). One unrelated AE led to discontinuation of zolpidem; there were no serious or severe AEs.

3.2.1 Exit Interviews

Exit interviews were conducted with 41 US subjects exiting the interventional study (Table 2). Concepts that emerged during the interviews are summarized in Table 4 of the ESM. Table 1 of the ESM shows COREQ-required information for the exit interviews [26].

The most frequently elicited daytime effects of insomnia were tiredness (n = 38) and concentration issues (n = 17). Twelve of the 15 physical, cognitive, and emotional impacts that emerged during the concept elicitation exercise are captured by the IDSIQ-18 (the concepts that are not captured are pain, numbness, and sadness/depression).

Item 12 (“Energetic”: n = 18) and item 1 (“Mentally Alert”: n = 14) were the IDSIQ-18 items that resonated most strongly with subjects. Subjects indicated a good understanding of the response scale and were able to provide a clear interpretation of the meaning of a score of 0, 5, and 10 on the response scale.

The mean minimum score change considered meaningful was 3.2 points for the Alert/Cognition domain (n = 15), 3.4 points for the Mood domain (n = 11), and 3.5 points for the Sleepiness domain (n = 18). At the item level, the mean minimum score change that subjects considered meaningful ranged from 2.0 for item 13 (“Effort”) to 4.5 for items 8 (“Irritable”) and 9 (“Impatient”).

Subjects reported that the IDSIQ-18 was easy to complete and took 5-10 minutes. They also reported that it was easy to fit the questionnaire into their daily schedule.

3.2.2 Item Distribution

Missing data were minimal in the interventional study, with 0–7.0% of subjects with insomnia not completing the IDSIQ-18 on any given day in the treatment period. The completion rate was also high for the good sleepers in the observational study, with 9.7–21.4% of subjects not completing the IDSIQ on any given day in the study period.

Analysis of data from day 1 of zolpidem treatment showed that the subjects with insomnia collectively selected all 11 response options (0–10), with the exception of items 14 (“Refreshed”) and 15 (“Mentally Tired”) (Fig. 1a of the ESM]. No floor or ceiling effects were observed. By contrast, good sleepers showed a large floor effect: for each item, over 15% of subjects selected a response option of 0 on day 1 and over 40% of subjects selected a response option of 0 to 2 (Fig. 1b of the ESM). However, good sleepers selected a response option of at least “8” for each of the 18 items and the “10” response option for 13 of the 18 items. Mean scores for each of the 18 items on day 1 were higher (worse) in subjects with insomnia compared with good sleepers (Table 3).

Table 3 Mean day 1 item scores in subjects with insomnia and good sleepers (full analysis population)

3.2.3 Inter-Item Correlations

The number of item pairs with correlations > 0.80 on day 1 (which is indicative of redundancy) was 15 for subjects with insomnia and 2 for good sleepers (Table 4). For subjects with insomnia, item 1 (“Mentally Alert”) was strongly correlated with the two other items in the Alert/Cognition domain: item 2 (“Clear Headed”) and item 3 (“Concentrate”). Moreover, item 6 (“Nervous”) was strongly correlated with another item in the Mood domain: item 11 (“Overly Sensitive”). Items 9 (“Impatient”) and 11 were strongly correlated with item 8 (“Irritable”); item 9 was also strongly correlated with item 10 (“Stressed”). Finally, item 12 (“Energetic”) was strongly correlated with three other items: 1, 14 (“refreshed”), and 18 (“awake”).

Table 4 Inter-item correlations for the Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ)-18 on day 1 in subjects with insomnia and good sleepers (cross-sectional population)

3.2.4 Exploratory Factor Analysis

Calculated eigenvalues suggested the presence of two to three separate factors, with good evidence for the first factor (eigenvalue 12.093) and second factor (1.441), and some evidence for a third factor (0.954). In the three-factor model, four items did not conform to the hypothesized domain structure of the IDSIQ-18: item 7 (“Frustrated”) loaded on the Sleepiness domain rather than on the Negative Mood domain and items 12 (“Energetic”), 14 (“Refreshed”), and 18 (“Awake”) loaded on the Alert/Cognition domain rather than the Sleepiness domain (Table 5).

Table 5 Exploratory factory analysis of the hypothesized Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ-18) domain structure using day 1 data from subjects with insomnia (full analysis population, N = 114)

3.2.5 Rasch Analysis

Rasch principal component analysis revealed that 66.6% of the variance on day 1 was explained by the Rasch single-factor dimension. The first residual factor explained 8.8% of the variance, indicating that the IDSIQ is multidimensional (rather than unidimensional). Two item pairs, item 1-item 2 (“Mentally Alert”-“Clear Headed”: 0.74) and item 1-item 3 (“Mentally Alert”- “Concentrate”: 0.65), displayed high residual correlations, suggesting redundancy or local dependence. The outfit statistic of 1.7 for item 4 (“Forgetful”) was slightly above the acceptable range of 0.5–1.5, indicating variability in the item responses at the tails of the severity distribution; however, the infit statistic of 1.5 was within the acceptable range. Person and item reliability values (person = 0.96; item = 0.92) were above the threshold of 0.8.

3.2.6 Generation of the Final IDSIQ Instrument

Based on the results of the inter-item correlation, exploratory factor, and Rasch analyses, as well as the qualitative interviews, four of the IDSIQ-18 items were removed: 1 (“Mentally Alert”), 6 (“Nervous”), 9 (“Impatient”), and 11 (“Overly Sensitive”). The remaining 14 items were renumbered and loaded onto three domains: Alert/Cognition (six items), Negative Mood (four items), and Sleepiness (four items) (Table 6). Although these domains resembled the domains specified in the theoretical framework for the 18-item IDSIQ, the 14 IDSIQ items loaded differently to the hypothesized structure. The “Negative Mood” domain was renamed the “Mood” domain. The final IDSIQ instrument is shown in Fig. 2 of the ESM.

Table 6 Internal consistency of the Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ)-14 at day 1 (cross-sectional population)

3.2.7 Confirmatory Factor Analysis

Model fit for the full model was acceptable based on CFI (0.950) and SRMR (0.050), but not RMSEA (0.101) (Table 5 of the ESM). The Mood and Sleepiness domains satisfied the prespecified criteria for CFI, RMSEA, and SRMR, indicating that the unidimensional models for these domains had good fit and explained the data well. For Alert/Cognition, although CFI (0.935) and RMSEA (0.201) did not satisfy the prespecified acceptability criteria, good model fit was indicated by SRMR (0.034) and strong factor loadings (range 0.680–0.926).

3.2.8 Item Total Correlation Analysis

For subjects with insomnia, corrected item total correlations for individual IDSIQ-14 items with their own domain were all higher than the threshold of 0.40 on day 1 (range 0.669–0.886). Some items had a weaker correlation for their own domain than for other domains, although in most cases the differences were negligible. Only the item “Energetic” showed a markedly stronger correlation for another domain compared with its own domain: 0.846 for Alert/Cognition vs 0.697 for Sleepiness. The same item also had a stronger correlation with Alert/Cognition than with its own domain among good sleepers (0.614 vs 0.524). The item “Forgetful” had a correlation of 0.342 with its own domain (Alert/Cognition) among good sleepers. All other item total correlations for individual items with their own domain were > 0.40 in good sleepers.

3.2.9 Internal Consistency

Cronbach’s alpha coefficients for IDSIQ-14 total score and the three IDSIQ-14 domains were all above 0.70 (Table 6), indicating acceptable internal consistency.

3.2.10 Test-Retest Reliability

For subjects with no change between day 1 and day 8 on the PGA-S, ICC values for IDSIQ-14 total score and the three IDSIQ-14 domains were all above 0.70 in subjects with insomnia (Table 7), indicating acceptable test-retest reliability. When the analysis was repeated with inclusion of subjects with a 1-point change on the PGA-S, ICC values for IDSIQ-14 total score and the Alert/Cognition and Mood domains remained above 0.70.

Table 7 Test–retest reliability of the Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ)-14 in subjects with insomnia and good sleepers (test–retest populations)

3.2.11 Concurrent Validity

IDSIQ-14 total score and domain scores showed moderate to strong correlations with PROMIS Sleep T score in subjects with insomnia (range 0.666–0.778) and good sleepers (0.493–0.755) (Table 8). Correlations with ISI total score were weak in subjects with insomnia (range 0.283–0.315) and weak to moderate in good sleepers (0.321–0.586).

Table 8 Concurrent validity of the Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ)-14 at day 1 in subjects with insomnia and good sleepers (full analysis population)

3.2.12 Known-Groups Validity

With the exception of the comparison of the subthreshold and moderate ISI severity categories for the IDSIQ-14 Alert/Cognition domain (p = 0.1432), all pairwise comparisons of different ISI and PGA-S severity categories showed significantly higher mean weekly average IDSIQ-14 total scores and domain scores for the more severe category (p < 0.05) (Table 6 of the ESM). Mean weekly average IDSIQ-14 total scores and domain scores were also higher for subjects with insomnia than for good sleepers.

3.2.13 Responsiveness

In an analysis based on data from subjects with insomnia in the interventional study, the IDSIQ-14 demonstrated change over time, with significant decreases in mean weekly average IDSIQ-14 total and domain scores between day 1 and days 8 and 14/15 in subjects who reported a decrease in disease severity based on PGI-C, PGA-S, or ISI (Table 7 of the ESM). However, significant decreases in mean weekly average IDSIQ-14 total and domain scores were also observed in subjects with no change in disease severity based on PGI-C, PGA-S, or ISI.

3.2.14 Cumulative Distribution Function and Probability Density Function Curves

Cumulative distribution function curves showing changes in IDSIQ-14 total score and domain scores according to changes in PGA-S score and PGI-C category indicate greater decreases (improvements) in IDSIQ-14 scores in subjects with insomnia who reported the greatest improvements in symptom severity (Fig. 2). The corresponding probability density function curves are shown in Fig. 3 of the ESM.

Fig. 2
figure 2figure 2

Cumulative distribution function (CDF) curves for changes in weekly average Insomnia Daytime Symptoms and Impacts (IDSIQ)-14 total score and domain scores according to changes in Patient Global Assessment of Disease Severity (PGA-S) score and Patient Global Impression of Change (PGI-C) category in subjects with insomnia (full analysis population). a Changes in IDSIQ-14 total score and domain scores stratified by change in PGA-S score. b Changes in IDSIQ-14 total score and domain scores stratified by PGI-C category. Changes in IDSIQ-14 and PGA-S scores and PGI-C categories are for day 1 to day 14/15

3.2.15 Discriminant Validity

In a sex- and age-adjusted analysis, having insomnia was associated with an IDSIQ-14 total score on day 1 that was 46.08 points higher (p < 0.0001) compared with good sleepers (Table 9). For individual IDSIQ-14 domains, the corresponding differences were 19.25 points for Alert/Cognition (p < 0.0001), 13.26 points for Mood (p < 0.0001), and 13.58 points for Sleepiness (p < 0.0001).

Table 9 Association between insomnia status and Insomnia Daytime Symptoms and Impacts Questionnaire (IDSIQ)-14 scores on day 1 (cross-sectional population)

3.2.16 Interpretation of Scores

Based on anchor-based analyses (Table 7 of the ESM), a 20-point change in the IDSIQ-14 total score was considered meaningful. The corresponding meaningful score changes for the IDSIQ-14 domains were 9 points for the Alert/Cognition domain, 4 points for the Mood domain, and 4 points for the Sleepiness domain.

4 Discussion

To the best of our knowledge, this study presents the first self-report PRO instrument developed and validated according to FDA guidelines [7] that can be used to measure the impact of insomnia on daytime functioning. The IDSIQ comprises 14 items with a recall period of “today”. The items are grouped into three domains reflecting daytime effects of insomnia that are commonly encountered in clinical practice [2, 3]: Alert/Cognition, Mood, and Sleepiness. The IDSIQ can be used in clinical trials of insomnia therapies and in observational studies. Following a thorough selection process from among existing instruments, the IDSIQ was rigorously developed using the DISS [10] as the starting point. Its content validity was established in qualitative interviews with 54 subjects and psychometric properties were established and thresholds for clinically meaningful improvements were identified. Given the rigor of its development and adherence to FDA guidelines for PRO development and validation, the IDSIQ can be used for registration studies of insomnia therapies and observational studies to characterize the deleterious impact of insomnia on daytime functioning.

Confirmatory factor analysis generally supported the domain structure of the final 14-item version of the IDSIQ, although RMSEA indicated suboptimal model fit. However, poor fit for RMSEA is common in smaller samples [46]. Test-retest reliability and known-groups validity of the final IDSIQ were good and concurrent validity based on PROMIS Sleep was acceptable. The poor concurrent validity based on ISI in subjects with insomnia may reflect the fact that the ISI has a 2-week recall period [9], has only one item that directly measures the daytime impact of insomnia, and was developed without patient input. Data from a clinical trial of a sleep medication (zolpidem) were used to assess responsiveness of the IDSIQ to change in subjects with verified DSM-5 insomnia diagnoses. The multiple anchors, anchor threshold criteria, and time points included in the responsiveness analysis made it possible to evaluate whether similar results were obtained across different anchors and different time points. Through this robust approach, responsiveness was found to be acceptable based on decreases in mean IDSIQ-14 scores in subjects who reported a decrease in disease severity based on PGI-C, PGA-S, or ISI. The cumulative distribution function and probability density function curves show that a consistent relationship was observed across the continuum of change scores, suggesting that the IDSIQ is sensitive across a range of change values. Finally, despite the limitations of the small sample and dependence on anchor-based methods only, the meaningful change thresholds give a good indication of the levels of improvement that are relevant for this sample of patients with insomnia.

Another instrument, the SFIS, was developed by Bell and colleagues to address the lack of PROs for daytime functioning in insomnia [16]. The SFIS captures the patient experience of how insomnia affects their psychological and cognitive functioning, social activities, and work productivity. Its psychometric properties were validated in a study of 171 subjects with self-reported insomnia and 261 subjects with no history of insomnia. One of the limitations we evaluated during our review of available PRO instruments was whether the recall period was appropriate. The utility of the SFIS [16] and the DSCQ [11] is limited by a 7-day recall period, which does not capture day-to-day variability in symptoms. In comparison, the IDSIQ offers several advantages such as daily reporting of symptoms, use of an NRS (the SFIS uses a 5-point Likert scale), and calculation of different domain scores (the SFIS offers only a global score), which is advantageous because insomnia is a multi-faceted condition.

Strengths of the presented studies include the capture of insights from a total of 95 subjects with insomnia in interviews during instrument development and validation. Moreover, the subsequent psychometric evaluation was performed in accordance with published FDA guidance for assessing the measurement properties of PROs and for obtaining patient input during clinical development [7, 47]. One limitation is that subjects in the qualitative interviews were naïve to the questionnaire they were asked about, whereas subjects in the exit interviews had completed the IDSIQ-18 every night for 2 weeks. The exit interviews therefore offer a perspective that is not directly comparable with the feedback from the qualitative interviews. Another limitation is the fact that no dedicated instrument for assessing daytime functioning was included in the analysis of concurrent validity. Additionally, the validity of the IDSIQ in people with milder insomnia remains to be established, as subjects participating in the interventional study were required to have an ISI score of ≥15. Data from a larger sample could provide further support for the meaningful change thresholds. Moreover, the IDSIQ has so far only been tested with short-term zolpidem use. Its ability to capture longer term treatment effects remains to be established.

5 Conclusions

Through a rigorous process, the IDSIQ was developed and validated as the first self-report PRO instrument with a diurnal recall period for evaluating daytime symptoms in people with insomnia disorder. The short completion time and subject feedback on ease of completion suggest that the IDSIQ is patient friendly. The good test-retest reliability and acceptable concurrent validity and responsiveness of the IDSIQ indicate that it can be used in registration studies of new insomnia drugs.