Introduction

Attention-deficit/hyperactivity disorder (ADHD) is a disorder characterized by hyperactivity, impulsivity, and inattention that affects between 3 and 7% of school-age children (APA 2000). A worldwide pooled prevalence of 5.29% has been reported (Polanczyk et al. 2007). Impairment of ADHD affects cognitive and psychosocial functioning (Barkley 2002; Biederman and Faraone 2005; Nijmeijer et al. 2008; Escobar et al. 2008) as well as the quality of life (QoL) in patients and their families (Johnston and Mash 2001; Sawyer et al. 2002; Klassen et al. 2004; Matza et al. 2004; Escobar et al. 2005; Riley et al. 2006b).

Treatment options for ADHD include psychostimulants, especially in combination with behavioral therapy (MTA study) (Jensen et al. 2001) or atomoxetine, which is a non-stimulant treatment option for ADHD (Cheng et al. 2007). In most of the studies evaluating the efficacy of these medications, questionnaires such as the ADHD Rating Scale (ADHD-RS) (DuPaul et al. 1998a; Faries et al. 2001) or the clinical global impression (CGI) (Guy 1976; NIMH 1985) have been used as outcome measures for the core symptoms of ADHD.

Health-related QoL has received increasing attention both from clinicians and from investigators in children and adolescents with ADHD (Harpin 2005; Hakkart-van Roijen et al. 2007; Yang et al. 2007; Bastiaens 2008). Health-related QoL is a multidimensional concept that reflects the subjective physical, social, and psychological aspects of health and is distinct from symptoms of the disorder and objective functional outcomes (Wallander et al. 2001). It strongly depends on the subjectively perceived impact of the disorder (and of the respective treatment) on the level of physical, psychological, and social functioning (Leidy et al. 1999; Revicki et al. 2000). Some psychometric instruments are available to assess the health-related QoL, including the Child Health and Illness Profile, Child Edition (CHIP-CE) (Riley et al. 2001; Riley et al. 2006b) and the Child Health Questionnaire (CHQ) (Landgraf et al. 1996). These questionnaires are generic scales that assess QoL aspects that go beyond the core symptoms of the disorder and reflect various dimensions of QoL. CHIP-CE has child-, adolescent- and parent-rated versions, allowing the assessment of the patient’s QoL both from the parent’s and from the patient’s perspective. The possibility to assess QoL from different perspectives is a promising characteristic of this instrument for assessing QoL in children and adolescents (Schmidt et al. 2001).

A number of studies have shown improvement in health-related QoL in children and adolescents treated with atomoxetine (Michelson et al. 2001; Buitelaar et al. 2004; Perwien et al. 2004; Matza et al. 2006; Brown et al. 2006; Perwien et al. 2006; Prasad et al. 2007; Wehmeier et al. 2007, 2008). These studies have used the CHQ, the CHIP-CE, or other QoL instruments.

Up to now, the psychometric properties of the CHIP-CE were mostly studied in non-ADHD populations using cross-sectional data only. Only Riley et al. (2006b) discuss some psychometric properties of this generic scale in an ADHD population. They found that internal consistency reliability was good-to-excellent (Cronbach’s α > 0.70) for all CHIP-CE domains and sub-domains and that almost no ceiling and floor effects were observed. A factor analysis of the sub-domains yielded a 12-factor solution. The domain-level factor analysis identified six factors, the four domains of Satisfaction, Comfort, Resilience and Risk avoidance and in addition the two sub-domains of the Achievement domain. Moderate to high correlations between the CHIP-CE scales and measures of ADHD and family factors were found. The HRQoL of children in this sample was considerably lower than that of community youth. However, this analysis has some limitations. First, the patients were not required to have been diagnosed formally with ADHD but only the clinical judgment of the investigator if the patient has hyperactive/inattentive/impulsive symptoms/problems and had not been formally diagnosed with ADHD or a hyperactive/inattentive/impulsive syndrome in the past was required for inclusion into the study. Another analysis of the study data showed that 11.5% of patients did not fulfill strict ADHD criteria (Döpfner et al. 2006). In addition, only cross-sectional data were analyzed making any statements about score sensitivity for changes over time impossible.

The objectives of the present combined analysis were to evaluate the psychometric properties of the CHIP-CE at baseline and over time and to assess the correlation between parameters related to QoL and those related to ADHD core symptoms using the individual patient data of five clinical trials studying atomoxetine in children and adolescents with ADHD.

Methods

Study design and procedures

Individual patient-level data from five clinical trials (four European and one Canadian, all of which were studies of atomoxetine using the CHIP-CE) with similar inclusion and exclusion criteria and similar duration (8–12 weeks’ follow-up) were included in the combined analysis. More details about the trials are reported elsewhere (Escobar et al. 2010). Thus, all data from clinical trials studying atomoxetine and using the CHIP-CE in the Lilly data base were included. The total number of patients included in the combined analysis was 794. Three of these studies were randomized, double-blind trials comparing atomoxetine with placebo: Study 1 (n = 99) (Svanborg et al. 2009), Study 2 (n = 149) (Escobar et al. 2007; Montoya et al. 2007), and Study 3 (n = 139) (Curatolo et al. 2007). The fourth study was a randomized, open-label study of atomoxetine versus standard of care (Study 4, n = 201) (Prasad et al. 2007), and the last one was an open-label atomoxetine study (Study 5, n = 206) (Dickson et al. 2007), where all patients received atomoxetine.

All patients met the DSM-IV diagnostic criteria for ADHD and had a symptom severity of at least 1.5 standard deviations (SD) above norm values for the ADHD-RS (ADHD subscale of the SNAP in Study 3). The diagnosis was confirmed using the Kiddie-Schedule for Affective Disorders and Schizophrenia for School-Aged Children-Present and Lifetime Version (K-SADS-PL) in all studies except in Study 5. In Studies 2 and 3, basal CGI-S scores for ADHD were at least 4 or higher. The double-blind treatment period was between 8 and 12 weeks in the placebo-controlled studies (8 weeks for Study 3, 10 weeks for Study 1, and 12 weeks for Study 2). Studies 2 and 4 included only medication-naïve patients. Study 3, which was carried out in Italy, did not explicitly require medication-naïve patients, but at the time of recruitment, there were no ADHD drugs available in that country.

The primary scale on which this combined analysis was based is the Child Health and Illness Profile-Child Edition-Parent Form (CHIP-CE-Parent Form) (Riley et al. 2001), a 76-item generic health-related quality of life (HR-QoL) questionnaire, covering a total of five domains (Satisfaction, Comfort, Risk avoidance, Resilience, and Achievement) and twelve sub-domains (satisfaction with health (SH), satisfaction with self (SS), physical comfort (PC), emotional comfort (EC), restricted activity (RA), individual risk avoidance (IRA), threats to achievement (TA), family involvement (FI), physical activity (PA), social problem solving (SPS), academic performance (AP), and peer relations (PR)) that were developed in non-ADHD samples. The CHIP-CE scores are standardized to t-scores, i.e., to a mean (±SD) of 50 (±10), based on the norm values, which were derived from a sample of 1,049 school children from the United States, with higher scores indicating better health. Riley et al. (2004a) found that its domains (Satisfaction, Comfort, Risk Avoidance, Resilience, and Achievement) measure structurally distinct, interrelated aspects of health. Furthermore, they summarized that the domain reliability was high with an internal consistency between 0.79 and 0.88 and a retest reliability between 0.71 and 0.85 as measured by the intra-class correlation ICC.

Efficacy on core ADHD symptoms was assessed using the Attention Deficit/Hyperactivity Disorder Rating Scale-IV, Parent Version (ADHD-RS), which evaluates all 18 symptoms of ADHD according to the DSM-IV diagnostic criteria (Guy 1976; DuPaul et al. 1998b). Improvement is indicated by a decrease in the score. The ADHD-RS comprises a total score, a hyperactive/impulsive sub-score, and an inattentive sub-score.

Statistical analysis

The demographic data were analyzed using descriptive statistics. The number of missing items per evaluation was computed and also analyzed descriptively as a continuous variable. The proportion of evaluations without missing items was presented for the CHIP-CE as a whole and for the domains and sub-domains. All visits and all five studies were pooled for this analysis. Inclusion of patients receiving active treatment and placebo in the analysis over time will increase the range of the changes and will thus lead to a wider basis for the evaluation. The item-total correlations (Spearman’s and Pearson’s correlation coefficients) were calculated for the total scores as well as for the domains and sub-domains. Furthermore, the sub-domains were correlated with the domains and the total score, and the domains were correlated with the total score. The items/sub-domains/domains were sorted by their Spearman’s correlation coefficient with the respective summary score. Only the Spearman’s correlation coefficient is reported here because it is similar to the Pearson’s correlation coefficient for these data. Cronbach’s alpha was computed for the items that were grouped into a sub-score and for all subsets of items that can be created by deleting one item within a sub-domain. The relative frequencies of floor effects (lowest possible value observed) and ceiling effects (highest possible value observed) for the sub-domains, domains, and total scores are provided. Correlations between domains of the CHIP-CE at baseline and at endpoint are shown. The same was done for the sub-domains. A factor analysis based on the sub-domains was performed additionally in order to explore the relationships between the sub-domains. Factor analyses using the varimax rotation on the 76 items with solutions allowing 5 or 12 factors were performed because the CHIP-CE has 5 domains and 12 sub-domains, as the goal was to replicate the factor structure seen in the normative sample. Only loadings >0.30 are presented. All analyses were done using the SAS statistical program.

Results

Patient population and disposition

A total of 794 patients were included in the analysis. The age range was 6–15 years. The mean age was 9.7 years (SD 2.30 years). Most of the patients were children (<12 years): 611 (77.0%) and male 658 (82.9%). Mean ADHD-RS total score at baseline was 41.8 (SD 8.04), the inattentive sub-score was 22.2 (SD 3.83), and the hyperactive/impulsive sub-score was 19.6 (SD 6.03). At baseline, mean CGI-S ADHD was 4.8 points (SD 0.89). Baseline total CHIP-CE mean t score was 28.9 (±11.76) (standard: 50 ± 10); for details, see Table 1. A more detailed discussion of the impact of ADHD on QoL as measured by the CHIP-CE can be found elsewhere (Escobar et al. 2005, 2010).

Table 1 Descriptive analysis (mean and SD) of CHIP-CE total score, domains, and sub-domains at baseline based on all five studies

Internal psychometric properties of the CHIP-CE

Missing values

The proportion of CHIP-CE evaluations with at least one missing value was 19.4%. On average, 0.7 (SD 2.23) items were missing for the whole scale. The proportion of CHIP-CE evaluations with at least one missing value in one of the domains ranged between 4.1% (Resilience domain) and 9.5% (Comfort domain). The sub-domain with the lowest proportion of missing values was the PA sub-domain (0.7%), whereas the sub-domain TA had the highest number of missing values (6.2%). On average, 0.2 (or less) items (SDs 0.19–0.96) were missing for the various domains and sub-domains.

Item-total correlations

To give a clearer impression of item to total score correlation, not all 76 correlations between the individual items and the total score are shown here. Instead, the quartiles of the 76 Spearman’s correlation coefficients are reported. At baseline, the highest correlation with the total score was r = 0.581; 25% of the items had a higher correlation than r = 0.455. The median correlation was r = 0.374; 75% of the items had a higher correlation than r = 0.245. The lowest correlation was r = 0.055. Item 45 (“How often did your child play hard enough to start sweating and breathing hard?”) had the lowest correlation (r = 0.055; 95% CI −0.016 to 0.127) and was the only item where zero was included in the 95% CI (i.e., where the correlation was not significantly higher than 0). A similar pattern of correlations was found at the end of the double-blind phase for the placebo-controlled studies. Overall, smaller correlations were observed when correlating the changes from baseline. The highest correlation was r = 0.502, the 25% quartile was r = 0.337, the median was r = 0.274, the 75% quartile was r = 0.211, and the lowest correlation was r = 0.063.

Item-domain correlations

Within the various CHIP-CE domains, the highest and the lowest Spearman’s correlations between the individual items and the respective domain are reported in the following. The highest baseline correlation in the Satisfaction domain was r = 0.743 and the smallest was r = 0.512. Correlations in the Comfort domain were between r = 0.305 and r = 0.602, for the Resilience domain between r = 0.265 and r = 0.643, and for the Achievement domain between r = 0.468 and r = 0.624. For the Risk avoidance domain, correlations ranged from r = 0.268 (item 76 “How often did he/she have trouble paying attention in school?”) to a maximum of r = 0.747. However, the second lowest correlation within the Risk avoidance domain had a correlation of r = 0.501. Such a large difference between item and domain correlation was not seen for the other domains, where the single item-domain correlations were more evenly distributed between the minimum and maximum values.

Correlations were similar at the end of the double-blind phase for the placebo-controlled studies. However, the correlation for item 76 (“How often did he/she have trouble paying attention in school?”) was not as distinct from other item to domain correlations as for the baseline assessment in the Risk avoidance domain.

Overall, lower correlations were seen for changes from baseline. Here, correlations were between r = 0.386 and r = 0.664 for the Satisfaction domain, between r = 0.184 and r = 0.526 for the Comfort domain, between r = 0.215 and r = 0.527 for the Risk avoidance domain, between r = 0.139 and r = 0.524 for the Resilience domain, and between r = 0.329 and r = 0.694 for the Achievement domain.

Item-sub-domain correlations

Within the CHIP-CE sub-domains, the highest and lowest Spearman’s correlations between the individual items and the respective sub-domain were also analyzed. At baseline (endpoint values are provided in brackets), the highest correlation for the SH sub-domain was r = 0.682 (0.759) and the smallest was r = 0.590 (0.601). For the SS sub-domain, the correlations were between r = 0.703 (0.709) and r = 0.876 (0.868), for the PC sub-domain between r = 0.437 (0.314) and r = 0.620 (0.666), for the EC sub-domain between r = 0.527 (0.528) and r = 0.684 (r = 0.758), for the RA sub-domain between r = 0.556 (0.608) and r = 0.863 (0.869), for the IRA sub-domain between r = 0.670 (0.626) and r = 0.889 (0.853), for the FI sub-domain between r = 0.419 (0.432) and r = 0.656 (0.690), for the SPS sub-domain between r = 0.721 (0.655) and r = 0.825 (0.807), for the AP sub-domain between r = 0.641 (0.615) and r = 0.784 (0.818), and for the PR sub-domain between r = 0.618 (0.573) and r = 0.832 (0.858). For the TA sub-domain, the minimal and maximal correlations were r = 0.286 (item 76) (0.361) and r = 0.712 (0.678), respectively. However, the item with the second lowest correlations within this sub-domain had a correlation of r = 0.563 (0.490), showing that item 76 had a particularly low correlation within this sub-domain. The items for the PA sub-domain were separated into two groups based on the correlations. Items 44–46 had correlations between r = 0.778 (0.730) and r = 0.830 (0.832), whereas the items 31–33 had correlations between r = 0.323 (0. 345) and r = 0.408 (0.377). A similar pattern, but with generally smaller correlations, was observed for the changes from baseline.

Table 2 shows the Spearman’s correlation coefficients between the sub-domains and the domains and between the domains and the total score.

Table 2 Spearman’s correlation coefficients with 95% CIs between the sub-domains and the domains and between the domains and the total score at baseline, at endpoint after the placebo-controlled period, and for the change from baseline to that endpoint

Internal consistency (Cronbach’s alpha)

Internal consistency of CHIP-CE was assessed using Cronbach’s alpha. The results are shown in Table 3. The internal consistency was good for all sub-domains at baseline and at endpoint. Only the EC and FI sub-domains fell short of a consistency of 0.7, which can be used as a helpful cut-off (DeVellis 1991). However, no such cut-off was previously discussed for changes over time. The internal consistency for changes from baseline to endpoint was fair, except for AP, which had better internal consistency for changes over time. The internal consistency of all sub-domains at baseline and endpoint was robust against single missing items, as the alpha values did not decrease by any meaningful degree when one item was deleted. The TA domain and the AP sub-domains were sensitive to certain items in terms of change. Alpha was below 0.4 for these sub-domains based on the changes from baseline to endpoint when one item was deleted.

Table 3 Cronbach’s alpha (standardized) for the sub-domains and the lowest alpha that was reached by deleting an item in that sub-domain with 95% CIs

Floor and ceiling effects

Floor and ceiling effects were evaluated using the baseline visits and all subsequent visits to increase the basis of information, as these effects should not occur at any time. The floor and ceiling effects of the total score were less than 0.1% at baseline and across all visits. The same holds for the floor effects of all domains. The largest ceiling effect of the domains was seen for the Satisfaction domain when all visits were pooled (1.3%). Floor effects of the sub-domains were mostly below 1%. The AP sub-domain had the largest floor effect based on baseline values (3.5%). Ceiling effects varied across the different sub-domains and were generally lower if only the baseline visit was taken into account. At baseline, the ceiling effect was below 1% for the sub-domains SH, TA, AP, and PR. The ceiling effect increased to values between 1 and 2% if all visits were taken into account. The sub-domains SS (baseline), EC (baseline and for all visits), IRA (baseline), FI (baseline and for all visits), and SPS (baseline and for all visits) had values between 1 and 5%. Higher ceiling effects were discovered for the sub-domains SS (all visits: 6.9%), PC (baseline: 5.9%, all visits: 9.1%), RA (baseline: 54.6%, all visits: 58.7%), IRA (all visits: 8.2%), and PA (baseline: 7.3%, all visits: 8.9%).

Factor analyses based on individual items

Factor analyses with solutions allowing 5 or 12 factors were performed because the CHIP-CE has 5 domains and 12 sub-domains (see Tables 4, 5 for the loadings). The factor analysis was based on baseline data only. The first factor of the 12-factor solution mainly consists of items from the sub-domains IRA and TA, which together form the Risk avoidance domain. High loadings of the second factor came almost exclusively from the EC domain. The third factor had high loadings not only from all four SS items, but also from two items from the SH sub-domain (item 1: “How often does your child have a lot of fun?” and item 4: “How often does your child feel happy?”). The 5 items of the SPS sub-domain composed the fourth factor. These items did not load onto other factors and no other item loaded to any relevant degree onto factor four. The 3 out of 6 PA items, which were related to running and walking, loaded high onto the fifth factor, together with smaller loading from item 34 (“Feel too sick to play at home?”), item 10 (“My child is physically fit”), and item 11 (“My child is well coordinated”). All AP items loaded high onto the sixth factor, together with smaller loadings from two TA items (item 74: “How often did he/she get along with his/her teacher?” and item 76: “How often did he/she have trouble paying attention in school?”). The AP items loaded nearly exclusively onto this factor. Only the five PR items loaded onto factor seven, and only two of these items had smaller loadings onto the first factor. No loadings onto any relevant degree for the PR items were observed in terms of any other factor. The four items composing the RA sub-domain made up almost exclusively the factor eight. Again, only one of these items had a smaller loading onto another factor. Factor nine contained all nine PC items, which loaded only onto this factor (except for item 5). All FI items made up factor ten. Loadings of these items onto other factors were minor. The group of PA items that relate to games and sports loaded high onto factor eleven. Factor twelve received loadings from four of the six items of the SH sub-domain, three of which did not load onto other factors. Also, an EC item (item 21: “How often did your child have trouble falling asleep?”) and a PC item (item 5: “How often is your child sick?”) loaded onto this factor.

Table 4 Factor analysis with 12 factors (varimax rotation) for the CHIP-CE (only loadings >0.30 are presented)
Table 5 Factor analysis with five factors (varimax rotation) for the CHIP-CE (only loadings >0.30 are presented)

The result of a factor analysis based on 5 factors is shown in Table 4. All but one item of the Risk avoidance items (item 76) loaded onto the first factor displayed in the first column. Additionally, two items from the Comfort domain, four items from the Achievement domain, and four items from the Resilience domain loaded onto this factor. These loadings were generally smaller than the loadings from the Risk avoidance items. All of the Comfort domain items, which are related to RA, loaded onto the second factor as displayed in the second column. Furthermore, seven of the nine Comfort domain items, which belong to the PC sub-domain, had loading onto the second factor. The other two PC items did not have loadings of more than 0.3 onto any factor. Only one of the other comfort items (i.e., those related to EC) had a small loading for this factor. Those three of six PA items from the Resilience domain that were related to running and walking loaded high onto this factor too. Furthermore, three SH items had medium loadings onto this factor. All the SS items loaded onto the third factor together with four SH items. This factor also received high loadings from the four Achievement domain items of which the PR sub-domain consists. Smaller loadings were also seen for Resilience items, which were mostly related to PA (i.e., games and sports). The fourth factor consisted mainly of items related to EC and received almost no loadings from the other two Comfort sub-domains. Smaller loadings also came from a few Satisfaction items. The fifth and last factor received loadings mainly from the FI sub-domain, which belongs to the Resilience domain, and the AP sub-domain, which belongs to the Achievement domain.

Correlations between domains of the CHIP-CE

Table 6 shows the correlations between the domains at baseline and at endpoint. Most correlations were higher at endpoint than at baseline. The pattern of correlations was similar in both analyses. The Risk avoidance domain had the lowest correlations compared with other domains, both at baseline and at endpoint. However, this was not the case for changes from baseline to endpoint. The highest correlation for change was seen between the Achievement and Risk avoidance domains (r = 0.462), followed by the domains Comfort versus Satisfaction (r = 0.360), Resilience versus Satisfaction (r = 0.323), Risk avoidance versus Comfort (r = 0.309), Achievement versus Satisfaction (r = 0.290), Achievement versus Resilience (r = 0.270), Resilience versus Risk avoidance (r = 0.261), Achievement versus Comfort (r = 0.221), Resilience versus Comfort (r = 0.212), and Risk avoidance versus Satisfaction (r = 0.198).

Table 6 Spearman’s correlation coefficients between domains of the CHIP-CE at baseline (above diagonal) and at endpoint (below diagonal)

Correlations between sub-domains of the CHIP-CE

Table 7 shows the correlations between the sub-domains at baseline and at endpoint. Six sub-domains (SH, SS, EC, TA, SPS, and PR) correlate with three or more other sub-domains with r > 0.3, both at baseline and at endpoint. Three further sub-domains correlate with three or more other sub-domains with r > 0.3, at baseline (PC, RA, and IRA). The highest correlations found were r = 0.603 at baseline and r = 0.559 at endpoint. Three sub-domains appear to be correlated with other sub-domains to a lower degree. At baseline, all correlations were less than 0.3 for FI. At endpoint, only the correlations with SS (r = 0.412) and with SPS (r = 0.319) were higher than 0.3. PA is correlated (r > 0.3) with SH only at baseline (r = 0.368) and at endpoint (r = 0.393). AP is not correlated with any other sub-domain at baseline and only with TA at endpoint (r = 0.356). For correlations between changes from baseline to endpoint, only four correlations were stronger than 0.3: SS versus SH (r = 0.441), AP versus TA (r = 0.380), TA versus IRA (r = 0.336), and PC versus SH (r = 0.307).

Table 7 Spearman’s correlation coefficients (>0.3) between sub-domains of the CHIP-CE at baseline (above diagonal) and at endpoint (below diagonal)

Factor analyses based on original sub-domains of CHIP-CE

A factor analysis based on the sub-domains is another approach to exploring relationships between sub-domains (see Table 8). This approach takes all correlations into account simultaneously. The pattern of correlations described above is confirmed with this method. The sub-domains IRA, TA, SPS, and PR load strongly onto the first factor. The second factor consists mainly of the three Comfort sub-domains. Each of the other three factors (3, 4, and 5) received high loading from one of the individual sub-domains mentioned above. The second highest loading for the third factor after PA is SH. The second highest loading for the fourth factor after FI is SS. TA and AP load onto factor 5.

Table 8 Factor analysis loadings (>0.3) based on sub-domains of the CHIP-CE at baseline (varimax rotation)

Correlations between CHIP-CE and ADHD-RS

At baseline, correlations between the total score, the domains, and the sub-domains of the CHIP-CE versus ADHD-RS total score were low (<0.4) (e.g., CHIP-CE total score: r = −0.345) except for the Risk avoidance domain (r = −0.517) and its sub-domains (individual risk avoidance r = −0.481, threats to achievement r = −0.463). More detailed information about these correlations between CHIP-CE and ADHD-RS as well as the treatment effect of atomoxetine in terms of these scales can be found elsewhere (Escobar et al. 2010). A more detailed profile over time of the CHIP-CE was evaluated in the SUNBEAM study by Prasad et al. (2007).

Discussion

The objective of this combined analysis was to evaluate the psychometric properties of the CHIP-CE in a sample of children and adolescents with ADHD from clinical studies. The analyses were based on the data from five clinical trials of atomoxetine. The descriptive CHIP-CE baseline data of these studies confirmed the impairment in terms of QoL in this clinical trial population with moderate core symptoms severity. The psychometric evaluation of the CHIP-CE showed a low number of missing items, confirming that the questionnaire comprising 76 items is relatively easy to apply (Riley et al. 2004a, 2006b). The correlations between the items and the total score were stable over time as the item-total correlations showed a similar pattern at baseline and after the double-blind phase for the placebo-controlled studies. Smaller correlations were observed between changes from baseline values. The similarity of the correlations at baseline and at endpoint indicates that the total score was sensitive to the same items at both points in time, a result that could not be shown by the cross-sectional analysis by Riley et al. (2004a, 2006b). The same holds true for the various domains. Interestingly, the item-total correlations varied widely for the Risk avoidance domain. Such a gap was not seen for any of the other domains. The item with the weakest correlation to the domain score “trouble paying attention at school” is closely related to the core symptoms of ADHD. Therefore, the low correlation with the Risk avoidance domain suggests that in the ADHD population, this item belongs to a different dimension than other items in this domain. Correlation patterns were similar at the end of the double-blind phase for the placebo-controlled studies. However, the weak correlation for item “trouble paying attention at school” was not as distinct as for the baseline assessment in the Risk avoidance domain. Weaker correlations were seen for the changes from baseline analyses.

The assessment of the item-sub-domain correlations yielded a similar pattern for the TA sub-domain, which is part of the Risk avoidance domain, for baseline and endpoint. The items for the PA sub-domain could be separated into two groups based on the correlations with three items that had a much higher correlation with the sub-domain than the other items. Items 44 (“How often did your child play active games or sports?”), 45 (“How often did your child play hard enough to start sweating and breathing hard?”), and 46 (“How often did your child run hard when he/she played or did sports?”) had much higher correlations compared with the items 31 (“How often did your child have trouble walking one block?”), 32 (“How often did your child have trouble walking up one flight of stairs?”), and 33 (“How often did your child have trouble running?”). A similar pattern, but with overall weaker correlations, was observed for the changes from baseline.

Correlations between sub-domains and domains and between domains and the total score were similar at baseline and endpoint. The correlations for change from baseline were usually slightly smaller. The RA and the PA sub-domains had lower correlations with their domains than most of the other domains at baseline, at endpoint, and also for the change from baseline. The same was found to be true for the Comfort domain regarding the correlation of the domain with the total score. The Achievement domain, the Satisfaction domain, and the Risk avoidance domain seem to be especially important components of the CHIP-CE scale in children and adolescents with ADHD, based on their strong correlation with the total score. The low correlation of the other two domains, Resilience and Comfort, might be caused by the fact that these contain sub-domains that are not affected by ADHD at baseline (PC, RA, and PA). This was not only observed in the present population of patients with ADHD, but also in a cross-sectional sample from the United States on which Riley et al. (2007) based their analysis.

The internal consistency as measured by Cronbach’s alpha for all sub-domains was good at baseline and at endpoint, which confirms the findings from an observational study with ADHD patients (Riley et al. 2006b) as well as the results based on a community sample (Riley et al. 2004a). The internal consistency for changes from baseline to endpoint as measured by Cronbach’s alpha was moderate, except for AP where it was low. Therefore, the CHIP-CE is generally useful to track changes in QoL over time. The internal consistency of domains and sub-domains was robust against single missing items, except for changes in the TA sub-domain and the AP sub-domain. Results from those sub-domains should only be used if all items are available. Considerable ceiling effects were only observed for the RA domain, which is not surprising in a sample selected based on a psychiatric and not a physical condition. A similar profile of floor and ceiling effects was seen in an observational study in ADHD patients (Riley et al. 2006b). The RA domain had also most ceiling effect (6.3%) in a community sample (Riley et al. 2004a). The factor analysis allowing for 12 factors showed that the sub-domains generally load onto different factors; especially the sub-domains that are impaired in ADHD patients can be distinguished. However, this is not the case for the 5-factor solution based on the number of CHIP-CE domains, where the items from sub-domains that do not belong to the same domain often load together on one factor. It is therefore advisable to use the sub-domains rather than the domains of the CHIP-CE when evaluating ADHD patients. This is supported by the factor analysis based on the sub-domains and the correlation analysis of the sub-domains, which showed that those sub-domains that belong to the same domain do not necessarily have a high correlation. Riley et al. (2006a) also found a 12-factor solution in a cross-sectional naturalistic ADHD sample. This is an important difference to the results of CHIP-CE domains previously reported in a community sample (Riley et al. 2004a, b; Rajmil et al. 2004). The correlation between the domains over time was stable in our analysis. The same holds true for the sub-domains. A cluster of between-sub-domain correlations was observed for nine sub-domains, which showed correlations of >0.3 with three or more sub-domains at baseline and/or at endpoint. In contrast, the three sub-domains FI, PA, and AP appeared to be less correlated with the others.

Possible limitations of this evaluation are the different designs of the studies on which this combined analysis was based, including different patient populations with respect to pre-treatment and comorbidities. Therefore, these results may not be directly transferable to epidemiological samples. Furthermore, it is difficult to assess how the proxy evaluation by the parents may have influenced the relationship between QoL and the core symptoms. The influence of the QoL of the parents or the parents’ diseases (such as ADHD) could not be assessed because these data were not obtained.

Conclusions

The strength of this analysis is the large sample of patient data from outside of the United States. This large sample size together with the longitudinal assessment of the questionnaire makes this analysis unique. Previous evaluations of the CHIP-CE used only cross-sectional samples and thus could not assess its performance in measuring changes over time. Our findings suggest that the application of the CHIP-CE provides useful and psychometrical robust insights into the QoL in terms of internal consistency and structure—especially when evaluating the sub-domains. Based on this combined analysis, the CHIP-CE can also be recommended to track changes in QoL over time.