The Value of Measuring Impact Alongside Symptoms in Children and Adolescents: A Longitudinal Assessment in a Community Sample

The impact that psychiatric symptoms have on the lives of young people is central to clinical practice and classification. However, there is relatively little research on impact and its association with symptoms. This paper examines how well impact can be measured and how it relates to psychiatric outcomes. On four separate occasions over 3 years, symptoms and impact were assessed in a UK epidemiological sample (n = 4,479; 51.5 % boys) using the Strengths and Difficulties Questionnaire (SDQ) as reported by parents, youths and teachers. Disorders were ascertained using the Development and Well-Being Assessment. An impact scale made of items about distress and impairment demonstrated considerable internal consistency, cross-informant correlations, and longitudinal stability by all reporting sources. Impact at baseline was a strong predictor of psychiatric disorder 3 years later after accounting for psychiatric disorders and symptoms measured at baseline: odds ratio OR = 2.10, 95 % Confidence Interval (CI) [1.50, 2.94] according to parent-rated impact and OR = 1.71, CI [1.08, 2.72] according to teacher-rated impact. Changes in impact over time were predicted, but not fully accounted for, by symptoms measured at baseline. Impact can be reliably and easily measured across time, and it may be clinically useful as an independent predictor of future symptoms and psychiatric disorders. More studies are needed to understand inter-individual variation in the impact caused by equivalent symptoms.


Introduction
Psychiatric symptoms result in distress and impairment. This impact due to psychiatric symptoms is central to clinical decision-making in both the DSM-IV and the upcoming DSM-5. However, there is relatively little research on how well impact can be measured and on how much information it adds (Rapee et al. 2012). In this paper we examine this issue in a large, multi-informant, longitudinal study of a general population sample.
Clinicians inquire about their patients' symptoms and also ask about the impact caused by these symptoms. DSM-IV defines the clinical significance of symptoms according to whether they lead to distress or functional impairment (American Psychiatric Association 2000; Ustun and Kennedy 2009). Distress is defined in terms of the worry and upset caused by the symptoms. Functional impairment is not strictly defined in DSM-IV (Ustun and Kennedy 2009), but is meant to capture a reduced level of adaptive functioning in social and educational (in adults, occupational) domains of life.
When distress or impairment criteria are required to make a diagnosis, the prevalence of psychiatric disorders is often reduced (Bird et al. 1992). For example, separation anxiety and simple phobias are about five times less common if criteria for impairment-and not only symptoms-are required for diagnosis (Simonoff et al. 1997). However, this may not be the case for other disorders, such as depression (Simonoff et al. 1997), where symptoms (Pickles et al. 2001) are the best predictors of later diagnosis.
The decision to require impact as a criterion for psychiatric disorders has attracted criticisms (Rutter 2011). By analogy to other branches of medicine, symptoms of a psychiatric disorder may merit medical attention even in the absence of current impact. Thus, symptoms of diabetes, hypertension or a transient ischemic attack require diagnosis and intervention even if they have not yet caused impact (Rutter 2011). This and related considerations support the ICD-10 approach that does not generally require impairment before a diagnosis can be made, though impairment can optionally be coded separately.
However, there are important counter-arguments in favor of the DSM's focus on impact. For example, information about impact may help identify children who are suffering (and may be in need of intervention) even if they do not meet the full symptom criteria for an operationalized psychiatric disorder .
More fundamentally, both DSM-IV's major focus on impact, and ICD-10's optional coding of impact implicitly assume that impact is sufficiently separable from symptoms to be worth considering separately. Thus distress and impairment resulting from symptoms ought to be adequately measurable in their own right and add predictive value to that of symptom assessment alone. This assumption, which has received little empirical scrutiny, is tested in this paper. We examine two sets of questions within a large 3-year longitudinal epidemiologic study using multiple informants (parents, youth and teachers reporting on both symptoms and impact).
The first set of questions relates to whether a short scale can measure impact with satisfactory psychometric properties. Do questions about impact form a scale with acceptable factorial structure and internal consistency? What is the stability of such questions across time? How, well do informants agree with each other when they rate impact? Also, what is the relationship between measures of impact and relevant outcomes of psychosocial adjustment, such as contact with psychiatric services, self harm, truancy and contact with police (a measure of concurrent validity)?
The second set of questions concerns the relationship between impact and symptoms across time. Does impact add to the prediction of future psychiatric symptoms even after accounting for psychiatric symptoms at baseline? Also, does knowledge about symptoms at baseline contribute to the prediction of impact 3 years later (after adjusting for baseline impact)? Is it possible to distinguish impact from symptoms across time or can the changes in impact be fully accounted by changes in symptom levels?

Sample
The 2004 British Child and Adolescent Mental Health Survey (B-CAMHS04) involved a sample of 5-16 year olds (n=7,977) representative of the general British population; it has previously been described in detail (Green et al. 2005). The study used "child benefit" (a universal state benefit payable in Great Britain for each child in a family) to develop a sampling frame of 5-16 year olds in different postal sectors in England, Wales, and Scotland. After excluding families with no recorded postal code, it was estimated that this represented 90 % of all British children. Out of the 12,294 contacted, there were n=1,085 who opted out and n=713 who were non eligible or had moved without trace, leaving 10,496 who were approached in person. Of those, n=7,977 participated (65 % of those selected; 76 % of those approached). After 12 and again after 24 months (i.e., in 2005 and 2006), parents who had agreed to be followed up again were mailed a Strengths and Difficulties Questionnaire (SDQ) (Goodman 1997). 36 months after the baseline survey i.e., in 2007 (Parry-Langdon 2008), families were approached once more unless they had previously opted out or the child was known to have died. Of the original n=7,977 participants, n=5,326 (67 %) participated in the detailed follow-up (Parry-Langdon 2008). In this paper we present the data on those children whose families participated at all 4 study time points. This yielded a sample of n=4,479 (56 % of original participants; 52 % boys). Attrition analyses show that participants with data at all 4 time points did not differ by gender (52 % vs. 52 % were boys; χ 2 (1, n=7977)=0.00; p=0.95), were more likely to be younger (M = 10.38, SD = 3.31 vs. M = 10.75 SD = 3.50 years; t (7975)=4.84, p<0.001), less likely to suffer from a psychiatric disorder (7 % vs. 13 %; χ 2 (1, n=7977)= 66.03, p<0.001), and less likely to come from a family that owned rather than rented its home; (19 % vs. 42 %; χ 2 (1, n= 7972)=492.57; p<0.001).

Assessment
Strengths and Difficulties Questionnaire (SDQ) The SDQ has robust psychometric properties (Bourdon et al. 2005;Goodman 1997Goodman , 2001 and separately inquires about symptoms and impact.
Symptoms The SDQ asks 5 questions each about the following domains: hyperactivity/inattention; behavior problems; emotional symptoms; and peer problems. Summing the items generates scores ranging from 0 to 10 for each scale. A total difficulties score created by the addition of these scales ranges from 0 to 40.
Impact The SDQ impact score is generated for the parent and self report by the sum of 5 items: one item about distress; plus 4 items on social impairment in a) family life, b) friendships, c) learning and d) leisure activities. Teachers are only asked about distress and impairment in learning and friendships (Goodman and Scott 1999).
Parents (95 % were mothers, 4 % fathers, 1 % other sources) were asked to fill in the SDQ at baseline and at each of the study's follow-up points, namely 12, 24, and 36 months. The completion rates of the SDQ were very high in those parents who agreed to participate at each time point: n=4,474 of 4,479 (99.9 %) at baseline, 100 % at 12 and 24 months, and n=4,449 of 4,479 (99 %) at 36 months.
Children's teachers were asked to fill in the SDQ at baseline and at 36-month follow-up. In Great Britain, it would be unusual for the rating to be completed by the same teacher on both occasions. The completion rates of the teacher SDQ were moderate in those families who had participated at baseline and all follow-ups: n=3,507 of 4,479 (78 %) at baseline and n= 2,644 of 4,479 (59 %) at 36 months. Children aged 11 and above were asked to fill in the SDQ at baseline and at 36 months. At baseline there were n=2,207 children who were eligible to provide SDQ data and whose families had participated at baseline and all follows-up, of whom 1995 (90 %) completed the SDQ. At 36 months there were n=3,419 children who were eligible to provide SDQ data and whose families had participated at baseline and follow-up, of whom n= 2,926 (86 %) completed the SDQ.
Development and Well-Being Assessment (DAWBA) The DAWBA (Ford et al. 2003;Goodman et al. 2000;Messer et al. 2006) is a structured interview administered by lay interviewers who also record verbatim accounts of problems. The questions are closely related to DSM-IV and ICD-10 (APA 2000; World Health, O 1994) diagnostic criteria and focus on current problems. The κ statistic for chance-corrected agreement between two raters was 0.86 for any disorder (SE 0.04), 0.57 for internalizing disorders (SE 0.11), and 0.98 for externalizing disorders (SE 0.02) (Ford et al. 2003). Values of κ<0 indicate no agreement, 0-0.20 slight agreement, 0.21-0.40 fair agreement, 0.41-0.60 moderate agreement, 0.61-0.80 substantial agreement, and 0.81-1 almost perfect agreement (Landis and Koch 1977). Children were assigned a diagnosis only if their symptoms were causing significant distress or social impairment. The DAWBA interview was administered to all parents and to all children aged 11 or over; a shortened version of the DAWBA was mailed to the child's teacher. Further information on the DAWBA is available from http://www.dawba.info. The DAWBA was completed at baseline and at 36 months.
This paper focuses on the overall presence of disorder (i.e., any DSM-IV disorder), externalizing disorders (the combination of conduct, oppositional defiant and attention deficit/hyperactivity disorders), and internalizing disorders (the combination of depressive and anxiety disorders).
Other Outcomes To assess the concurrent validity of the impact ratings, the following information available to the baseline (2004) part of the survey were used: contact with psychiatric services (n=113, 3 %), self harm (n=151, 3 %), truancy (n=98, 2 %) and contact with police (n=122, 3 %). A participant was coded as having experienced one of these outcomes if so rated by youth-, teacher-or parent report. For self harm, information was only available from the youth and parents. To assess predictive validity, we also examined whether impact ratings in 2004 were associated with new onset (i.e. reported for the first time) of the following in the 36-month follow up of the 2007 survey: contact with psychiatric services (n=127, 3 %), self harm (n=199, 4 %), truancy (n=47, 1 %) and contact with police (n=201, 5 %).

Analysis
Internal Consistency This was estimated using Cronbach's alpha across each informant across time.
Factorial Structure The single-factor structure of the impact score was tested in a confirmatory factor analysis using the 5 SDQ impact items for the parent and self report (but not the 3-item teacher report). Fit was assessed by the following indices: Comparative Fit index (CFI; 0.95 and above indicates good fit) the Tucker Lewis Index (TLI; values close to 1 indicate good fit) the Root Mean Square Error of Approximation (RMSEA; values smaller than 0.06 indicate good fit) and the Standardized Root Mean Square Residual (SRMR; values smaller than 0.07 indicate good fit) (Yu 2002).
Concurrent Validity This is presented in a figure as the standardized mean of impact by one of the following binary categories from the baseline 2004 survey: contact with psychiatric services, self harm, truancy and contact with police. A table presents the results from logistic regression models in which each of these categories (from the baseline 2004 survey) was the outcome and baseline impact the independent variable. The models are presented unadjusted or adjusted for baseline total SDQ score.
Predictive Validity A table presents the results from logistic regression models in which the independent variable is impact at baseline, and the dependent variables were new onset of each of the following binary categories in the 36month follow up of the 2007 survey: contact with psychiatric services, self harm, truancy and contact with police. The models are presented unadjusted or adjusted for baseline total SDQ score.
Longitudinal Stability This was tested in regression models within informants (e.g., parent report at baseline predicting parent report at 36 months) and across informants (e.g., parent report at baseline predicting teacher report at 36 months). Impact Ratings Across Informants Concurrent association of ratings between informants was assessed using correlation models, whereas the longitudinal associations were assessed using regression models.

Association Between Symptoms and Other Outcomes
Association Between Symptoms and Impact The prediction of symptoms by impact was estimated in regression models with the standardized total SDQ symptom score as the outcome (e.g., at 36 months) and standardized baseline impact score as the predictor. Adjusted models covaried for standardized baseline total SDQ symptom score. This was reversed for the prediction of impact score from the total SDQ symptom score.
Association Between Psychiatric Disorders and Impact The prediction of psychiatric diagnoses at 36 month followup (dependent variable) by baseline impact was estimated in logistic regression models where the standardized impact score was used as an independent variable. In adjusted models, diagnoses at baseline were used as covariates.
Trajectories of Symptoms and Impact Across Time A path analysis model was estimated to test whether total SDQ symptom score and impact followed distinguishable trajectories across time. The model is schematically represented in Fig. 2.
Analyses were conducted in Mplus, Version 5 (Muthen and Muthen 2007). The study option was used to standardize scores and all analyses were estimated with robust maximum likelihood that uses sandwich estimators for the standard errors according to Huber and White (Huber 1967;White 1980).

Ethical Approval
All study procedures received multicentre research ethics committee approval and informed consent was obtained from parents and assent from children participants.

Results
Internal Consistency The internal consistencies for the impact scales were high: by parent report, 0.88 at baseline, 0.89, 0.82, and 0.88 at 12, 24, and 36 months respectively. For youth-report, they were 0.82 at baseline and 0.88 at outcome; for teacher report 0.86 at baseline and 0.87 at outcome.
Factorial Structure As shown in Table 1, a single-factor structure was acceptable at most time points for the parentand youth-reported impact scales. In particular, the CFI and the SRMR showed that the one-factor solution was a good fit to the data. The RMSEA indicated acceptable fit at all time points, with the exception of 12 months, while the TLI was below the 0.95 cut off at 12 and 36 months by parent report. All factor loadings were excellent (Table 1).

Inter-Rater Correlation and Longitudinal Stability
There was a moderately strong correlation at baseline between impact reported by all three informant sources ( Table 2). The 12-, 24-, and 36-month stability of the impact scores within informants was considerable (Table 2). Parent-and teacher-reported impact at baseline were highly significant in predicting impact 3 years later, within as well as across informants. Fig. 1, impact, as rated by any informant, was higher in those who had experienced contact with psychiatric services, self harm, truancy and contact with police compared with those who had not experienced such outcomes. Table 3 presents the odds ratios of the association between each of these outcomes and impact. As can be seen, impact rated by any of the three informants was a significant predictor of psychiatric service use, even after adjustment for baseline total SDQ score. However, neither teacher-nor self-report were significant predictors of self harm after adjustment for baseline symptoms. Similarly, neither parent-nor teacher-report was significantly associated with police contact once adjusted for baseline symptoms. Table 4 presents the odds ratios of the association between each of the validating outcomes in year 2007 and baseline impact. As can be seen, parentand teacher-rated impact was a significant predictor of future psychiatric first-time service use and new-onset self harm in 2007, even after adjustment for baseline total SDQ score. However, neither parent-rated nor teacher-rated impact were predictive of new-onset truancy or first-time police contact in 2007 after adjustment for baseline SDQ scores. Self-reported impact was not a significant predictor of any of these 2007 outcomes once baseline SDQ score had been adjusted for.

Predictive Validity
Impact Predicting Future Symptoms Impact measured at baseline was significantly predictive of the SDQ total symptom score at 36-month follow-up in unadjusted models (Table 5). This was true both within as well as across informants. In models adjusted for SDQ total symptom score at baseline, parent-or teacher-rated impact measured at baseline remained a significant predictor of SDQ total symptom score measured by parent, teacher, or youth report at 36month follow-up. However, youth-reported impact at baseline was not predictive of youth-reported symptoms at 36-month follow-up. The effect sizes of the associations between impact at baseline and symptoms at outcome varied from moderate in unadjusted models to small in adjusted models (Table 5).

Impact Predicting Future Psychiatric Disorders
We assessed how impact predicted psychiatric disorders at 36month outcome. Parent-rated impact was a significant predictor of psychiatric disorders, although the strength of its association diminished progressively with adjustment for baseline disorder (Table 6). The results were similar for teacher-reported impact (Tables 6), while youth-reported impact was a significant predictor of each disorder domain except for the prediction to disruptive behavior disorders which was non-significant after adjustment.
Symptoms Predicting Future Impact SDQ total symptom score at baseline was highly predictive of impact score at 36-month follow-up in unadjusted models (Table 7). This was true both within and across informants. In models adjusted for impact measured at baseline, total SDQ symptom score measured by any informant at baseline remained a significant predictor of impact score measured at 36-month follow-up. However, the effect sizes of the associations were small, as indicated by the standardized coefficients (Table 7).

Predictions Between Impact and Symptoms Across
Time Focusing just on data from parent reports-since these were available at baseline, 12, 24, and 36 months-we studied the association between impact and symptoms across time in a path analytic model (Fig. 2). The within-domain stability is stronger than the across-domain stability: impact is always a better predictor of impact (the coefficients are significant even for the 36 month predictions), whereas the SDQ total symptom score is always a better predictor of SDQ symptom score. This makes it unlikely that one is merely secondary to the other. Figure 2 also demonstrates that impact is directly predictive of future symptoms only in the short term: for example, impact at 12 months is predictive of SDQ symptom score at 24 months (adjusting for the stability of total SDQ symptom score and the prediction from 24-month impact), but it is not predictive of SDQ symptom score at 36 months with comparable adjustments. Similarly, the SDQ symptom score predicts impact in the short term (12 months). However, Fig. 2 also shows that impact is predictive of SDQ total symptom score in the longer run but indirectly. For example, 12-month impact score predicts 36 month SDQ symptom score indirectly via the 12-and 24-month SDQ total symptom scores (which strongly predict the 36-months SDQ symptom score), as well as via its significant associations with impact scores at 12 and 24 months (which predict SDQ symptom score at 36 months). The same pattern of relationships applies in the prediction from SDQ total symptom to impact.

Discussion
This paper examined the value of measuring impact alongside psychiatric symptoms. Impact was defined as the presence of distress or impairment in different settings and analyses were conducted in a large epidemiologic sample using a longitudinal and multi-informant (parent, youth, and teacher) design. Overall, we found that even a brief measure Factor loadings and fit indices from the confirmatory factor analysis (CFA) of impact items by parent and self report. Each column represents a separate CFA (each CFA had five degrees of freedom) of impact had adequate psychometric properties and that it added predictive value. The first set of questions related to whether impact could be measured appropriately using a short scale. The impact scale had excellent internal consistency by parent-, youth-, and teacher report. In addition, there was support for a single-factor structure for the parent-and youth-reported scales, although one of the indicators (TLI) was just below the generally acceptable threshold on two measurement occasions by parent report. We also found that the impact scales showed substantial stability across time both within as well as across informants. The stability is particularly notable in teacher report, where children were rated by different teachers at baseline and follow up. Also, the agreement (ranging from 0.34 to 0.47) between informants compares favorably with cross-informant correlations for symptoms in previous meta-analyses (Achenbach et al. 1987). Finally, impact showed a strong concurrent association with relevant outcomes of psychosocial adjustment, namely contact with psychiatric services, self harm, truancy and contact with police, although some of these associations were diminished to non-significance after adjustment for psychiatric symptoms. In addition, parent-and teacherrated impact showed good predictive validity for newonset of contact with psychiatric services and new-onset self harm; however, self-rated impact was not predictive of any of these outcomes and parent-and teacher-rated impact was not predictive of either truancy or police contact after adjustment for baseline symptoms. It is notable that impact predicted close to an 80 % increase in the probability of contact with psychiatric services concurrently and 20 % for new onset of service contact in 2007, even after psychiatric symptoms were taken into account. This highlights the importance of impact (particularly as reported by the parents of individuals) in decisions about use of psychiatric services. This raises the possibility that changes in impact may be a good indicator of the perceived effectiveness of interventions offered by psychiatric services. If so, this would have the major advantage of universal applicability since the same measure of impact is potentially relevant to all disorders (or combination of disorders), whereas a separate set of symptoms is potentially needed for every single disorder or combination of disorders. On the other hand, our findings also suggest that some negative outcomes are better predicted by psychiatric symptoms than by impact, at least by some reporting sources. Given this uncertainty about the relative benefits of monitoring impact, symptoms or both, future studies could profitably compare the utility of impactand symptom-based measures when used in session-bysession monitoring or longer-term follow-up. The second set of questions concerned the relationship between symptoms and impact across time. The number of significant findings (at the pre-set level of p<0.05) was far greater than would be expected by chance, as shown in the paper's tables. In dimensional analyses, impact at baseline was a significant predictor of future symptoms both within and across informants. The predictions were still significant even after accounting for baseline symptoms, with the exception of youth-reported impact adjusted for youth-reported symptoms. The unadjusted effect sizes of these associations were moderate (e.g., 0.34 for parent reported impact to symptoms 3 years later), but were in many of the models substantially attenuated after adjusting for baseline symptoms (e.g., 0.04 for parent reported impact to symptoms 3 years later). We also show that standardized impact scores at baseline added considerably to the prediction of psychiatric disorders, even years later. We show that every increase in parent-rated impact score increased the probability of future psychiatric disorder over 3 years OR odds ratio; CI 95 % confidence interval. Odds ratios with confidence intervals are presented from logistic regression models with each of the outcomes as dependent variables and impact as an independent variable either unadjusted or adjusted for baseline total SDQ score All findings in bold are significant (p<0.05) Fig. 1 The association between mean impact and adverse outcomes by reporting source. Standardized (Cohen's d) mean scores for impact are presented on the y axis between 67 % (parent report) and 22 % (self report). As with the previous analyses, youth-reported impact was the weakest predictor of future outcomes, similar to other findings about the relative weakness of youth-reported psychopathology ). Our results agree in part with those of Pickles et al. (2001). Their results suggest that baseline impact adds to the prediction of future CD/ODD diagnosis and future impact, independently of baseline symptoms. However, these authors found that when predicting future depression symptoms and impairment, baseline symptoms are a better predictor than baseline impact. We found that impact independently predicts emotional disorders (which includes depressive disorder) to about the same degree as it predicts externalizing disorders. Pickles et al. (2001) measured domain-specific impairment, rather than global impact as we did-a difference that conceivably accounts for the discrepant findings. The results of this paper also suggest that impact was sensitive to changes in symptoms across time. Thus, an increase in psychiatric symptoms led to more subsequent impact (adjusted for impact levels present at baseline), even 3 years later. Using a path analytic model, we showed that impact was best predicted by impact and symptoms were best predicted by symptoms. The Strengths and Difficulties Questionnaire (SDQ) has been shown to be a dimensional measure of psychopathology , useful for measuring mental health problems in young people around the world (Achenbach 2012). However, little previous research has been done to validate its impact supplement. A previous study has shown that including the score of the SDQ impact supplement as part of an added value score is helpful in assessing the effectiveness of clinical interventions ). To our knowledge this is the first study that uses longitudinal epidemiological data to examine whether the impact supplement adds value to the prediction of psychiatric disorder.
Our findings suggest two reasons why impact should be measured in addition to symptoms. The first reason relates to measurement utility: impact added to the prediction of new-onset future symptoms, a finding consistent with previous results . It could be argued that this prediction is merely because symptoms at baseline have been measured imperfectly. However, any measurement contains noise. This may be especially true of clinical assessment. Finding that the addition of a concise measure of impact improved the prediction of future symptoms is therefore important.
The second reason for measuring impact in addition to symptoms concerns decisions about service provision. Impact may guide whether a child warrants referral to specialist services and treatment. Particularly when it comes to OR odds ratio; CI 95 % confidence interval. Odds ratios with confidence intervals are presented from logistic regression models with each of the outcomes as dependent variables and impact as an independent variable either unadjusted or adjusted for baseline total SDQ score All findings in bold are significant (p<0.05) β standardized regression coefficient; CI 95 % confidence interval; R 2 = proportion of variance. All findings in bold are significant (p<0.05); otherwise non-significant (ns); standardized robust regression coefficients and confidence intervals from robust maximum likelihood models are presented in each cell with R 2 as the estimate of the variance predicted  (Simonoff et al. 1997). It has not yet been shown that children who display psychiatric symptoms but are not impaired would come to significant future harm unless they were treated. Therefore, many clinicians might continue to β standardized regression coefficient; CI 95 % confidence interval; R 2 = proportion of variance All findings in bold are significant (p<0.05); otherwise non-significant (ns); standardized regression coefficients and confidence intervals from robust maximum likelihood models are presented in each cell with R 2 as the estimate of the variance predicted Fig. 2 Path analysis of the relationship between impact and SDQ total symptom score across time (all parent reported). Significant (p<0.05) paths or correlations with standard errors in brackets are presented as solid straight lines or solid curved arrows respectively, dashed lines illustrate non significant associations. B/L = baseline, R 2 = proportion of variance explained use a low measured impact to justify refraining from diagnosing or treating non-impaired children.
Our study has a number of strengths including the use of a large epidemiologic study, a longitudinal design, assessment with multiple informants, and the use of dimensional as well as categorical measures of psychopathology. However, the results should also be interpreted in the light of important limitations. Firstly, as is often the case in longitudinal studies, there was considerable attrition. While this would be most likely to bias prevalence estimates rather than the associations between disorders and other factors (Wolke et al. 2009), it has meant that those with high levels of psychopathology at baseline were less likely to participate in the study. Secondly, our measure of impact was designed to be concise and simple to use. Unlike other instruments, such as the Eyberg child behaviour inventory (Eyberg 1992), the SDQ does not ask about the impact associated with each reported symptom. Knowing which of the reported symptoms are contributing most to perceived impairment can be clinically important. Further studies, possibly also using qualitative approaches, may be required to assess the full range of experiencing the impact of psychiatric symptoms. Thirdly, our measures of concurrent validity are gathered from the same informants (parents, teachers, and youth) as our measure of impact. Ideally, these outcomes should also be gathered from external informants (such as police records). Including outcomes that are reported by sources external to the predictors avoids the problems associated with shared method variance-the inflation of estimates of association due to the use of the same informants (Campbell and Fiske 1959).
Finally, there is a certain circularity when predicting DSM psychiatric disorders as an outcome, since they contain impairment as a pre-requisite. However, we obtained a similar pattern of results when predicting to SDQ scores.
In conclusion, our findings suggest that impact can be reliably measured with a brief scale and that doing so may benefit clinical assessment. Having reliable measures of symptoms and impact may contribute in the future to investigations of why individuals with similar sets of symptoms can experience different levels of impact.