Introduction

Anxiety disorders are among the most common mental health problems experienced by children and adolescents [1, 2]. Accordingly, the past decade has brought forth a multitude of studies examining the etiology, phenomenology, and prognoses of these anxiety disorders [3]. Despite this empirical attention, however, research on the validity of our current classification system for childhood anxiety disorders continues to yield conflicting findings [411]. On the one hand, arguing for the accuracy of our current system, a large body of literature has shown that children with anxiety disorders can be reliably distinguished from those without anxiety disorders [1214] and that specific anxiety disorders can be disentangled from each other into discrete yet correlated entities [10, 11]. On the other hand, arguing against the accuracy of our current system, existing childhood anxiety research and clinical work is plagued by diagnostic difficulties and inter-rater disagreement [15, 16], drawing into question the system delineated in the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) [17] and the utility of retaining this system for future modifications of the manual. Approaches to understanding and addressing this conflict have been varied, yet a potentially pivotal question has remained unaddressed: Do the diagnostic difficulties and issues of inter-rater disagreement act in such a way that they refute the current classification system? Or are these diagnostic difficulties a complicating—yet distinct—issue in an otherwise accurate system?

Supporting evidence for the current classification system of childhood anxiety disorders may fall into two of the components of construct validity [1820]: (a) discriminant validity (i.e., the extent to which anxiety disorders in children can be distinguished from each other and from disorders unrelated to anxiety) and (b) convergent validity (i.e., the extent to which different means of assessing the same anxiety construct converge with one another). If the current diagnostic system, which distinguishes among major anxiety syndromes of childhood, accurately categorizes anxiety disorders in children, there should be evidence of good discriminant and convergent validity for measures of these syndromes. For example, measures of separation anxiety should aggregate together but be psychometrically distinct from measures of social anxiety. Of course, discriminant validity does not require the complete independence of separate constructs, so long as each disorder has a significant degree of unique variance. Furthermore, the measurement of each anxiety disorder should reflect the trait in question more than the general reporting pattern of each informant [19, 20]. The present study uses a clinical sample of children and adolescents to examine the construct validity of the DSM-IV child anxiety classification system, with a focus on SoP, SAD, GAD, and PD.

Previous efforts to demonstrate the construct validity of specific anxiety disorders in children have found support in the area of discriminant validity. Though promising, these studies have often been limited by their focus on a restricted range of anxiety disorders (e.g., only SoP versus non-disordered children) or employment of a narrow range of methodology (e.g., only child-report). Some studies have demonstrated that children with any anxiety disorder (not a specific diagnosis) differed from their normative peers on various measures of anxiety [12, 14]. Studies that have aimed to distinguish among specific anxiety disorders in childhood have produced mixed results. Large questionnaire studies have used factor analytic techniques to identify classes of child psychopathology [21] yet the limited number of items assessing specific anxiety symptoms have often precluded detecting particular anxiety syndromes that may cohere together. Spence, using a more-detailed pool of anxiety self-report items, found support for PD, SAD, SoP, GAD, obsessive–compulsive disorder, and physical fears (analogous to specific phobia) as distinct entities that are related by a higher-order factor [10]. Spence’s findings are encouraging, yet their generalizability to the construct validity of child anxiety disorders is limited by the use of only child-report data and a community sample. Ferdinand and colleagues [11] reported that in referred and general population samples of children and adolescents, SAD and SoP are only distinct constructs in the sample of referred children. The present paper expands upon Ferdinand’s work by including multiple informants and employing a multitrait–multimethod matrix, described below, to assess the convergent and discriminant validity of these taxa.

Evidence for the convergent validity of these four major child anxiety disorders is scant. Agreement between diagnosticians, parents, and children has been found to be moderate in clinical samples, but varies by the specific anxiety disorder in question [9]. Most studies show moderate convergence between children’s self-report measures and diagnoses on the clinical/non-clinical dichotomy but show poorer agreement at the level of specific anxiety disorders [4]. Convergence between parent-report questionnaires and diagnostician-assigned diagnoses has also been fairly weak in some studies. For example, Boyle and colleagues [22] found modest “case” agreement when mother-report questionnaire scores were dichotomized to clinical/non-clinical and compared to diagnoses derived from a structured interview (κ’s = .31 to .37). Moderate to poor concordance has also been shown in several studies comparing parent and child self-report measures of anxiety [23, 24]. A recent study by Comer and Kendall reported stronger parent–child agreement at the symptom level than at the diagnosis level, particularly for observable symptoms, yet even this relatively stronger agreement was weak overall (κ < .35) [8]. In short, convergent validity of the child anxiety disorder syndromes is not yet well established.

Campbell and Fiske’s multitrait–multimethod matrix (MTMM) design [18] offers an approach that permits the simultaneous evaluation of discriminant and convergent validity that could be used to address unresolved questions about the construct validity of the current conceptualization of child anxiety disorders. The MTMM design has proven to be influential in the past half century of psychological research [19]. However, the original analytic strategy proposed by Campbell and Fiske for conducting an MTMM analysis has been shown to have multiple limitations, including ambiguity about what constitutes satisfactory results and how to extract underlying trait and method factors from the correlation matrix [25]. Of the several proposed corrections to the original simple correlational approach to MTMM analyses, covariance structure modeling has gained the most prominence [19], with the general confirmatory factor analysis model (CFA; a type of structural equation modeling) being the method of choice [25, 26]. The present study uses CFA to evaluate and compare a series of nested models, following the guidelines of Widaman [27] and Byrne [19, 28], to test an MTMM model of child anxiety. This analytic approach has clear criteria for hypothesis testing and allowed us to probe the construct validity of child anxiety disorders in a manner distinct from previous, less conclusive studies.

Method

Participants

Participants were drawn from a consecutive series of children, ages 6 to 17 years, undergoing diagnostic evaluation at a university hospital based clinic specializing in the diagnosis and treatment of childhood anxiety and related disorders. The final sample consisted of 174 children (94 boys and 80 girls; mean age = 11.61 years, SD = 2.64). The racial/ethnic composition of the sample was: White (77%), Asian American (5%), Hispanic (4%), African American (2%), and “other” (12%). Hollingshead’s socioeconomic status index ratings indicated a primarily middle-class sample (1 = low, 9 = high; M = 7.37, SD = 1.53) [29]. The most frequent diagnostician-assigned child anxiety diagnosis under study in the present sample was GAD (28.7%), followed by SoP (17.8%), SAD (14.4%), and PD (4.6%). Whereas the majority of children in this sample did not meet criteria for GAD, SoP, SAD, and PD (53.4%), 46.6% of children had at least one of these diagnoses. Diagnostic comorbidity was ample, with 13.2% of children meeting criteria for two of these diagnoses, and 2.9% of children meeting criteria for three diagnoses. Rates of other diagnoses in the sample were obsessive compulsive disorder, 54.6%, attention deficit hyperactivity disorder, 19.5%, dysthymia or major depressive disorder, 15.5%, Tourette’s disorder or other tic disorders, 15.5%, oppositional defiant disorder or conduct disorder, 8.6%, specific phobia, 6.9%, selective mutism, 1.1%, and post traumatic stress disorder, 0.6%. Additional details about the sample are provided in a study of the psychometric properties of the Anxiety Disorders Interview Schedule for DSM-IV: Child and Parent Versions [9].

Procedure

At clinic intake, the ADIS-C/P [30] was administered to each child and his or her parent(s) by a doctoral student in clinical psychology or a doctoral-level psychologist, trained by the director or associate director of the clinic. Training involved attending a presentation of the administration of the interview, observing and coding a videotaped interview, co-rating multiple live interviews conducted by a trained diagnostician, and, finally, assuming satisfactory completion of the earlier steps, conducting at least one interview using the ADIS-C/P while under the supervision of a trained diagnostician. A single diagnostician administered the ADIS-C/P first to the parents and then to the child. While the parents were being interviewed, the child completed the self-report measures under the supervision of a trained research assistant. Following this, the diagnostician interviewed the child while the parent(s) completed questionnaires. A licensed clinical child psychologist supervised each intake evaluation. Prior to the start of the clinical evaluation, parents provided informed consent and youngsters provided assent for the use of their intake data for research purposes.

Measures

ADIS for DSM-IV: C/P

The ADIS-C/P [30] is a semistructured interview protocol with favorable psychometric properties and strong inter-rater reliability [12, 31]. Lyneham and colleagues reported that interrater reliability for individual anxiety disorders based on parent and child interviews were excellent (κ = .82–.96) according to the guidelines set forth by Mannuzza and colleagues [31, 32]. Silverman and colleagues reported strong test–retest reliability statistics for the ADIS-C/P for combined diagnoses (κ = .80–.92) and individual diagnoses (κ = .62–.88), with intraclass correlation coefficients ranging from .81 to .96 for the test–retest reliability of ADIS symptom scales for individual reporters [12]. Diagnosticians made ratings on the ADIS-C/P Clinical Rating Scale (CRS; 0 = not at all, 4 = some, 8 = very, very much) for each assigned diagnosis. Ratings of 4 or above are considered to be of a clinical level. In order to maintain the independence of reporters, diagnosticians were blind to all parent and child responses on self-report measures until their diagnostic impressions and CRS ratings were obtained. Sixty-six percent of cases were also rated by a diagnostic review team (blind to the diagnostician’s ratings) to estimate reliability. Intraclass correlation coefficients calculated between the diagnostician and diagnostic review team for the continuous CRS data were at or above .73 for each specific disorder.

Multidimensional Anxiety Scale for Children (MASC)

The MASC is a standardized 39-item self-report measure of anxiety yielding four factor scores [33]. Each item is rated on a 4-point Likert-type response scale ranging from 0 (never true about me) to 3 (often true about me). The four factor scales were empirically derived through principal components analysis and include Social Anxiety (9 items), Separation Anxiety (9 items), Harm Avoidance (9 items), and Physical Symptoms (12 items). Cronbach’s αs for these four scales in this sample were .82, .70, .64, and .79, respectively. These αs are comparable to those reported by March and colleagues, which ranged from .74 to .85 [33].

A parent report version of the MASC (MASC-P) was also administered [9]. MASC-P items are identical to the MASC items but with nouns and pronouns altered to match the parent’s perspective (i.e., “My child …” instead of “I …”). Baldwin and Dadds [34] report strong psychometric properties for the MASC-P, as well as data that show the MASC factor structure holds for the parent version. Cronbach’s αs for the MASC-P Social Anxiety, Separation Anxiety, Harm Avoidance, and Physical Symptoms scales in this sample were .85, .72, .68, and .81, respectively.

Data Analysis

The main model under study is displayed in Fig. 1. As noted previously, child and parent reports on each specific disorder were operationalized as MASC subscales. The child- and parent-MASC Separation Anxiety and Social Anxiety subscales and corresponding ADIS-C/P CRS scores match well to DSM-IV SAD and SoP, respectively. The child- and parent-MASC Harm Avoidance subscale and ADIS-C/P GAD CRS scores were used as indicators of GAD (the Harm Avoidance subscale focuses, in part, on perfectionism, a GAD feature that is also measured in the ADIS-C/P GAD section) [35]. Lastly, the child and parent Physical Symptoms subscale of the MASC, as well as PD ADIS-C/P CRS scores, were operationalized as indicators of PD (high scores on the Physical Symptoms subscale predicted PD diagnoses in a psychometric study of the ADIS-C/P) [9].

Fig. 1
figure 1

Hypothesized MTMM general CFA model (Model 1: Correlated traits/correlated methods). Note: Soc Social phobia; GAD Generalized anxiety disorder; SAD Separation anxiety disorder; PD Panic disorder

The CFA-based approach for testing the convergent and discriminant validity of the four putative anxiety disorder syndromes with an MTMM analysis is an application of structural equation modeling (SEM). Hypothesized models of trait- and method-influence on anxiety scores are tested for overall model fit and compared to each other, providing a systematic, model-based method to analyze data in an MTMM variance–covariance matrix [19, 36]. Trait factors (i.e., specific disorders) and method factors (i.e., reporter) are modeled as latent variables; that is, they are not measured directly but are estimated using the observed scores (e.g., child-report social anxiety) included in the model. The models presented in the current paper were tested with EQS version 6.1 [37]. The main CFA model (model 1; see Fig. 1) to which others are compared is the least restrictive: Trait factors are allowed to be freely correlated with each other, as are method factors. Model 2 includes correlated methods, but does not include trait factors. Model 3 consists of perfectly correlated traits while letting methods correlate freely (in other words, allowing the correlation between different reporters to be estimated). Model 4, the final model, allows the trait factors to correlate freely while the method factors are constrained to be perfectly correlated with each other.

A comparison of model fit statistics between models addresses the degree to which discriminant and convergent validity is supported at the matrix level by the present conceptualization of child anxiety disorders. Convergent validity is tested by comparing the first two models—model 1 with freely correlated methods and freely correlated traits, and model 2 with freely correlated methods and no traits. There is evidence of convergent validity (i.e., that independent reports of the same trait converge) if the model that includes trait factors fits better than the second model without traits. Discriminant validity is ultimately exemplified by the inter-correlation of independent measures of different traits being negligible as well as the inter-correlation of independent methods (irrespective of traits) being negligible [19, 36]. Discriminant validity in terms of trait effects is supported by a significant difference in model fit between the third model, in which traits are perfectly correlated, and the first model, in which traits correlate freely. Discriminant validity in terms of method effects is supported by a significant difference in model fit between model 4, in which the correlation among methods is unity, and model 1, in which methods correlate freely.

Results

A correlation matrix of all measures included in the study is displayed in Table 1, with descriptive statistics for each measure provided at the bottom of the table. The reporters’ ratings of the SAD, SoP, and PD are related to each other in a pattern that supports their construct validity; reporter ratings are significantly correlated within each disorder across informants, evidencing convergent validity by traditional MTMM criteria [18]. However, ratings of SAD, SoP, and PD are typically not significantly correlated with each other outside of ratings within the child and parent informants, evidencing discriminant validity by traditional criteria with significant informant bias. The sole exception to this finding was a small but significant relationship between diagnosticians’ ratings of PD and parents’ ratings of SAD. The general reporting bias shown by the parents and children in the sample was not evident in the diagnostician ratings, where ADIS-CRS scores for specific disorders did not significantly correlate with each other, with the exception of SoP and GAD. Overall, the obtained pattern of correlations provided moderate evidence of convergent and discriminant validity for SAD, SoP, and PD, but less so for GAD.

Table 1 Correlation matrix of study measures

Goodness-of-fit indices for the models are displayed in Table 2. The chi-square test of model fit measures the degree to which the data depart from the specified model. The larger the chi-square relative to the degrees of freedom (a measure of the number of unspecified parameters), the poorer the model fit. The comparative fit index (CFI) is a measure of fit that accounts for the complexity of the model in its calculation. The CFI ranges from 0 to 1, with a CFI above .90 indicating an acceptable fit. Lastly, the normed fit index (NFI) is an additional measure of fit that ranges from 0 to 1, with values above .90 indicating acceptable fit.

Table 2 Summary of goodness-of-fit indexes for MTMM models

More importantly for the present purposes is the comparison of goodness-of-fit indices between models. Table 3 presents differences in chi-square values, degrees of freedom, the CFI, and the NFI. In terms of the test of the convergent validity of the proposed structure of anxiety syndromes as illustrated in Fig. 1, the statistically significant difference between the fit of models 1 and 2 is supportive of the convergent validity of this conceptualization of child anxiety. The existence of traits in model 1 significantly improves the fit. The statistically significant difference between the fit of models 1 and 3 provides support for the discriminant validity of the traits; allowing the four putative anxiety syndromes to correlate freely significantly improved model fit (the alternative was a correlation of 1.0 among all the traits), showing that the four anxiety traits are not perfectly correlated with one another, and thus, diverge meaningfully. Finally, discriminant validity of method (reporters) was also supported by the significant difference between the fit of models 1 and 4; when the model distinguished between methods, model fit improved significantly compared to the alternative (analogous to a lack of independence among methods), showing that each of the three methods employed—child report, parent report, and diagnostician rating—provide nonredundant information about the traits in question.

Table 3 Differential goodness-of-fit indexes for MTMM nested model comparisons

Discussion

The aim of this study was to evaluate the construct validity of four child anxiety disorders using MTMM. In line with previous research, the present study’s observed correlation matrix displayed modest interrater agreement [8, 34, 38] and significant intrarater correlations for parent and child informants [7, 10, 34, 39]. Nonetheless, most of the traditional MTMM criteria for convergent and discriminant validity were met for SAD, SoP, and PD based on the simple correlation matrix. When confirmatory factor analyses (CFAs) were used to test hypotheses about construct validity using a model fitting approach applied to the same matrix, results were supportive of both convergent and discriminant validity, even with the addition of GAD in the model (which had not shown strong evidence of validity by MTMM criteria in the simple correlation matrix). The advantages of distinguishing among these four child anxiety syndromes outweighed interrater disagreement and other sources of error, such that the standard model of child anxiety disorders (as distinguishable but related entities) was a significantly better fit to the data than any alternative model (e.g., models excluding the specific disorders or constraining the specific disorders to be perfectly correlated with one another).

If specific anxiety disorders are phenomenologically distinct from one another, the more accurate model (i.e., the model that fits the data better) should be one in which specific anxiety syndromes can be distinguished from one another and do not provide entirely overlapping information, even if they are partially interrelated. In the present study, the specific anxiety disorder syndromes were related to each other, yet the model assuming perfect concordance between anxiety syndromes fit the data significantly worse than the model that allowed the magnitude of each syndrome to vary independently from the others. Thus, the present study supports the current conceptualization of these anxiety disorders as related, yet distinct entities. In line with previous research [4, 2224], these findings also suggest that each of the methods used to assess the child anxiety syndromes offers a related, yet unique perspective. This supports discriminant validity for the methods; each method (i.e., reporter) is significantly independent from the others.

If specific anxiety disorders are specified correctly in the current DSM-IV nosology, the model in which specific anxiety disorders exist and are adequately defined by converging reports of different informants will fit the data better than the model in which specific anxiety disorders are excluded and variation in the data is explained exclusively by individual differences among children and by method variance, not by meaningful patterns or subtypes of anxiety. In the present sample, the inclusion of specific anxiety disorders in the model significantly improved model fit and child-, parent-, and diagnostician-ratings converged on the specific anxiety disorders under evaluation. Although parents, children, and diagnosticians at times painted a different picture of the children’s symptom severity (illustrated by modest inter-rater convergence at the bivariate correlation level), they tended to be more in line with one another when rating a specific anxiety syndrome than when rating different syndromes. This pattern of findings offers support for the convergent validity of the four anxiety syndromes according to the CFA-based MTMM criteria employed in this study.

The present research extends previous findings by Spence [10], Chorpita and colleagues [40, 41], and Ferdinand and colleagues [11] by using a multi-method approach with a clinical sample, affording greater symptom variability than could typically be found in community settings and allowing greater understanding of the aspects of construct validity of these anxiety syndromes among children who were clinically referred. Similar to the present study, Chorpita’s work, which found that the variability observed in specific anxiety disorders could be substantially explained by a set of higher-order factors (negative affectivity, physiological hyperarousal, etc.), highlights the way in which specific anxiety disorders are related yet distinct entities that share different combinations of the same underlying dimensions. Although the scope of this paper is limited to an examination of the construct validity of specific child anxiety disorders, the findings do not imply that higher-order factors would not partially account for the relationships between disorders. A logical next step in this line of research will be to integrate the present MTMM methodology, offering support for the current categorization of child anxiety disorders, with models investigating the dimensions that underlie these categories.

Limitations

Several methodological limitations of this study warrant discussion and attention in future research. First, the children and parents participating in this study presented to a clinic specializing in childhood anxiety treatments. Although not all children received a diagnosis, overall anxiety levels of children in the current sample are greater than that of the population. Such a sample of children grants the variability needed to explore interrelationships among the constructs of interest, yet it is possible that estimates of these interrelationships may change in a population with a significant proportion of children without anxiety disorders [39]. This is an important future direction for study.

Second, the size of the current sample, though large enough to conduct confirmatory factor analyses, was not large enough to examine the relations between the child’s developmental status, gender, and the construct validity of the anxiety syndromes. Testing for model differences between groups requires multiple sample analysis, which in the case of gender, for example, would reduce the subsamples to n = 94 and n = 80, providing inadequate power for the number of free parameters in some of the models. Some research indicates that the structure of anxiety is stable in multiple age groups [42], but most agree that a child’s developmental status is a pivotal component in the assessment of child psychopathology (e.g., the declining incidence of SAD as children transition to adolescence) [40, 43, 44]. Gender has been shown to strongly relate to the prevalence rates anxiety syndromes with females demonstrating almost twice the risk of males in some studies [45], warranting additional research on the construct validity of anxiety syndromes within each gender.

The present analytic procedure goes beyond standard approaches to studying childhood anxiety disorders by evaluating the construct validity of these disorders at a model level, assessing discriminant and convergent validity with state of the art structural equation procedures that provide clear-cut methods for hypothesis testing. In response to the question posed at the beginning of the present paper regarding whether modest interrater agreement and other diagnostic difficulties refute the current classification system of child anxiety syndromes, the present study suggests that the answer is no. The current nosology appears to reflect the data reasonably well using a comparative CFA approach.

Despite their differences, each informant provided a unique perspective that converged into four distinct anxiety disorders. This finding echoes that of Phillips and colleagues, who found that the perspectives offered by multiple informants contribute to the diagnostic picture beyond what can be captured by combining informants’ reports using the ‘or’ rule [46]. The modest interrater agreement suggests that not only may diagnoses be most accurate when information from multiple informants is considered—a common truism in clinical assessment [47]—but that the nature of the anxiety construct may change depending on which informant(s) the assessment is based. This theoretical approach, the emergent variable model [48], holds that the construct under study, in this case child anxiety, is a composite of different measures rather than a latent variable that exists independently of the methods used to estimate it. Similar to standard diagnostic practice for assessing attention deficit/hyperactivity disorder [17], a complete assessment of anxiety may necessitate multiple informants. Future psychometric research efforts in this area may wish to consider whether empirically-based approaches to combining data from multiple informants can be derived to yield more accurate diagnostic decisions in research and clinical practice.

Summary

This study found moderate support for the convergent and discriminant validity of four main anxiety disorders: SoP, SAD, GAD, and PD. Results were particularly strong for SoP, SAD, and PD. Strengths of this study include its use of CFA, a model-based approach, which showed that the standard model of child anxiety disorders (as distinguishable but related entities) was a significantly better fit to the data than any alternative model. These results suggest that specific anxiety disorders should be distinguished from each other, and that a complete assessment of anxiety may necessitate multiple informants.