Introduction

Asthma and chronic obstructive pulmonary disease (COPD) can substantially impact patient health status [1, 2]. Capturing patient-reported outcomes (PROs) is a key method for assessing patients in routine clinical practice and understanding the effects of treatments in clinical trials; regulatory authorities around the world have issued guidance on the collection of such patient-reported data [3,4,5].

Several PRO instruments assessing symptoms, impacts and health status have been specifically developed for asthma or COPD [6, 7] and use disease-specific wording. Only two respiratory health status instruments are currently available for use in both asthma and COPD: the St George’s Respiratory Questionnaire (SGRQ) [8] and the Airways Questionnaire 20 (AQ20) [2, 9]. Both of these have features that limit their use in the routine clinical setting. The SGRQ takes around 10–15 min to complete [7], making it impractical for routine use. The AQ20, by contrast, only takes 2–3 min to complete, but it focuses primarily on impairment and not overall health status [2, 9].

Given the overlap in symptoms between asthma and COPD [10,11,12], there is the potential to create a standardised health status measure for use in both asthma and COPD that is practical for administration in routine clinical practice and includes items relevant to both conditions. Such a measure may also enable research into the impact of obstructive lung disease in populations where the specific diagnosis may be unclear.

The widely used COPD Assessment Test (CAT) [13,14,15], a health status measure for COPD, was modified to replace disease-specific terms with generic ‘pulmonary disease’ language. This modified version is called the Chronic Airways Assessment Test (CAAT).

The goal of this analysis was to examine the psychometric properties of the CAAT in patients with asthma and/or COPD using cross-sectional data from the NOVELTY study (a NOVEL observational longiTudinal studY; NCT02760329) [12, 16].

Materials and methods

Development of the CAAT

The CAT was modified, with permission, to replace the term ‘COPD’ with ‘chronic airways’ in the title and replace ‘COPD’ with ‘pulmonary disease’ in the introduction. All other features of the CAAT are identical to the CAT, including the items, response options and scoring algorithm [13]. The CAT is the copyright of GlaxoSmithKline; the CAAT will similarly be placed under copyright with the same permissions for personal use, for clinical practice and for clinical research as the CAT.

The CAAT takes about 2–3 min to complete, and comprises eight items relating to respiratory symptoms (items 1–3: relating to cough, chest phlegm and chest tightness) and functional impacts on wellbeing and daily life (items 4–8: relating to breathlessness, activity limitation at home, confidence leaving home, ability to sleep soundly and energy level) (Additional file 1: Fig S1).

As with the CAT [13], the CAAT total score (range: 0–40) is calculated as the sum of the eight individual items, with higher scores indicating a worse health status. To calculate the CAAT total score, patients must provide responses to at least six items; if one or two responses are missing, the scores for the missing items are set to the average of the individual’s non-missing item scores at the time of administration.

Psychometric validation sample

The goal of this study was to evaluate the cross-sectional psychometric properties of the CAAT from baseline data in patients with physician-assigned asthma and/or COPD in the NOVELTY study using item response theory (IRT) modelling and differential item functioning (DIF).

The total sample was selected from NOVELTY patients who completed the CAAT, and was comprised of three randomly selected sub-samples of patients with physician-assigned asthma, asthma + COPD or COPD (Additional file 1: Fig S2). Simple random sampling provided a balanced representation of patient demographics and severity categories reflective of the NOVELTY study. A second analytic sample of NOVELTY patients was comprised of those with asthma + COPD or COPD who completed both the CAAT and CAT (Additional file 1: Fig S2).

Sample size was based on observations needed to adequately power key sub-group analyses. A conservative approach was taken; diagnostic group sample sizes (N = 510) were double those previously reported to be required for accurate assessment [17].

For IRT-based DIF analysis, a sample size of 100–200 for 10 items is appropriate [18]. To obtain severity-balanced samples of this size, three 10% random samples were taken from the asthma and COPD groups and then combined, with duplicates removed (Additional file 1: Fig S2). A fourth COPD sample was taken to ensure adequate representation of patients assessed as having very severe COPD. No patients with asthma were assessed as having very severe disease. Patients with asthma + COPD were not eligible for DIF analysis due to the need for discretely characterised individuals (i.e. asthma only or COPD only).

Psychometric validation objectives and analysis

The broad psychometric objectives of this work were to evaluate the items and scales of the CAAT for: (1) internal consistency and structural validity; (2) item response characteristics and conceptual framework of the CAAT using IRT modelling and DIF; (3) discriminant/concurrent validity; and (4) to compare the CAAT and CAT in the same patients.

Objective 1: The internal consistency of CAAT items is a necessary characteristic of overall construct validity (i.e. respiratory health status). Cronbach’s alpha was used to indicate the level of consistency between CAAT items (an alpha > 0.7 represents adequate consistency and > 0.9 suggests redundant items) [19]. Exploratory factor analysis (EFA) was then used to assess whether the CAAT items measured the same concept, or factor. Unidimensionality and structural validity were evaluated, with these findings then informing subsequent confirmatory factor analysis (using Mplus v8.2) and evaluation of the invariance of measurement and structural characteristics across diagnostic groupings.

Objective 2: The boundary locations, discrimination and information functions of CAAT items were examined using IRT analysis. A two-parameter logistic graded response model [20] was fitted using STATA v16.1 to examine item response characteristics. DIF analysis was performed to determine if CAAT items performed in the same way in patients with asthma and COPD using ordinal logistic regression (DifDetect in STATA [21]). Differences in response were explored by testing each item for uniform DIF (the presence of a mean difference between groups) and non-uniform DIF (differences between individuals changing across the response severity range).

Differences in DIF magnitude between groups were evaluated in two ways: effect sizes calculated as Cohen’s d, which give a dimensionless measure of the size of differences, and mean boundary difference scores across all five response options to each item, which provide an estimate of differences expressed in CAAT units. Based on prior findings for the CAT [22], a mean difference of two units was assumed to be the minimum clinically important difference (MCID) for the CAAT total score. For the purposes of this analysis, the MCID was assumed to be distributed equally across items, resulting in a CAAT item-level MCID of 0.25.

Objective 3: To evaluate convergent and discriminant validity (ability of a tool to relate to similar measures, and not relate well to measures that reflect a different aspect of disease), Pearson’s correlations between the CAAT and other measures were examined. Correlation coefficients > 0.70 were regarded as strong; 0.4–0.7 moderate; and < 0.4 weak [23]. Analysis of covariance was used to examine the relationship between CAAT and SGRQ total scores.

Objective 4: To evaluate the agreement between the CAAT and CAT in all patients from NOVELTY who completed both instruments (N = 277), intraclass correlation coefficients were used. The CAAT and CAT were further compared using descriptive statistics and Bland–Altman plots.

NOVELTY study population

Observations were obtained from NOVELTY, a global, prospective, 3-year observational study of patients with a physician-assigned or suspected diagnosis of asthma and/or COPD [12, 16]. The study design, patient population, and ethical committee and institutional review board compliance have been reported previously [12, 16]. To avoid the selection bias observed in regulatory studies [24], NOVELTY enrolment was stratified by physician-assigned diagnosis (asthma, both asthma and COPD [hereafter referred to as asthma + COPD] or COPD) and physician-assessed severity (mild, moderate or severe). No diagnostic or severity criteria were pre-specified when determining eligibility. For patients with asthma + COPD, physician-assessed severity was the higher of the separate severity classifications for asthma and COPD.

Data collection

Patients completed questionnaires via the web or by telephone interviews. Consistency in mode of PRO administration (i.e. web or telephone) was encouraged. The PROs were administered in the same order each time up until 2 July 2019. Thereafter, PROs could be completed in any order.

At baseline, patients were administered the CAAT, SGRQ and EuroQol 5-dimensions 5-level visual analogue scale (EQ-5D-5L VAS). A subset of patients with asthma + COPD or COPD completed both the CAAT and CAT. Additionally, data for spirometry measures (post-bronchodilator forced expiratory volume in 1 s [FEV1], forced vital capacity [FVC] and FEV1/FVC ratio) were collected at the baseline visit.

Results

Patient demographics and clinical characteristics

The total sample consisted of 1530 observations (510 in each diagnostic group). In the total sample, the mean and standard deviation was 62.4 ± 13.3 for age, 15.6 ± 16.9 for years since diagnosis, 70.6 ± 24.1 for post-bronchodilator FEV1% predicted, and 15.9 ± 8.5 for CAAT score (Table 1). Compared to the asthma + COPD and COPD groups, the asthma group was on average younger, had a higher proportion of females, had a lower mean CAAT score and higher mean post-bronchodilator FEV1% predicted (Table 1). Mean scores for the CAAT and CAT were similar among patients with asthma + COPD or COPD who completed both questionnaires (Table 1 and Additional file 1: Table S1).

Table 1 Patient demographics and clinical assessments by physician-assigned diagnosis

Internal consistency and structural validity

Internal consistency was adequate (Cronbach’s alpha for asthma: 0.87; asthma + COPD: 0.86; COPD: 0.84; total sample: 0.86), indicating the CAAT assesses the same general construct as the CAT in each diagnostic group. Initial exploratory factor analyses indicated that items clustered into two correlated groups: items 1–3 pertaining to symptoms, and items 4–8 pertaining to functional impact. However, confirmatory factor analysis demonstrated very good fit of a single hierarchical factor for total CAAT score across diagnostic groups (Additional file 1: Table S2).

Item response characteristics and conceptual framework

Item response theory

In the total sample, CAAT items had a good overall IRT model fit except item 6 (confidence leaving home). Item response boundary locations were monotonic and in the expected order. Discrimination between response options for individual items ranged from 1.2 to 2.9 across the health status continuum (theta). Symptom-related items (items 1–3) had broad coverage but were less informative (i.e. lacked precision) vs. the other items (Additional file 1: Fig S3). By comparison, functional impact-related items (items 4–8) provided more information but had a narrower range (Additional file 1: Fig S3). Across the total sample, test information coverage lay between theta values of −2.0 and 3.1.

Differential item function

Sampling resulted in 127 patients with asthma and 161 patients with COPD once duplicates were removed. Four items showed uniform DIF (p < 0.005; Table 2); none showed non-uniform DIF (p = 0.18–0.98). There was no consistent mean boundary difference in CAAT units between asthma and COPD (Table 2); patients with asthma scored lower (indicated by the negative sign) in five items and patients with COPD scored lower in three items. Assuming an item-level MCID of 0.25 (see Methods), this threshold was exceeded in five items. On average, the asthma group scored slightly lower, largely due to items 4 and 5; the mean difference was −0.19 (standard deviation 0.47; p > 0.1). This translates into 1.54 CAAT units, which is 3.9% of the scaling range of 0–40 and below the assumed two-unit CAAT MCID.

Table 2 Mean boundary difference and p values for uniform DIF between patients with asthma and COPDa

When measured by effect size, three items (4, 5 and 6) showed a significantly lower response in patients with asthma (Fig. 1). A meta-analysis of all items demonstrated a significantly lower response overall in the asthma group (p = 0.013), but the difference was small (Cohen’s d =  −0.23).

Fig. 1
figure 1

Mean boundary threshold difference between asthma and COPD for CAAT items by effect size units. Each CAAT item was scored between 0 and 5. Effect size units were calculated using Cohen’s d. A negative value indicates a lower response in patients with asthma. The overall mean was calculated as the standardised mean difference from a meta-analysis using a random effects model. Error bars represent 95% confidence intervals. CAAT Chronic Airways Assessment Test, COPD chronic obstructive pulmonary disease

Convergent and divergent validity

Results showed consistently high correlation as reflected by R2 > 0.86 between the SGRQ and CAAT across diagnostic groups (Fig. 2; individual patient scores shown in Additional file 1: Fig S4). Analysis of covariance showed no significant difference in the regression slopes between asthma and COPD (p = 0.46). There was a significant intercept in all groups (i.e. when the SGRQ score was 0, the CAAT score was > 0 [≈ 5 units, p < 0.0001]), but there was no significant difference in intercept between asthma and COPD (p = 0.078).

Fig. 2
figure 2

Linearity between CAAT and SGRQ total scores for the total sample and each diagnostic group. Error bars represent standard errors; data points with no error bars are representative of one patient with that SGRQ score. CAAT Chronic Airways Assessment Test, COPD chronic obstructive pulmonary disease, SGRQ St George’s Respiratory Questionnaire

The CAAT also correlated strongly with the CAT and moderately with the EQ-5D-5L VAS; weaker correlations were observed for spirometry measures (Table 3; Additional file 1: Table S3).

Table 3 Pearson’s correlations between CAAT score and patient-reported outcomes or clinical assessments

Comparison between the CAAT and CAT

Intraclass correlation indicated strong reliability between the CAAT and CAT (Additional file 1: Table S4). Bland–Altman plots showed no consistent difference between CAAT and CAT scores (Fig. 3).

Fig. 3
figure 3

Bland–Altman plots of CAAT and CAT total scores in asthma + COPD, COPD, and both. Each small circle represents one patient, with the difference between CAAT and CAT total scores (CAT score – CAAT score) plotted against the average of the two scores. Jittering has been added to panel A for clarity where there were multiple superimposed circles. The central line shows the mean difference between the two measures, while the upper and lower lines show the limits of agreement (± 1.96 SD). Data for six patients who did not meet inclusion criteria have been excluded. CAAT Chronic Airways Assessment Test, CAT COPD Assessment Test, COPD chronic obstructive pulmonary disease, N total number of patients in the sample, SD standard deviation

Discussion

These results show that the CAAT has strong psychometric properties and may be a suitable PRO for assessing health status in patients with asthma and/or COPD. While patients with asthma scored some items lower than patients with COPD in the DIF analysis, the overall differences in CAAT total score were small and their summed effect was below the CAT MCID and therefore unlikely to be of clinical importance. This suggests that CAAT scores in asthma and COPD are likely to reflect similar degrees of health impairment; however, this may not apply at an individual item level, where statistically and clinically significant differences were found, particularly for items 4 and 5 (breathlessness and limited home activity).

The observed emergence of CAAT item grouping into symptom-related and functional impact-related items suggests that a two-factor model would need to be explored further if CAAT domains were to be considered. The DIF analysis showed that this would only result in small improvements to precision, however. Like the CAT [13], the CAAT is designed to provide a single and easy-to-calculate measure of health status impairment, whereas a two-factor model would require a more complex scoring algorithm, introducing a barrier to its use. For these reasons, we have opted for a single total CAAT score as being suitable for the majority of CAAT applications.

The CAAT correlated strongly with the CAT, with no consistent difference seen in the Bland–Altman plots. As expected from previous studies of the CAT in COPD [13, 25], the CAAT consistently and strongly correlated with the SGRQ in all three diagnostic groups. It also correlated moderately with the EQ-5D-5L VAS, a generic measure of health status. However, as generally reported for other health status PROs [2, 8, 9], correlations between the CAAT and spirometry measures were weaker than those between the CAAT and other PROs, although patients with lower lung function tended to have worse CAAT scores as expected.

The 2022 GINA and GOLD reports emphasise the need for regular assessment of symptoms and their impact on patients with asthma and/or COPD [10, 14]; a need therefore exists for a clinically applicable PRO to use across asthma and COPD and in patients with both conditions. Recently, the Respiratory Symptoms Questionnaire was developed as a respiratory symptom tool for patients with asthma and/or COPD [26]; however, it is not designed to address the broader concept of health status. Our results suggest that the CAAT can assess health status in everyday clinical settings without adding undue patient burden. For instance, the CAAT was recently used in an investigation of the utility of patient-reported questionnaires in patients with or at risk of COPD [27].

Although the CAT was designed for use in patients with COPD, the CAAT demonstrated good performance in patients with physician-assigned asthma, asthma+COPD and COPD. Unlike some asthma and COPD questionnaires, the CAAT captures aspects of health status relevant to both asthma and COPD, including the impact on activity, sleep and energy level. Of the items included in the CAAT, only items 6 (confidence leaving home) and 8 (energy) are not already part of routine asthma assessment [10]. Although item 2 (phlegm) is not currently included in validated asthma symptom control tools, it is common in patients with asthma [28, 29]. Furthermore, previous qualitative patient interviews of asthma symptoms support the relevance of several CAAT items in patients with asthma [30]. The CAAT provides a single tool for standardised assessment of disease-specific health status in routine clinical practice across a range of obstructive lung diseases.

A key strength of this analysis is that it was performed using a range of measures within a large, real-world population of patients across primary and non-primary care settings.

Limitations of this analysis include the stratification of NOVELTY enrolment by physician-assessed severity (to ensure adequate and approximately equal sample sizes for subgroup analyses [12, 16]). Consequently, the patients in this analysis sample did not reflect a truly random sample of asthma and COPD populations in the community. Patients with asthma + COPD were relatively overrepresented in this analysis compared with the overall NOVELTY population (33% vs. 12%, respectively) [12], but this can give us some confidence in the reliability of our findings in these patients since it provided a large sample size. Finally, the sample size may have been overly conservative for some of the analyses, particularly the IRT, resulting in some analyses being slightly overpowered and detecting small but not clinically important differences.

Beyond the scope of this paper, future analyses should look in detail at the relationship between CAAT scores and a range of measures of severity relevant to asthma, asthma + COPD and COPD, and a longitudinal analysis to assess the performance of the CAAT over time. Further research is required to determine the CAAT score MCID, whether this differs from the CAT MCID [22], and investigate whether it applies across diagnostic and severity groups. Current asthma control instruments are poorly responsive in patients with severe asthma [31], so it will be important to determine the responsiveness of the CAAT in this patient group.

Conclusion

This cross-sectional analysis is the first step in psychometrically evaluating the CAAT as a measure of health status in patients with asthma and/or COPD. It has demonstrated good cross-sectional psychometric properties and moderate-strong correlations with other health status measures, making it a suitable PRO instrument to assess the impact of obstructive lung disease in broad populations of patients with airways disease. Due to its brevity, the CAAT may be particularly relevant for routine clinical practice and ‘real-world’ effectiveness studies performed in patients in a routine care setting.