FormalPara Key Summary Points

Population-based administrative databases sourced from actual real-world clinical settings provide the opportunity to perform large-scale epidemiological studies; novel use and application of these real-world data in health care research have increased substantially

However, administrative databases were not designed to be used in research, and the validation of case-finding algorithms to accurately identify patients with atopic dermatitis (AD) is needed

The objective of this study was to validate an existing AD algorithm published by Henriksen et al. and a Modified algorithm. Both algorithms used dispensed prescriptions and diagnoses of skin conditions to identify patients with AD

The sensitivity and positive predictive value of the Modified algorithm were shown to be acceptable in the pediatric patient population when using primary and secondary care data to validate this algorithm; thereby, the Modified algorithm can be used to identify pediatric patients with AD using administrative data

A similar assessment in adult patients indicated that further modifications to this algorithm would be needed to be able to use it to accurately identify adult patients with AD in these administrative databases

Introduction

Atopic dermatitis (AD) is a chronic inflammatory skin disease characterized by dry skin, pruritus and eczematous lesions [1]. In roughly 60% of cases, the disease manifests during the first year of life, but may start at any age [2, 3]. Most AD patients have mild-to-moderate disease, whereas approximately 10% of patients suffer from severe disease [4,5,6].

The use of real-world data to evaluate drug safety and effectiveness has received substantial attention from medical researchers and regulators [7], and large-scale epidemiological studies are important in identifying and studying risk-factors, disease rates and resource utilization and outcomes of interventions. For this purpose, the Nordic countries, which have a long history of maintaining high-quality national health registers, offer an excellent setting for performing epidemiological studies. In Sweden, it is mandatory to report medical information to the National Patient Register (NPR), the Prescribed Drug Register (PDR) and the Cause of Death Register (CDR), and these registers provide almost universal coverage. Data can be linked between registers on an individual level by using a unique personal identifier, provided to all Swedish citizens.

Historically, epidemiological studies of AD have used questionnaires to identify AD [8] or have been limited to a specific geographical region [9]. While questionnaires can provide detailed information on several important domains of the disease which are not available in administrative databases, questionnaires can be susceptible to language- and cultural-related issues associated with interpretation [10, 11] as well as loss of patients due to drop out.

Moreover, parental answers to questionnaires that are obtained retrospectively may be prone to both selection and recall bias, which can affect reported outcomes [12, 13]. Studies which are restricted to selected age groups or geographical areas may have limited generalizability. Observational studies have been proposed as a method to complement the results from studies using questionnaires [13] while at the same time capturing relevant subpopulations of patients with AD, including those managed exclusively in primary care, which is oftentimes the case for patients with mild AD. Diagnoses in primary care are not recorded on a national level in Sweden, but dispensed medications originating from primary care are captured in the PDR which is nationwide. An approach for accurate AD patient identification thus depends on a reliable algorithm to identify patients with AD through their dispensed medications. While several studies have validated, e.g., asthma medication as a proxy for asthmatic disease [14,15,16,17,18,19,20], only a few studies have validated data on dispensed prescriptions for treatment of AD as a proxy for AD [13, 21,22,23].

Henriksen et al. [24] developed an algorithm to identify children with AD from filled prescriptions of topical treatments. This algorithm uses diagnoses from hospital visits to exclude patients with other types of dermatitis or medical conditions known to lead to the use of topical treatments [24]. A Danish study validated this algorithm by telephone interviews with the caretakers of the children identified by the algorithm [22]. They found that the sensitivity and specificity of the algorithm were 74.1% and 73.0%, respectively. However, AD is a relapsing-remitting condition, which may lead to recall bias, and research suggests that questionnaire validation performs poorly [25], even for the United Kingdom Working Party criteria, a set of diagnostic criteria for atopic dermatitis [26]. To the best of our knowledge, the algorithm developed by Henriksen et al. has not been validated by using primary care databases or been validated in a Swedish setting.

Objectives and Importance

The primary objective of this study was to validate the algorithm developed by Henriksen et al. [24] in pediatric patients using two primary care registers in Sweden. We used primary care register data as a gold standard to validate the algorithm since many patients with AD are managed in primary care. As a secondary objective, this study aimed to validate a Modified AD algorithm, originating from Henriksen et al. [24] with modifications made by the clinical authors of this paper. Each algorithm was also evaluated in an adult population. In a sensitivity analysis, we also included diagnoses from secondary care to verify the presence of an AD diagnosis.

This study informs the suitability of identifying patients with AD through data on prescriptions in real-world administrative databases, which in turn would allow for studying subsets of patient groups with varying degrees of severity and disease control.

Methods

Data Sources and Ethics

The study extracted primary care data from two of the three largest regions in Sweden, Västra Götaland and Skåne, covering approximately 1/3 of the Swedish population. The two databases (VEGA and RSVD, respectively) include International Classification of Disease version 10 (ICD-10) diagnosis codes and dates of visits. The study also extracted data from the NPR, PDR and CDR. The NPR contains demographic and medical information for all in- and outpatient specialist visits (i.e., secondary care) from 1 January 2001, including ICD-10 codes with corresponding dates. The PDR includes data from 1 July 2005 for all pharmacy-dispensed medications originating from both primary and secondary care, including medications by Anatomical Therapeutic Chemical codes (ATC-code) and dispensation dates. The CDR contains information on cause and date of death. These three databases are managed by the Swedish National Board of Health and Welfare and have nationwide coverage.

Data linkage and subsequent pseudonymization were performed by Statistics Sweden. Ethical approval (reference number 2019-03840) was obtained in July 2019 from the Ethical Review Board in Sweden. Individual consent was not collected from the study population since this is not required for retrospective registry studies of Swedish secondary data.

Study Population and Study Design

This study included patients with at least one registered primary care visit in VEGA or RSVD between 2007 and 2018 (inclusive). No other inclusion criteria were used in this study; hence, the study cohort consisted of pediatric and adult patients with at least one primary care visit independent of diagnosis. The study cohort was linked to the PDR, and patients who had at least one dispensation of a topical calcineurin inhibitor (TCI) or at least two dispensations of a topical corticosteroid (TCS) within 12 months of each other (see “inclusion criteria” Table 1) were identified. Moreover, prescribed treatments (ATC code and dispensation date from the PDR) and healthcare visits (ICD-10 code and date from the NPR) that merited exclusion according to the Henriksen AD algorithm and the Modified AD algorithm (See “Drug exclusion” and “Diagnosis exclusion” in Table 1) were also collected.

Table 1 Defining predicted AD through the Henriksen and Modified AD algorithms

The time period for evaluation of the exclusion criteria was not explicitly stated in Henriksen et al. [24], and the approach to use the entire identification period to evaluate the diagnosis and drug exclusion was therefore taken. The Modified AD algorithm also used the entire identification period for evaluating the diagnosis exclusion criteria. To evaluate the drug exclusion criteria in the Modified AD algorithm, the period from start of study period until the day before the treatment index date was used. The rationale for this was that the PDR includes prescribed dispensations from both primary and secondary care, and any dispensation occurring after the treatment index date is rather indicative of a diagnosis of a comorbid condition to AD. The study design is presented in Fig. 1.

Fig. 1
figure 1

Study design

All patients in the study population were classified according to the presence of an AD diagnosis in primary care (patients with an AD diagnosis were classified as “positive patients” while patients without an AD diagnosis were classified as “negative patients”). In the next step, all patients were classified according to the predictive status (“positive predicted” or “negative predicted”) by the Henriksen AD algorithm and the Modified AD algorithm. Given the disease status from primary care (positive or negative) of each patient and the corresponding predicted status (positive predicted or negative predicted), all patients were classified as either true positive (TP), false positive (FP), false negative (FN) or true negative (TN) as outlined in Fig. 2.

Fig. 2
figure 2

Binary test classification. *Diagnosis of AD from secondary care was used to confirm disease status in a sensitivity analysis

True positive was defined as a patient with an AD diagnosis in primary care and for whom the algorithm predicted a positive result (i.e., identified disease). False negative was defined as a patient with an AD diagnosis in primary care and for whom the algorithm predicted a negative result (i.e., did not identify disease). False positive was defined as a patient with no AD diagnosis in primary care and for whom the algorithm predicted a positive result (i.e., identified disease).

True negative was defined as a patient with no AD diagnosis in primary care and for whom the algorithm predicted a negative result (i.e., did not identify disease). Patients with an AD diagnosis in secondary care but no AD diagnosis in primary care may indicate that (1) the patient has been diagnosed with AD a primary care region for which this study has no data but then moved to one of the two other regions available in this study and continued treatment there (and were then included by the algorithm) or (2) physicians in primary care were unsure about the correct diagnosis and thus referred the patient to secondary care, and/or used a general category term (L30+) of dermatitis. These patients were therefore excluded from the study cohort in the base case but included and classified as positive patients in the sensitivity analysis, in which an AD diagnosis in secondary care was used to validate the algorithm. In the sensitivity analysis, the level of overlap between an AD diagnosis and any of the exclusion criteria was also evaluated.

Three different index dates, diagnosis index, inclusion index and primary care index, were used in this study. Patients with a positive status from primary care were assigned a diagnosis index date corresponding to the date of the first observed visit to primary care with an AD diagnosis. An inclusion index date was assigned to those patients classified as “positive predicted” and corresponded to the first pharmacy prescription of either a TCI or a TCS.

Additionally, those patients without a diagnosis index date or an inclusion index date (this scenario is only applicable to patients who were classified as TN) were assigned a primary care index date corresponding to the median date of their healthcare visits to primary care during the study period. The primary care index date was designed to represent the median date of exposure in primary care during the study period and consequently allows for calculation of the age of the patient at that time.

For patients having several index dates, a hierarchy was employed to determine a single study index date. A diagnosis index was most preferred as the index date, followed by the inclusion index, followed by the primary care index. The patients’ ages at the study index date then determined the age cohort to which the patients belonged to (pediatric or adult). Many epidemiological studies rely on an index date, which represents the date when exposure starts and is oftentimes defined as disease onset or date of diagnosis. The proportion of true-positive patients with inclusion index date within 6 months of the diagnosis index date was therefore calculated. In a sensitivity analysis, 3 and 12 months were also used.

Statistical Analyses

Descriptive statistics for patient characteristics were presented as mean and standard deviation (SD) for age, and number and percentage were used for sex, inclusion and exclusion criteria, presence of diagnosis in secondary care and use of emollients.

The groups: TP, FN, FP and FN were used to calculate the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and proportion correctly predicted using the primary care index date (for each algorithm, by pediatric and adult patients) [27]. Furthermore, the proportionate reductions in uncertainty scores were calculated as measures of the proportion by which a positive or negative test result reduces the diagnostic uncertainty [28, 29]. A 95% binomial confidence interval (CI) using the Agresti-Coull interval [30, 31] was reported for each outcome.

Results

Pediatric Population

A total of 487,176 children with a primary care visit were included in this study. The characteristics of patients classified as TP, FN and FP by both algorithms are presented in Table 2. Age at index date in the pediatric population was similar across predicted status and algorithms (approximately 5.8 years) with the exception of the group of false positives, which on average were 7.4 years old. Most patients were included by the algorithm through two TCS dispensations within 12 months of each other. In the FN cohort, most patients (~ 76%) were not prescribed a TCS or TCI and were hence not captured by the algorithm while the remaining patients were dispensed a TCS or TCI but were then excluded because of a medical condition (by either diagnosis or dispensation).

Table 2 Summary of patient characteristics by AD algorithms of pediatric and adult populations

Further analysis of these conditions showed frequent overlap (see Tables 5 and 6 in the supplementary material) between the diagnosis of AD and the diagnosis of pruritus (ICD-10 L29+) and other dermatitis (ICD-10 L30+), both of which formed part of the exclusion criteria. The remaining 75% of the false-negative cohorts had hence been given an AD diagnosis in primary care but never filled a dispensation to qualify for inclusion. Dispensation of emollients (which was not part of the algorithms) was most common in patients classified as TP but about 40% of patients classified as FN were also dispensed an emollient any time during the identification period.

Adult Population

A total of 2,166,776 adult patients with a primary care visit were included in this study. Characteristics of patients classified as TP, FN and FP by both algorithms are presented in Table 2. Age at index date in the adult population varied across predicted status but was similar between algorithms. True-positive patients were generally younger (40 years) and average age increased to 45 years and 55 years in false-negative patients and false-positive patients, respectively. Most patients were included by the algorithm through dispensation of two TCSs within 12 months of each other. In the FN cohort, the majority of patients (51%) were not prescribed a TCS or TCI and were hence not captured by the algorithm.

Further analysis of these conditions showed frequent overlap (see Tables 7 and 8 in the supplementary material) between the diagnosis of AD and the diagnosis of pruritus (ICD-10 L29+) and other dermatitis (ICD-10 L30+), both of which formed part of the exclusion criteria. The remaining 55% of the FN cohort had hence been given an AD diagnosis in primary care but never filled a dispensation to qualify for inclusion. Dispensation of emollients (which was not part of the algorithms) was most used by patients classified as TP in the Henriksen AD algorithm (36.5%) while it was used by 29.3% and 21.3% of patients classified as FN and FP according to the Henriksen AD algorithm, respectively.

Table 3 shows the predictive ability of each algorithm in the pediatric and the adult cohorts. The sensitivity and PPV of the Henriksen AD algorithm in the pediatric population was 30.0% and 40.7%, respectively. The specificity was 95.1%. The proportion of true-positive patients with a diagnosis index date within 6 months of the inclusion index date was 52.5%. The sensitivity of the Modified AD algorithm was higher (30.4%) compared to that for the original Henriksen AD algorithm while PPV was lower (40.2%). In the adult cohort, the sensitivity and PPV were 20.4% and 8.7% in the Henriksen AD algorithm, respectively. The sensitivity and PPV of the Modified AD algorithm in the adult cohort was 21.2% and 8.3%, respectively. The specificity was 94.4%.

Table 3 Predictive ability of the AD algorithms, by pediatric and adult patients

Results from Sensitivity Analysis

Table 4 shows the predictive ability of the Modified AD algorithm in the sensitivity analysis. The sensitivity and PPV in the pediatric population were 62.1% and 66.3%, respectively. The specificity was 94.1%. The proportion of true-positive pediatric patients with a diagnosis index date within 6 months of the inclusion index date was 86.4%. The sensitivity and PPV in the adult population were 48.3% and 16.9%, respectively. The specificity was 92.5%. The proportion of true-positive adult patients with a diagnosis index date within 6 months of the inclusion index date was 67.3%. We also found that the diagnosis index date was in close proximity (< 1 year) to the primary care index date in the TP and FN groups in the pediatric and the adult cohorts, implying that the median date of exposure in primary care was a good proxy for an index date for the TN patients.

Table 4 Predictive ability of the Modified AD algorithm using secondary data as validation, by pediatric and adult patients

Discussion

Interpretation and Comparison with Other Studies

The results from this study show that, when validated with diagnoses from primary care data solely, in the pediatric population, the positive predictive values of the Henriksen and Modified AD algorithms were 40.7% and 40.2%, respectively, and the sensitivities of the two algorithms were 30.0% and 30.4%, respectively. The specificity was 95.1% and 95.0% in the Henriksen and Modified AD algorithm, respectively. In the adult population, the corresponding PPVs were 8.7% and 8.3%, respectively, and sensitivities were 20.4% and 21.2%. The specificity of the Henriksen and Modified AD algorithms was 94.8% and 94.4%, respectively, in the adult population. The relatively low predictive ability is consistent with another Swedish study, which validated filled prescriptions for AD using medical records from primary care and showed a PPV of 45% in children [13], and indicate that neither of the two algorithms should be used without access to administrative data from a secondary care registry.

Several factors, including limitations in the prescribed drug register, organization of primary healthcare in Sweden and study design, may have contributed to the low sensitivity and PPV of the two algorithms. First, some group I TCSs (based on the European I–IV TCS classification, where: I = mild, II = medium strong, III = strong, IV = extra strong) are available as over-the-counter (OTC) drugs in Sweden and are consequently not included in the PDR. It is therefore possible that some patients (presumably with mild symptoms) were classified as false negative (e.g., patients with an AD diagnosis from primary care but not predicted as such by the algorithms because they purchased their treatment as OTC rather than being dispensed a prescription). Similarly, emollients, which constitute the basic treatment for AD but are not part of the algorithms, are also available OTC, and the observed use of emollients (see Table 2) may then be underestimated compared to the true use of emollients.

These results suggest that patients with very mild and/or transient symptoms are not accurately identified by these algorithms. This interpretation is verified by the results from a Dutch validation study, which found that medication proxies resembling long-term treatment (≥ 4 dispensed prescriptions of TCS) had the highest predictive power to identify AD patients [21]. The significance of not capturing patients with mild symptoms depends on the objective of the study. In prevalence studies, this may lead to an underestimate of the prevalence while not capturing the mildest patients may be of less importance when studying the association between AD and risk of comorbidities. Another possible explanation to the low sensitivity and PPV could be that general practitioners may be more inclined to record an AD diagnosis in pediatric populations without a comprehensive clinical assessment and therefore overreport AD.

Furthermore, provision of primary care in Sweden is not centralized but rather organized in regions, and this study only had access to primary care data from two Swedish regions. It is therefore possible that patients were given their AD diagnosis in a primary care center located in another region but started/continued treatment while visiting primary care in any of the two regions included in this study. Such patients would have been classified as false positive only because of the unobservability of the true status of AD for the patient. The sensitivity analysis was in part designed to reduce the risk of this situation by also including patients who had an AD diagnosis from secondary care but none in primary care.

The results from the Modified AD algorithm also revealed that there was significant overlap in patients classified as false positives between the AD diagnosis and pruritus (L29+) and other dermatitis (L30+). Pruritus is a common symptom of AD [1] and “other dermatitis,” which means unspecified dermatitis (L30.9+) may be recorded by physicians who are unsure about the exact underlying condition. These two diagnosis codes (L29+ and L30+) were therefore removed from the medical conditions in the diagnosis exclusion criteria as a sensitivity analysis. This adjustment was applied to the Modified AD algorithm and improved the accuracy in the pediatric population where sensitivity increased from 30.4% to 62.1% and PPV increased from 40.2% to 66.3%. The results from the sensitivity analysis are also consistent with the results from the validation of Henriksen’s AD algorithm in a Danish setting. Stensballe et al. [32] used a telephone interview with the family to confirm physician diagnosis of the children identified by this algorithm. The authors showed that this algorithm had a PPV of 60.0%.

Finally, the Henriksen AD algorithm was designed to identify children with AD and not adults with AD, and it may not be surprising that the accuracy of the algorithm and the Modified AD algorithm was lower in the adult patient population compared to the pediatric patient population. To our knowledge, no other study with a similar study design to ours has validated the use of data on dispensed prescriptions for treatment of AD as a proxy for AD disease identification in an adult population.

Strengths and Limitations

The major strength of this study is the use of a comprehensive and detailed database including data from two primary care databases, which included over 80% of the population residing in the two regions. It is therefore plausible that the results from this study are generalizable to similar healthcare settings. Also, the level of detail including dates and diagnosis codes of healthcare visits together with prescription dates and types of medications enabled us to identify which diagnoses or medications led to exclusion and validate the interpretations.

We also recognize that our study has some limitations. In addition to the limitations mentioned in the interpretation of the results, in this study we did not have access to patient records, and it was therefore not possible to verify (through a chart review) the diagnoses provided by the physicians, which were used to infer the true disease status of the patients included in this study. Also, we observed that a large share (60.2%) of healthcare visits to primary care is missing diagnosis codes. Since AD is a chronic disease, physicians may not set a diagnosis at each consultation, which in turn may affect the sensitivity and positive predictive value of the algorithms as well as the analysis on the timing of the prescription (i.e., inclusion index date) in relation to the primary care diagnosis (i.e., diagnosis index date). However, the long follow-up time of this study reduces the impact of this on the sensitivity/positive predictive value of the evaluated algorithms.

Conclusions and Implications for Future Research

This study showed acceptable predictive power and sensitivity for the Modified AD algorithm in a pediatric population when we also used diagnosis-codes from secondary care data to identify patients with AD. We also showed that additional adjustments are needed to the Modified algorithm to accurately identify adults with AD. This work is important in order to explore the unique possibilities that real-world data offer for case identification. Yet, identification of patients through a medication prescription proxy is complicated, and any algorithm should be used with caution and potentially be complemented with sensitivity analyses. Future research using the two algorithms evaluated in this study must also consider the relative importance of sensitivity and specificity to the specific research question.

When including both primary and secondary care data and further refinements to the specified exclusion criteria, the Modified algorithm yielded acceptable levels of sensitivity, specificity and positive and negative predictive value in the pediatric population. However, the sensitivity and positive predictive value were poor using primary pediatric care data alone and in the adult AD populations. In conclusion, our results indicate that the Modified AD algorithm should be used to identify pediatric patients with AD but that further modifications of this algorithm or the Henriksen AD algorithm, originally established for pediatric AD, are needed, together with accompanying validation studies to accurately identify an adult AD population using administrative data.