FormalPara Key Points

Based on screening of thousands of diagnoses from nationwide Danish health registers, we identified no signals of previously unknown adverse events of TNF-α inhibitors in pediatric patients.

Surveillance of adverse events from routinely collected real-world data can complement other analyses in generating pediatric-specific drug-safety evidence.

1 Introduction

Tumor necrosis factor-alpha (TNF-α) inhibitors have revolutionized the treatment of chronic inflammatory diseases and become increasingly common in children [1,2,3]. Previous studies in adults have found associations between TNF-α inhibitors and increased risk of adverse events, including serious infections and malignancies [4, 5]. However, extrapolation of adult data to children is not necessarily relevant, as has been shown regarding infections [6]. The pediatric-specific safety evidence for TNF-α inhibitors is generally scarce.

Detection of potential adverse events post-market approval is key to ensure safe use of drugs. Signals of previously unknown adverse events can be detected when new drugs are used at a larger scale and by a wider range of patients in clinical practice. Adverse event screening can play a particularly important role in pediatrics, where output of both clinical and observational studies is low [7, 8]. To support optimal prescribing in children there is a need for pediatric-specific safety data [9, 10].

Spontaneous reporting systems have traditionally been the leading source of timely safety data [11]. However, due to increasing availability of large amounts of secondary data, including healthcare registers, new opportunities for signal generation have emerged [12]. The use of detailed patient data that are routinely collected over time enables detection of rare adverse events and decreases the risk of reporting bias and confounding.

The aim of this data-mining study was to screen for new signals of adverse events of TNF-α inhibitors in pediatric patients with inflammatory bowel disease (IBD) or juvenile idiopathic arthritis (JIA), applying newly developed methods for adverse events data mining on nationwide Danish health registers.

2 Method

2.1 Study Population

The study was performed based on Danish population-based registers, linked via unique personal identity numbers. The source population was defined as all individuals living in Denmark aged < 18 years at some time during the study period, 2004–2016. From the source population, we identified individuals with confirmed pediatric IBD or JIA, which was defined as at least two contacts with specialist care (inpatient or outpatient) with a physician-assigned IBD or JIA diagnosis during the study period or previously (1986–2016). These made up the study cohort of eligible individuals. See details in Supplementary Table 1 (Online Supplementary Material, OSM).

2.2 Exposure Episodes

From the study cohort, we identified episodes of follow-up of new TNF-α inhibitor use and episodes of no use of TNF-α inhibitors. New use of TNF-α inhibitors was defined as initiation of these biologics with no use within 2 years before. The TNF-α inhibitor episodes continued as long as the patient was on treatment. Treatment discontinuation was identified based on assumed duration of each drug administration (Supplementary Table 1, OSM) and an allowed gap in coverage (grace period) of a maximum of 90 days. Maximum length of follow-up was 3 years (see examples of the identification of episodes in Supplementary Fig. 1, OSM). Use of TNF-α inhibitors was defined based on procedure codes from the Danish National Patient Register (anatomical therapeutic chemical classification system [ATC] code L04AB). Biologic therapy is only administered in specialist care in Denmark and without incurring any cost for the patient [13].

Follow-up time with no exposure to TNF-α inhibitors in the last 2 years was considered no-use time. The no-use time was divided into episodes of a maximum of 3 years, which served as comparator episodes. No-use episodes were censored at initiation of TNF-α inhibitors. The episode design allowed individuals to be included in the study multiple times, as both TNF-α inhibitor and no-use episodes. All episodes were mutually exclusive; no time nor outcome event was counted more than once.

We performed two analyses: first, a propensity score-matched analysis where TNF-α inhibitor episodes were compared with no-use episodes; second, a self-controlled analysis where temporal risk windows during follow-up were compared within TNF-α inhibitor initiators.

2.3 Propensity-Score Matching

In the propensity score-matched analysis, TNF-α inhibitor and no-use episodes were matched on underlying disease (JIA, Crohn’s disease [CD], or ulcerative colitis [UC; including unclassified IBD]) and on propensity scores. One patient could contribute multiple episodes to the analysis, but episodes in each matched pair had to come from different individuals. General potential confounders were included in the propensity-score model: demographics (age, sex), socioeconomic factors (family income and education level of parents), disease duration, drug use in the last year (oral corticosteroids and immunomodulators [thiopurines or methotrexate]), and general healthcare and drug use (number of prescription drugs, outpatient contacts, and inpatient admissions). We used a nearest-neighbor greedy matching algorithm (1:1 matching) with a caliper corresponding to two standard deviations of the log-odds of the propensity score [14]. The caliper was chosen to ensure that all TNF-α inhibitor episodes were matched and included in the analysis.

2.4 Eligibility and Censoring

Episodes were excluded if any of the following criteria were met at index: age ≥ 18 years, patient lived outside of Denmark in the last 5 years, no specialist-care contact with IBD or JIA diagnosis in last 3 years, and use of any biologic in the last year (see Supplementary Table 1, OSM). All patients were censored at maximum follow-up (3 years), end of study period (31 December 2016), emigration, or death. TNF-α inhibitor users were also censored at treatment discontinuation and no-use episodes were censored at initiation of TNF-α inhibitors, if any. Additionally, within the matched pairs of the propensity score-matched analysis the episode with longer follow-up was censored at the end of follow-up for its match to make follow-up equal within matched pairs.

2.5 Adverse Events Data Mining

We screened for adverse events based on physician-assigned diagnosis codes (10th revision of the International Statistical Classification of Diseases and Related Health Problems [ICD-10]) from outpatient and inpatient visits in specialist care. All ICD-10 codes as well as groups of related codes at three higher levels were evaluated as potential adverse events: disease chapters (e.g., I00-99 Diseases of the circulatory system), disease blocks (e.g., I10-15 Hypertensive diseases), and three- to four-position codes. As such, the ICD-10 codes define a structured tree of diagnoses and each grouping is defined by a cut on that tree. Diagnoses obtained from the register were recorded at the three- and four-position levels, which also represented individual cuts (see a detailed example in Supplementary Fig. 2, OSM). Codes that were not considered relevant as potential adverse events were excluded from the analysis, for example, congenital conditions, pregnancies and other codings unlikely to be caused by drugs (see Supplementary Table 2, OSM).

In the propensity score-matched analysis, we screened the data for cuts with a higher incidence in the TNF-α inhibitor episodes in comparison with the no-use episodes. In the self-controlled analysis, we screened for temporal clustering of potential adverse events following initiation of TNF-α inhibitors, that is, events with higher risk during certain time windows. Hence only the TNF-α inhibitor episodes were included in the self-controlled analysis.

Only incident events were considered for the analysis. In the propensity score-matched analysis, a code was incident if it was not preceded by the same code at the three-position level (e.g., I11.0 not preceded by any code starting with I11) at any time point before index to avoid inclusion of repeated events within individuals. In the self-controlled analysis, a look-back of 3 years in relation to the date of each event was used to determine if it was incident. Hence, the look-back was constant over follow-up. All events were analyzed but signals based on fewer than three exposed events could not be presented due to Danish data protection legislation.

2.6 Tree-Based Scan Statistics

To identify cuts with a higher incidence we used tree-based scan statistics, which are disproportionality statistics that adjust for multiple testing and that allow for simultaneous testing of diagnosis codes at all levels of granularity, that is, all cuts on the ICD-10 tree [15]. We screened for potential adverse events in the propensity score-matched analysis using the unconditional Bernoulli model [16, 17]. Exposure was assumed to follow a Bernoulli probability distribution. Under the null hypothesis and given the 1:1 matching ratio, events in all cuts were equally probable (probability = 0.5) to occur during TNF-α inhibitor episodes as during no-use episodes. The alternative hypothesis was that events in at least one cut had a higher risk (probability > 0.5) of occurring during TNF-α inhibitor episodes.

In the self-controlled analysis, we used the conditional tree-temporal scan statistic [18, 19]. The analysis was conditioned on the total number of events over follow-up in each cut. Under the null hypothesis, events were uniformly distributed over follow-up. The alternative hypothesis was that there was at least one cut where the risk was higher in at least one of the analyzed risk windows. An advantage of this method is that no predefined risk windows are needed; temporal screening is performed over the entire follow-up period. We analyzed all unique, temporal risk windows of 2 days–1.5 years that fit during the maximum follow-up of 3 years (maximum window length was half of maximum follow-up). No window was shorter than 20% of the follow-up day it ended (e.g., a window that ended on day 100 was 20 days or longer) to avoid analyzing short-risk windows a long time after drug initiation.

Log likelihood ratios (LLRs) were calculated for each cut in the propensity score-matched analysis and for each cut-risk window in the self-controlled analysis. Inference was based on Monte Carlo simulation because there is no simple expression for the sample distribution of the LLRs [20]. p values were obtained for each analysis by ranking the LLRs of the most likely cuts in relation to maximum LLRs simulated under the null hypotheses. Cuts with a p value below 0.05 were considered significant. The analysis was performed with the free TreeScan v1.4 software (https://www.treescan.org) and SAS v9.4 (SAS Institute Inc.).

3 Results

3.1 Episode Characteristics

During the study period, 1310 new users of TNF-α inhibitors were identified. Following 1:1 propensity-score matching, a cohort of 1310 pairs of TNF-α inhibitor episodes and no-use episodes was included. Episodes were well balanced on all variables, despite the large caliper used in the matching algorithm (Table 1). Of the TNF-α inhibitor episodes, 59% were female and mean (SD) age was 13.4 (4.0) years. The indication for TNF-α inhibitor use was JIA in 51% of the episodes, CD in 35%, and UC in 14%. Episodes were censored at the shortest length of follow-up within the matched pairs. Mean (SD) length of follow-up was 1.0 (0.9) years. For the self-controlled tree-temporal analysis, 1310 episodes of new TNF-α inhibitor use were included. The mean (SD) length of follow-up for these episodes was 1.2 (0.9) years.

Table 1 Characteristics of episodes of tumor necrosis factor alpha (TNF-α) inhibitor use and no use included in unmatched and propensity score-matched cohorts

3.2 Propensity Score-Matched Analysis

In the propensity score-matched cohort, 1284 incident, unique cuts of the ICD-10 tree were recorded during follow-up among all episodes. There were five cuts with a significantly high number of events in the TNF-α inhibitor episodes in comparison with the no-use episodes (Table 2). Two of the cuts were dermatologic: ICD-10 chapter Diseases of the skin and subcutaneous tissue (L00-99; 87 vs. 44 events; risk difference [RD] 3.3%; p value 0.017) and the related sub-branch, Dermatitis and eczema (L20-30; 34 vs. 8 events; RD 2.0%; p value 0.004). For context, the excess number of events in the chapter L00-99 were also driven by disorders of skin appendages (L60-75; 28 vs. 10 events; p value 0.39), papulosquamous disorders (L40-45; 10 vs. < 3 events; p value 0.43), and other disorders of the skin and subcutaneous tissue (L80-99; 13 vs. 10 events; p value 1.00) (Supplementary Table 3, OSM). The other three significant cuts were ICD-10 block Anxiety, dissociative, stress-related, somatoform, and other nonpsychotic mental disorders (F40-48; 39 vs. 11 events; RD 2.1%; p value 0.007), Reaction to severe stress and adjustment disorders (F43; 35 vs. 9 events; RD 2.0%; p value 0.008), and Adjustment disorders (F432; 33 vs. 7 events; RD 2.0%; p value 0.002) (Table 2 and Supplementary Table 3, OSM).

Table 2 Cuts on the ICD-10 tree with a significantly high risk in tumor necrosis factor-alpha (TNF-α)-inhibitor episodes as compared with no-use episodes from the propensity score-matched analysis

3.3 Self-Controlled Analysis

The self-controlled analysis was performed on the TNF-α inhibitor episodes. In total, 1036 unique cuts with incident events during these episodes were identified. No combinations of cuts and risk windows with significantly high incidence were identified. Hence, there were no signals of events with temporal clustering during follow-up.

4 Discussion

In this data-mining study of adverse events of TNF-α inhibitors in pediatric patients based on the nationwide Danish population, we found no signals of previously unknown adverse events. A signal of dermatologic complications that has been previously described in adults and children was detected, including excess cases of diseases of the skin and subcutaneous tissue, and dermatitis and eczema [21,22,23,24,25,26]. A detected signal of psychiatric diagnoses of anxiety, dissociative, stress-related, somatoform, and other nonpsychotic mental disorders, including reaction to severe stress and adjustment disorders, was likely associated with the underlying disease and its severity, rather than with the treatment. The study shows the utility and advantages of newly developed methods for adverse event data mining to generate safety information that is specific to children based on Scandinavian health registers.

Previous studies have described dermatologic adverse events of TNF-α inhibitor use. In particular, studies have described that new-onset psoriasis is a paradoxical adverse event of TNF-α inhibitors in patients with rheumatic disease and IBD. In adult IBD, dermatologic events have been recorded in 21–29% (sample size n = 583–732) of patients initiating TNF-α inhibitors, where median follow-up was 3–4.4 years [21, 22]. Psoriasis and cutaneous infections were the most common manifestations. In pediatric patients, one study found the risk of dermatologic events to be 11% (n = 409), with psoriasis, infections, and eczema being the most common diagnoses [23]. A small pediatric case series estimated the risk at 48% (n = 84), where half of the patients with events had lesions that were considered severe [26]. The risk of new-onset psoriasis among pediatric TNF-α inhibitor users has been estimated at 8–14% (n = 73–409) [23,24,25]. In our analysis, 6.6% of TNF-α inhibitor episodes had at least one incident event in the chapter Diseases of the skin and subcutaneous tissue (L00-99) and the risk difference in comparison with no use was 3.3%.

The previous pediatric studies are one-arm case series or cohort studies that do not estimate the risk in relation to non-treated patients, that is, the relative risk or risk difference. To inform clinical practice about the potential dermatologic risks in pediatric patients, pharmacoepidemiologic studies in large, unselected populations with suitable comparators are needed, since clinical trials of suitable power will unlikely be conducted.

Our analysis also generated a signal of adjustment disorders (F432), which is part of reaction to severe stress and adjustment disorders (F43), and the ICD-10 block Anxiety, dissociative, stress-related, somatoform and other nonpsychotic mental disorders (F40-48). A plausible interpretation is that the signal reflects an association with the burden of underlying severe disease in general, rather than the pharmacologic effect of TNF-α inhibitors. A recent study showed that the risk of related conditions is higher in pediatric IBD patients in comparison with the general population: hazard ratios were 1.6 for mood disorders (427 events) and 1.5 for anxiety disorders (673 events), although the study did not investigate whether disease severity is a risk factor [27].

Key strengths of this study were the generalizability and large sample of pediatric patients analyzed through routinely collected healthcare data on non-selected TNF-α inhibitor initiators from the national Danish population. Patients were identified during a study period of 13 years and followed for 1 year on average. The data sources were also granular and comprehensive enough to allow for robust confounding control and to identify a large range of potential adverse events. However, null findings in this type of hypothesis-generating study cannot be interpreted as an absence of adverse events. Insufficient power or too scattered recording of certain diagnoses can lead to non-significant clusters.

Our use of the recently developed tree-based scan statistics enabled scanning for clusters of events at multiple levels of diagnosis granularity, for temporal clustering in relationship to drug initiation, and simultaneously adjust for multiple testing to generate valid p-values. The self-controlled and propensity score-matched analyses complemented each other. By performing both, we were able to capture signals of potential adverse events based on both temporally increased incidence and generally increased incidence in comparison with matched, no-use episodes. The lack of a priori defined potential adverse events and potential risk windows were strengths of the analysis.

A potential limitation was residual confounding. The propensity score-matched analysis was susceptible to confounding by indication and the self-controlled analysis to time-dependent confounding within TNF-α inhibitor users. In the propensity score-matched analysis, we adjusted for the general potential risk factors age, sex, underlying disease, disease duration, treatment history, and general healthcare and drug use. Given that disease severity, which is generally higher in TNF-α inhibitor users, is positively associated with the risk of many potential adverse events, it was unlikely that residual confounding by indication resulted in false negatives.

We chose not to use alternative study designs that might have decreased confounding additionally, including active comparator new user and prevalent new user designs [28, 29], due to the large exclusion of TNF-α inhibitor users and reduced power that these designs would have required. Given the hypothesis-generating aim of the study, we prioritized analyzing all TNF-α inhibitor initiators during the study period. As for all adverse event data-mining studies, we analyzed a large set of potential outcomes simultaneously and we did not adjust for specific risk factors in relation to each outcome. The included factors represent key confounders in relation to most types of outcomes—as themselves or as proxies for other factors. The aim of the analysis was to detect signals of potential adverse events, rather than inferring causality between drug and outcomes. Nonetheless, robust confounding control increases the validity of the results.

Another potential limitation of the study was misclassification of exposure or outcomes, which can lead to biased results. All TNF-α inhibitor treatment is administered in specialist care in Denmark, and coverage in the national patient register is considered to be high [13, 30]. A previous study has validated the use of diagnosis codes from the Danish national patient register to detect outcomes [31]. We did not have access to general practice records, which meant that adverse events only diagnosed in the primary-care setting could not be detected in the analysis. However, children with serious and chronic disease, such as IBD and JIA, are cared for almost exclusively in specialist care.

5 Conclusions

This adverse event-screening study identified no previously unknown adverse events of TNF-α inhibitors in pediatric patients. The study also showed how newly developed methods for health-register screening can provide comprehensive and relevant adverse event signal detection. In pediatric patient groups where data are scarce, this approach can complement other types of studies in generating drug-safety evidence.