FormalPara Key Points

In the safety of vaccines, background incidence rates are key to allow proper monitoring and assessment.

Between 2003 and 2014, 148,947 new cases of nine autoimmune diseases were identified in seven European healthcare databases from four countries.

Incidence rates were highest for Bell’s palsy and lowest for Kawasaki disease. Specific patterns were observed by sex, age, calendar time, and data sources.

1 Introduction

The Accelerated Development of VAccine beNefit-risk Collaboration in Europe (ADVANCE) was a public–private consortium launched by the Innovative Medicines Initiatives in 2013 to bring together stakeholders (i.e. regulators, academics, and vaccine manufacturers) actively involved in the postmarketing monitoring of benefits and risks (B/R) of vaccines [1]. The aim of the ADVANCE project was to build an efficient system to generate robust evidence on background rates and vaccine coverage and, ultimately, to rapidly assess the B/R of vaccines using existing healthcare databases in Europe. ADVANCE has transitioned to the Vaccine Monitoring Collaboration for Europe that will implement the ecosystem [2]. In that context, several tools and methods have been developed to standardize ways of working among selected European healthcare databases. A description of the system and the methods/workflows can be found in the article by Sturkenboom et al. [1, 3].

With the entry of new vaccines to the market and their use on a large scale, rare adverse events not detected during clinical development phases may occur. Large sample sizes are required to rapidly evaluate suspected causal associations between rare adverse events such as autoimmune diseases and vaccines in a real-world setting. Preparedness to investigate safety signals and safety concerns is a necessary requirement of vaccination programs stipulated in the Vaccine Safety Blueprint [4]. Based on a stakeholder analysis in Europe, background rates are important from a regulatory, manufacturer, and public health perspective [1]. Because of the mode of action of vaccines and the fact that adjuvants, which stimulate immune response, may be used, autoimmune diseases are often events of interest to monitor and investigate. This is especially relevant considering that they have age-related patterns of onset that may coincide with age at vaccination. Moreover, autoimmune diseases are rare and because of the possible impact of environmental factors on their occurrence [5, 6], there is a constant need to generate up-to-date background incidence rates (IRs). As part of being prepared to respond to signals, background rates are a crucial source of information in the assessment of suspected cases, especially during mass vaccination campaigns [7] or for continuous safety monitoring of vaccines in a growing recipient population [8].

As part of the database characterization efforts of the ADVANCE project, we estimated background IRs of nine autoimmune diseases. We described and tried to explain heterogeneity among sources of data (e.g. hospital-based outcomes and/or primary care-based), and compared them with external published data [9].

2 Methods

2.1 Setting

The ADVANCE project had access to 20 different data sources, seven of which could be used in this assessment, representing four countries—Denmark, Spain, Italy, and the UK (Table 1). Detailed descriptions of these databases can be found in the electronic supplementary file.

Table 1 Database characteristics

All participating data sources extracted study data into a common data model (CDM). As described by Sturkenboom et al. [10], the CDM comprises three data files—population, events and vaccinations.

2.2 Population

The source population comprised all persons registered with at least 1 year of data prior to the start of the study period or follow-up from birth. Data for all individuals recorded in each database from the start of follow-up (defined as birth or first data availability, whichever was latest) until the end of follow-up (defined as the date at last data retrieval, leaving the database, the date of first event, or death, whichever date was earliest), were used to define the follow-up for database characterization. The only eligibility criteria were that the date of birth, start and end follow-up dates, and sex needed to be available. The study start date varied between databases, depending on when the database collection started, and ended in 2017 for all databases. Data access providers (DAPs) created a population file in the format of the CDM including patient identifier, start of follow-up date, end of follow-up date, birth date, and sex.

2.3 Events

The autoimmune diseases of interest were acute disseminated encephalomyelitis (ADEM), Bell’s palsy, Guillain–Barré syndrome (GBS), immune thrombocytopenia purpura (ITP), Kawasaki disease, optic neuritis, narcolepsy, systemic lupus erythematosus (SLE), and transverse myelitis. The outcomes were defined using definitions from the Brighton Collaboration and learned societies, the World Health Organization, or the European Centre for Disease Prevention and Control. The case definitions were mapped to an initial list of the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), and ICD Tenth Revision (ICD-10), Read, and the International Classification of Primary Care (ICPC) codes using the ADVANCE Code mapper tool [11]. DAPs for each database were asked to modify and verify the proposed codes based on local coding habits and prior experience. Each DAP extracted the final list of codes for the specific events in their local terminology and transformed the data into the event file of the CDM containing the following fields: patient identifier, event type, date, original code (ICD-9/10, Read, ICPC, or text). The event file was linked to the population file to calculate event IRs and to assess whether these rates were as expected by benchmarking rates within the data source, between data sources, and against published data. This assessment allowed us to demonstrate the appropriateness of the data processing steps used. The code list for each outcome of interest is available in electronic supplementary Table S1. The ITP condition was defined according to narrow and broad concepts. Details on the harmonization process for data extraction are described elsewhere [10].

2.4 Data Management and Analyses

The DAPs extracted data from their database using the local data format and software, which were transformed into the ADVANCE CDM (CSV format). We used Jerboa data processing software, which is JAVA-based, for event code counting and incidence calculations. The Jerboa software has been used for multiple studies and is freely available. The script and instructions were sent to the DAPs, who ran the script against their input files, and the outputs were sent through a secure file transfer protocol (File Zilla or HighTail) to a private remote research environment (PRRE) [10].

The event characterization included code counts by type of event and database, and event IRs in the population by calendar year, sex, and age. Age was categorized per year until 17 years, from 18–24 years, and then in 5-year categories. We subsequently categorized age in 0–1, 2–4, 5–14, 15–24, 25–64, and ≥ 65 years for description, as this coincides with age of routine vaccination in general and because this categorization was compatible with the Post-licensure Rapid Immunization Safety Monitoring programme (PRISM) [9] US database age categories, allowing for age-specific comparisons of IRs between the US and European networks. For the incidence estimates calculated with Jerboa, there was a 1-year run-in period for individuals aged 6 months onward; individuals with an entry date within 6 months of birth started their follow-up at birth. Events recorded in the 1-year run-in prior to the start of follow-up were not considered and only first events recorded after the run-in period were considered to be incident. To have a comparable period of calendar time across databases, IRs were limited to the calendar years 2003–2014. Healthcare databases were classified according to the type of data sources: general practitioner databases including Base de Datos para la Investigación Farmacoepidemiológica en Atención Primaria (BIFAP), The Health Improvement Network (THIN), Royal College of General Practitioners (RCGP) Research and Surveillance Centre (RSC), and Pedianet and hospitalization record linkage databases, including Aarhus University Hospital (AUH)/Staten Serum Institute (SSI), Agenzia regionale di sanità (ARS), and Val Padana. We calculated crude IRs as the number of incident events within the follow-up period divided by the total person-time at risk and 95% confidence intervals (CIs) using the exact method for each event. IRs were expressed per 100,000 person-years (PYs). We also computed yearly pooled IRs for each autoimmune disease to compare the type of data sources (general practitioners vs. hospitalization record linkage) by using a random-effects model (Der Simonian–Laird method). Higgins I2 statistics were measured to determine heterogeneity between the type of data sources. Upon higher rates of narcolepsy observed in AUH/SSI, we conducted a post hoc analysis to estimate age-stratified IRs of narcolepsy in Denmark over the study period. Data handling and computation of rates were performed in SAS 9.4 (SAS Institute Inc., Cary, NC, USA) and meta-analyses were conducted in Stata v14.0 (StataCorp LLC, College Station, TX, USA.

3 Results

Over the period 2003 to 2014, the total person-time of follow-up was more than 233 million PYs for the seven European healthcare databases. The largest contributions in follow-up were from AUH/SSI databases (30.9%), THIN (27.0%), and ARS (20.0%) (Table 2). The population aged between 15 and 64 years has most of the person-time represented in each database, except for Pedianet, which only captures the pediatric population. Between 2003 and 2014, there were 148,947 incident cases of nine predefined autoimmune diseases. Of the nine individual autoimmune diseases, the crude IR of Bell’s palsy was the highest (23.8/100,000 PYs, 95% CI 23.6–24.1), followed by ITP broad definition (21.7/100,000 PYs, 95% CI 21.6–22.0), SLE (5.3/100,000 PYs, 95% CI 5.2–5.4), ADEM (5.3/100,000 PYs, 95% CI 5.2–5.3), ITP narrow definition (3.8/100,000 PYs, 95% CI 3.7–3.9), optic neuritis (3.4/100,000 PYs, 95% CI 3.3–3.5), GBS (2.1/100,000 PYs, 95% CI 2.0–2.1), narcolepsy (1.1/100,000 PYs, 95% CI 1.0–1.1), transverse myelitis (1.0/100,000 PYs, 95% CI 0.9–1.0), and Kawasaki disease (0.7/100,000 PYs, 95% CIs 0.6–0.7). The sex-specific crude IRs of several autoimmune diseases were higher in females than in males (Table 3), and the most pronounced was SLE, with an IR of 8.5/100,000 PYs in females and 2.1/100,000 PYs in males. For each database, age- and sex-specific crude IRs are presented in electronic supplementary Table S2.

Table 2 Follow-up duration and number of autoimmune events for each database over the period 2003–2014
Table 3 Crude incidence rates (/100,000 PYs) per sex for each autoimmune disease

3.1 Age-Stratified Incidence Rates Per Database

Overall and age-stratified IRs are presented in Table 4. We observed that the age patterns differ across different autoimmune diseases: IRs increased with increasing age for Bell’s palsy, GBS, and SLE. The narrow definition of ITP shows the highest rates in the 0–4 years age group. This rate decreased in children aged between 5 and 24 years, and increased by age from the age of 25 years. A similar pattern with a higher magnitude of rates was observed using the ITP broad definition. In the elderly (65 + years) IRs ranged between 22 and 64/100,000 PYs, except in BIFAP, where IRs peaked at 130/100,000 PYs. IRs for narcolepsy were low (≤ 1/100,000 PYs), but slightly higher rates were observed in the Danish database. In Denmark, the IR for narcolepsy was as high as 3.1/100,000 PYs in the 15–24 age group. A specific analysis of this age group per calendar year in the AUH/SSI database showed that IRs increased at the beginning of the study period and tended to level out during the period 2008–2012, potentially followed by a slight increase towards the end of the study period (Fig. 1). The pattern of IRs for optic neuritis was similar across databases, increasing by age and peaking in the 25–44 years age group, except in the BIFAP database, where a constant increase by age was observed. Although no clear pattern was observed for ADEM, IRs peaked in the 25–44 years age group in both the record linkage Italian databases (ARS and ASCLR). The pattern of IRs for Kawasaki disease was similar across databases, with most of the events occurring before the age of 14 years. IRs for transverse myelitis varied from 0.0 to 2.2/100,000 PYs; no events were reported in the BIFAP and Pedianet databases.

Table 4 Crude incidence rates (/100,000 PYs) for each autoimmune disease per age groups and databases over the period 2003–2014
Fig. 1
figure 1

Incidence rates for narcolepsy in the AUH/SSI database, per age group and calendar year. IR incidence rate, PY person-years, CI confidence interval, AUH/SSI Aarhus University Hospital/Staten Serum Institute

3.2 Incidence Rates Over Calendar Years According to the Type of Data Sources

Yearly pooled IRs of autoimmune diseases were stable over time but differed by type of data source for some diseases (electronic supplementary Fig. S1). IRs of ADEM and GBS were higher in hospital-based record linkage databases than in primary care databases. On the contrary, IRs of Bell’s palsy, ITP narrow, Kawasaki, SLE, and transverse myelitis were higher in primary care databases.

4 Discussion

In this study, we estimated age-, sex-, and calendar time-specific background rates of nine autoimmune diseases of interest for vaccine safety assessment from seven European electronic healthcare databases. We demonstrated that the ADVANCE system could detect age-specific patterns and differences in IRs by the origin of information (e.g. hospital or general practioners) as well as sex. IRs were fairly stable over time for each disease, showing that identification or recording was not modified during the study period. The age-dependent patterns are important to know for the calculation of observed versus expected cases, as some of the age categories in which rates increase coincide with the age of vaccination. The ADVANCE tools allowed for rapid estimation of the rates by age, calendar time, and sex. Overall, IRs from the ADVANCE system were of a lower magnitude than rates generated through the US PRISM system, which covers claims-based diagnoses from outpatients, emergency units, and hospitalization. Age-specific patterns were similar for most of the autoimmune diseases, i.e. ADEM, Bell’s palsy, GBS, narcolepsy, optic neuritis, SLE, and transverse myelitis. IRs for ITP narrow definition matched rates from the US PRISM system more closely than those for the ITP broad definition. For both systems, PRISM and ADVANCE, we observed the highest rates for Kawasaki disease in children < 4 years of age. The female predominance in SLE is also consistent with recent published literature [12], with the female:male ratio for SLE ranging from 4:1 to 9:1, which is aligned with our observation (4:1). In all databases, IRs for optic neuritis peaked between the ages of 25 and 44 years, decreasing thereafter, except in BIFAP, where we observed a constant increase by age. Estimates of the incidence of optic neuritis have been published from Barcelona [13], another region in Spain for which data are not captured in BIFAP. The data from Barcelona also confirmed the peak of IRs for optic neuritis in the 20–40 years age group over the period from 2008 to 2012. The reason for this variation in rates for optic neuritis between BIFAP and the other databases in ADVANCE is unknown. The ICPC code that was used is specific for optic neuritis, but this code may be used in clinical practice to code suspected conditions as a reason for referral to specialists allowing for testing, diagnosis and confirmation. IRs for narcolepsy were low and stable over time ≤ 1/100,000 PYs, except in Denmark, where the rate of narcolepsy diagnosis was slightly elevated and showed periods with increases in persons between 15 and 24 years of age. However, an increase in the incidence of narcolepsy in Denmark was previously observed, and happened prior to the administration of the influenza A(H1N1)pdm09 pandemic vaccine, which has been associated with increases in the IR of narcolepsy in Finland, Norway, Ireland, and Sweden [14, 15], but not in countries with low vaccine coverage [16].

Comparisons of our data with the US PRISM system showed similar age patterns in IRs [10]. Rates from PRISM, which is based on US claims data, were generally higher than the rates we observed in Europe. This may have several causes: coverage of outpatient specialist diagnoses, inclusion of prevalent cases, generally higher disease rates, or care-seeking behavior. With regard to European published data, high similarities in rate patterns have been observed for most of the diseases, such as Bell’s palsy or GBS [7], Kawasaki disease [17, 18] or narcolepsy [16]. Nevertheless, no direct comparison could be made for several reasons: no overlapping in age strata, ascertainment methods used, diverse sources of data, and their geographical location. Overall, this benchmark provides reassurance about external validity.

We demonstrated that all the participating databases provide crude rates consistent with expectations. However, our pooled crude rates should be interpreted with caution because they were not adjusted for any relevant covariates, nor were they weighted by the data sources with the largest person-time contribution, and should only be used in the context of each individual DAP’s results. Misclassification of incidence as prevalence may occur due to differences in health care provision, as some diagnoses are made in primary care whereas others may lead to hospitalization, and most of the databases do not capture all health care sites. Our analysis by type of data source highlights the specific process of diagnosis of autoimmune diseases. The quantification of these differences is important to realize when designing a specific study, and may profit from the component strategy introduced in the ADVANCE project for this purpose [19]. Background rates of adverse events of special interest following immunization are always needed to conduct observed/expected analyses [7, 20], to understand burden of disease of adverse events [21], or in cost-evaluation of vaccine implementation [22].

5 Conclusion

This study demonstrated that the European ADVANCE system can identify specific autoimmune events, that age-, sex- and time-specific rates can be generated based on available tools, and that the IRs are mostly consistent across selected European healthcare databases. Some variations were observed according to the type of care that is captured in the data sources.