FormalPara Key Points

Use of standardised and pregnancy-specific data elements by different stakeholders enhances the possibilities for comparative assessments of data from studies within the same therapeutic area.

The study revealed a significant alignment between data variables from various Data Access Providers (DAPs) and the Core Data Elements (CDEs) recommended for primary data studies on drug safety during pregnancy.

The findings suggest that harmonization of data analysis for pregnancy pharmacovigilance, involving diverse stakeholders and data sources, could be feasible.

1 Introduction

More than 5 million women become pregnant in the European Union every year and the majority take at least one medication during pregnancy [1]. However, few medications have been adequately monitored for safety and labelled for use in pregnant women and it takes an estimated mean time of 27 years after commercialisation to determine the reproductive risk profile of a medication [2].

Clinical evidence on the efficacy and safety of medications for the general population is generally provided by randomised clinical trials. As pregnant women are usually not included in clinical trials, these rarely provide information on the benefit/risk of medication use during pregnancy [3]. As such, data from post-marketing observational studies are generally required to fill the evidence gap. Primary source data collection methods are commonly used where information about medication exposure and pregnancy outcome is collected directly from pregnant women and/or their healthcare providers. Whilst numerous long-standing datasets from both public and private partners exist, these activities have operated in silos, with considerable heterogeneity in data collection methods. Combining and/or comparing research results on medication safety in pregnancy, whether comparing between studies of the same medication or between medications for the same disease, is also complicated by heterogeneity in the identification and the definitions of key data elements [4]. This heterogeneity impedes the ability to rapidly combine raw data and/or to assimilate the evidence generated from different studies in order to decrease the time taken to provide reliable conclusions about the safety of medication in pregnancy [5]. These challenges have been identified since the 1990s and remain unresolved [6].

The ConcePTION project aims to challenge and improve the way drug use during pregnancy is studied. This includes exploring the possibility of developing a distributed data processing and analysis infrastructure using a common data model, which could form a foundational platform for future surveillance and research. A prerequisite would be that data from the various data access providers (DAPs) can be harmonised according to an agreed set of standard rules concerning the structure and content of the data. A reference framework of core data elements (CDEs) recommended for collection of primary data in pregnancy pharmacovigilance or studies investigating foetal safety following maternal medication use during pregnancy was recently developed in the ConcePTION project as a first step in this process [5]. The aim of the CDE framework is to help optimise and standardise data collection procedures in primary source pregnancy pharmacovigilance studies to improve data harmonisation and evidence synthesis capabilities.

Use of standardised data elements, optimised specifically for pregnancy drug safety studies, by different stakeholders would allow for standardisation of data collection in future studies, which may greatly enhance the possibilities for combining crude data, pooling datasets and/or undertaking comparative assessments of data from studies within the same therapeutic area. This is particularly relevant in pregnancy pharmacovigilance where both exposures and the outcomes being studied are often rare. Effective use of a CDE is well-established and integrated in global drug safety and pharmacovigilance systems (FAERS, Eudravigilance, Vigibase), which is based on the electronic transmission of adverse event reports (referred to as individual case safety reports or ICSRs), using International Conference on Harmonisation (ICH) E2B standard as the CDE. E2B(R3) is the current version for electronic transmission of ICSRs [7]. This standard defines and standardises data elements for transmission of ICSRs on adverse events and adverse drug reactions in pre- and post-approval periods and allows for the exchange of ICSRs between various parties among which are marketing authorisation holders, regulatory authorities, pharmacovigilance centres and medical ethics committees. It is however recognised that the data fields used in this system are not specifically designed for pregnancy safety studies and therefore lack some essential variables [5] that are necessary, in particular, to permit a quantitative analysis and estimation of the prevalence of certain foetal outcomes in relation to medication use. Thus, the use of E2B(R3) for pregnancy safety studies calls for additional efforts through enhanced programmes to consider data fit for purpose as pregnancy safety evidence. In order to address this, adoption of the ConcePTION primary data source CDE by existing and new DAPs would be required.

The aim of this study was to assess the ability to align the current data collection variables and definitions used by different public and private DAPs for pregnancy registries or enhanced pregnancy pharmacovigilance systems with the ConcePTION primary data CDE recommendations framework. This analysis was conducted using several different data sources, each focusing on or including medications used in the treatment of multiple sclerosis.

2 Methods

2.1 Study Design

This methodological study explored the degree to which data collected from various DAPs align with the CDE variables and definitions established in the ConcePTION primary data CDE recommendations framework.

2.2 Data Source

2.2.1 Data Access Providers (DAPs)

Data access providers were public institutions and pharmaceutical companies collecting various types of data, including clinical data, exposure data, outcome data, and other relevant health information, from pregnant patients and/or healthcare providers regarding disease-modifying therapies for multiple sclerosis during pregnancy, using one of the three following main types of data collection methods:

  1. A.

    Pregnancy exposure registries. These registries collect health information on pregnancy and foetal outcomes following exposure to medicinal products during pregnancy. Pregnancy registries typically involve the use of specifically designed data collection forms by various stakeholders, including healthcare professionals (HCPs) and patients who willingly participate and give formal consent, to gather comprehensive health information on pregnancy and foetal outcomes. There are national and international pregnancy exposure registries that may be initiated by pharmaceutical companies, academic groups, research groups or professional scientific societies like the Organisation of Teratology Information Specialists (OTIS), which specialises in providing evidence-based information on the risks of exposures during pregnancy and breastfeeding. Pregnancy registries may focus on a single drug, a drug class or a disease.

  2. B.

    Enhanced pharmacovigilance programmes. These programmes collect and process pharmacovigilance data via existing variable fields in the safety database, data collected through sets of targeted checklists or questionnaires, and in some programmes, free-text data from the narratives. In addition, a structured follow-up, a rigorous process of data entry, data quality control and a programmed aggregate analysis is performed. Data are collected initially from ICSRs, used for general adverse event reporting, but are then supplemented by targeted checklists or questionnaires with dedicated pregnancy-related fields. Initial reporting can be by the HCP or directly by the patient. The data are entered in the respective pharmaceutical company safety database.

  3. C.

    Teratology Information Services (TIS) from the European Network of Teratology Information Services (ENTIS). ENTIS is a collaborative network of services offering expertise on possible risks related to exposure to medications, and other environmental exposures, during pregnancy and breastfeeding at an individual level. TIS collect patient data both during initial contact and after a follow-up period covering pregnancy outcome using a similar methodology based on structured telephone interviews and/or mailed questionnaires.

The exhaustive list of participating DAPs is presented below:

  1. A.

    Pregnancy registries: Gilenya (Novartis), Aubagio (Sanofi), Aubagio (Sanofi, OTIS), The Dutch Pregnancy Drug Register (Lareb)

  2. B.

    Enhanced pharmacovigilance programmes: Gilenya PRIM (Novartis), MAPLE-MS (Merck Healthcare KGaA)

  3. C.

    TIS: members of ENTIS (Swiss TIS (STIS), UK TIS (UKTIS), Zerfin TIS, Jerusalem TIS)

2.2.2 Data Collection

Between May and November 2022, DAPs were requested to answer a questionnaire concerning their general characteristics and method of data collection including the following items: name, short name, institution/market authorisation holder (MAH), governance, website, initial role, geographical localisation, beginning and end date of data collection, primary reporter, notification, transmission and collection of data, and follow-up approach.

In a second questionnaire, each DAP was requested to answer the following questions for each CDE item (questionnaire presented in Table 1 of the Electronic Supplementary Material (ESM):

Table 1 Description of data sources

- Can this item be taken directly from an existing field in the DAP database? (yes/no)

For yes responses, these items were already available in the DAP database and met the definition of the CDE.

- Can this item be derived by combining data from fields in the DAP database? (yes/no)

For yes responses, these items could be derived, using other variables in order to meet the definition of the CDE (e.g. the pre-pregnancy maternal body mass index [BMI] was not directly available in the DAP database, but could be derived using the maternal pre-pregnancy weight and height that were available in the database).

- Does the DAP collect data, which is similar to this item, but the CDE definition is different from that used in the DAP database? (yes/no)

For yes responses, these items were considered divergent as they were not directly available and could not be derived, but a similar variable was available (e.g. the pre-pregnancy maternal BMI was not directly available and could not be derived, but the maternal BMI at inclusion/entry to the registry was available).

- Is the item missing from the DAP database? (yes/no)

For yes responses, these items referred to variables that were not available.

Following the above answers each CDE item was classified into one of the four following categories: (1) directly matched; (2) derived; (3) divergent; or (4) not available.

It is important to note that the DAPs answered the questionnaire based on their current primary data collection form. This study only focused on the intended data collection step. Data quality (i.e. data accuracy, data completeness) and data processing issues (i.e. data storage, data formatting, other technical issues) were not considered.

Given the fundamental role of E2B(R3) in the data exchange of global pharmacovigilance data, the degree to which ICH E2B(R3) fields align with the ConcePTION primary data pregnancy exposure CDE was also investigated.

2.3 Statistical Analysis

A descriptive analysis of the CDE variables collected, classified in four categories, was performed for each DAP and overall. Results were presented as absolute numbers (n) and proportions (%).

3 Results

Four pregnancy registries (Gilenya Novartis, Aubagio Sanofi/OTIS, Aubagio Sanofi, The Dutch Pregnancy Drug Register Lareb), two enhanced pharmacovigilance programmes (Gilenya PRIM, MAPLE-MS Merck) and one ENTIS consortium (comprising STIS, UKTIS, Zerifin TIS and Jerusalem TIS) participated in the study. The description of all DAPs is presented in Table 1.

Data collection by MAHs was initiated as requirements from regulatory authorities, except for the Gilenya PRIM, which was initiated by the sponsor to complement the corresponding Gilenya Pregnancy registry. ENTIS is a non-profit organisation and Lareb is a public institution. The primary role of ENTIS member organisations is to counsel pregnant women and/or HCPs on medication use during pregnancy. Data are collected primarily to provide case-specific risk assessments and advice but are used collectively for surveillance and research purposes. The Dutch Pregnancy Drug Register is based on data from pregnant women with the purpose of pharmacovigilance and research activity. For the private pregnancy registries, case enrolment required that written informed consent was obtained from pregnant women after the woman herself or her HCP spontaneously contacted the registry. MAPLE-MS and Gilenya PRIM included cases from ICSRs reported by pregnant women and HCPs that were recorded in the MAH’s safety database. In these systems, data collection is enhanced through a targeted questionnaire directed to primary reporters. For ENTIS, pregnancy and infant follow-up data were collected around the delivery due date and for some TIS until 3 years of age for live-born infants. The other DAPs performed a follow-up until 1 year of life of the infant. For MAPLE-MS, this follow-up was performed only for infants with congenital anomalies.

3.1 Alignment with the CDE: DAP Pregnancy-Specific Data Collection Systems

This study assessed 51 specific items from the CDE framework recommendations (Table 2). The majority of the DAP data variables aligned with the CDE items and definitions; 85% (n = 305/357, range 73–94% between DAPs) were directly taken from existing fields and 12% (n = 42/357, range 0–24% between DAPs) were derived by combining different variables.

Table 2 Pooled results of alignment of the data access providers with the core data elements

For very few of the DAP variables, alignment with the CDE items was not possible, either because the definitions were different from the CDE definition (1%, n = 3/357, range 0–2% between DAPs) or because the variables were not collected by the DAPs (2%, n = 7/357, range 0–4% between DAPs). No discrepancies were reported between DAPs, regarding divergent and not available variables (Table S2 of the ESM).

Alignment with the CDE items was similar across type of data collection method with variables directly taken or derived for 96% (n = 196/204) of items for the pregnancy registries (Gilenya Novartis, Aubagio Sanofi (OTIS), Aubagio Sanofi, The Dutch Pregnancy Drug Register, Lareb), 99% (n = 101/102) for the enhanced pharmacovigilance programmes (Gilenya PRIM Novartis, MAPLE-MS Merck) and 98% (n = 50/51) for ENTIS.

Each of the unavailable CDE items was unique to a single DAP; none of them was missing in more than one DAP. The seven not available CDE items were maternal pre-pregnancy BMI, medication route of administration, medication frequency of use, maternal death outcome (as the reporter is the mother herself and this would therefore appear as a case that is lost to follow-up), molar pregnancy or blighted ovum pregnancy outcome, and infant head circumference at birth (Table 3). The DAP variables that were divergent (with a different definition than the CDE items) related primarily to maternal age, which was not always based on maternal date of birth, but rather mother’s age at the last menstrual period or maternal age at reporting, and maternal pre-pregnancy BMI (Table 3).

Table 3 Details and comments on not available (A) and divergent (B) CDE items

3.2 Alignment with the CDE: ICH E2B(R3) Data Structure

The ICSR ICHE2B(R3) fields lack a greater number of the ConcePTION primary data CDE variables and definitions than the different DAPs participating in this study (details in Tables S3 and S4 of the ESM). Eight key CDE items were not available in E2B(R3), including prospective status, source of directly reported EDD, plurality, pregnancy outcome collection status, date of end of pregnancy, gestational age at end of pregnancy, details of congenital anomalies and infant malformation case classification.

4 Discussion

The primary data pregnancy pharmacovigilance CDE items proposed by Richardson et al. were used as a reference for standardising data reporting in pregnancy pharmacovigilance [5]. This study found that for previously collected data by pregnancy-specific data collection systems of both private and public DAPs participating in the study, a very high proportion of variables aligned with the ConcePTION primary data CDE items, with 96% of all variables directly matching existing fields or derived by combining other variables.

Although the DAPs participating in this study showed excellent alignment in terms of data elements collected, all operate differently. The Dutch Pregnancy Drug Register was the only dataset that was based on direct reporting by pregnant women only, whereas all the other DAP datasets were based on reporting from both HCPs and/or pregnant women. Additionally, the DAPs collect data in different contexts and for different reasons (e.g. legal and regulatory vs clinical), which may also lead to differences in reporting patterns and patient recruitment. These differences in patient recruitment and data collection may influence the results obtained by DAPs and hamper the ability to combine data sources, or to directly compare the risk or safety estimates across different datasets. Furthermore, follow-up procedures differed between DAPs. The DAPs perform a follow-up until 1 year of life, except for ENTIS (where a follow-up even within participating centres ranges from outcome at birth to offspring age of 3 years) and MAPLE-MS (follow-up until 1 year only for infants with congenital anomalies). Again, these differences in follow-up are to be taken into consideration when comparing neonatal and infant outcomes, as several relevant infant outcomes may manifest or only be detected later in life.

While the unavailable variables identified in the study might not appear be of major interest by non-experts in the field, they are in some contexts important for an accurate analysis of pregnancy and infant safety data and to identify possible confounding by indication for product use. For example, maternal BMI was either not collected or collected at the beginning of pregnancy or at pregnancy registration instead of before pregnancy by ENTIS member organisations. Recording of maternal weight at advanced stages of gestation could result in an incorrect BMI calculation. Collecting accurate information on maternal BMI is important because obesity is associated with a higher risk of various maternal and foetal perinatal complications, and these risks are exacerbated with more severe obesity [8, 9]. Possible associated complications include congenital anomalies, gestational hypertension, pre-eclampsia, gestational diabetes, preterm birth and having a large for gestational age infant [10, 11]. However, the difference between pre-pregnancy BMI and BMI at reporting might be of limited clinical relevance, particularly where pregnancies are reported in early gestation. The DAPs that did not collect information on the route of exposure or frequency of drug use were MAHs single-product registries with a specific route/frequency of administration. Thus, these not available variables should not lead to a loss of relevant information. The infant head circumference not available in the Dutch Pregnancy Drug Register is relevant in clinical practice as an easy screening instrument for paediatricians and has value in teratogen surveillance, [12] but it has been reported to be an inaccurate tool for assessing children’s development outcome as up to 85% of children measured with a very small head develop normally [13]. The study found that some pregnancy outcomes such as molar pregnancy and blighted ovum were not separately recorded by the Gilenya registry. Nevertheless, it is probable that many of these pregnancies were recorded as miscarriages, implying that the actual impact of this discrepancy on data quality is likely insignificant. Similarly, maternal death was not collected by the Dutch Pregnancy Drug Register as the primary reporters are mothers themselves, and the data collection design does not allow matching this point to the CDE. It should be technically possible that, where feasible, these limited numbers of not available variables identified in the study could be included in current pharmacovigilance data collection systems to match the CDE.

This study highlights the extent to which E2B(R3) fields are deficient in key ConcePTION primary data source CDE variables and definitions in stark comparison to the pregnancy-specific data collection systems operated by DAPs participating in this study. As ICH E2B(R3) is the standardised procedure for the electronic transmission of ICSRs designed for spontaneous adverse event reporting, this may lead to a potential loss of important pregnancy and foetal-maternal information during a data exchange among various parties, including MAHs, regulatory authorities, and primary reporters for pregnancy exposure reports. Although only a limited number of variables are not available, some of these variables are of high clinical relevance (i.e. gestational age at end of pregnancy, details of congenital anomalies). As the data exchange system for ICH-E2B(R3) reporting could represent the basis for a common data model that could be used by stakeholders performing pregnancy safety studies, including the ConcePTION Primary Data CDE pregnancy-specific items in the E2B guideline would be of utmost importance. In recognition of these limitations, enhanced pharmacovigilance programmes use several additional data collection components such as structured checklists and questionnaires, thereby achieving a higher level of alignment with the definition of the CDE.

Our study presents a significant contribution to pharmacovigilance in pregnancy, as it is the first study to explore which variables are collected in different pregnancy pharmacovigilance systems and how they conform to the CDE. This study has the advantage of including both public and private DAPs and providing high-level details on the variables collected. However, this study also has limitations that should be taken into consideration. One of the main limitations is that only DAPs collecting pharmacovigilance data on MS drug exposure during pregnancy were included, which could limit the generalisability of the findings. It is noteworthy that all DAPs involved in this study have extensive experience and expertise in the area of pregnancy pharmacovigilance or data collection, which again could further impact generalisability.

The study was conducted in the ConcePTION project as a test to see if data could be collected and combined using novel methodological tools developed (CDE). This study covers the first step of the project exploring intended data collection, without evaluating data storage and data analysis, which will be addressed in future publications. Finally, our research focused only on the essential CDE items. It is important to note that the CDE could evolve over time in regard to emerging evidence.

5 Conclusions

This study represents a first step in a process of standardising data collection by different stakeholders collecting data as part of collaborative pregnancy safety studies. Data access providers participating in this study presented a very high proportion of variables matching the ConcePTION Primary Data CDE items. The low proportion of divergent items and of items not collected, together with the possibility to adapt variables to match current data standards, gives the prospect that the alignment of definitions and harmonisation of pregnancy pharmacovigilance data by different stakeholders could be feasible. Importantly, this insight challenges perceived barriers and theoretical concerns regarding the scientific validity of combining diverse datasets to improve teratogen detection. Furthermore, this study indicates that previously collected data from different data collection systems could potentially be exploited more effectively.