FormalPara Key Points

The paper provides insights into the key variables presented in the cases reporting ADRs during pregnancy.

In the absence of a pregnancy data element in the current electronic format for the submission of safety reports, the use of the SMQ ‘Pregnancy and neonatal disorder’ may be too broad and sub-optimal.

A novel algorithm was developed based on a set of conditions identified from the manual review of a random sample of cases retrieved from the SMQ PNT, to reliably identify cases reporting ADRs during pregnancy.

A subset of more than 200,000 cases reporting ADRs during pregnancy has been identified in EV and could be used to assess new methodologies for routine signal detection.

1 Introduction

Spontaneous reporting is one of the most effective methods to detect new and rare suspected adverse drug reactions (ADRs) [1] and its utilisation in signal detection has been optimised over time [1, 2]. Data from spontaneous reports have been used to evaluate the safety of medicine exposure during pregnancy in the absence of reliable data from clinical trials, but underreporting, data quality issues and the lack of relevant information are major drawbacks of spontaneous reporting systems [3]. Observational studies and disproportionality methods have therefore been used for hypothesis testing and signal generation regarding the impact of medicines on pregnancy outcomes or embryo-foetal development [4]. However, given the methodological difficulties, such as the background rate of specific individual birth defects, the variability in reporting practices and the complexity of pregnancy-related symptoms, signal detection in this population remains challenging [5].

Medicine use in pregnancy is the norm—not the exception. According to a study conducted in France, medicines are prescribed in up to 90% of all pregnant women [6]. Nonetheless, available information on the impact of medicines used during pregnancy is considerably falling behind the information available for other vulnerable populations [7], such as children, elderly and patients with renal and hepatic impairment. Given that pregnant women may become ill, and people with a disease may become pregnant, it is important to generate and interpret data regarding the impact of medicine use in pregnancy, recognising that treating maternal disease, more often than not, benefits both mother and child [8, 9].

In 2001, EudraVigilance (EV) was created as a system for managing and analysing information on suspected adverse reactions to medicines which have been authorised or are being studied in clinical trials in the European Economic Area (EEA), and it has become one of the largest pharmacovigilance databases in the world [10]. As of 31 December 2023, EV included over 15.9 million unique suspected ADR case reports [11] submitted by EU national competent authorities (NCAs), marketing authorisation holders (MAHs) and sponsors in line with the EU legal requirement [12,13,14] and the accompanying guidelines [15]. Despite the several functionalities available in EV to support pharmacovigilance activities, a dedicated data field, to indicate whether the cases are associated with medicine exposure in pregnancy and to help streamline the reporting, is not yet available in the current ICH E2B(R3) format [16] for electronic transmission of individual case safety reports (ICSRs). Because signal detection is carried out routinely on the entire EV database, signals associated with pregnancy may be diluted and therefore may not surface. Considering that spontaneous reporting systems were developed largely as a consequence of thalidomide, it was decided to explore the possibility of identifying in EV a subset of cases reporting ADRs in pregnancy, with a view to developing signal detection methods specifically for this population.

In EV, reported ADRs are coded using terms from the Medical Dictionary for Regulatory Activities (MedDRA®Footnote 1) [17], which is hierarchical and includes five different levels: System Organ Class (SOC); High Level Group Term (HLGT); High Level Term (HLT); Preferred Term (PT) and Lowest Level Term (LLT). To support pharmacovigilance activities, PTs related to defined medical conditions or safety topics of regulatory interest are grouped in Standard MedDRA Queries (SMQs). Standard MedDRA Queries facilitate retrieval of cases as a first step when investigating safety issues related to medicines and may combine very specific terms (narrow scope) and less specific terms (broad scope). Such grouping is consistent with the description of the overall clinical manifestation associated with a particular adverse event and drug exposure. The SMQ Pregnancy and Neonatal Topics (PNT) [18] (Fig. 1) was developed to make it more compatible with regulatory goals related to pregnancy and neonatal topics [19] and it is normally used to identify pregnancy cases in pharmacovigilance reporting systems [20, 21]; however, the data retrieved using this approach result in a broader range of ADRs. Examples include: (1) lack of therapeutic efficacy of contraceptive terms co-reported with terms related to drug exposure during pregnancy; (2) cases of ADR in the mother following use of abortifacients; (3) situations not related to in utero exposure (i.e., paediatric exposure); (4) reporting malpractice and coding quality issues where pregnancy terms are used incorrectly.

Fig. 1
figure 1

Hierarchy structure of pregnancy and neonatal topics (SMQ). The above photographic content is protected by MedDRA’s copyright and extracted from chapter 2.83 "Pregnancy and neonatal topics (SMQ)" in the Introductory Guide for Standardised MedDRA Queries (SMQs) Version 27.0,  2024 [18]. A minor modification to the original table was made to include the SMQ levels available in MedDRA. MedDRA Medical Dictionary for Regulatory Activities, SMQ Standardised MedDRA Query

As part of a larger initiative [8] to strengthen the evidence base regarding medicine use in pregnancy, this manuscript reports on a novel approach to identify a subset of cases in EV that are most likely to be associated with medicine exposure in pregnancy, with the aim to improve signal detection for this population.

The first objective of this study was to assess the utility of the SMQ PNT (broad) in identifying ADRs associated with exposure during pregnancy. The second objective focused on leveraging the insights gained from this evaluation to develop a rule-based algorithm that more reliably identifies cases that are truly representative of an ADR occurring during pregnancy. For example: ADRs with an impact on the baby (e.g., growth retardation in utero, congenital malformation), ADRs with an impact on the mother (e.g., pre-eclampsia) or both (e.g., miscarriage).

2 Methods

In alignment with the research objectives outlined in the Introduction, data extraction, development of the algorithm and assessment of the performance were carried out.

2.1 Retrieval of Cases in EV Using the SMQ PNT

A query in EV was performed (date of extraction: 15 September 2023) to retrieve the cases that reported at least one MedDRA PT included in the SMQ PNT broad. The broad approach, chosen over the narrow alternative, was utilised to ensure the retrieval of all potential pregnancy cases in EV; thus, facilitating a comprehensive understanding of various reporting scenarios in pregnancy. In light of the low proportion of pregnancy cases in EV, we randomised a sample of 100 cases from the SMQ PNT to streamline the annotation process while providing meaningful insights into the performance. This approach ensures that our sample is representative of the population of interest for this study. From the pool of retrieved cases, the random sample was selected assigning a unique identifier to each case using the ‘sample’ function in R. This subset of cases was reviewed independently by two authors (CZ and CdeV) and the cases were classified as ‘ADR occurring during pregnancy’ or ‘other’ based on the information provided in the case narrative. These reviews were then analysed to determine the level of agreement between the two reviewers and any discrepancies. Final agreement was achieved by consensus.

2.2 Development of the Algorithm

A rule-based algorithm was then developed by incorporating the findings of our case review from the SMQ PNT, to systematically apply the inclusion and exclusion criteria detailed in Table 1.

Table 1 Description of the variables included in the algorithm developed in EudraVigilance, including the rationale for inclusion and exclusion

Cases classified as ‘other’ were reviewed to identify the exclusion criteria to be considered as not representing a suspected ADR during pregnancy. In particular: cases where the medicine was not administered in the context of pregnancy; cases describing normal or no pregnancy outcome that should not be reported as ICSR (in line with GVP VI [15]); cases in the context of failed abortion and with contraceptive use leading to unintended pregnancy.

On the other hand, the cases classified as ‘ADR occurring during pregnancy’ were reviewed to determine the inclusion criteria (Table 1) on whether the ICSR:

  1. (a)

    contained a record of gestational age;

  2. (b)

    indicated the route of administration as indicative of in utero exposure (i.e., transplacental or intra-amniotic);

  3. (c)

    contained indications describing exposure associated with pregnancy;

  4. (d)

    involved reactions describing exposure during pregnancy;

  5. (e)

    involved reactions of neonatal disorders indicated as parent-child report (i.e., when information on both parent and child is provided) and/or congenital malformation.

The final algorithm used a combination of the above information structured in 10 different conditions as depicted in Fig. 2 (together with the number of ICSRs in—and excluded at every step).

Fig. 2
figure 2

Attrition diagram of the criteria used in EudraVigilance to identify pregnancy cases. Data extracted in EV on 15 September 2023. EV EudraVigilance, HLGT High Level Group Term, HLT High Level Term, PT Preferred Term, SMQ Standardised MedDRA Query

2.3 Retrieval of Cases in EV Using the Algorithm

The set of rules and conditions constituting the algorithm was used to mine all cases received in EV (i.e., all clinical trials and all post-marketing cases). Following the steps performed for the evaluation of the SMQ PNT and applying the same data lock point (15 September 2023), a random sample of 100 EV cases was collected from the total cases retrieved by the algorithm and reviewed by the same authors to determine its utility in correctly identifying cases reporting ADRs in pregnancy. Agreement between the assessments was again determined and discrepancies were resolved by consensus.

2.4 Details on Data Analysis and Programmes Utilised

Cases were extracted from EV using the EudraVigilance Data Analysis System (EVDAS) [22]. The random sample of 100 cases was selected using the ‘sample’ function in the open-source language R. Figures 3 and 4 were generated with R, using the packages ‘ComplexUpset’ and ‘ggVennDiagram’

Fig. 3
figure 3

Upset plot of the intersection between the different conditions used to identify pregnancy cases. Data extracted from EV on 15 September 2023. EV EudraVigilance, HLGT High Level Group Terms, HLT High Level Terms, PT Preferred Term, SMQ Standardised MedDRA Query. The conditions are: (1) Gestation Period is not null. (2) Administration Route is equal to/is in intra-amniotic use, transplacental use. (3) Indication HLT is equal to/is in Exposures associated with pregnancy, delivery and lactation AND Indication SMQ Level 2 is not equal to/is not in Lactation related topics (incl. neonatal exposure through breast milk) (SMQ narrow). (4) SMQ Level 2 for the adverse reaction is equal to/is in Foetal disorders (SMQ narrow). (5) SMQ Level 2 for the adverse reaction is equal to/is in Termination of pregnancy and risk of abortion (SMQ narrow). (6) SMQ Level 2 for the adverse reaction is equal to/is in Pregnancy, labour and delivery complications and risk factors (excl. abortions and stillbirth) (SMQ narrow). (7) SMQ Level 2 for the adverse reaction is equal to/is in Neonatal disorders (SMQ narrow) AND Seriousness Congenital Anomaly is equal to/is in Yes. (8) SMQ Level 2 for the adverse reaction is equal to/is in Neonatal disorders (SMQ narrow) AND Parent Child Report is equal to/is in Yes. (9) SMQ Level 2 for the adverse reaction is equal to/is in Congenital, familial and genetic disorders (SMQ narrow) AND Seriousness Congenital Anomaly is equal to/is in Yes. (10) SMQ Level 2 for the adverse reaction is equal to/is in Congenital, familial and genetic disorders (SMQ narrow) AND Parent Child Report is equal to/is in Yes. Note that intersections with less than 500 cases are not displayed

Fig. 4
figure 4

Upset plot of the number of pregnancy cases identified only by a single Condition. Data extracted from EV on 15 September 2023. EV EudraVigilance, HLGT High Level Group term, HLT High Level Terms, PT Preferred Term, SMQ Standardised MedDRA Query.                       The conditions are: (1) Gestation Period is not null. (2) Administration Route is equal to/is in intra-amniotic use, transplacental use . (3) Indication HLT is equal to/is in Exposures associated with pregnancy, delivery and lactation AND Indication SMQ Level 2 is not equal to/is not in Lactation related topics (incl. neonatal exposure through breast milk) (SMQ narrow). (4) SMQ Level 2 for the adverse reaction is equal to/is in Foetal disorders (SMQ narrow). (5) SMQ Level 2 for the adverse reaction is equal to/is in Termination of pregnancy and risk of abortion (SMQ narrow). (6) SMQ Level 2 for the adverse reaction is equal to/is in Pregnancy, labour and delivery complications and risk factors (excl. abortions and stillbirth) (SMQ narrow). (7) SMQ Level 2 for the adverse reaction is equal to/is in Neonatal disorders (SMQ narrow) AND Seriousness Congenital Anomaly is equal to/is in Yes. (8) SMQ Level 2 for the adverse reaction is equal to/is in Neonatal disorders (SMQ narrow) AND Parent Child Report is equal to/is in Yes. (9) SMQ Level 2 for the adverse reaction is equal to/is in Congenital, familial and genetic disorders (SMQ narrow) AND Seriousness Congenital Anomaly is equal to/is in Yes. (10) SMQ Level 2 for the adverse reaction is equal to/is in Congenital, familial and genetic disorders (SMQ narrow) AND Parent Child Report is equal to/is in Yes

3 Results

3.1 Analysis of a Random Set of 100 Cases from the SMQ PNT to Determine the Frequency of ‘Cases Reporting ADRs in Pregnancy’ and ‘Other’

Amongst the 100 randomly selected cases, 46 were classified as ‘other’, i.e., not representing a suspected ADR during pregnancy. These 46 ‘other’ cases were sorted into four main categories: (1) no medicine use was involved in pregnancy (n = 26, describing medicine use in children or people aged over 50); (2) an adverse outcome was not reported (n = 5, describing exposure in pregnancy without an adverse outcome); (3) ineffective abortion (n = 6, describing failure of induced abortion); (4) ineffective contraception (n = 9, describing unintended pregnancy).

3.1.1 Mining EV Data Using the Algorithm

Applying the exclusion and inclusion criteria described in Table 1, the algorithm retrieved 202,426 cases  in EV (as can be seen from Fig. 2). In particular, 148,507 cases (73.4%) are related to the SMQ ‘Pregnancy, labour and delivery complications and risk factors (excl. abortions and stillbirth)’ (Condition 6) containing the top 3 PTs retrieved by the algorithm (Table 2), i.e., ‘foetal exposure during pregnancy’, ‘exposure during pregnancy’ and ‘maternal exposure during pregnancy’ that are likely to be co-reported in EV with one or more ADRs during pregnancy. This is to be expected based on the guidance in the MedDRA Points to consider [23] where terms such as ‘exposure during pregnancy’ are recommended to be systematically included in the pregnancy cases. The inclusion of the SMQ ‘Termination of pregnancy and risk of abortion’ (Condition 5) allowed for the identification of an additional 28,233 cases reporting adverse pregnancy outcomes such as ‘abortion spontaneous’, which is the 4th most retrieved PT by the algorithm (Table 2). This means that 87.3% of the potential pregnancy cases were identified using two SMQs narrow (level 2).

Table 2 The 10 most frequently reactions (PT) retrieved by the algorithm

Further refinement was achieved through the use of additional 8 Conditions, as depicted in Fig. 2. Figure 3 shows the upset plot of the intersection between the different Conditions used to identify pregnancy cases, while Fig. 4 shows the number of cases that can be uniquely identified by a single Condition, confirming the above findings.

3.1.2 Analysis of a Random Set of 100 Cases from Algorithm to Determine the Frequency of ‘Cases Reporting ADRs in Pregnancy’ and ‘Other’

The review of 100 randomly selected cases from the algorithm resulted in the classification of 90 cases reporting suspected ADRs during pregnancy, while 10 cases were classified as ‘Other’, among which: (1) cases not describing exposure during pregnancy (n = 4); (2) cases describing exposure in pregnancy without an adverse outcome reported (n = 4), (3) cases describing failure of induced abortion (n = 1) and (4) describing contraception failure (n = 1).

3.1.3 Comparison Between the SMQ PNT and the Algorithm

Assuming that the 100 random cases used for the review were a good representation of the larger set, this gave the algorithm a 90% (95% CI 84–96%) positive predictive value (PPV), compared to 54% (95% CI 44–64%) of the SMQ PNT. Querying EV using the SMQ PNT (broad) yielded 334,792 cases, while the pregnancy algorithm retrieved 202,426 cases. In terms of overlap, 188,324 cases were identified both by the SMQ PNT and the pregnancy algorithm (Fig. 5). Moreover, the algorithm retrieved 14,102 additional cases that had not been identified through the SMQ PNT. This was the result of including information regarding gestational age and route of administration in the algorithm, both of which are independent from the MedDRA terminology. Based on the estimated false positive rate of the SMQ PNT, we anticipate that the vast majority of cases excluded by the algorithm and captured by the SMQ PNT will be non-relevant pregnancy cases.

Fig. 5
figure 5

Overlap between cases identified by the pregnancy algorithm and by the SMQ Level 1 Pregnancy and Neonatal Topics (broad). Data extracted from EV on 15 September 2023

4 Discussion

This is the first algorithm developed in EV using structure data elements that have the potential to hold pregnancy-related information and improve the identification of suspected ADRs occurring during pregnancy. Our rule-based algorithm is built upon expert knowledge and demonstrated notable improvements in precision when compared to the SMQ PNT broad. While the algorithm predominantly retrieves cases with non-specific terms (e.g., ‘exposure during pregnancy’ which, in itself, does not constitute an ADR), the exclusion criteria ensure that only cases concomitantly reported with an ADR are primarily captured. The observed enhancements can be explained by the identification and inclusion of specific characteristics of the reviewed cases (e.g., filters by age, certain indications, and reactions) into the algorithm development process. Moreover, the additional cases retrieved by the algorithm and missed by the SMQ PNT are to be attributed to the inclusion of specific data elements such as the route of administration and the gestation period, that do not rely on MedDRA terminology. The added value of these specific data elements depends on the completeness of the information provided by the reporter. Overall, the improved PPV observed with the algorithm underscores the potential for practical applications to improve data retrieval in EV and support prioritisation tools for signal detection. Whilst it is pivotal to adopt a case-by-case approach to pregnancy signals (i.e., in line with the safety issue being assessed), this algorithm could facilitate an initial triage to automatically retrieve the more relevant cases and reduce the burden of manual inspection. Future studies could also explore the use of a more accurate denominator when calculating disproportionality. These findings could also inform future guidance for stakeholders on the critical minimum elements that should be reported in an ICSR to enhance signal detection in pregnancy.

While our study presents promising results, there are some limitations. For example, sensitivity and specificity could not be calculated because the rate of true and false negative cases was not identified. Also, the upper age limit of 50 may have resulted in the erroneous exclusion of true pregnancy cases; however, we expect that number to be small. Moreover, in line with our exclusion criteria, this algorithm does not allow to retrieve ADRs in the context of unintended pregnancy (i.e., where contraceptives were co-administered with other medicines) and/or in the context of normal pregnancy conditions and outcomes. For such cases, a targeted signal detection approach is recommended. Finally, upon manual review of 100 randomly selected cases by the algorithm, 10 were found not to be associated with an ADR during pregnancy and no further obvious opportunity to refine the algorithm was identified. Nonetheless, a 90% PPV is considered promising for developing pregnancy-specific signal detection methods in EV.

We anticipate that the algorithm will perform similarly in other comparable pharmacovigilance reporting databases that comply with the current format of electronic submission of safety reports and use MedDRA coding for ADRs. Furthermore, the algorithm could be fine-tuned according to each organisation’s objectives.

5 Conclusion

In conclusion, our study highlights the utility of a rule-based algorithm in EV to retrieve reported cases of suspected ADRs in pregnancy, thereby minimising the burden of manually excluding irrelevant cases. With a PPV of 90%, the algorithm outperformed the SMQ PNT, which had a PPV of 54%. This improvement supports the adoption of this algorithm to complement signal detection activities focused on medicine use in pregnancy.