FormalPara Key Summary Points

Up to and including November 2017, less than 25% of European Economic Area reports in EudraVigilance were classified as non-serious, since then this figure raised to 61.6%.

Comparison to a set of known adverse drug reactions suggested that including the non-serious reports slightly increased the sensitivity of the signal detection system (from 10.8 to 11.2%) whereas the positive predictive value was unchanged (at 13.6%).

Comparing the signal of disproportionate reporting appearing using only the serious reports and the signals of disproportionate reporting appearing using all reports (serious and non-serious) showed that 91.9% of signals of disproportionate reporting were unchanged. For the minority of signals of disproportionate reporting that changed, differences with respect to the profile of both adverse events and products were noticed; the impact of these changes, even if currently limited, might require further assessment in the future.

1 Introduction

The European Union pharmacovigilance legislation adopted in December 2010 [1, 2] brought about significant changes to reporting requirements for suspected adverse reactions [3]. An important change was the new requirement for Marketing Authorisation Holders and National Competent Authorities to report to EudraVigilance (EV) all non-serious cases originating from the European Economic Area (EEA). In addition, the new legislation brought an obligation on Marketing Authorisation Holders and National Competent Authorities to record and report cases received from patients, who cannot report directly to EV. In order to put into effect these changes, enhancements to the EV system were required and an audit demonstrating that the technical requirements were met had to take place. Following these steps, on 22 November, 2017, the enhanced EV was launched, accommodating the increased volumes of reports, supporting the use of international standards, including the International Council for Harmonisation (ICH) E2B(R3) on the data elements for transmission of individual case safety reports, and increasing access, analytical tools and transparency [4]. With these system enhancements, the new reporting requirements entered into force.

Signals of new safety issues with medicines, or a new aspect of a known issue, may originate from multiple sources, with spontaneous reports of suspected adverse reactions to medicines contributing to more signals than any other source [5]. EudraVigilance is the system for managing and analysing this information in the EEA. Therefore, when it comes to the overall performance of the EV system, a legitimate question is whether the mandatory reporting of non-serious reports (NSRs) from 22 November, 2017 has had an effect on the detection of safety signals when signal detection processes are conducted in all other respects according to the established process [6, 7].

This paper reports on a series of analyses performed to assess the increase and nature of the NSRs and their effect on signal detection in EV. The work is a contribution to the Pharmacovigilance Risk Assessment Committee Strategy on Measuring the Impact of Pharmacovigilance Activities [8], it is in line with the Pharmacovigilance Risk Assessment Committee implementing regulations of regularly reviewing the methodology(ies) used and publishing recommendations, and its results inform the pharmacovigilance process review and potential improvements.

2 Methods

The analyses were aimed at assessing the impact of NSRs on statistical routine drug safety monitoring at the European Medicines Agency (EMA) and hence the extraction of cases from EV followed the same default criteria applied when preparing the data for screening, which includes the calculation of signals of disproportionate reporting (SDRs) as described in section 4.1.5 of the guideline on ‘Screening for adverse reactions in EudraVigilance’ [7]. These reports are referred to as spontaneous in the rest of the article.

2.1 Descriptive Analysis

In order to assess the way the legislation has changed the reporting volumes, the trend of NSRs received in EV was examined across a period that included the implementation date of 22 November, 2017. The absolute number of all spontaneous reports (serious and non-serious) and of NSRs received in EV was calculated from 01/07/2012 to 30/06/2019, as well as the proportion of NSRs relative to the total. Results were further broken down by cases arising in the EEA only. As the legislation only mandated the reporting of NSRs from the EEA, most analyses focused on these reports.

Changes in numbers and proportions of NSRs reported to EV were analysed to identify possible patterns according to patient age groups, primary source qualification and country of origin. A potential complicating factor in the analysis was the coincident introduction of the possibility to submit the cases in the new ICH-E2B(R3) data format that requires seriousness to be stipulated at a reaction level, whereas in ICH-E2B(R2) the seriousness is assigned at a case level (meaning that only the case could be categorised as serious or non-serious). To deal with both reporting standards in the analysis, the same logic used with ICH-E2B(R2) cases was applied to cases reported in ICH-E2B(R3) (i.e. if at least one reaction was reported as serious, the case was considered serious). Moreover, some analyses focusing on cases reported in ICH-E2B(R3) have been included to understand how seriousness at a reaction level is reported.

2.2 Signals of Disproportionate Reporting

This part of the study focused on estimating the impact of the mandatory reporting of NSRs on the statistical detection of safety signals. Several disproportionality measures have been proposed for signal detection purposes [8,9,10,11]; the statistical indicator of potential drug safety problems used routinely in EV is the reporting odds ratio (ROR) [7] (Table 1 of the Electronic Supplementary Material [ESM]). The ROR is calculated in each of five macro geographical regions of reporting (Europe, North America, Japan, rest of Asia and rest of the world [12]) and a SDR is raised for a specific drug-reaction combination if, in at least one macro region, all the following criteria are satisfied:

the lower bound of the 95% confidence interval of the ROR is greater than one;

the number of individual cases is greater than or equal to three for active substances contained in medicinal products included in the additional monitoring list [15] or five for the other active substances;

the event belongs to the Important Medical Events list [16], a subset that reflects more closely the concerns of pharmacovigilance experts.

For each of the analyses, the number and the character of SDRs arising from routine statistical signal detection methods performed by the EMA for substances included in Centrally Authorised Products (CAPs) were calculated and then compared for either (i) spontaneous serious reports only or (ii) all spontaneous reports (serious and non-serious).

2.2.1 Comparing SDRs

To assess whether including NSRs generated a different set of potential signals, and the magnitude of this change, SDRs raised with only the serious reports were compared to SDRs raised with all reports, and as a result three groups were identified: (i) SDRs raised in both scenarios; (ii) SDRs raised only from serious reports; and (iii) SDRs raised only from all reports (serious and non-serious).

2.2.2 Performance Measures

To compare the effectiveness of signal detection methods, a definition of the properties that constitute an effective signal detection system and a reference database to measure the extent to which these properties are achieved were required. Regarding the reference database, the publicly available PROTECT ADR database of labelled reactions for CAPs [13] that maps the wording in section 4.8 of the Summary of Product Characteristics to Preferred Terms (PTs) of the Medical Dictionary for Regulatory Activities (MedDRA®) was used. As the reference database includes only CAPs, the analysis described in this section was restricted only to CAPs.

In line with previous work [9, 10, 12], the measures chosen to compare the effectiveness of signal detection were sensitivity (the proportion of adverse drug reactions [ADRs] from the reference database that were flagged as SDRs) and positive predictive value (PPV, the proportion of SDRs that corresponds to ADRs present in the reference database, also called ‘precision’ and considered a measure of efficiency).

The SDRs identified from serious reports and the SDRs from all reports were then compared with the ADRs in the PROTECT ADR database to calculate sensitivity and PPV. Several sensitivity analyses were conducted to assess whether results were dependent on some specific settings in the reference database and/or on the data used to calculate SDRs:


Reference dataset restricted only to post-marketing ADRs: the PROTECT ADR database allows us to recognize the ADRs that were identified or confirmed during post-marketing. As EV is used to identify post-marketing ADRs, restricting the reference database to such events might serve as a useful comparison;


Maximise the proportion of NSRs on the total spontaneous reports: even if from 22 November, 2017 the proportion of NSRs showed a marked increase, their weight on the totality of cases in EV is still modest.

Restricting the analysis to include only data from the implementation of the new legislation allowed an estimation of the impact of NSRs when their relative proportion was at the highest and it simulated what could happen in a few years’ time as the weight of NSRs increases. As the legislation only mandated the reporting of NSRs from the EEA, an additional analysis further restricted to include only reports from the EEA allowed us to focus on the region where the proportion of NSRs is even higher.


Use of the discontinued method to identify SDRs: before the implementation of the new guideline on ‘Screening for adverse reactions in EudraVigilance’ [7], the disproportionality measure was calculated on the totality of spontaneous reports in EV, without considering the split by macro region. Using this now-discontinued method was considered a helpful analysis as it spread the weight of NSRs clustered in the EEA.

All the analyses listed addressed the question of whether including NSRs in the set of reports where the disproportionality algorithm is applied had an effect on the number and characteristics of SDRs; an alternative analysis was to explore the use of NSRs separately, answering the question whether the set of NSRs by itself could generate additional SDRs to improve signal detection. In this approach, tested by Seabrooke et al. [12], reports in EV were divided into two subgroups, serious and NSRs, and the ROR was estimated within each subgroup. An SDR was considered to exist if the conditions for an SDR were met within either of the two subgroups.

2.2.3 Number of Reports per SDR

As an additional measure to assess the potential impact on workload, the average and median number of reports associated with each SDR generated in the two scenarios (only serious reports or all reports) were compared.

2.2.4 Individual Review of False Positives

The underlying assumption of using a reference database of only positive controls is that any SDR with the event not coded in the reference is considered as a false positive rather than an unknown association [14]. The consequence is that the estimate of PPV obtained will be slightly conservative, but, more importantly for the scope of this analysis, any comparative evaluation between SDRs from only serious reports and from all reports is unlikely to be misleading as both scenarios should roughly be equally disadvantaged by inaccuracies in the list of positive controls.

To have some empirical confirmation of the false positives in both scenarios, 40 SDRs considered as false positive and appearing only in one scenario (20 SDRs raised only from serious reports and 20 SDRs raised only from all reports) were selected from those with the highest ROR and reviewed individually and independently by two experienced assessors in signal management to confirm they were not potential signals.

2.2.5 Profiling of SDRs Appearing in Only One Scenario

The profile of SDRs appearing only when the disproportionality algorithm was applied to serious reports and of SDRs appearing only when it was applied to all reports (serious and non-serious) were further analysed to investigate whether they differed in term of reactions and substances involved, and whether any pattern could be established.

3 Results

3.1 Descriptive Analysis

3.1.1 Chronological Trend of NSRs in EV

Figure 1 shows a marked increase of NSRs both in absolute number and as proportion of the total number of cases reported in EV. Figure 2 shows that the change was more pronounced when the EV analysis was restricted to EEA cases, where NSRs represent 61.6% of the reports submitted after the new reporting requirements on NSRs entered into force. Up to and including November 2017, less than 25% of EEA reports in EV were classified as non-serious, since then this figure raised to 61.6%.

Fig. 1
figure 1

Quarterly numbers of total spontaneous and non-serious spontaneous reports in EudraVigilance

Fig. 2
figure 2

Quarterly numbers of European Economic Area (EEA) total spontaneous and non-serious spontaneous reports in EudraVigilance

3.1.2 Reporting by Primary Source: EEA

Reports of suspected ADRs may be initiated by a patient, by a healthcare professional (HCP), or by both a patient and HCP.Footnote 1 Litigation cases have not been considered here in line with the criteria described in section 4.1.5 of the guideline on ‘Screening for adverse reaction in EudraVigilance’ [7]. Table 1 shows the increases in the proportion of NSRs from each source following the legislation. A marked increase in the proportion of NSRs was seen in each of the three categories of primary source reporting with the reports originated by a patient remaining the category with the highest proportion of NSRs. The category of ‘patient and HCP’ saw the biggest increase in relative terms. The category of ‘only patient’ also had the largest increase in absolute numbers of reports (serious and non-serious) in comparison with the levels before the implementation of the legislation.

Table 1 Number and proportion of non-serious reports before and after the reporting rule change, split by primary source

3.1.3 Reporting by Age of Subject: EEA

There was some variation in increases in NSRs by age of the subject of the report (Table 2). The biggest increase in the relative proportion of NSRs was seen in the adult category and when age was not reported; paediatric remained the category with the highest proportion of NSRs among the categories with age reported.

Table 2 Number and proportion of non-serious reports before and after the reporting rule change, split by age group

3.1.4 Reporting by Country: EEA

Figure 3 shows that some countries were already sending high numbers of NSRs before the change in the reporting requirements, as EV permitted the voluntary submission of NSR before the implementation of the new legislation. However, after the change, most countries range around ~60% of NSRs and the variation in rates between countries was much decreased (see Table 2 of the ESM).

Fig. 3
figure 3

Proportion of non-serious reports (NSRs) on the total spontaneous reports before and after the report rule change, split by European Economic Area countries (As the data for this analysis were retrieved prior to its withdrawal from the European Union, the UK has been considered as part of the EEA.)

3.1.5 Impact of the New ICH-E2B(R3) Reporting Standard: EEA

The ICH-E2B(R3) reporting standard requires seriousness to be assigned for each suspected reaction in the report, allowing serious and non-serious reactions to be individually identified in the same report. The left side of Table 3 shows the number and the type of adverse events for the cases reported in EV with the ICH-E2B(R3) format. The total number of reports received (200,549) represented more than a quarter (26.4%) of the total spontaneous cases received in EEA after the new legislation came into force. Case reports where both serious and non-serious reactions are reported represent a minority (approximately 8%).

Table 3 Total number of reports and mean number of adverse events per report submitted in ICH-E2B(R3)

Thus, while there is a possibility that the advent of ICH-E2B(R3) could have qualitatively changed the nature of some reports—as non-serious adverse reactions under ICH-E2B(R2) could have previously been hidden under what is now a report with both serious and non-serious reactions—we consider this to be limited in number and therefore unlikely to have a large impact on this study. Finally, the right side of Table 3 also shows how reports with both serious and non-serious reactions have on average twice the number of adverse reactions compared with the reports with only either serious or only non-serious reactions.

3.2 Impact on Statistical Signal Detection

3.2.1 Comparing SDRs

The SDRs under the two scenarios (only serious or all reports) were compared; the majority (83,170, representing 91.9% of all SDRs) were unchanged, with only a small fraction of SDRs appearing in one scenario only, therefore no major changes were expected in the subsequent analyses. Including NSRs led to an overall increase of 5.5% of SDRs, with 5981 new SDRs, but also 1337 that were lost (see Fig. 4).

Fig. 4
figure 4

Signals of disproportionate reporting (SDRs) considering only serious or all spontaneous reports in EudraVigilance (threshold of at least five cases applied). NSRs non-serious reports

A small increase in the number of SDRs appeared when NSRs were added to serious reports. However, it is not immediately clear whether this increase can be attributed to the reliability of information contained in the NSRs or to the increase in the volume of reports. This is because an increase in volume of reports can also impact on the number of SDRs [9, 10, 12].

To investigate that the increase in volume would naturally increase the number of SDRs, a set of reports of the same number as the NSRs were created. To make this set not completely random, the same products and adverse events as the NSRs set were used, but their combination was assigned at random. Therefore, the set of random reports consisted exactly of the same number for each product and adverse event as in the NSRs, but the combinations between products and adverse events were different. When these reports, from which the original valid information had been replaced by random information, were used, an increase of 4.4% in SDRs was found, compared to 5.5% with true NSRs. These results were consistent over ten repetitions of the randomisation.

3.2.2 Performance of Signal Detection: CAPs

Review of the reference database revealed that many (81%) of the ADRs at PT level listed in the Summary of Product Characteristics for the CAPs have never been reported in spontaneous reports to EV. This was unchanged by inclusion of NSRs (see Table 3 of the ESM).

In terms of performance, based on CAPs only, using both serious and NSRs resulted in a marginal increase in true and false positives compared to focusing only on serious reports. The results in Table 4 show a slightly higher sensitivity (~ 0.4%) and the same PPV when using both three and five as the threshold for the number of cases to define an SDR. These results were qualitatively unchanged in all the sensitivity analyses performed. See Table 5 for results using only adverse drug reactions identified postmarketing in the reference database. The alternative analysis to explore the use of NSRs separately revealed that the set of NSRs by itself could generate only a very limited number of additional ADRs (26) and this approach led to a slightly lower sensitivity and roughly the same PPV when compared with the main approach used in this study of analysing serious and NSRs together (see Tables 4 and 5 of the ESM).

Table 4 Performance of signal detection methods on Centrally Authorised Products

To verify whether the impact on sensitivity and PPV can be mainly attributed to the reliability of information contained in the NSRs or to the increase in the volume of reports, the same set of random NSRs described in the previous paragraph was added. A small increase in sensitivity was still obtained but at the cost of a small decrease in PPV; using the five reports threshold, sensitivity rose from 10.8 to 10.9% while PPV decreased from 13.6 to 13.5%. These results were consistent over ten repetitions of the randomization.

Table 5 Performance of signal detection methods on Centrally Authorised Products: sensitivity analysis using only adverse drug reactions identified post-marketing in the reference database

3.2.3 Number of Reports per SDR

Including NSRs generated 5.5% more SDRs; however, the average number of reports (individual case safety reports) associated with each SDR was similar to the average when only serious reports were considered, at just below 40, and the median was unchanged at nine in both scenarios.

3.2.4 Individual Review of False Positives

The result of the individual and independent review by two signal management assessors of 20 SDRs unique to each scenario, i.e. from only serious or from all spontaneous reports, not in the reference database did not identify any potential signals for further review.

3.2.5 Profiling of SDRs Appearing in Only One Scenario

The unique SDRs generated analysing only the serious reports were contrasted with those unique SDRs generated analysing serious and NSRs. Table 6 shows the MedDRA® PTs most reported in these discordant sets of SDRs.

Table 6 Top 12 MedDRA® PTs most reported in the discordant sets of signal of disproportionate reporting

On each side of the table, the PTs are ranked by frequency in the discordant set in the heading, but the ‘Rank’ column indicates the ordering of that particular PT amongst SDRs generated by the other report set. It can be seen that there were some unexpected changes in the SDRs found when NSRs were added, such as the serious disorder of Stevens–Johnson syndrome making it to the top place as well as other serious conditions (e.g. circulatory collapse or multiple organ dysfunction syndrome).

Emphasis of statistical signal detection might refocus on to different clinical domains with the inclusion of NSRs. An indication of such a change can be obtained by examining the relative risk that an SDR will be generated in each MedDRA® System Organ Class when NSRs are included compared with using only serious reports. Figure 5 shows the System Organ Classes ordered by this relative risk. For example, the risk that an SDR will be occurring in the ‘Reproductive system and breast disorders’ category is raised by a factor of 1.135 when NSRs are included.

Fig. 5
figure 5

Number and relative risk of signals of disproportionate reporting by Medical Dictionary for Regulatory Activities (MedDRA®) System Organ Class when non-serious reports are included

The SDRs that differ when contrasting the analyses of only serious reports vs serious reports and NSRs may also change the focus with respect to substances to be investigated. Table 7 shows that there is no overlap in the most frequently reported substances in the discordant SDRs. The display of frequency and ranking order follows the same rationale as in Table 6.

Table 7 Top ten most reported substances in the discordant sets of SDRs

4 Discussion

The change in the rules for reporting suspected ADRs to EV came into effect in November 2017 and has resulted in an increased level of reporting of NSRs, both in absolute number and in relation to the total. The change was more pronounced when the analysis was restricted to cases arising in EEA, as expected because the relevant reporting rule change only applied to EEA reporting. The increase in the number of NSRs had also been affected by the new legislation mandating the submission of patients and consumer reports to EV.

We have examined both the number and the character of SDRs arising from routine EMA statistical signal detection methods applied either solely to spontaneous serious reports or to all spontaneous reports, i.e. including the NSRs. The vast majority of SDRs (91.9%) were common to both scenarios.

Inclusion of NSRs produced some new SDRs and eliminated others. Overall, an increase of about 5.5% of SDRs was noted; however, inclusion of a similar number of computer-generated reports, obtained by randomising the adverse event and substance lists from the NSRs, also consistently increased SDRs by 4.4%. Although this demonstrates that the increase in SDRs that can be attributed to the information contained in the NSRs is only about 1.1%, the increase in sensitivity when using randomised reports is only modest (0.1%) and counterbalanced by a corresponding decrease in PPV. While when using the real NSRs, the increase in sensitivity raised from 10.8 to 11.2% and the PPV remained unchanged.

The maintenance of the PPV and consequently of efficiency in terms of SDRs per ADR detected is a positive finding and is reinforced by the finding that the average number of reports contributing to each SDR did not increase. Hence, the workload per ADR seems likely to have remained unchanged (even if the workload has increased overall). The different sensitivity analyses performed, looking at different time periods and/or focusing only on the set of ADRs identified post-marketing, showed the same pattern of slightly increased sensitivity and similar PPV, thus providing confidence on the robustness of the aggregated results.

The discordant SDRs highlighted only in one scenario are a minority (8.1% of the total SDRs); nevertheless, there were changes in the character of SDRs arising from reports that include NSRs. Both the nature of the adverse events and the sets of products have changed. However, the significance of these changes seems small when compared to the totality of the SDRs and the impact will only become clear in the long term as the proportion of NSRs in the database increases.

An interesting observation from the analysis was that there was considerable variation in transmission of NSRs between Member States prior to the legislation and some were already sending a high proportion if not all of their NSRs. Hence, an analysis of the potential impact of legislative changes could have been made prior to implementation and might have given useful information in shaping the legislation. The variation in transmission of NSRs between Member States decreased markedly after the legislation; the reasons for this remaining variation were not explicitly explored but likely reflect different cultures for reporting, different modalities being put in place by national competent authorities and different levels of promotion of patient reporting.

The analyses performed have some limitations. The timing to detect an ADR was not studied; although time is considered a key criterion for signal detection methods, the 91.9% overlap of SDRs between the two scenarios considered did not suggest an important change on any of the dimensions studied. The greater number of reports when NSRs are included would indicate that, if any effect on timing to detect an ADR was present, it would be on reaching more quickly the threshold on the number of cases to have an SDR.

Another limitation is that the reference database was based only on CAPs; however, it is not expected that Nationally Authorised Products (NAPs) would systematically differ from CAPs in term of performance of statistical signal detection. The analysis showing that 91.9% of SDRs remained unchanged included NAPs and reinforces the suggestion that the conclusions would not be impacted.

Finally, the analysis of the effect of including NSRs was mainly limited to the impact on statistical signal detection; no attempt was made to establish whether there is any effect on subsequent phases of signal management including signal evaluation. Potential areas of further research could include determining whether NSRs provide more evidence to decide whether to validate, or not, a signal, or whether the proportion of NSRs on the total number of reports for each SDR could be used as an additional indicator to prioritise, or not, the drug-reaction combination for review.

5 Conclusions

The November 2017 change in reporting rules to EV mandating submission of NSRs of suspected ADRs originating in the EEA has resulted in a substantial increase in the number of such reports received. This change has led to a small increase in the sensitivity of the routine statistical signal detection used to detect safety signals at the EMA and has not affected its efficiency. The number of SDRs per ADR detected and the median number of reports contributing to each SDR have remained constant.

Some new SDRs were detected when NSRs were included in the calculations and some SDRs were eliminated. The characteristics of these discordant SDRs were different both with respect to the profile of the adverse event and with respect to the products. These changes, even if impacting only a small proportion of SDRs, require long-term monitoring and careful scientific evaluation to understand whether there is any impact on the type of statistical signal detected and any implications for methods and processes used. However, at the current time and based on these results, it is not considered necessary to amend the current EMA practices in screening EV. The total proportion of NSRs is likely to increase over time and further monitoring of their impact will be required.