FormalPara Key Points

‘Black swan events’ are unexpected, severe situations that, in retrospect, can seem predictable. A subset of safety risks seen in pharmacovigilance are such black swan events.

Adequate data ingestion is core to safety. Machine learning (ML) during pharmacovigilance data ingestion should make both data intake and processing more effective and enable signal detection and management, or—at an absolute minimum—not hinder it.

Routine use of ML in pharmacovigilance data ingestion requires considering the potential for black swan events. It needs to support both adequate ingestion of familiar and common reporting patterns or attributes and unexpected changes in reporting that might signify a black swan event.

There are many manifestations of potential black swan events in pharmacovigilance data, but ML can be anticipated to support or enable the adequate ingestion of data on these events if ML best practice and—when needed—mitigatory strategies are employed.

1 Introduction

The ability to effectively capture, organise, compile, analyse, share, and act on highly heterogenous data in a timely manner is at the heart of pharmacovigilance [1]. As data volumes grow [2], coupled with the complexity and variability of international regulation requirements, there is a pressing need to ensure that emerging scientific and technological advances are harnessed to ensure data management remains workable and effective and that the focus remains on patient safety [3]. Pharmacovigilance systems are necessarily complex as they ingest data from very different sources. They are therefore made up of many distinct process steps with different purposes across data intake, processing, and reporting. All of this must happen in a timely and effective manner [4].

‘Intelligent automation’ uses automation technologies to streamline and scale decision making across organizations. Intelligent automation simplifies processes, frees up resources, improves operational efficiencies, and has various applications [5]. Intelligent automation incorporates robotic process automation and ML with other processes to enable rapid and effective intake, processing, and reporting of heterogenous data at scale. Of course, it is paramount that systems incorporating algorithmic learning can identify and appropriately manage unexpected, and even rare, data points to build trust in these systems.

2 Trusted Intelligent Automation and Data Ingestion

Given the large amount of manual effort routinely required in pharmacovigilance operations to effectively manage safety data, the promise of intelligent automation in pharmacovigilance is enormous [6, 7]. Rule-based systems are in production use, for example, in duplicate checking and data quality review [8, 9]. More and more opportunities for robotic process automation will be identified, and solutions and increasing numbers of solutions will therefore continue to have a clear impact on the automation of routine tasks. However, the complexity and breadth of medical knowledge limits the possibility of capturing all relevant information in rules [10]. The additional benefits of automation will be seen as ML capabilities in safety evolve to supplement rule-based data ingestion and communication [6]. ML has advanced enormously in recent years, and we believe that what are now standard ML approaches from other industries and areas can and will be increasingly applied in pharmacovigilance. Although the usefulness of ML will be limited for some current tasks by the current volume of training data [6], for other tasks—particularly those that are the same or similar to those in broader healthcare—no such limitation should be anticipated, e.g., voice input technology to facilitate work in electronic healthcare records [11]. Ultimately, ML will be routinely used across all parts of the pharmacovigilance data lifecycle, from data ingestion to analysis and augmented decision making, as automation becomes increasingly ‘intelligent’.

Effectively negotiating the regulatory challenges of validating emerging ML-based technologies is also key to building confidence and trust among pharmacovigilance stakeholders [12].

ML credibility and reliability are based on demonstrated learnings from the most seen types of training data ingested in development. Although this may be representative and voluminous, it does not represent the universe of potential inputs. Further trust can be built by appropriately identifying and learning from these outliers by, for example, computing a low confidence score or triaging for manual (human) assessment.

Intelligent automation solutions must ultimately show that a combination of rules-based systems and ML can ensure, as effectively and reliably as possible, that data in safety databases are promptly available for analysis and reflect all known and relevant information from a suspected adverse event (AE) occurrence (see Fig. 1).

Fig. 1
figure 1

ICSR process overview and how a ICSR ML enabled process could approach the challenge of preventing loss of fidelity of data. HCP healthcare practitioner, ML machine learning, PV pharmacovigilance

The use of ML in safety can be broadly distinguished by asking two questions [6]. First, can ML further our ability to find novel data patterns as we conduct signal detection or consider data aspects in our evaluations that might otherwise be suppressed or hidden and therefore missed (e.g., variable combinations for propensity score estimation in pharmacoepidemiology studies) [13]? Second, to what extent can we do existing data ingestion work, particularly the more routine operationally repetitive tasks, more efficiently and effectively through learning from data? Or take on new work that was not previously possible [4]? For example, ‘autoencoding’ verbatim text to International Conference on Harmonisation Medical Dictionary for Regulatory Activities (MedDRA) code AE terms during data entry [14] is an ML opportunity, assuming sufficient training data exist to enable ML to systematically handle the data, where pre-identified specific rules may be impossible or at least limiting. Although the first use is out of scope for this manuscript, we should note that the analytic challenges to resolve across these two pharmacovigilance application areas are often very similar (e.g., identifying medical events in free text) [15]. However, the aims of ML in data entry and data output analysis are very different: ML for data analysis on the outputs is to enable the ability to find interesting patterns and outliers; ML in automation focuses on effective learning from previous data to adapt processes and approaches more effectively. As an example, ML in data ingestion might help filter out noise and errors while ensuring that the original core data are represented with excellent fidelity and minimal information loss.

2.1 The Concept of Black Swan Events

Although unexpected findings affect all elements of drug (and vaccine) development, this is perhaps most keenly felt in the field of safety surveillance, where—in addition to surveillance of known or suspected safety issues and the potential occurrence in routine healthcare practice—the identification of previously unanticipated safety issues is of paramount importance. In refining our understanding of the safety of routinely used medicines, the capability to effectively identify previously unexpected safety signals is crucial [16].

Even a rare safety issue can lead to drastic changes in the recommended usage or even the withdrawal of a medicine or vaccine already used in routine healthcare delivery. An accurate understanding of the emerging safety profiles of medicines and vaccines during development is vital for effective and appropriate progressing of drug and vaccine development programmes.

The concept of ‘black swan events’ is of relevance to pharmacovigilance. Taleb [17] defines a black swan event as one of extreme impact that, although outside the realm of regular expectations (i.e., prospectively unpredictable), prompts humans to concoct explanations for its occurrence after the fact, making it seemingly explainable and predictable (i.e., retrospectively distorted). Humans are often irrational when considering risk and probabilities [18]. Sandman et al. [19] argued that the public’s response to risks is driven by the nature of the hazard and their outrage that such events could happen, accompanied by a strong urge to blame someone if and when they do occur. This tendency naturally leads to retrospective distortion as selective attention is given to different sources of risk. The social amplification of risk can be seen as an ex-post attempt to make up for failures to anticipate extreme events [20]. In addition to impact and retrospective predictability, all black swans are ‘outliers’ as they are unexpected given what has been seen before.

Europeans referred to a black swan as something impossible until actual black swans were discovered in Australia in 1697. The term black swan event has been perhaps most used in the financial industry, for example, to refer to the global financial crisis of 2008–2009 [21]. It has also been used to refer to the Ebola epidemic in Africa in 2015 [22]. Most recently, the COVID-19 pandemic has been referred to as a black swan event in terms of its widespread impact [23, 24].

Taleb’s definition of a black swan event as prospectively unpredictable depends on the observer. An event that was a black swan to one person (or algorithm) may have been predictable and therefore not a black swan to another. Presumably, at least some Australians in 1697 would have dismissed the notion of white swans. One needs to be aware of the possible existence of future potential black swans, and therefore systems and processes should detect, or at least flag during ingestion, data that might signify potential black swan events proactively and prospectively.

3 Black Swan Events in Safety Data

The concept of black swan events has only received cursory reference in the pharmacovigilance literature, and not in the context of intelligent automation [25, 26]. We consider a pharmacovigilance black swan event to be a drug or vaccine–AE combination that became a new, unexpected safety signal and then had a big impact on the benefit–risk profile of the medicine/vaccine, which some may have subsequently felt could have been identified earlier with the benefit of hindsight. For this manuscript, we define ‘big impact’ as a safety risk that significantly affects the known benefit–risk profile of a medicine or vaccine, leading to changes in medicine or vaccine utilization. ‘Unexpected’ could mean that data accruing are systematically different from what was previously in the pharmacovigilance system (data in general or, more narrowly, increased reporting of a specific drug/vaccine–AE pair; i.e., a ‘data outlier’) and/or that the emerging risk was unpredictable in terms of contemporaneous medical knowledge, irrespective of reporting pattern (a ‘medical outlier’). Table 1 shows examples of different historic pharmacovigilance safety signals and/or risks that could be considered black swan events. They are distinct in terms of the nature of medical knowledge at the time of risk identification or in terms of data outlier manifestation. Black swan events may be either medical or data outliers or both data and medical outliers. This table emphasizes the variety of potential black swan events that a pharmacovigilance system must look to identify a priori.

Table 1 Examples of different drug or vaccine–adverse event pairs that could be considered pharmacovigilance black swan events and how these exemplars can inform future machine learning data ingestion strategies

ML at its core requires learning from data, and algorithmic performance relies on sound, contemporaneous, up-to-date training data to learn. The use of ML in automated ingestion raises concern that outliers, including potential black swan events, may be smoothed away into the generality of the data as noise rather than attracting the focus they deserve as data points that may be systematically different from the other observed data and therefore require further investigation. Black swan events present one specific challenge to the trusted use of ML approaches in pharmacovigilance. ML researchers actively study the handling of ‘out-of-distribution’ data and providing uncertainty estimates for model predictions [27,28,29,30].

Much of the safety data to be ingested into pharmacovigilance systems are neither unexpected (e.g., reporting of well-established safety issues) nor serious (mild), and some have minimal potential to impact the understood benefit–risk profile of a medicine [3]. A credible ML-enabled intelligent automation process for data ingestion needs to appropriately ingest the data manifestation of these known/familiar/predictable AEs and appropriately handle the rare new information, including potential pharmacovigilance black swan events. The data manifestation of a potential black swan event would be data outliers in quantitative terms, i.e., drug–event pairs more frequently reported than might have been anticipated, or aspects of their reports or clinical outliers that are clinically convincing data on safety reports alluding to clinical novelty (see Table 1).

4 Machine Learning to Pre-Empt, Alert, and Mitigate Potential Black Swans During Safety Data Ingestion

In Sect. 4.1, we list some ML implementation strategies to pre-empt and alert to potential black swan events during data ingestion, so that signal detection with any approach is at least as likely to—as quickly as possible—identify the emerging safety issue as would have been possible with more manual data ingestion processes. Table 2 shows a range of examples of how ML might be applied to aspects of data ingestion and illustrates the pitfalls of injudicious ML implementation that could lead to poor representation of data in safety databases, potentially impacting signal detection. On the other hand, there is no certainty, of course, that manual curation and processing have always optimally helped in the identification of black swan events, and one can anticipate that, in the future, ML perhaps might identify some such deficiencies.

Table 2 Examples of using machine learning in the data ingestion process where inappropriate deployment could complicate downstream identification of an emerging signal

Further perspective is needed here: As discussed earlier, black swan events are considered from a specific human vantage point. Safety is littered with examples that some would consider black swan events, although some of these safety issues may have seemed unexpected from a contemporaneous medical knowledge viewpoint while exhibiting more typical patterns of accumulating data seen with many emerging safety issues. Such typical data patterns would be increasing volumes of reports describing a specific vaccine/drug–event pair or related concepts, etc., and the increasingly informative cumulative data conveyed on reports represents typical pharmacovigilance information, e.g., well-known coded adverse drug reaction terms and other information important in safety data assessment, so that the ‘unexpectedness’ is focussed on the lack of plausibility between the drug/vaccine and AE. Such safety issues would therefore present in such a way that effective intelligent automation with ML would be unlikely to be problematic from an algorithmic perspective in terms of case intake and processing, enabling the data to be efficiently handled with minimum human involvement while retaining the fidelity of the stored and transmitted data to that originally received. Some of the black swan event examples in Table 1 have data patterns similar to those seen previously (volume and nature of data conveyed on safety reports); some do not. Therefore, emerging issues with different data patterns require different mitigatory surveillance strategies. There may be a lack of data for some, or data may differ from what was seen previously. Although we concur with others that there is enormous potential for more effective learning from data for medicines and vaccines [31], practical approaches are needed to handle and learn effectively from ‘predictable’ data (i.e., data similar in properties to historic data such as ongoing reporting of common well-established AEs) but also to pre-empt and be aware of potential black swan events and optimally learn from data outliers. Such practical approaches could realize the full potential of data-driven technologies in pharmacovigilance and, fortunately, are well-studied in the technical ML literature; see the discussions of ‘out-of-distribution learning’ in Hendrycks et al. [30], Blundell et al. [32], Shafaei et al. [33], and Meinke et al. [34].

4.1 Considerations for Pharmacovigilance Use of Machine Learning to Avoid and Pre-Empt Potential Black Swan Events

The following points should be considered when using ML-based systems to enable pre-emption of and alerting to potential black swan events in data ingestion and related tasks in pharmacovigilance.

  • Have in place a pre-planned mitigatory process to provide essential confidence and trust in the use of ML systems. If a black swan does occur, have a process to deal with unexpected outcomes.

  • ‘Data drift’ can occur over time as models are used in production [35, 36]. That is, the data can gradually change over time, perhaps almost imperceptibly becoming very distinct from the data at production launch: This property of data drift needs to be monitored by defining and then complying with ‘significant change’ triggers for human review [37, 38]. Such a trigger is a metric or metrics that measure against model performance and, if a threshold is breached, will redirect for a human to review more extensively instead of the routine use of the ML model. Humans are generally better suited to deal with unexpected patterns outside of the expected scope, and significant change triggers will be a pre-defined threshold that forces the system to require careful human review over its trained model. Such periodic reviews may lead to changes to the ML model through retraining on new data and redeploying the updated algorithm.

  • Continually monitor for unanticipated changes in model performance metrics (e.g., leveraging extreme value theory, which focuses on statistical properties of rare events) [39].

  • Ensure metadata, which explain the data, are kept current for linking, grouping, and interpreting data fields for increased insight into the data profile.

  • Orthogonal data mining (i.e., using different algorithms that are as independent as possible to discover previously unknown patterns/outliers representing fragments or indicators of emerging black swans) reduces the probability of missing an outlier by using two distinct methods. Similarly, ensemble methods [27, 40] can also be used; these create multiple distinct models and then combine them to produce improved results.

  • Issue periodic model robustness challenges to find any weaknesses/limiting assumptions, such as the use of ‘adversarial attack’ approaches, which look to deliberately subvert otherwise reliable ML systems [41]; continually assess the impact of input to the model (e.g., with sensitivity analyses and input data perturbation); and observe changes in the amount or distribution of missing data. This approach would aim to understand model behaviour when interacting with unexpected data to create a more robust system or model.

  • Model prediction needs to account for rare events that do not occur in the training data; an ML method should always compute and make the level of confidence in predictions transparent to users and always have a way of returning ‘NULL’, ‘prediction unable to be made’, and ‘requires human review’ responses when prediction is not possible [42,43,44].

  • Frequently update the model with new training data to ensure ongoing model generalizability. Model retraining should be triggered by a metric(s) measuring a drift in the data to evaluate model performance. Cost-sensitive loss functions should be applied to judge the quality and capability of the models, which look to weigh up the cost of missing a rare event. If retraining is not triggered within a set amount of time, consider setting a time-based trigger for retraining.

  • Articulate clearly any uncertainty in ML models using methodological approaches. For example, if using an neural network model, add a diversity term in the loss function during training [45] and consider the uncertainty to the fullest extent possible in resulting use [46] to make the potential for when and where black swan events might occur as clear as possible to users.

  • Enhance credibility and useability to the extent possible by explainability, not only of how the model works and underlying assumptions but also regarding uncertainty in the model. Ensure trust in the evaluation and assessment methods used to develop the ML approach and maximize the reproducibility of ML analyses [47] to enable others to test and challenge the approach and reduce the chance of black swan events.

  • Employ existing explanation methods for prediction models to better understand the influential input parameters of a model and thereby enable one to determine whether a model is likely to be robust to changes in the environment or confounding factors [48].

  • Appropriate training of the users of the ML application is essential [49].

  • When practising ML, the pre-processing step is part of the model and should have precision and recall metrics associated with it, and the performance of the whole system should ideally be measured or constructed at once for ‘end-to-end learning’, starting from raw data with deep learning.

  • Data should never be smoothed or left out to make the ML easier to execute. Instead, ensure enough data, and use methods that address the ambiguity.

  • Critical analysis of known and likely mechanistic factors influencing treatment and diseases could be leveraged to better understand what factors of variation a model would have to be robust to and to formalise how changes in our understanding over time may necessitate model adaptation [50].

5 Conclusion

ML is increasingly being explored in pharmacovigilance as a component of pharmacovigilance systems to enable adequate data ingestion and related tasks. Systems for intelligent automation need to effectively manage common and reported safety data and rarer data points such as black swan events. Safety needs to harness the standard approaches used in the broader ML community as well as recent advances and implement ML systems that recognise the possibility of rare events and are designed accordingly. Black swan events are, by their nature, a fact of life. The design of ML-based systems for pharmacovigilance needs to recognize and actively plan for the possibility of such events.

With trusted and reliable strategies to avoid and mitigate potential black swan events, overall systems leveraging ML can realize their maximum impact for data intake and processing, making overall surveillance activities more effective, including for safety.