FormalPara Key Points

The development of a drug safety web tool was used as a vehicle to investigate challenges regarding the adoption of emerging “intelligent” technical paradigms in pharmacovigilance (PV).

In user feedback, actionability, i.e. facilitating the PV professional in decision-making processes, toppled other factors such as usability or explainability of analytics in user feedback as the cardinal challenge.

Heeding these findings, the PVClinical platform integrates heterogeneous drug safety data (spontaneous safety reports, observational studies, scientific publications, Twitter feeds) in a user-friendly interface.

Actionable systems such as the PVClinical platform are expected to significantly expedite the detection and prevention of adverse drug reactions (both in and out of the clinical setting).

1 Introduction

Adverse drug reactions (ADRs) are one of the major causes of morbidity and mortality, leading to severe hurdles in the development and authorization of novel therapeutics by industry and significant economic burdens to public healthcare providers. Indicatively, in 2018, Formica et al. estimated (1) the cost of ADRs to be between €2851 and €9015 for the inpatient setting and between €174 and €8515 for the outpatient setting; and (2) the impact of ADRs on the length of stay to be 9.2 ± 0.2 days (outpatient setting) and 6.1 ± 2.3 days (inpatient setting) [1]. Furthermore, the US Office of Disease Prevention and Health Promotion estimated that adverse drug events (ADEs) account for one in three of all hospital adverse events, relate to about 2 million hospital stays each year, and prolong hospital stays by 1.7 to 4.6 days. Regarding outpatient settings, each year ADEs account for over 3.5 million physician office visits, about 1 million emergency department visits, and approximately 125,000 hospital admissions [1,2,3,4]. Pharmacovigilance (PV) focuses on the collection and analysis of data regarding drug safety, and is formally defined as “the science and activities related to the detection, assessment, understanding and prevention of adverse effects or any other possible drug-related problems” [5].

PV activities aspire to identify and elaborate on potential new or partially documented ADRs (i.e. PV ‘signals’) and are currently driven by the investigation of individual case safety reports (ICSRs), along with data that derive from clinical trials, postmarketing surveillance and literature reviews. ICSRs are typically submitted by healthcare professionals (HCPs) or patients via spontaneous reporting systems (SRSs). The overall signal evaluation process can be further segmented into subprocesses (e.g. signal detection, triage, etc.). As part of this process, ICSRs are analyzed using specialized statistical metrics, mostly based on ‘disproportionality analysis’ (DA) approaches [6]. ICSRs play a prominent role in the overall risk management (RM) processes employed to validate that the risk–benefit relationship regarding a specific drug is beneficiary in terms of public health. This RM approach of combining and analysing data from various data sources to identify and mitigate risks is a laborious, mostly manually driven, process that requires multifaceted interpretation of the relative data through the scope of statistics, biological plausibility, disease pathobiology, pharmacology and alternative aetiologies due to potential confounders.

Beyond PV activities conducted by regulatory organizations or the pharmaceutical industry, ADR detection is also related to clinical practice. HCPs depend on clinical heuristic judgement that is laborious, time-consuming, and error-prone because it is heavily dependent on prior experience and specific case information. The management of serious ADR cases usually involves more than one clinical professional and iterative cycles of examining new findings, and critical thinking that will eventually lead to a differential diagnosis [7].

To this end, there is an obvious need for improvement in ADR investigation processes, both in the clinical environment and beyond (for example ADR signal management), in order to promptly consolidate information from multiple voluminous data sources, in a user-friendly fashion. To address this need, the prospect of engaging ‘intelligent’ technical approaches in the context of PV has previously been identified [8,9,10,11].

Currently, modern information technology (IT) approaches investigate new ways to support PV via data integration from medical and non-medical sources, such as scientific literature, biochemical databases (e.g. platforms with multiomics data, signalling pathway analysis and chemical properties of drug molecules) [12], electronic health records (EHRs), insurance claims or other observational databases, search engine logs and social media [13, 14]. While these data sources could provide a vast amount of data that could, in principle, be used for PV purposes, they also come with their own limitations and challenges regarding procedural, regulatory and technical aspects. The exploitation of these data sources has been an area of active research and the focus of numerous research projects and initiatives [15]. In this context, vast volumes of unstructured data (i.e. free text) are elaborated via the use of natural language processing (NLP) based on machine learning (ML) approaches, i.e. non-symbolic artificial intelligence (AI), e.g. via deep neural networks (DNNs) and support vector machines (SVMs) [16]. Furthermore, symbolic AI knowledge engineering (KE)-oriented approaches, such as the use of linked data and semantic web paradigms, are also actively investigated in the context of drug safety [17]. However, while ‘intelligent’ systems (ISs) seem promising, they are currently not widely adopted for PV purposes due to significant hurdles, which were recently analyzed by Bate and Hobbiger [18]. The importance and need to focus on these challenges in order to increase the potential impact of ISs in the healthcare domain as a whole has also been identified [19, 20].

PVClinicalFootnote 1 is a research project focusing on the development of a tool that could facilitate the investigation of potential ADR signals via the integration of heterogeneous data sources, i.e. SRS, EHR, social media and scientific literature. To this end, the PVClinical platform could be useful in various dimensions of ADR signal management, i.e. signal detection, strengthening and validation. Moreover, the PVClinical platform could also be useful in the clinical environment, as one of its main goals is to facilitate the investigation of potential ADRs by HCPs. In technical terms, PVClinical aims to build a web-based tool utilizing KE technologies, i.e. knowledge graphs built upon the Resource Description Framework (RDF)Footnote 2 and ontologies built using Web Ontology Language (OWL),Footnote 3 aiming to provide knowledge-intensive analytics adapted on the special characteristics of each data source using well-defined terminologies and ontologies as reference concept hierarchies. While a detailed presentation of the PVClinical platform is out of the scope of this paper, a preliminary design of the PVClinical platform, i.e. a first set of ‘user goals’/design objectives and its main information workflow has previously been presented [21].

In this paper, we highlight the key challenges that impinge on the wide adoption of IS approaches in the clinical setting and beyond, based on the experience of the PVClinical design process. To this end, we emphasize on the need to focus on actionability, which emerged as a top priority during this process.

2 Methods

As the PVClinical platform aims to be used by both PV professionals (pharmaceutical industry, regulatory organizations, etc.) and HCPs, it needs to be integrated into varying working environments, focusing on different types of end users applying different information processing workflows, with different kinds of goals/priorities. Thus, in order to be able to successfully integrate the designed platform into these heterogeneous contexts, a user-centred design approach was applied based on the methodology described by Natsiavas et al. [22], the main steps of which can be summarized as follows:

  1. 1.

    Analysis of the currently applied Business Processes (BPs) based on the respective user scenarios.

  2. 2.

    Definition of User Goals (UGs) upon the elaborated ΒPs, based on end-user feedback and a state-of-the-art analysis

As user input was identified as a first-class priority for the system design, Design Thinking was adopted as the overall methodological design paradigm across the above steps. ‘Design Thinking’ is a user-centred design approach that evolves through rapid, iterative cycles of ideation, prototyping and testing that is not yet widely adopted across healthcare software designs [23, 24]. This approach entails the active engagement of end users in the overall system design, potentially using several approaches (e.g. storytelling, interviews, use of paper prototypes etc.). In the context of the PVClinical project design process, several personal interviews and discussions with end users (more than 25 clinicians and 5 PV professionals) were conducted. These discussions/interviews also included the demonstration of prototypes, originally in terms of static mock-ups and progressively in the form of real interactive application prototypes. Furthermore, ‘think aloud’ sessions were conducted where end users navigated through the provided system functionality, expressing their thoughts (difficulties, challenges, etc.) while being recorded in order to further analyse their responses (Fig. 1).

Fig. 1
figure 1

Methodology overview for the design of a PVClinical project. PV pharmacovigilance

The need to elaborate on the respective BPs engaged with the various PV activities was early identified. BPs, which could also be referred to as ‘operational processes’, are defined as a collection of relevant and ordered structured activities/tasks aiming to produce a specific outcome [25]. For example, ADR evaluation can also be considered as a BP conducted in the context of a hospital, in tandsem with other BPs (e.g. patient treatment, administrative processes, etc.). The use of ISs could reshape the current practices of ADR assessment, which today are typically performed manually and lack systematic support of specialized IT tools, and consequently could have significant impact on the respective BPs. To this end, workshops and interviews with various stakeholders were conducted in order to identify and analyse these BPs. Based on this input, integrating ‘intelligent technologies’ aimed at supporting PV activities in the context of real-world healthcare activities was identified early on as a major challenge.

These challenges were also depicted in the so-called UGs, which outline the priorities raised by the end users. UGs are defined as “abstract user requirements, not directly referring to specific technical solutions or components” [22], directly attributed to specific user actors or ‘roles’. The definition of UGs facilitates the early identification and resolution of potential conflicts between actors. During the ‘user requirement analysis’ and ‘design’ phases of the PVClinical platform, UGs have been analyzed based on feedback that was given by clinicians and PV professionals.

3 Results

3.1 Business Processes

Tables 1 and 2 depict the relevant BPs on which an IS focusing on potential signal investigation could be used, as a result of workshops and interviews with clinicians and PV professionals. It should be noted that these BPs could be elaborated in different levels of granularity, and could consequently be analyzed in a more detailed level for each particular setting. While the presented BPs could be described with lower-level details, such a description would not offer much as these details are practically different in each organization or department (even clinics in the same hospital apply different BPs due to variations in patients’ treatments and the distinct structure of medical facilities). Therefore, describing these BPs in a lower level could not lead to generalizable conclusions, clearly identifying the need for a balance between the description of the distinct BPs and the need to avoid details.

Table 1 Business processes related to pharmacovigilance in the clinical environment
Table 2 Business processes related to pharmacovigilance out of the clinical environment

3.2 Business Processes Related to the Clinical Environment

3.2.1 BP1: Visit to the Outpatient Clinics

BP1 refers to patients visiting outpatient clinics and entails registration of the patient visit, their medical history stored in the hospital EHR, and the patient’s clinical examination and ePrescription.

3.2.2 BP2: Hospitalization

BP2 could be considered an extension of BP1 and refers to hospitalization and other clinical procedures typically conducted as part of the BP (e.g. surgery). As its final step, BP2 includes computerized physician order entries (CPOE), clinical notes maintenance, and patient discharge.

3.2.3 BP3: Quality of Healthcare Services Evaluation

BP3 relates to the evaluation of healthcare services in terms of quality assurance, based on clinically relevant metrics (ADRs, in-hospital infections, medical errors, etc.). These processes are conducted regularly in order to evaluate potential improvements in the clinical procedures applied, and they include the comparison of data produced by EHRs using the statistics provided by external data sources in order to identify critical differences.

3.2.4 BP4: Clinical Trials

BP4 relates to the design and execution of a clinical trial study, the definition of patient cohorts, data collection and curation, comparison with other clinical trials, and results reporting.

3.3 Business Processes Out of the Clinical Environment

3.3.1 BP5: Update of Periodic Safety Update Reports

BP5 entails the update of periodic safety update reports (PSURs) and points towards the review and statistical analysis of ICSRs, literature review, clinical trial data analysis and reporting to regulatory authorities.

3.3.2 BP6: Weekly Literature Review

BP6 refers to the weekly literature review, including the formation of queries containing keywords and synonyms against various literature sources.

3.3.3 BP7: Risk Management

Finally, BP7 relates to RM, and includes literature review, relevant clinical trial data review, and the calculation of risk factors.

3.4 User Goals

The identified UGs and their links to the respective BPs, based on end-user input and the respective workshops, are summarized in Table 3.

Table 3 User goals

Figure 2 qualitatively depicts the relationship of the UGs with the respective BPs. While it is clearly evident that BPs not related to routine clinical practice (i.e. BP4, BP5, BP6, BP7) support the biggest portion of UGs, it is also clear that a lot of UGs are related to clinically relevant BPs (BP1–BP3). Given the current low adoption of PV processes in the context of clinical treatment, this finding also implies that the integration of such IT tools could provide value for everyday clinical practice (Fig. 2).

Fig. 2
figure 2

Relationship between user goals and business processes (0 indicates no link/red; 1 indicates linked/green). UG user goals, BP business processes

3.5 “I Don’t Care About Analytics, I Need a Rule …”

Beyond the above analysis, the focus on the unstructured input during the discussions and ‘think aloud’ sessions is also very important. The general feeling gained from the end users was that while almost everybody realized the potential value of using such a system, they were also a little bit hesitant in terms of how this would be integrated into their practical work routine. To this end, a number of key verbal phrases were identified and were further elaborated (Table 4).

Table 4 ‘Think aloud’ feedback

It should be noted that these phrases do not reflect the overall end-user feedback but could be considered important ‘outlier’ points. As such, we highlight them because we argue that they provide useful insights. Interpreting them, we argue that beyond usability and explainability, the design of ‘intelligent’ IT systems should also be based on actionability, i.e. provide clear advantages in terms of the decision-making process instead of only providing figures and analytics. Furthermore, the lack of trust is also emerging as a key factor in terms of taking clinical decisions based on data and the quality of the data. As a whole, while providing numbers, figures and analytics might facilitate interpretation of data, practically, the end users also need clear guidance on whether they should trust these data (i.e. clarify if they could consider the data clear evidence of a potential PV signal) and how they should handle marginal situations in terms of rules or decision trees (e.g. thresholds on statistical measures).

3.6 Design Approach

One of the main goals of the PVClinical platform is to integrate heterogeneous information via multiple data sources, focusing on both established data sources such as SRSs and emerging data sources such as social media. To this end, while typically used statistical metrics (proportional reporting ratio [PRR], reporting odds ratio [ROR], etc.) can be calculated on sources such as SRSs, they are not applicable to other sources such as social media. Therefore, the need to apply different metrics and specialized user interface (UI) structures for each input data source was clearly identified. Thus, the platform is designed as an integrated web application, constituted of distinct workspaces, one for each data source (Fig. 3).

Fig. 3
figure 3

PVClinical platform design approach. UI user interface, RDF Resource Description Network, MedDRA Medical Dictionary for Regulatory Activities, ATC Anatomical Therapeutic Classification

3.6.1 Scenario Definition

Each potential ADR signal investigation corresponds to a ‘scenario’, typically related to a drug-event combination (DEC). Users can select a drug by using either the active ingredient or the trade name and the respective ADR via specialized UI controls (trees and free-text boxes), enabling multiple selections. The options provided are based on well-defined and widely accepted terminologies (i.e. World Health Organization Anatomical Therapeutic Classification [WHO–ATC]Footnote 4 and Medical Dictionary for Regulatory Activities [MedDRA]Footnote 5), providing hierarchically organized concepts that are stored in the form of knowledge graphs.

3.6.2 OpenFDA Workspace

The OpenFDA workspace provides an analytics gateway to the ICSRs referring to the respective investigation scenario, provided via the OpenFDA Application Programming Interface (API).Footnote 6 The provided analytics include various frequentist DA metrics such as PRR, LRT and ROR, and also more advanced statistical metrics that address the temporal component of signal detection (e.g. dynamic PRR, change-point analysis, change variance analysis, Bayesian change-point analysis) and could be extremely useful in premarketing PV processes (e.g. randomized control trials) that necessitate highly sensitive algorithms [26,27,28,29]. Furthermore, a ‘quick’ view is provided, emphasizing the need to provide information to HCPs in an ‘as simple as possible’ fashion and therefore facilitate decision making.

3.6.3 Observational Data Workspace

The observational data workspace enables the statistical analysis (e.g. based on pharmacoepidemiological metrics such as ‘incidence rate’) and visual representation of observational data (which could come from various sources, e.g. corporate data, data from the EHRs, claim databases and other SRS sources). Technically, the clinical data workspace is based on the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) and the overall software stack provided by the Observational Health Data Sciences and Informatics (OHDSI)Footnote 7 initiative.

3.6.4 Twitter Workspace

The Twitter workspace provides a visual overview of discussion trends on Twitter, as social media have been identified as a data source that might complementarily enhance the overall PV signal analysis process, and is actively investigated as an emerging data source for PV [30].

3.6.5 Literature Investigation Workspace

The scientific literature investigation workspace enables the quick identification of relative scientific publications via public and proprietary APIs, using PubMed as its original data source. In terms of functionality, the end user can annotate papers as being relevant or not and also keep notes for these papers.

3.6.6 Consolidating Report Workspace

The data presented in the various analysis workspaces are consolidated into one unified report, providing an overview of the collected data and also the remarks provided by the end user. The report produced can be extracted in a standardized FAIR principles-compatible format [31] based on RDF and can also be extracted in PDF format.

4 Discussion

PV activities are shifting from a traditional, passive paradigm of the SRSs to the paradigm of ‘active pharmacovigilance’, leveraging information from various available data sources. This significant overhaul in PV is expected to be driven by IT tools that utilize ‘intelligent’ technologies, also as part of everyday clinical practice in order to investigate or prevent potential ADRs.

However, these ‘intelligent’ technical paradigms come with limitations of various kinds [9, 18], the most important of which can be summarized as follows.

  • Various kinds of biases interfere with pre-existing knowledge and data typically used for training the respective algorithms.

  • The provided algorithms/tools face difficulty to perform well in real-world conditions where data are missing or are not perfect.

  • Many AI/ML models lack in terms of explainability, meaning they cannot be interpreted or explained in terms of human reasoning, significantly hindering the thorough validation of their outcomes, a process crucial for operations that are related to healthcare decisions.

Until now, the adoption of these digital approaches by PV stakeholders has been hampered due to significant limitations not only related to the technical challenges of AI/ML algorithms but also regarding the integration of these technologies as part of everyday workflow [32]. In this work, we provide insights gained via the design process of the PVClinical platform, building a KE-based platform for ADR assessment. The hurdles in adopting ISs in PV activities focusing on the clinical context are elucidated (at least partly) by the identified UGs of the PVClinical platform. They intersect with practically all the BPs elaborated on, however they can be generalized when referring to the adoption of ISs in the healthcare domain as a whole, beyond PV.

  • Fragmented Medical Datasets: Typically, the available datasets in a hospital refer to mostly unstructured, incomplete, semantically unaligned, and ‘siloed’ data among the various departments of a healthcare facility, e.g. hospital [33]. Moreover, when external datasets are available, they most frequently lack formal and computationally exploitable semantics. This special and semantic fragmentation of the available datasets prevents them from being aggregated and their integration could, in principle, be significantly facilitated by KE approaches (e.g. via the Linked Data paradigm and the use of Semantic Web technologies).

  • Inherent Technical Pitfalls of Intelligent Systems:

    • Versatility is a huge issue because, in the case of ML, most algorithms operate within very specific scenarios, albeit the real-life demands of clinical operations entail managing a multitude of heterogeneous sources, including ‘dirty’ or incomplete data. The increasing versatility of ‘intelligent’ algorithms is not a trivial matter but could be facilitated by research networks where ‘real-world’ data would be used to validate algorithms under development (e.g. following the OHDSI initiative modelFootnote 8).

    • Validity is also another major concern that must be addressed, as, typically, in the healthcare context, tools/methods, etc. are systematically validated and regulated (e.g. via processes including clinical trials and well-defined RM approaches). Hence, the open availability of these IT tools (e.g. ‘intelligent’ algorithms) could facilitate their wide validation [34].

    • Interpretability is a very important issue in relation to the application of ISs in the healthcare setting, as, many times, ML algorithms in particular are viewed as a ‘black box’, which hides the reasoning process producing the outcomes/results. While this might be considered as a benefit for other purposes or domains, in the healthcare setting it is not acceptable, as providing a clear explanation on why an IS provides an outcome is essential in the healthcare setting.

  • Usability: User friendliness significantly affects the adoption of ISs, both in and out of the clinical setting. The pace in which doctors interact with patients and other clinical scientists is gruelling, therefore any ISs should generate outputs rapidly and with precision, in a concise, reproducible, and validated manner [35]. To this end, a key issue identified is the need to minimize necessary user interactions, which might be disrupting, as even in critical systems, alert fatigue can significantly reduce acceptance. Furthermore, focusing on the use of ISs, a major ergonomics issue is raised: How should an end user interact with (semi)automatic ‘intelligent’ software processes (e.g. an ML algorithm or formally stated knowledge structures)?

  • Legal Issues: Legal, ethics and regulation issues should also be identified as an important factor regarding the acceptance of ISs in the healthcare setting. For instance, the liability of clinical scientists in cases of malpractice are vague and therefore the legal framework should be elucidated, and potentially regulated, as it could disrupt the diagnosis, patient stratification, and therapy processes, and beyond [34]. Obviously, these considerations overlap with ethics issues. For example, the concept of consent, one of the main legal and ethical cornerstones, needs to be adapted, as getting the concept of a patient to process his/her data using ML or KE methods when he/she does not really understand how these algorithms work is pointless and ethically questionable.

  • Information Security: IS outcomes depend heavily on datasets, either in order to train ML algorithms or to construct computationally exploitable knowledge structures (e.g. ontologies). Thus, major issues are raised regarding data-based biases and potentially malicious data management.

The above challenges have already been elaborated, to some extent, in various articles [18,19,20, 36]. In this paper, we describe the user-centred design approach applied in the PVClinical platform design, based on a ‘design thinking’ approach and ‘think aloud’ sessions. Typically, usability studies based on ‘think aloud’- or ‘design thinking’- based approaches engage a small number of end users. Therefore, the data supporting the respective conclusions are insufficient and subjectively interpreted. As such, these data could be considered (at least to some extent) biased and this could be considered a limitation of the present study too.

Based on this work, we argue that beyond these challenges, actionability should be defined as one of the top priorities for the design of ISs. In using the term actionability, we refer to the ability of exploiting the system functionality provided, not only in terms of better understanding or explaining the data but also in terms of decision making regarding healthcare, regulatory or administrative issues. Based on the BPs elaborated on and the insights provided by end users, actionability would be the term used to summarize the need to proceed further rather than just navigating among the data, towards using the BPs as part of a concrete decision-making process. For example, identifying specific metrics and thresholds for each data source (or combination of data sources) could significantly facilitate the overall interpretation, and therefore the benefit, of adopting such tools. Especially in the context of clinical practice, where HCPs are not very familiar with PV metrics and their priority is not to elaborate on the respective statistics but rather make a clinical decision, it is crucial to provide them with not only data but also specific actionable guidelines, e.g. well-defined and clear-cut statistical value thresholds, facilitating their decisions. To this end, while the guidelines regarding the adoption (testing, etc.) of ‘intelligent’ applications in the context of Drug Safety is actively evolving (e.g., the US FDA has recently published an action plan on the use and regulation of ‘Artificial Intelligence/Machine Learning (AI/ML)–Based Software as a Medical Device (SaMD) Action Plan’ [37]), the discussion is still not yet sufficiently focused on the decision-making process (Fig. 4).

Fig. 4
figure 4

Combination of symbolic and non-symbolic AI technical paradigms, along with the use of emerging data sources to tackle pharmacovigilance goals and challenges. AI artificial intelligence

Based on a white paper produced by Oracle emphasizing the data challenges in PV [38], over 60% of PV stakeholders deploy or plan to deploy ISs. In order to overcome the above challenges, both technical and procedural advances are required. In terms of technical approaches, many of the above challenges are imposed by the hype of using ‘black-box’-based ML algorithms (non-symbolic AI), which provide no clear explanation of the reasoning process producing the respective outcome. We argue that KE-based approaches (symbolic AI) should be more heavily employed and alternative schemes such as hybrid intelligence [39] should also be investigated. Regarding the procedural issues, the need to move beyond data science to clinically related validation schemes is emphasized [40]. Furthermore, it is also evident that organizations need to prepare before adopting ISs in everyday practice [41]. Particularly regarding the information security challenges, a threat analysis or gap analysis [42] should be conducted prior to the deployment of ISs in order to mitigate potential risks. Furthermore, the barriers and facilitators regarding the adoption of IT systems in healthcare should also be taken into account [43].

5 Conclusion

Conclusively, we argue that the use of ISs in healthcare is moving towards the ‘trough of disillusionment’ in terms of the Gartner hype cycle,Footnote 9 with some prominent examples showing great promise without yet confirming them in real-world healthcare practice [44]. However, given the advancement pace of ISs, their wide adoption in other domains and their huge potential benefits, their future use in the healthcare setting, including for PV purposes, seems certain, in spite of the lack of their current adoption. The development of ISs and their potential benefits and risks could be considered in analogy with the challenges imposed by the development of drugs in the 20th century. In terms of drug safety per se, beyond the ongoing ‘disillusionment’ phase, the integration of ISs should take into account user-centric design approaches to identify operational and usability gaps in order to facilitate the adoption of ISs and maximize their potential impact, further elaborating on ‘actionability’ aspects.