FormalPara Key Points

Safety of immune checkpoint inhibitors (ICIs) is a currently unsettled issue in clinical practice, requiring multidisciplinary expertise and heterogeneous skills of oncologists to early diagnose and timely manage immune-related adverse events (irAEs).

In the recent past, a plethora of real-world pharmacovigilance studies have characterized epidemiological features of irAEs, namely spectrum, kinetics, co-reporting, and fatality rate.

Considering the agnostic approval and evolving uses of different ICIs, including drug combination with targeted therapy, pharmacovigilance still plays a pivotal role in the postmarketing assessment of irAEs, thus supporting oncologists in optimal management.

1 Introduction

The advent of immune checkpoint inhibitors (ICIs) caused a paradigm shift both in drug development and in the treatment and prognosis of a wide range of different cancers, including melanoma and non-small cell lung cancer (NSCLC), with sustained long-term remission and prolonged survival [1]. However, by virtue of their mechanism of action, namely blocking either cytotoxic T-lymphocyte antigen 4 (CTLA-4) or programmed cell death 1 (PD-1), or its ligand (PD-L1), the excessively activated immune system results in off-target toxicities characterized by a unique and distinct spectrum of adverse effects, the so-called immune-related adverse events (irAEs) [2], with a different mechanistic basis depending on the damaged organ [3].

As a consequence of the rapid extension of therapeutic indications in different tumor types (the agnostic approval of pembrolizumab is a paradigmatic example) and settings, clinicians are increasingly facing both common and rare irAEs, which can virtually affect any organ or system in the body, requiring new skills to timely diagnosis, and a multidisciplinary approach to successful patient management [4,5,6,7,8].

Although most common irAEs were identified during preapproval clinical development, their assessment and clinical characterization in terms of spectrum, timing and outcomes were only recently investigated through real-world, large-scale pharmacovigilance analyses. In particular, the analysis of spontaneous reporting systems (SRSs) allows real-time long-term monitoring of the safety profile of medicines, and also timely identification of rare irAEs worldwide [9].

In the recent past, pharmacovigilance studies have attracted considerable interest in the medical literature and a number of articles have been published: a search in MEDLINE using a combination of simple keywords (e.g. adverse event reporting system, spontaneous reporting system, pharmacovigilance database, FAERS, Vigibase, Eudravigilance) yielded 3283 articles (search performed on 15 January 2020), of which 1445 were published in the last 5 years (390 in 2019). This scenario may be attributable to multifold reasons, including the increased awareness by clinicians regarding the importance of pharmacovigilance (namely postmarketing surveillance) and relevant commitment of submitting AEs for regulatory purposes, public availability of large SRSs, and open-access tools to independently analyze international databases [10].

In particular, the so-called disproportionality analyses (DAs) are now the most frequently epidemiological approach for postmarketing safety assessment, and the term ‘association’ is increasingly recorded in major specialized journals. When properly conducted, the performance of these studies is noteworthy for signal detection (i.e. the capacity to discriminate true from false positive drug–event associations) [11]. However, consolidated expertise is required to minimize bias not only during study conception and design but also in the proper interpretation of study results to avoid common mistakes by clinicians: incorrect use of SRSs to infer causality, assess the incidence of AEs, or provide risk stratification, which may ultimately result in unjustified alarm. Therefore, the debate regarding the proper use of these studies and the benefit of their publication, which arose in 2011 [12, 13], is now pressing [14, 15].

The case of oncology, and in particular immunotherapy, is not an exception, and a plethora of postmarketing DAs have documented the occurrence of various toxicities with ICIs in the real world [16, 17]. Therefore, clinicians may wonder whether and how these data will impact their routine practice, especially in the era of real-world evidence and relevant debate on their complementary role with randomized controlled trials (RCTs).

In this critical appraisal, we summarized the current landscape of pharmacovigilance studies in order to derive take-home messages for clinical practice, in particular what these studies add to current knowledge, what is still missing, and how to address knowledge gap by future pharmacovigilance approaches. To this aim, a brief primer to DAs is first offered to guide clinicians towards better understanding of their clinical and research implications.

2 Pharmacovigilance in Oncology: Challenges and Opportunities

Pharmacovigilance in oncology is not straightforward compared with other medical areas. Variegate clinically significant toxicities have been described both for old-fashioned chemotherapy and for targeted therapies, including biological agents and immunotherapy [18,19,20]. Frequent use of multiple therapeutic regimens makes it difficult to disentangle the adverse effects of individual drugs versus drug–drug interactions versus ‘innocent bystander’ effects [21, 22]. Moreover, complexity of patient histories results in high potential for confounding and effect modification (i.e. drug–disease interactions). Finally, the unique benefit–risk consideration may result in a higher threshold for recognizing and reporting AEs.

In the field of immunotherapy, the term ‘immunovigilance’ was also proposed [23], considering the documented increasing uptake and expectations of ICIs in clinical practice, as well as suboptimal knowledge about their actual safety profile due to accelerated approval and inherent limitations of RCTs in detecting rare toxicities. Therefore, immunotherapy likely represents a privileged setting to perform fourth-generation pharmacovigilance studies, namely big data, to assess emerging toxicities in real life, thus promoting safe prescribing.

3 Disproportionality Analysis: A Primer for Oncologists

As a general note, DAs alone cannot be used per se to assess a drug-related risk and cannot replace clinical judgment in the individual patient; however, their role is undisputable and unreplaceable for rapid detection of rare and unpredictable AEs with a strong drug-attributable component, especially when combined with a case-by-case analysis (individual inspection of reports for causality assessment) and detailed descriptive data (reporting trend, demographic data, concomitant drugs, indications, outcome). Until now, there has been no recognized gold standard among disproportionality measures (frequentist or Bayesian approaches). In particular, frequentist methods such as the reporting odds ratio (ROR), proportional reporting ratio (PRR), and relative reporting ratio (RRR) are based on the case/non-case approach: if the proportion of adverse events of interest is greater in patients exposed to a specific drug (cases) than in patients not exposed to this drug (non-cases), a disproportionality signal emerges and subsequent analytical investigations are usually required before taking regulatory actions [24].

While there are common thresholds for defining statistical significance (using a 95% confidence interval [CI]) and signaling criteria to perform and claim disproportionality (usually three or more cases are necessary), there is debate on relevant interpretation and clinical implications. Although, as a general rule, the higher the ROR/PRR/RRR value, the stronger the disproportion, disproportionality measures are only an indicator of risk (not a measure of risk) and express the extent of reporting (a proxy of actual occurrence). Only for a limited set of 15 known drug–event associations, disproportionality measures significantly correlated with the relative risks of analytical studies, thus providing an initial indication of the likely clinical importance of an adverse event [25].

Considering limitations of both spontaneous reports and pharmacovigilance databases (lack of exposure data, inability to firmly infer causality, the general under- and overreporting), incidence, reporting rate, risk quantification, and risk ranking cannot be claimed. The term ‘association’, usually found in several articles, should be used carefully to indicate that statistically significant disproportionality exists (referred to as a signal of disproportionate reporting or disproportionality signal). Notably, there is no consensus on whether the lack of significant disproportionality is actually a true negative finding (i.e. absence of risk) and, more importantly, on the controversial interpretation of negative disproportionality as a potential inverse causality (i.e. a signal of protective effect potentially indicative of a ‘beneficial drug reaction’) [26, 27]. In fact, negative findings from the disproportionality approach may be the result of comparator choice (see below) or the consequence of a ‘dilution effect’ by other frequently reported toxicities (disproportionality measures are interdependent).

Only under stringent criteria (no major distortions in reporting patterns, similar marketing approval, comparable penetration and utilization), can adverse event rates be compared, and disproportionality by therapeutic area may also be considered (i.e. comparing a drug with other agents within the same therapeutic class). Notably, the choice of comparator is a critical issue in pharmacovigilance because it may substantially impact on results. Disproportionality by therapeutic area can be viewed as an intraclass analysis to provide a clinical perspective and reduce the so-called indication bias by selecting a dataset where only anticancer therapies are represented, using relevant reports as a proxy of exposure; this allows to identify a real-world subpopulation that presumably shares at least a set of common risk factors. It is anticipated that, in the majority of DAs published in the literature, the analysis by therapeutic area was not performed: ICIs were compared with all other drugs in the database, but direct comparison between anti-PD-1/PD-L1 and anti-CTLA-4 monotherapy was carried out. The latter approach is likely influenced by disease-specific (rather than treatment-specific) factors, considering that melanoma patients largely comprise the anti-CTLA-4 group, and may have distinct demographic and toxicity proclivities, compared with the more pan-tumor population treated with anti-PD-1/PD-L1.

The existence of biases represents another technical issue when planning and designing DAs. In the oncological area, there are some peculiarities that warrant discussion. The so-called channeling bias (selective prescription of newer drugs to patients with more severe disease) is unlikely to be fully accounted for even through analysis by therapeutic area, considering the existence of first-, second-, and third-line therapies. Moreover, a non-negligible proportion of reports describe drug indication as an adverse event, a phenomenon known as indication bias. Finally, there is still no consensus on the interpretation of the outcome ‘death’ and reports of ‘drug ineffective’ [28, 29].

Therefore, when approaching DA in oncology, the existence of specific biases and relevant minimization strategies should be considered (Table 1). For a systematic discussion of all technical aspects in study concept, design, conduction, presentation, and discussion of results, the reader may refer to dedicated book chapters [10, 24] and expert documents such as the article on Good Signal Detection Practices [30].

Table 1 Key conceptual and technical challenges when performing disproportionality analysis in oncology

4 Methods

To summarize the current landscape of pharmacovigilance studies, a search strategy was performed in MEDLINE to extract relevant articles (performed as of 25 February 2020). This overview was not intended to be comprehensive, and eligible studies were postmarketing analyses performed on international SRSs, aiming to specifically investigate the safety/toxicities of ICIs, with minimum information to describe the epidemiology of irAEs. Notwithstanding the potential contribution and value of national databases, these studies were excluded after considering (1) the aim of the review (to provide a global perspective); (2) the limited catchment area of national archives and relevant influence of reporting pattern by local prescription features; and (3) better suitability and performance for drugs with well-established use [31]. Conversely, all types of pharmacovigilance analysis were included, using both a disproportionality and descriptive approach, considering the importance of a case-by-case assessment.

Therefore, a combination of the following keywords was used: spontaneous reporting system, spontaneous reporting database, pharmacovigilance database, FDA Adverse Event Reporting System, FAERS, Vigibase, Vigilyze, Eudravigilance, disproportionality analysis, disproportionality study, real-world study, toxicities, immune-related adverse events, immune checkpoint inhibitors, checkpoint inhibitors, anti-PD-1, anti-PD-L1, anti-CTLA-4, nivolumab, ipilimumab, pembrolizumab, atezolizumab, and durvalumab. Snowballing of retained articles was performed. For each included study, the following information was extracted: database used, time window of the analysis, type of analysis (descriptive or disproportionality), disproportionality measure (e.g. ROR), strategies for minimization of bias, irAEs of interest, relative frequency, fatality rate, time to onset and other key findings (combinations vs. monotherapy, anti-PD-1/PD-L1 vs. anti-CTLA-4). The relative frequency was derived from presented data by dividing the number of cases of a given irAE with the total number of events recorded for ICIs; the fatality rate (i.e. the proportion of death reports of the total reported events) was extracted from original studies.

5 Results

After full-text analysis of 36 potentially eligible studies, 30 met the inclusion criteria and were finally retained [9, 16, 17, 32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58]. Of these 30 studies, 14 performed DAs (Table 2) and 16 were only based on a descriptive design (Table 3), mainly due to the rarity of irAEs under investigation. It is anticipated that a direct comparison among studies, even those analyzing the same topic, cannot be formally carried out because of technical issues, including the time window of the analysis, removal of duplicates, drug codification, and calculation of disproportionality (disproportionality measure, comparator and adjustment).

Table 2 Overview of disproportionality studies
Table 3 Descriptive pharmacovigilance studies

The US FDA Adverse Event Reporting System (FAERS) and Vigibase (the largest worldwide databases collecting spontaneous reports) were exploited to perform DAs (using both frequentist and Bayesian approaches with traditional signaling criteria); no specific techniques were implemented to minimize potential bias, with the exception of dynamic disproportionality (i.e. calculation of disproportionality over time). The vast majority of studies compared ICIs with all other drugs in the database (only one study calculated disproportionality by therapeutic area and compared ICIs with anticancer agents) and also compared reporting of anti-CTLA-4 with anti-PD-1/PD-L1 medicines.

Only four studies (three disproportionality approaches) investigated the overall safety profile aiming to describe the spectrum of reporting and relevant differences between anti-PD-1/PD-L1 and anti-CTLA-4 monoclonal antibodies, while the vast majority focused on specific irAEs with relevant characterization: 11 articles described a single rare irAE (e.g. myocarditis, hypophysitis, hepatitis, systemic lupus erythematosus, multiple sclerosis, inflammatory bowel disease, hemophagocytic lymphohistiocytosis), while seven addressed a larger spectrum of events, including cardiovascular, neurological, hematological, ocular, skin, and endocrine toxicities. With few exceptions (myocarditis, hematological events), studies were not overlapping.

The relative frequency (for simplicity, referred to as reporting proportion) varied slightly depending on the event of interest, with endocrinopathies representing up to 8.6% (3.8% thyroid dysfunction), followed by pneumonitis (2.7%), hypothyroidism (1.8%), hyperthyroidism and colitis (1.5% each), and hypophysitis (1.3%). The vast majority of other irAEs have a reporting frequency of < 1%. The median fatality rate was 20.4% across all studies providing data, and ranged considerably across irAEs: from hyperthyroidism and hypothyroidism (0.15–0.93%) to myocarditis (consistent estimates among studies, ranging from 39.7 to 62.5% when co-occurring with myositis and myasthenia gravis) (Fig. 1). Median time to onset was 46 days, with earlier occurrence for combination regimens, compared with monotherapies, especially in the case of fatality (14.5 days vs. 40 days) [33].

Fig. 1
figure 1

Scatter plot showing reporting proportion (i.e. rates of a given irAE compared with total reports) and fatality proportion (i.e. rates of reports where death was recorded as the outcome compared with total number of reports) of the different toxicities with immune checkpoint inhibitors. Data were extracted from published studies. In cases where more than one estimate was available, the highest value was used (worst-case scenario). DRESS drug reaction with eosinophilia and systemic symptoms, SJS Stevens–Johnson Syndrome, TEN toxic epidermal necrolysis

6 Discussion

6.1 Key Messages from Pharmacovigilance Studies

Overall, the collected body of evidence from postmarketing real-world settings underline the urgent need to assess the actual epidemiological impact and risk–benefit profile of ICIs in clinical practice: a remarkable exponential increase in reporting consistently emerged in all studies. This may be viewed not only as a general phenomenon of pharmacovigilance (i.e. a direct consequence of latest European legislation, and increased awareness of clinicians regarding the culture of reporting) but also as a specific issue of immunotherapy: perceived expectations, rapid approval of drugs, and extension of therapeutic indications in different oncological settings. The case of pembrolizumab (first agnostic approval by the FDA) is noteworthy, and the drug has also been tested as a potential treatment option for progressive multifocal leukoencephalopathy [59]. Moreover, newer anti-PD-L1 agents such as atezolizumab have recently been approved in small-cell lung cancer and triple-negative breast cancer, with promising results [60, 61]. This scenario supports the need for continuing monitoring, especially for newer anti-PD-L1 agents.

6.2 What’s New: Spectrum and Epidemiology of Immune-Related Adverse Events (irAEs)

Collectively, ICIs cannot be considered as an homogeneous class, not only from a pharmacological viewpoint (different pharmacokinetic features) but also therapeutically, especially from a safety perspective. There is consensus that anti-CTLA-4 and anti-PD-1/PD-L1 drugs have different spectrums of toxicities, and, overall, patients receiving anti-PD-1/PD-L1 antibodies have a lower incidence of any grade irAEs than those under anti-CTLA-4 agents, with combinations increasing the incidence, severity, and onset of irAEs [62, 63].

Data from pharmacovigilance, especially a disproportionality approach, corroborated and extended this knowledge regarding class-specific patterns of irAEs. Based on DAs, gastrointestinal (colitis) and endocrine disorders (hypophysitis, adrenal insufficiency, hypopituitarism) were more frequently reported with anti-CTLA-4 drugs, whereas thyroid dysfunction, pneumonitis and myocarditis were preferentially recorded with anti-PD-1/PD-L1 agents. Although autoimmune hepatitis appears to have a similar reporting frequency, in both anti-CTLA-4 and anti-PD-1/PD-L1 drugs, cholangitis showed higher reporting with anti-PD-1/PD-L1 monoclonal antibodies. This is corroborated by recent case series, and can manifest as ‘large duct or small duct cholangitis’, and may have different clinical presentations, biochemical evolution, and outcome, including secondary sclerosing cholangitis and vanishing bile duct syndrome, even after discontinuation [64,65,66,67,68,69,70].

From an epidemiological perspective, the reporting proportion of irAEs (compared with all other events) appears very low, and only a minority of toxicities reached a relative frequency higher than 2% (thyroid dysfunction and pneumonitis). The first pharmacovigilance study aiming at characterizing immune-related signals found 1018 irAEs, corresponding to 2.6% of all reports with ICIs. These data underline that nonimmune-related reactions are substantially reported with ICIs, such as fatigue, pruritus, nausea, vomiting, diarrhea, and musculoskeletal complaints, in line with evidence from RCTs [71, 72].

Although increased reporting over recent years was documented for serious events in the FAERS [73], the phenomenon of underreporting is well documented in pharmacovigilance but may vary considerably depending on the various drugs and the type of event; a recent analysis using FAERS reports and US sales estimated that only 20% of serious events were actually reported with biologics (i.e. underreporting can reach 80%) [74].

6.3 What’s New: Kinetics of irAEs

According to RCTs, the frequency of irAEs is dependent not only on the agents used, exposure time and the administered dose but also on the patient’s intrinsic risk factors. Conversely, the timing (kinetics) of appearance is often dictated by the affected organ system [75]. Variable onsets have been described for the different toxicities, from early occurrence (first week) to delayed events (up to 26 weeks for acute kidney injury), with 4–12 weeks as the main window of onset.

Based on data from pharmacovigilance, the first 2 months can be considered as the ‘critical pharmacovigilance window’ because the median onset was ∼ 40 days and the vast majority of events reported to international SRSs occurred in this period. Delayed events were described, including rheumatic diseases (median onset of 81 days for arthritis and mean onset of 196 for systemic lupus erythematosus), diabetes mellitus (mean onset 116 days), serious cutaneous reactions (mean onset of 65 days for anti-CTLA-4 drugs), and thyroid dysfunction (median of 92 days with anti-PD-1/PD-L1 agents). Notably, for myocarditis, both rapid (median 16.5 days in combination regimens) and late events are described (median 178 days).

6.4 What’s New: Fatality Rate, Concurrent irAEs

The lower than expected reporting registered for some irAEs, such as hypothyroidism (compared with estimates from RCTs), likely suggests that multiple irAEs are reported in the same patient. In fact, overlap was recorded among irAEs by several studies, not only for the most frequently reported irAEs (endocrine, respiratory, and hepatobiliary disorders) but also among rare toxicities, including myocarditis, myositis, myasthenia gravis, neurological disorders, pneumonitis, and hepatitis.

Several pharmacovigilance studies also characterized the reporting of ICIs in terms of fatality. Collectively, these data pointed out that one-third of reported events have a fatal outcome (i.e. death was recorded as the final outcome of the reaction), and special attention should be paid to combination therapies and myocarditis events. Substantial fatality rates also emerged for toxic epidermal necrolysis (36%), hemophagocytic lymphohistiocytosis (26%), myositis (21%), hepatitis and myasthenia gravis (19% each), and pneumonitis (17.5%). The spectrum of fatal irAEs differs between agents: colitis was the most frequent cause of death in patients receiving anti-CTLA-4 drugs, whereas pneumonitis and hepatitis fatalities were mainly attributed to anti-PD-1/PD-L1, and myocarditis has the highest fatality rate overall [33]. These data underline the importance of timely recognition and early management of irAEs in order to limit the need for treatment interruption and minimize fatality.

6.5 What’s Missing: Management

Disappointingly, there are no clues on management strategies such as ICI discontinuation and treatment with corticosteroids. Several guidelines on the management of irAEs have been published, providing comprehensive general treatment algorithms for most of the frequently occurring irAEs, with clear recommendations on immunosuppressive treatment, including duration and tapering based on the severity of the irAE [76,77,78,79].

The reader should refer to dedicated guidelines for specific toxicities, especially for severe and/or treatment refractory irAEs [80]. In this evolving field, only expert opinions are available, and a personalized algorithm was recently proposed based on the immune-pathological pattern of each patient [4]. With few exceptions, ICIs should be continued with close monitoring for grade 1 toxicities, whereas grade 2 toxicity requires drug interruption, with permanent discontinuation advocated for high-grade toxicities (except for thyroid dysfunction) [see the Electronic Supplementary Material]. Overall, the decision to reintroduce the drug(s) should be made on an individual basis. As the use of ICIs becomes more common, the number of rare irAEs will increase, thus making suspicion and differential diagnosis vital for treatment and tailored management strategy [81].

Among the various toxicities, work-up and management of liver injury is challenging for different reasons [82], including the use of ICIs in hepatocellular carcinoma [83] and chronic hepatitis B infection, as well as the common presence of hepatic metastasis [84, 85]. Moreover, the type of liver injury does not affect work-up, whereas thresholds of liver function tests represent the main criteria for diagnosis and relevant management. Data from a French pharmacovigilance register of 536 patients with grade 3 hepatitis highlighted the importance of liver biopsy for a patient-guided approach to avoid corticosteroids [86], whereas analysis of 31 case reports of sclerosing cholangitis, besides highlighting clinical and pathological features, documented poor response to corticosteroid therapy [87]. A retrospective study conducted at MD Anderson Cancer Center on 5762 patients receiving ICIs found that 2% developed hepatotoxicity (especially in combination treatment), of whom 69% permanently discontinued relevant ICIs; 10 of 67 patients receiving corticosteroids had recurrent hepatotoxicity after the corticosteroid taper, and 31 patients resumed ICIs after transaminases improvement, 8 of whom (26%) developed recurrent hepatotoxicity. Notably, no differences emerged in terms of characteristics of liver injury, response to corticosteroids, and outcomes between individuals with and without possible pre-existing liver diseases [88].

6.6 What’s Next: An Evolving Research Agenda

Although specific knowledge gaps can be identified for each irAE, there are also common unsettled issues representing a research priority for next real-world pharmacovigilance studies. In particular, we can identify three main grey areas, with additional uncertainty in special populations and patients with pre-existing autoimmune diseases [89]. First, the role of rechallenge in patients’ management, and the relevant impact on the outcome, is unclear. The re-introduction of ICIs in the same patients with previous irAEs poses ethical and scientific issues, especially for anti-CTLA-4, which are contraindicated in cases of severe colitis due to the high risk of relapse and/or bowel perforation. Conversely, there are preliminary data for which rechallenge appears feasible and acceptable with anti-PD-1/PD-L1 agents. In a cohort of 93 patients, 40 were rechallenged with the same anti-PD-1/PD-L1 agent; the same or a different irAE occurred in 55% of patients, with similar severity but shorter time linked to the occurrence of a second irAE (9 vs. 15 weeks) [90]. The latest Vigibase survey found recurrences in 28.8% of patients with irAEs (of 452 informative rechallenges), with higher rates for colitis, hepatitis, and pneumonitis [91].

Second, the predictive value of irAEs, namely actual correlation between occurrence of irAEs and effectiveness of ICIs, is a promising area of investigation, with preliminary prospective data [92]. While earlier studies in patients with melanoma suggested no association between irAE onset and anti-CLTA4 efficacy, several lines of evidence suggest that irAE onset is predictive of anti-PD-1/PD-L1 response, namely progression-free survival, overall survival, and overall response rate, across a variety of solid tumors. However, before adopting clinically irAE onset as a predictive biomarker for ICI response, larger prospective studies are needed to better understand the nuances between irAE characteristics (severity, site, management, timing of onset) and ICI effectiveness [93]. In fact, a potential confounding factor in this relationship was identified: the notion that patients who experience irAEs are those who remain on ICIs for longer time periods and thus have a better prognosis than those who do not, by virtue of their disease biology, could be a source of guarantee-time bias [94]. Recently, a mechanistic link was documented: a prospective cohort study on 73 patients with NSCLC exposed to anti-PD-1 drugs suggested that T cells recognizing shared antigens in skin and lung tumor infiltrated both sites, thus supporting the conclusion that the toxic effect mechanism was related to overlapping antigens in lung tumor and skin [95].

Third, comparative safety among ICIs is an unresolved debated topic. Apart from the general notion that anti-CTLA-4 agents are considered more toxic than anti-PD-1/PD-L1 monoclonal antibodies, a reliable risk comparison is difficult owing to different therapeutic uses. Some authors have speculated that anti-PD-L1 drugs might be theoretically less toxic than anti-PD-1 agents due to preservation of PD-L2 signaling [62], although peculiarities do exist: avelumab (an anti-PD-L1 agent) causes infusion-related reactions in approximately one-quarter of patients, generally occurring during or just after the first four infusions. Reducing the rate of infusion, temporarily suspending the infusion, and administering premedications or corticosteroids, if needed, are all effective therapeutic measures. A network meta-analysis of 36 head-to-head RCTs provided a comparative assessment of ICIs. Atezolizumab (an anti-PD-L1) emerged with the most favorable safety profile in general, as well as nivolumab in the subgroup with lung cancer (probabilistic analysis). Among five ICIs, atezolizumab had the highest risk of hypothyroidism, nausea, and vomiting. The predominant treatment-related adverse events for pembrolizumab were arthralgia, pneumonitis, and hepatic toxicities. The main treatment-related adverse events for ipilimumab were skin, gastrointestinal, and renal toxicities. Nivolumab had a narrow and mild toxicity spectrum, mainly causing endocrine toxicities [96].

All major recently published systematic reviews consistently documented that patients treated with PD-1 inhibitors had an increased rate of any grade or high-grade irAEs compared with patients receiving PD-L1 inhibitors, with pneumonitis emerging as a key serious and frequently fatal event [97, 98]. A recent systematic review of 125 RCTs applied Bayesian multilevel regression models and estimated incidences using both fully reported and censored data (simulating individual patient-level meta-analysis): 66% of patients exposed to anti-PD-1/PD-L1 drugs developed at least one adverse event of any grade, and 14% developed at least one adverse event of grade 3 or higher severity, with hypothyroidism (6.07%) and hyperthyroidism (2.82%) being the most common [72]. Of note, 24% of pneumonitis cases were grade 3 or higher in severity, and pneumonitis was the most common cause of treatment-related death. In addition, hepatitis was found most likely to be serious, with 51% being grade 3 or higher. Nivolumab appeared to have a higher mean incidence of all-grade and grade 3 or higher adverse events, compared with pembrolizumab, but the mechanism and clinical significance are unclear. In all major systematic reviews recently published, PD-1 inhibitors were associated with a higher mean incidence of grade 3 or higher adverse events compared with PD-L1 inhibitors (odds ratio 1.58, 95% CI 1.00–2.54).

As anticipated, pharmacovigilance studies through DAs cannot directly compare ICIs for safety, but continuous vigilance is needed to intercept early the rare irAEs that may be unique to a given pharmacological class and better characterize their safety profile. In particular, the impact of concomitant medications represents an emerging area to be explored, also in the light of evolving immunotherapy combinations [99, 100].

7 Conclusion and Perspectives

A number of pharmacovigilance studies, using descriptive and disproportionality approaches, were conducted in 2019 to systematically characterize the variegate and multifaceted spectrum of toxicities with ICIs. A concerted effort is required by all stakeholders (regulatory agencies, patients, manufacturers, healthcare professionals) considering the fast-track approval of ICIs, their evolving clinical use due to rapid extension of therapeutic indications in different oncological settings with multiple anticancer combinations, and our imperfect prediction and characterization of irAEs. The recent description of solid organ transplant rejection [101] strengthened the role of pharmacovigilance as a tool for real-time safety monitoring of immunotherapy, especially for rare but potentially fatal complications. A direct comparison with data from RCTs is misleading, considering that incidence cannot be calculated through spontaneous reporting. Furthermore, the unexpectedly (and apparently) low reporting proportion of irAEs extracted from original studies have multifold interpretations, including the high reporting of fatigue, asthenia, nausea, and vomiting (although to a lesser extent compared with chemotherapy), and the co-occurrence of multiple irAEs in the same patient.

Data from pharmacovigilance, especially a disproportionality approach, consistently confirmed the existence of a class-specific pattern of irAEs in the real world; colitis, hypophysitis, adrenal insufficiency, and hypopituitarism were more frequently reported with anti-CTLA-4 drugs, whereas thyroid dysfunction, pneumonitis, cholangitis and myocarditis were preferentially recorded with anti-PD-1/PD-L1 agents. As regards the kinetics of occurrence, the median onset of ∼ 40 days should remind clinicians about the importance of stringent monitoring during this initial 2-month window, especially when combination regimens are used. During this critical vulnerable period, fatal events are reported, and particular attention should be devoted to myocarditis, hepatitis, and pneumonitis for anti-PD-1/PD-L1 drugs, and colitis for anti-CTLA-4 agents.

We put forward some advice and suggestions for next-generation pharmacovigilance studies.

  1. (1)

    Implement disproportionality techniques, including machine learning algorithms, to investigate potential drug interactions and predictors of irAE occurrence, especially considering multiple combinations.

  2. (2)

    Link adverse event reporting to drug consumption (prescription/dispensation data) in order to calculate a reporting rate. This measure, accounting for empirical estimation of the underreporting, might allow to calculate a minimum incidence rate from postmarketing real-world evidence, and may also provide a safety comparison [10].

  3. (3)

    Address missing information on management (including outcomes after drug interruption and rechallenge) and describe the long-term consequences of irAEs.

  4. (4)

    Prospectively characterize the exposure–response relationships (including actual assessment of the value of therapeutic drug monitoring) and the role of irAEs as potential biomarkers of effectiveness.

We endorse the concept of a global registry trial, combined with artificial intelligence techniques, and welcome the new reporting of adverse event outcomes proposed by the Side Effect Reporting Immuno-Oncology (SERIO) recommendations [102], to open the new era of pharmacovigilance 4.0 in immuno-oncology.