FormalPara Key Points

Qualitative analysis of the relationship between toxicological findings and adverse drug reactions (ADRs) is one of the primary measures for determining the risk–benefit profile of a pharmaceutical.

We evaluated the potential of nonclinical safety assessments for predicting ADRs in humans on blood cancer drugs approved in Japan.

The results suggested that ADRs in clinical trials could be predicted on the basis of toxicity data obtained in animal tests.

1 Introduction

Nonclinical data play a fundamental role in new drug development; they can be used to assess potential safety risks. The International Conference on Harmonisation (ICH) M3(R2) recommends that nonclinical safety studies should be adequate to characterize potential adverse effects that might occur under the conditions of the clinical trial to be supported [1]. It also states that clinical trials defined by ICH E8 should be extended based on the demonstration of adequate safety in previous clinical trial(s), as well as on additional nonclinical safety information [1, 2]. Human pharmacology studies with biomarkers conducted at the early clinical phases do not play a key role for safety estimation in therapeutic exploratory and/or therapeutic confirmatory studies in Japanese new drug applications (NDAs) [3]; therefore, animal toxicology data are useful for the prediction of safety profiles during the late clinical phases.

As one of the primary measures for determining the risk–benefit profile of a pharmaceutical, we recently conducted a study to evaluate the quantitative safety profiles of blood cancer drugs approved in Japan [4]. We examined safety indices obtained using the ratio of drug dose/exposure in animals at the no observed adverse effect level (NOAEL) to that in humans at the expected therapeutic dose. We used data from toxicokinetic studies indispensable for safety assessment as stated in ICH S3A [5]. We categorized quantitative safety profiles into five types, from I (high) to V (low), and found that although there were some drugs for blood cancer treatment with low quantitative safety profiles (categories III, IV, and V), the safety profiles of those drugs were not discussed in the NDA dossiers [4]. In the regulatory reviews for drug approval, quantitative safety profiles can provide a certain amount of information for the evaluation of the risk–benefit balance. In addition, it is important to assess drug safety using qualitative aspects by comparing nonclinical toxicology findings and adverse drug reactions (ADRs). However, there have been relatively few attempts to methodically assess the correlation between toxicity levels caused by the same drugs in animals and humans.

Igarashi et al., at the Japanese Pharmaceutical Manufacturers Association (JPMA), investigated published papers on general pharmacological studies and the clinical adverse reactions observed during new drug development [6]. They demonstrated that tests of cardiovascular functions, spontaneous locomotor activity, and intestinal transport are of considerable value in predicting ADRs. Furthermore, Olson et al. revealed that 71% of ADRs were observed in animals for the same target organ and the hematological, gastrointestinal, and cardiovascular ADRs were highly concordant [7]. However, evidence supporting the prediction of or extrapolation to human toxicities from the results of animal toxicology studies is scarce, and there is no consensus on this matter. Against this background, our primary objective in this study was to evaluate the potential of nonclinical safety assessments for predicting ADRs in humans treated with blood cancer drugs.

2 Methods

2.1 Data Source

We first reviewed data from drugs for blood cancer because severe adverse reactions were observed during clinical development and post-marketing surveillance of anticancer drugs. Moreover, the number of new molecular entities (NMEs) in this therapeutic area was suitable for this examination as a starting point and this group contained not only small-molecule drugs but also macromolecular drugs such as antibody drugs. Drugs for blood cancer approved in Japan from September 1999 to November 2016 as NMEs were analyzed. NOAEL, maximum approved dose, exposure levels at NOAEL and maximum approved dose, lowest observed adverse effect level (LOAEL), toxicological findings obtained at LOAEL, and ADRs were extracted from NDA review reports by the Ministry of Health, Labor and Welfare (MHLW) (until March 2004), the Pharmaceuticals and Medical Devices Agency (PMDA) (from April 2004), the common technical document (CTD) [8] by marketing authorization holders, package inserts, and interview forms available on the PMDA website [27]. Data were obtained in accordance with Japanese domestic regulations such as Good Clinical Practice [9] and Good Laboratory Practice [10] guidelines complying with the Pharmaceutical Affairs Law. Of the 539 NMEs identified, 28 drugs for blood cancer were identified for analysis (Table 1).

Table 1 NMEs for blood cancer analyzed in this study

2.2 Data Handling

2.2.1 Safety Index

Safety indices are obtained from the ratio of doses and exposure levels in animals to those in humans. The safety index by dose (SI-D), safety index by maximum plasma concentration (C max) [SI-C], and safety index by area under the plasma concentration–time curve (AUC) [SI-A] were calculated according to the following equations [4].

SI-D = NOAEL (mg/kg/day)/maximum approved dose (mg/kg/day)

SI-C = C max at NOAEL (µg/mL)/C max at maximum approved dose (µg/mL)

SI-A = AUC at NOAEL (µg·h/mL)/AUC at maximum approved dose (µg·h/mL).

2.2.2 Quantitative Safety Profile

The quantitative safety profile of each drug was assessed if both SI-D and SI-C or SI-A were available. The safety profiles fell into five categories based on the safety indices (Fig. 1): profile 1, SI-D >1.0 and SI-C or SI-A >1.0; profile II, SI-D ≈1.0 and SI-C or SI-A ≈1.0; profile III, SI-D >1.0 and SI-C or SI-A <1.0; profile IV, SI-D <1.0 and SI-C or SI-A >1.0; and profile V, SI-D <1.0 and SI-C or SI-A <1.0 [4].

Fig. 1
figure 1

Quantitative safety profile. NOAEL no observed adverse effect level

These categories comprise one approach to clarify safety characteristics including the balance between safety index by dose and that by exposure. Safety profile I shows that both dose and exposure levels for animals exceed those for humans; therefore, it is interpreted that there is a certain safety margin for a drug categorized in safety profile I, while a drug in safety profile V has no safety margin for either dose or exposure levels.

2.2.3 Collection of Nonclinical Toxicological Findings and ADRs

2.2.3.1 Drugs for Which Toxicological Findings at LOAEL Are Available

Nonclinical toxicological findings at LOAEL for each drug were collected from the same nonclinical study mentioned in our previous report [4], that is, the study that gave the smallest NOAEL. To compare the toxicological findings with ADRs, names and the number of ADRs of ≥grade 3 were obtained from the clinical studies defined as pivotal. The grades were based on National Cancer Institute—Common Toxicity Criteria Version 2.0 [11], Common Terminology Criteria for Adverse Events (CTCAE) Version 3.0 [12], and CTCAE Version 4.0 [13]. In cases where the grades of severity were categorized as mild, moderate, and severe, the severe grade was considered to be ≥grade 3. Prioritization of pivotal studies used for analysis is shown in Table 2. We placed the utmost importance on studies with Japanese patients. If phase III data of Japanese patients were not obtained, phase II studies with Japanese patients were selected. In the case that no Japanese patient data were available other than a phase I study, foreign clinical data for which extrapolation to the Japanese population had been accepted based on ICH E5 [14] were used. As there was one drug (gemtuzumab ozogamicin) for which ADRs were not available in the source documents, adverse events (AEs) were substituted for ADRs.

Table 2 Handling of pivotal studies
2.2.3.2 Drugs for Which Toxicological Findings at LOAEL Are Not Available

If no toxicological study at a dose over NOAEL had been performed and toxicological findings at LOAEL were not available, nontoxic observations at NOAEL were collected. However, because observations at NOAEL were not considered as toxicity changes, it is not appropriate to compare such data with ADRs with a severe grade. For such drugs, clinical ADRs ≤grade 1 were taken for comparison.

2.2.4 Concordance of ADRs and Toxicological Findings

An ADR reported in a clinical study was considered concordant with a nonclinical toxicological finding when the same finding was made in a human and an animal, or similar observations were made for similar organs [6, 7, 15, 16] (Table 3). Concordance rate was calculated as follows:

Table 3 Toxicological findings in animals considered concordant with ADRs in humans
2.2.4.1 Drugs for Which Toxicological Findings at LOAEL Are Available

Concordance rate (%) = (number of ADRs or AEs of ≥grade 3 that are concordant with toxicological findings at LOAEL/total number of ADRs or AEs of ≥grade 3) × 100.

2.2.4.2 Drugs for Which Toxicological Findings at LOAEL Are Not Available

Concordance rate (%) = (number of ADRs or AEs of ≤grade 1 that are concordant with nontoxic observations at NOAEL/total number of ADRs or AEs of ≤grade 1) × 100.

2.3 Statistical Analysis

The SPSS software Version 23 (IBM, Armonk, NY, USA) was used to perform statistical analysis of the collected data. Comparisons were performed by the Mann–Whitney U test. Regarding the association between two variables measured on at least an ordinal scale, the Spearman rank-order correlation coefficient was used. A p value of <0.05 was considered statistically significant.

3 Results

Of the 28 drugs for blood cancer, 27 were eligible for analysis. Ibritumomab tiuxetan was excluded because the lack of data prevented calculation of the safety index. NOAEL, LOAEL, and clinical studies selected for the analysis are listed in Table 4. Table 5 shows the concordance rate, administration route, drug type, species, and quantitative safety profile for each drug. The concordance rate of bosutinib was not calculated because of the lack of information on the severity of ADRs. The concordance rate of each System Organ Class (SOC) categorized by CTCAE is listed in Table 6. ADRs that were concordant with nonclinical observations in blood lymphatic system disorders and investigations were observed for many drugs, that is, 16 and 15 out of 26 drugs, respectively.

Table 4 NOAEL, LOAEL, and pivotal studies
Table 5 Concordance rate, administration route, drug type, and quantitative safety profile
Table 6 Concordance rate of each System Organ Class categorized by CTCAE

The mean concordance rate of 26 drugs excluding bosutinib was 23.9% (median: 18.5%), with a range of 0–84.8%. When stratified by the drug type, the mean concordance rates of small-molecule drugs and antibody drugs were 24.1% (median 18.5%) and 23.3% (median 14.3%), respectively. There was no significant difference between them (p = 0.839; Fig. 2). The mean concordance rates of drugs with nonclinical data for rodents, non-rodents, oral drugs, and injectable drugs were 24.7% (median 24.3%), 23.6% (median 16.7%), 16.4% (median: 6.8%), and 31.4% (median 21.8%), respectively (Table 5). No significant differences between concordance rates were observed based on species (rodent vs. non-rodent; p = 0.935) and administration route (p = 0.169).

Fig. 2
figure 2

Concordance rates of all, small-molecule, and antibody drugs. SD standard deviation

The mean concordance rates of drugs excluding bosutinib by quantitative safety profile [five types; from I (high) to V (low)] were 7.4% (I), 18.4% (III), and 37.0% (V) (Table 5). (No drug was categorized into II in this study. Bosutinib was categorized into IV, but its concordance rate was not calculable.) The concordance rate and quantitative safety profile were weakly correlated (Spearman’s r = 0.448, p = 0.047; Fig. 3).

Fig. 3
figure 3

Correlation between concordance rate and quantitative safety profile

4 Discussion

The purpose of our study was to analyze the potential of nonclinical safety assessments in predicting ADRs in humans. We obtained the nonclinical toxicological findings and ADRs observed in clinical trials for each drug and examined the relationship of safety levels for animals and humans by calculating the concordance rates. In similar studies, the JPMA conducted systematic and retrospective surveys to analyze the concordance of toxicity in animal tests and ADRs in clinical trials [15, 16]. Igarashi et al. investigated 141 drugs approved in Japan [6]. They showed that general pharmacological studies of cardiovascular functions, spontaneous locomotor activity, and intestinal transport were useful in predicting ADRs [6]. More recently, Tamaki et al. conducted a study to examine the usefulness of nonclinical safety assessments in predicting ADRs in humans [17]. They revealed that 37% of ADRs were predictable based on concordant toxicological findings in animals [17]. This figure is slightly higher than the mean concordance rate of our study (23.9%). However, considering that they targeted all drugs, excluding anticancer agents and vaccines, and collected ADRs with an incidence rate of ≥5%, these figures are comparable.

In a further investigation, we analyzed the correlation between the concordance rate and the quantitative safety profile obtained in our previous study [4]. As shown in Fig. 3, there was a significant correlation between these two factors. As the concordance rate varied over a wide range, from 0 to 84.8%, it was difficult to predict clinical ADRs in a comprehensive manner based on animal toxicological findings. However, a significant correlation between the concordance rate and the quantitative safety profile indicated that drugs with a low quantitative safety profile would show relatively high concordance rates. Examination of animal toxicological findings, especially for drugs with a low quantitative safety profile, has the potential to predict their clinical safety. For drugs with the quantitative safety profile of category III, IV, or V, the dose and/or exposure at clinical therapeutic use exceeded the dose/exposure at NOAEL. Therefore, some animal toxicological findings at LOAEL for such drugs might be reproducible in clinical use. If the toxicological finding at LOAEL is not available, the observations at NOAEL might provide useful information and help to predict ADRs to some extent. However, considering that the mean concordance rates of drugs with a high quantitative safety profile (category I: 7.4%) were lower than those of drugs with a low safety profile (category III: 18.4%, or category V: 37.0%), the overall risk–benefit of those drugs should be carefully considered, taking into account various aspects. When looking at concordance rates by SOC, ADRs concordant with nonclinical observations in blood and lymphatic system disorders and investigations were found for approximately 60% of the drugs in this study. In terms of blood cancer drugs, toxicological findings related to those SOCs might provide beneficial information to predict clinical ADRs in those SOCs.

We found that the median concordance rate of antibody drugs (14.3%) was lower than that of small-molecule drugs (18.5%), although there was no significant difference between them and the number of antibodies was small. Tamaki et al. reported that the proportion of correlated ADRs in small-molecule drugs was 46% and that in antibody drugs was 16%, indicating a trend similar to that of our results [17]. For small molecules, general toxicology tests are usually performed in rodents and non-rodents (ICH S9) [18]. However, ICH S6 [19] states that safety evaluation programs for biotechnology-derived pharmaceuticals should normally include two relevant species, but in certain justified cases, one relevant species may suffice. According to the ICH S6 guideline, the animal species for testing of monoclonal antibodies are those that express the desired epitope and demonstrate a similar tissue cross-reactivity profile as for human tissues.

In our study, four of the six antibody drugs had nonclinical data only for monkeys, and other species were not investigated. This might have contributed to the low concordance rate. Chapman et al. discussed the selection of species for toxicology studies of monoclonal antibodies [20]. They raised the concern that species cross-reactivity alone might not be sufficient to confirm species suitability. They referred to the case of TGN412, an anti-CD28 super-agonist monoclonal antibody, which induced a life-threatening cytokine storm in its first human study. Although there was no significant difference in concordance rates between antibody and small-molecule drugs, an appropriate way to predict the risk to humans, based on nonclinical toxicity findings for antibody drugs, is still needed. In 2014, seven of the top ten best-selling drugs in the world were biotechnology-derived pharmaceuticals [21] and many more biopharmaceutical products are under development [22]. Therefore, practical guidance for a risk–benefit assessment of biopharmaceuticals would be beneficial.

There is currently no established method to weigh the predictability of ADRs in humans on the basis of animal data. Bailey et al. conducted several studies on human drug safety using toxicity data obtained from animal tests [2325]. They suggested that toxicity observed in animals occurs in humans. However, their data were not particularly consistent or reliable because of considerable variability and the lack of any clear pattern in the types of toxic effects. They overlooked the caveat that the absence of toxicity in animals provided essentially no insight into the likelihood of toxicity or absence of toxicity in humans. Perel et al. compared treatment effects reported in systematic reviews of clinical trials with those of their own systematic review of the corresponding animal experiments [26]. They concluded that many animal studies are of poor methodological quality and the lack of concordance between animal experiments and clinical trials is the result of bias, random error, or the failure of animal models to adequately represent human diseases.

Although we investigated all blood cancer drugs approved in Japan from 1999 to date, there is a limitation in publicly available data. Access to some of the existing data was not possible because study reports in CTD M4 and M5 were not disclosed; only summary documents, such as CTD M2, are available. The amount of information available for different drugs varies; some CTD M2 documents contain enough data for analysis but others do not. Moreover, as we focused on drugs for blood cancer, caution should be taken when generalizing about drugs used in other therapeutic areas.

We found that the potential range of applications of nonclinical assessments in ADR predictions was substantial. However, our concordance rates differed from those reported in some other studies. Our analysis of the relationship between concordance rate and quantitative safety profile found a weak correlation, suggesting that ADRs are predictable on the basis of animal toxicities, especially for some drugs with low quantitative safety profiles. Perel et al. suggested that with the increasing number of systematic reviews of animal experiments, a quantitative approach to determine similarities between animal models and clinical trials should become possible [26]. Our study results should contribute to the development of this field.

5 Conclusion

Within the constraints of this study, our results suggest that toxicity findings observed in animal tests could be extrapolated to human treatments. This might allow the prediction of ADRs in clinical trials for some drugs with a low quantitative safety profile. Nonclinical safety assessments might be useful in predicting the clinical safety of such drugs.