FormalPara Key Points

Evaluation of a drug’s potential effect on the QT interval can be generated with high confidence using exposure–response analysis of electrocardiogram and pharmacokinetic data from first-in-human studies.

This approach, ‘Early QT assessment’, can provide an alternative to the thorough QT studies in select cases.

1 Introduction

The thorough QT (TQT) study has been a key component of the clinical evaluation of the propensity of new drugs to cause QTc prolongation since the adoption of the International Conference of Harmonisation (ICH) E14 clinical guidance document in May 2005 [1]. The request to study each new drug in a specifically designed study in healthy subjects if justifiable from a tolerability and safety perspective, and otherwise in the target patient population, was triggered by a number of drug withdrawals in the 1990s for arrhythmias associated with QT prolongation [2, 3]. The TQT study has been successful in terms of detecting drugs with a QT effect and thereby avoiding the introduction of new drugs with an unknown QT liability to the market. However, this has had its price; based on a conservatively chosen threshold (10 ms) and the requirement that the QT effect is evaluated separately at each post-dosing timepoint, without consideration of the pharmacology of the drug, the study is overly sensitive and has therefore resulted in a number of ‘false’ positives, i.e., drugs are labeled as QT prolongers without a demonstrated underlying proarrhythmic risk [4, 5]. The TQT study is also resource intensive [6], and, if electrocardiogram (ECG) data could be generated with the same level of confidence from other studies routinely performed as part of clinical development, this would represent a more efficient approach, with other potential advantages, such as improved understanding of any liabilities early in clinical development. The ‘first-in-human’ studies [single ascending dose (SAD) and multiple ascending dose (MAD)] seem well suited for this purpose because achieved plasma levels of the parent compound and abundant metabolites often substantially exceed therapeutic levels later observed in patients. Provided serial ECG assessment and pharmacokinetic sampling are incorporated into the design, SAD and MAD studies represent an opportunity to generate ECG data with the same high quality as the TQT study [79]. Several doses of the investigational drug are typically administered to small cohorts with only six to eight subjects receiving active drug (and often only two per cohort receiving placebo), and the power to exclude small effects in a ‘by timepoint’ analysis for each dose group as in the TQT study is therefore unacceptably low [10]. If, on the other hand, exposure-response (ER) analysis is employed, all data across a wide range of plasma concentrations of the drug are used, and the power to detect and exclude small QT effects would be substantially improved [11].

The experience with ER analysis of ECG data has increased over the last decade, among both regulators and sponsors. The US FDA Interdisciplinary Review Team (IRT) for QT studies was formed shortly after the adoption of the ICH E14 document and has since provided sponsors with consistent advice on the design and analysis of TQT studies [12] and has independently reviewed and analyzed close to 400 TQT studies to date. ER analysis has become an integral part of the IRT review of data from QT assessment studies [11, 12] and has proven invaluable in terms of enhancing the confidence in characterizing drug-induced QTc prolongation. ER analysis is now routinely used to predict the QT effect in the targeted patient population, including clinical scenarios with doses and formulations not directly evaluated in the TQT study and QT effects in specific populations and under certain conditions (e.g., drug interactions) with increased exposure of the drug [1319]. Extensive experience with QT-prolonging drugs demonstrate that the effect on the QT interval is directly related to plasma levels of the drug or main metabolites, with few exceptions (e.g., QT prolongation inhibition of hERG protein trafficking, which is delayed in relation to peak plasma levels [2022]). In our view, it therefore makes sense to focus on QT effects in relation to plasma concentration of the drug, rather than by timepoint without consideration of the pharmacology of the drug, and a wider role for ER analysis in the assessment of drug-induced ECG effects seems justified.

Even though the experience of many pharmaceutical sponsors from the application of ER analysis on data from first-in-human studies is favorable, publicly available reports are relatively scarce [79, 23]. Based on discussions with the FDA, a research collaboration was therefore initiated between the Clinical Pharmacology Leadership Group of the Consortium for Innovation and Quality in Pharmaceutical Development (IQ Consortium) [24] and the Cardiac Safety Research Consortium (CSRC [25]) with the intention of conducting a prospective study to evaluate whether ER analysis applied to ECG data from a small SAD-like clinical pharmacology study could serve as an alternative to the TQT study. In this commentary, an outline of the main results of the study is given and potential implications thereof are discussed.

2 The IQ-CSRC Prospective Study

The objective of the IQ-CSRC study was to evaluate whether QT assessment performed in early phase clinical studies using an intense ECG schedule and ER analysis can detect, and therefore also exclude, small QT effects with the same level of confidence as would a TQT study. The selection of drugs, doses, design of the study, and methods of analyses [26] were discussed and agreed upon with the FDA, and the results have recently been published [27]. Healthy subjects (n = 20) were enrolled into the study and underwent three separate treatment periods, during which study treatment or placebo was administered on 2 consecutive days. Six drugs with well-characterized QT effect were selected for the evaluation, as follows. Five ‘QT-positive’ drugs were administered at doses intended to cause QTc prolongation of around 9–12 ms on day 1 and 15–20 ms on day 2: (1) oral ondansetron 52 mg and intravenous ondansetron 32 mg [28]; (2) quinine 648 mg three times daily for four doses [29, 30]; (3) oral dolasetron 100 mg and intravenous dolasetron 150 mg [31]; (4) oral moxifloxacin 400 mg and intravenous moxifloxacin 800 mg [32, 33]; and (5) oral dofetilide 0.125 and 0.25 mg [34]. One negative drug, levocetirizine, was added at the same doses as in its TQT study, 5 and 30 mg [35, 36]. An incomplete block design resulted in each study drug being administered to nine subjects and placebo to six subjects in separate periods. Serial ECGs and pharmacokinetic samples were collected on each dosing day. The primary variable for the ER analysis was the change from pre-dose baseline QTcF (ΔQTcF). Prospective criteria were used to exclude hysteresis and to select the appropriate ER model. To claim that the study was able to demonstrate the QT effect of the five QT-positive drugs, the slope of the concentration–QTc effect relationship had to be statistically significantly different from zero, and the upper bound of the two-sided 90 % confidence interval (CI) of the predicted mean ∆∆QTcF had to be greater than 10 ms at the observed geometric mean peak plasma drug concentration (C max) of the lower dose on day 1. To exclude a QT effect for the ‘QT negative’ drug (levocetirizine), the upper bound of the two-sided 90 % CI of the predicted mean ∆∆QTcF had to be less than 10 ms at the observed geometric mean C max after the higher dose (30 mg) on day 2.

Data were available from eight to nine subjects receiving active drugs on day 1, from six to nine subjects on day 2, and from six subjects receiving placebo from both study days. Across all timepoints on day 1, the largest mean placebo-adjusted ∆QTcF (∆∆QTcF) was between 10 and 15 ms for all QT-positive drugs except hydrodolasetron (the active metabolite of dolasetron) with an effect of 6.5 ms; ∆∆QTcF was 1.8 ms for levocetirizine. On day 2, the largest mean ∆∆QTcF reached 10.2 and 12.2 ms for ondansetron and hydrodolasetron, respectively, and was above 20 ms for quinine (22.1 ms), moxifloxacin (33.4 ms), and dofetilide (24.5 ms); peak ∆∆QTcF on levocetirizine was 3.1 ms. A linear ER model provided the best fit of the data for all dugs except dofetilide, for which a maximum response (E max) model was better according to pre-specified model-selection criteria. A significant slope of the ER relationship was demonstrated, and the upper bound of the 90 % CI of the predicted effect at the observed C max of day 1 was above 10 ms for all positive drugs, i.e., all QT-positive drugs met the prespecified criteria (Table 1; Fig. 1). For the negative drug, levocetirizine, an effect exceeding 10 ms could be excluded at the observed mean C max (1005 ng/mL) of the higher dose, 30 mg. Two sensitivity analyses were performed. The first was performed to explore the scenario in which the peak QT effect was at the level of regulatory concern, i.e., the intended effect level on day 1 (9–12 ms). For this purpose, only data from day 1 with the lower doses of the positive drugs and from day 2 with the higher dose of the negative drug were used; criteria for positive and negative QT assessment were still met for all drugs (Table 2; Fig. 2). Since many SAD studies are of pure parallel-group design, active treatment periods for subjects who also received placebo were excluded in the second analysis, in effect, creating a pure parallel-group comparison with six to seven subjects receiving active treatment and six other subjects receiving placebo. All drugs also met the prespecified criteria with this approach (Table 2; Fig. 3). An analysis of the ‘by timepoint’ effect on the PR and QRS intervals confirmed the known effects of quinine and dolasetron on cardiac conduction, with a largest mean ∆∆PR effect ~16 ms for both on day 2 and the largest mean ∆∆QRS of 7.7 and 5.2 ms on day 2, respectively.

Table 1 Exposure response (QTc) analysis: the slope of the concentration/QTc relationship and the predicted ∆∆QTc effect at peak plasma drug concentration
Fig. 1
figure 1

The predicted effect of dofetilide on ∆ΔQTcF with a linear and a maximum response ER model (nine subjects receiving active drug and six receiving placebo for both study days). The solid black line with gray shaded area denotes the model-predicted mean placebo-adjusted ΔQTcF with 90 % confidence interval with the linear model, whereas the solid green line with the green shaded area shows the prediction with a maximum response model. The horizontal red line shows the range of plasma concentrations divided into deciles. Red squares with vertical bars denote the observed arithmetic means and 90 % confidence intervals for the placebo-adjusted ∆QTcF within each plasma-concentration decile. The placebo-adjusted ∆QTcF was derived from the individual ∆QTcF for the active subtracted by the mean predicted ∆QTcF for placebo from the model. With both models, the slope of the exposure-response relation was statistically significant and the upper bound of the 90 % confidence interval of the predicted QT effect at the observed peak plasma drug concentration on day 1 (0.42 ng/ml) was above 10 ms

Table 2 Exposure response (QTc) analysis: sensitivity analyses
Fig. 2
figure 2

The predicted effect of dolasetron (hydrodolasetron) on ∆ΔQTcF using data from the lower dose only (day 1) with nine subjects receiving active drug and six subjects receiving placebo; on this day, the largest mean ∆∆QTcF across timepoints was only 6.5 ms. The symbols are as in Fig. 1. The slope of the exposure-response relation was statistically significant (0.016 ms per ng/ml; 90 % confidence interval 0.0008–0.032) and an effect on QTc above 10 ms could not be excluded (mean 6.8 ms; 90 % confidence interval 3.4–11.6) at the observed peak plasma drug concentration on day 1 (211 ng/ml)

Fig. 3
figure 3

Exposure–response analysis of the QT effect of levocetirizine using data from parallel groups of subjects with six subjects receiving active and six subject receiving placebo. The symbols are as in Fig. 1. An effect on ∆∆QTcF exceeding 10 ms could be excluded throughout the observed concentration range

3 Implications of the IQ-CSRC Prospective Study

The results from the IQ-CSRC study have been presented to the ICH E14 and S7B discussion group and were recently (December 2014) discussed at a public meeting at the FDA’s White Oak campus, co-organized by CSRC/IQ/FDA, with participation from main regulatory regions (USA/EU/Japan). The discussions were centered on the general applicability of the results, the lack of a positive control in studies intended to replace the TQT study, and limitations of the approach of applying ER analysis to routine SAD/MAD studies. We believe that the results of the IQ-CSRC study provide clear support for replacing the TQT study with ECG assessment in routine clinical pharmacology studies, and would like to share some thoughts on topics that must be addressed for wider acceptance of this approach.

3.1 Lack of Positive Control

It is unrealistic to expect that early phase clinical studies will routinely include a pharmacological positive control; it is therefore important to consider how the study’s sensitivity to detect small QT changes can be evaluated if data are to be used as a substitute for the TQT study. The positive control in a TQT study serves the purpose of demonstrating that the experimental conditions and the ECG methodology of the study are sensitive enough to detect a small effect of the investigational compound, should there be one. The positive control thereby provides reassurance against false negatives, i.e., the scenario where a study fails to detect a drug-induced QT effect, only to find out later in development (or after approval) that the drug causes proarrhythmias associated with pronounced QT prolongation. From a safety perspective, the risk of false negatives is therefore of key importance. However, this risk, appears small when ER analysis is applied to early phase QT studies, provided a wide range of plasma concentrations of the drug has been achieved and an intense ECG/pharmacokinetic schedule has been implemented using the same experimental conditions and ECG methodologies as in TQT studies. Since published examples are relatively few [8, 9, 23, 37, 38] and not based on prospective series, a recently published report using a simulation approach can help to gain further understanding of the rate of false negatives and false positives in small-sized studies. A large number of small studies with 6–18 subjects receiving active treatment and six receiving placebo was simulated with data from five TQT studies; three studies with moxifloxacin with mean peak ∆∆QTcF effect of 12.5, 14.0, and 8.0 ms, one study with ketoconazole with a smaller QT effect (∆∆QTcF 7.6 ms), and one TQT study with a drug with a larger effect (∆∆QTcF 26 ms). A total of 1000 studies were simulated for each of five sample sizes of subjects on active treatment (n = 6, 9, 12, 15, and 18) for each study, i.e., a total of 25,000 studies [39]. The criterion for negative QT assessment was based on ER analysis and was the same as in this study, i.e., a QT effect (∆∆QTcF) exceeding 10 ms should be excluded. The rate of false negatives with a sample size of nine or more subjects receiving drug and six receiving placebo was 1 % in two of the three moxifloxacin datasets and around 5 % in the third. For ketoconazole, with a smaller peak effect than moxifloxacin, the rate of false negatives was larger with such small sample sizes (around 25–30 %). Similar results, i.e., a rate of false negatives around 5 %, have been obtained with simulation studies performed on TQT study data submitted to the FDA (personal communication, Dr. Jiang Liu, scientific lead of the IRT). These simulations lend support to the claim that the risk of false-negative results is low when ER analysis is applied to data from small studies with drugs that have a threshold QT effect but call for confirmation from real-life studies. The rate of false positives is obviously also of interest, since a high rate would be ineffective from a resource perspective: this rate was below 20 % with nine subjects receiving active (six receiving placebo) and near or below 10 % with 12 subjects in the simulation study discussed above [39]. Which rate of false positives is acceptable will most likely vary among drug developers, since in these cases additional studies would be needed. In case of a small or ambiguous QT effect, a TQT may be the best option; whereas in case of a clear QT effect, further characterization of the effect in the targeted patient population may be needed.

In summary, the risk of false negatives with ‘early QT assessment’ seems acceptably low, in our view, provided the experimental conditions are similar to those in TQT studies and an intense ECG schedule with a high-quality technique has been implemented, paired with pharmacokinetic determination at each timepoint. It should also be emphasized that the confidence in a negative QT assessment in the absence of a positive control will be higher if plasma levels of the drug substantially exceeding therapeutic levels are achieved; if this is not the case, a positive control may be required to gain confidence in the negative results of the drug.

3.2 How Generalizable are the Results: How to Ensure High Quality?

Since a QT effect at the level of concern, i.e., around 10 ms, could be detected in a consistent way in the IQ-CSRC study, both in the primary analysis and in the sensitivity analyses, it may be claimed that the results provide validation of the approach of applying ER analysis to QT data obtained in small-sized studies to replace the TQT study; it can thus be argued that the results create confidence in the approach in general. The important question then becomes to what extent the results can be repeated using other clinical sites, experimental conditions, and ECG methodologies. For TQT studies, the positive control serves as the ‘quality control’ with certain preset criteria that are to be met. There is no reason to believe that the same level of high-quality data cannot be generated from SAD/MAD studies performed in healthy volunteers at experienced clinical sites, but some data-driven metrics of quality may be needed, at the least for the near- to mid-term future, as more experience is gained. Some research in this area has been performed using internal FDA data [40, 41], but tests have to be tailored and further defined to allow application to a typical SAD study without a full baseline day and small number of subjects. Quality tests could include metrics of heart rate stability within timepoints, a reproducible QT/RR curvature, within- and between-subject variability of the QT interval and the time course of the adaptation of the QT interval to changes in heart rate. While heart rate and QT variability can be evaluated from extracted ECGs at prespecified timepoints, other metrics may be based on a richer sample of QT/RR data from continuous ECG (Holter) recordings. This should be realistically achievable, given that most ECG studies in healthy subjects are currently performed using continuous recordings. Work has been initiated in this regard on the IQ-CSRC study dataset and will need to be tested on routine SAD/MAD studies to allow definition of useful metrics. Based on experience from TQT studies, it is also worth pointing out that stringent control of experimental conditions and the use of standardized ECG techniques will result in lower variability of the QT interval measurements. The objective of these studies is to exclude a small QT effect by using a non-inferiority approach (see criterion below) and consequently, the smaller the variability, the greater the chance is to exclude an effect. In a way, this serves as an internal quality control.

The underlying concept of ER analysis applied to early phase clinical data is that there should be no need to change the design and sample size of a typical SAD or MAD study; often plasma levels of the drug well above therapeutic levels are achieved and the total number of subjects across dose groups substantially exceeds the sample size (n = 9) of the IQ-CSRC study. Within this framework, certain prerequisites have to be met to support a request for a TQT waiver based on early QT assessment. Sufficiently high plasma concentrations of the parent and abundant metabolites are critically important to support a claim of the absence of a QT effect at clinically relevant concentrations; if justifiable from a tolerability perspective, achieved levels should substantially exceed the highest observed levels in patients. For drugs or metabolites with pronounced accumulation on multiple dosing, sufficiently high plasma levels may not be possible to achieve with a SAD study, and the ECG assessment would in such cases be better performed in a multiple-dose setting; these are the same type of considerations as those in the choice between a single-dose and multiple-dose TQT study.

ER analysis is mentioned in the ICH E14 ‘questions and answers’ (Q&A) document from March 2014 as ‘promising in terms of enhancing our confidence to characterise QTc prolongation’ [42]. Based on the extensive experience with ER analysis for evaluation of QT effects and the results of the IQ-CSRC study, it now seems timely to consider an expanded role for ER analysis in the definitive assessment of a drug’s QT effect and whether ER analysis applied to early clinical phase data can serve as an alternative to the TQT study. For this purpose, the following criterion, analogous to the threshold in the TQT study, has been proposed [7] as a basis for a ‘negative QT assessment’, i.e., to demonstrate that a drug does not cause QT prolongation of concern:

  • The upper bound of the two-sided 90 % CI of the predicted placebo-adjusted ∆QTc should be below 10 ms at the highest clinically relevant plasma concentrations of the drug.

Therapeutic plasma concentrations and full pharmacokinetic characteristics will obviously not be known at the time of an early phase clinical study, which means that a two-tiered process would be needed, with confirmation of the utility of the dataset once the drug exposure in patients is well characterized, substantially later in development. If it then can be shown that achieved plasma concentrations in early phase clinical studies substantially exceed those seen in patients receiving chronic dosing, in terms of both parent drug and metabolites, a negative QT assessment using ER analysis may serve as a replacement for the TQT study [43].

Based on the increasing experience and confidence in ER analysis of ECG data among regulators and sponsors, and as a consequence of the discussions triggered by the IQ-CSRC study, the ICH E14 discussion group met in June 2015 in Fukuoka [44] and decided to revise the E14 guideline (personal communication Drs. Stockbridge and Garnett, representing FDA and Dr. Keirns, Astellas, representing US PhRMA on the E14 discussion group). This will be handled as an amended Q&A document addressing the role of ER analysis of early phase clinical data, requirements for replacing the TQT study using this approach, and with advice on various aspects of the analysis and quality control of the data.

4 Conclusions

Based on extensive experience from TQT studies and from ECG assessment in patient trials, ER analysis has emerged as an important tool to evaluate the propensity of drugs to cause QT prolongation. A high concordance between the largest observed QT effect in TQT studies and the predicted QT effect at correspondingly high plasma levels of the drug using ER analysis has been observed. The IQ-CSRC prospective study was designed to evaluate whether ER analysis applied to small-sized, early phase clinical studies can be used to detect drugs with a QT effect at the level of regulatory concern. The study correctly identified five ‘QT-positive’ compounds and excluded a QT effect with a ‘QT-negative’ drug, levocetirizine. The study thereby provided validation of the concept of using ER analysis applied to early phase clinical studies to provide definitive QT assessment and serve as a replacement for TQT studies. In consequence, the ICH E14 clinical guidance document will likely be revised to allow an expanded role for ER analysis.