Historical Benchmarks for Quality Tolerance Limits Parameters in Clinical Trials

Background In 2016, the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use updated its efficacy guideline for good clinical practice and introduced quality tolerance limits (QTLs) as a quality control in clinical trials. Previously, TransCelerate proposed a framework for QTL implementation and parameters. Historical data can be important in helping to determine QTL thresholds in new clinical trials. Methods This article presents results of historical data analyses for the previously proposed parameters based on data from 294 clinical trials from seven TransCelerate member companies. The differences across therapeutic areas were assessed by comparing Alzheimer’s disease (AD) and oncology trials using a separate dataset provided by Medidata. Results TransCelerate member companies provided historical data on 11 QTL parameters with data sufficient for analysis for parameters. The distribution of values was similar for most parameters with a relatively small number of outlying trials with high parameter values. Medidata provided values for three parameters in a total of 45 AD and oncology trials with no obvious differences between the therapeutic areas. Conclusion Historical parameter values can provide helpful benchmark information for quality control activities in future trials.


Introduction
In 2016, International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) E6 introduced quality tolerance limits (QTLs) as a quality control in clinical trials. Since then, the industry has been learning about implementation. In support of this learning, TransCelerate prepared this manuscript to offer a frame of reference for future clinical trials.
QTLs set limits for detection of systematic issues impacting on participants' safety or the reliability of trial results. Several recent papers fill in some of the missing 1 3 details about implementing QTLs, including their relationship to risk-based monitoring [1] and a framework for implementation [2]. Using a risk-based approach, sponsors typically define which of their clinical trials will benefit from QTLs as a risk control. Each clinical trial then sets QTLs by defining the parameters to be observed, the units of measure, the data source(s), and their initial limit(s). TransCelerate's QTL Framework also contains examples of parameters [2].
While many of the parameters are familiar to those involved in clinical trials, using them to set quality limits and to guide quality decisions is still new. Implementation of QTLs brings new challenges in data specifications, collection, and analysis. Defining parameters and setting limits for a trial can be challenging as those need to be predefined (i.e., set before a trial starts and data are accumulated). Historical data, clinical and statistical expertise, and an understanding of the protocol can be helpful in considering how to set the thresholds for QTLs in a given trial. This manuscript was written to expand the available references that sponsors can consult when setting QTLs by providing historical benchmarks of parameter values.
In 2020, TransCelerate published an expanded and updated framework for QTL implementation (available at www. trans celer atebi ophar ma. com and published as a peer-reviewed article [2]). The QTL parameters proposed by Bhagat et al. were used as a basis for this article [2]. TransCelerate collected and aggregated historical data provided by a subset of TransCelerate member companies and separately by Medidata. As the list of parameters used is predominantly applicable for medium and large trials, therefore, this exercise focused on trials recruiting more than 50 trial participants. At the time this study was designed, the volume and characteristics of the data to be received was unknown; thus, no statistical hypothesis testing was planned.
Importantly, the objective of this article is not to provide recommendations on what parameter values should be applied as QTLs for future trials. These need to be independently determined by every sponsor. Instead, we are providing historical benchmark information on five different parameter values that can be especially useful in conjunction with a sponsor's internal data and research when a sponsor utilizes the concept of 'expectation' as introduced in a 2017 TransCelerate publication on QTLs [3].

Historical Data Collection from TransCelerate Member Companies
Companies were asked to complete a data collection tool without using trial identifiers (e.g., trial name) so that data could be aggregated blindly. The data contribution was voluntary. Key trial characteristics captured for each trial were therapeutic area, phase, number of participants (categorized as less than 20, 21-50, 51-200, 201-1000, or more than 1000), route of drug administration, trial design, and when the trial was completed (within last five years vs. more than five years ago). Trial characteristics that were not available were left blank. Along with trial characteristics, this tool collected QTL parameters measured in each trial and their units and values. Each record (row) within the tool represented one parameter per trial. If a trial was assessed for multiple parameters, multiple lines were completed with the required information. As this was historical data collection, the parameters could be assessed after trial completion and were not used for monitoring the trial actively. Data collection tools completed by each responding member company were shared with the TransCelerate Project Manager, an independent consultant retained by TransCelerate to oversee the QTL initiative. For all planned aggregations, dataset balancing exercise was performed to prevent results being disproportionally driven by one member company. For this purpose, the Project Manager (1) calculated the percentage of data from each member company; (2) defined the number of study records that need to be removed from the sets coming from top-contributing member companies to achieve a balanced dataset; and (3) assigned a random number to every record in a company dataset and removed records starting with the smallest random number and moving higher, until the number of records defined in step 2 had been removed. The activity was performed for each reported parameter independently. The Project Manager then aggregated and anonymized the data based on the authors' instructions prior to being shared with the QTL team for analysis and interpretation. The aggregated results before and after the balancing exercise were consistent.
Aggregations were conducted without formal statistical analyses or hypotheses testing. The majority of responding companies provided data on parameters as a percentage of trial participants and the exact number of trial participants was not collected; therefore, the results were presented as a mean of percentages.

External Partner Data Collection
Clinical trials targeting specific indications were selected from the Medidata Enterprise Data Store (MEDS), comprised of 22,000 + historical clinical trials, for de-identified aggregate analyses. In order to identify differences in parameter values between different therapeutic areas, Alzheimer's disease (AD) and oncology trials were compared. Selection of these indications was based on the fact that AD and oncology trials differ significantly and on the availability of the data. A Medidata statistician performed the analysis based on authors' instructions. The information on protocol deviations (PDs) was retrieved from the disposition dataset and represents the disposition-related protocol deviations.
Aggregations were conducted without formal statistical analyses or hypotheses testing.

Historical Data Collection from TransCelerate Member Companies
Seven TransCelerate member companies provided historical data from a total of 294 trials on parameters previously proposed in the TransCelerate QTL Framework [2]. Each company provided data on 4-119 trials. The trials were predominantly late phase (72% Phase 2 or 3) and represented a wide range of therapeutic areas. The details of the trials are presented in Table 1.
The companies provided data on 2-10 parameters. The number of companies providing data on each parameter is presented in Table 2. In all further aggregations at data contributed by at least five member companies are presented. This criterion was met for five parameters. These aggregated data for parameters and trial subsets are presented in Table 3. The lowest mean value for the whole dataset was collected for the percentage of lost to follow-up trial participants and highest for the percentage of trial participants with premature discontinuation of investigational drug. The possibility of comparisons between various trial characteristics was limited by the availability of sufficient data. Nevertheless, some association between the size of the trial and the percentage of trial participants with premature discontinuation of investigational drug was noted with larger trials having higher rates of discontinuation. Otherwise, no meaningful differences in parameter values were linked with collected trial characteristics.
The distribution of values was similar for most parameters with clustering at or near zero and a relatively small number of outlying trials with high parameter values. This pattern was less visible for percentage of trial participants with premature discontinuation of investigational drug.

External Partner Data Analysis
As comparisons between different therapeutic areas and other trial characteristics were limited in the member company historical data collection, this was addressed using an alternate source of data. Medidata provided values for three parameters in 45 AD and oncology historical trials. The information on classification of all PDs was incomplete in the available datasets; therefore, the percentage of all patients with disposition-related PDs was assessed.
This analysis did not show any striking differences between the AD and oncology trials in any of the three analyzed parameters. However, there was a somewhat higher mean percentage of patients withdrawing consent in the oncology trials (Table 4). Also, trial characteristics (phase, design, and time of completion) did not appear to correlate with historical parameter values within the oncology dataset (Table 5). In general, for the three parameters aggregated, the mean parameter values were higher than medians in this dataset because of the small number of trials reporting high and outlying parameter values.

Discussion
Historical data analyses, together with medical and statistical characteristics of a trial, were previously proposed as factors to be considered in setting QTLs [3]. Many sponsors conduct historical data analyses using data from their own completed trials in order to generate benchmark data for setting QTLs in new trials. This activity may be difficult or impossible when a sponsor has a small portfolio of trials or when an organization is entering a new therapeutic area. This article shares benchmark data on parameters proposed as the basis for QTLs and summarizes the availability of public domain data that could be helpful in determining the historical values of the parameters.
For the purpose of this study, we collected historical data from several TransCelerate member companies. The first TransCelerate publication on QTLs was released in 2017 [3].
It included examples of parameters: percentage of ineligible participants, percentage of premature treatment discontinuations, and percentage of participants lost to follow-up. These are some of the same parameters for which the most responding member companies provided data for this publication. The data collection period for the article was short, and the member companies were encouraged to share data that were readily available. Therefore, the parameters most represented in the collected historical dataset may reflect the ease of data collection or frequency of use in QTL implementation projects. Percentage of trial participants randomized who do not meet inclusion/exclusion criteria 7 Percentage of trial participants with premature discontinuation of investigational drug 6 Percentage of trial participants with withdrawal of informed consent 5 Percentage of lost to follow-up trial participants 6 Percentage of trial participants for whom trial endpoint data were not collected 5 Percentage of trial participants with important protocol deviations other than eligibility 3 Percentage of trial participants on rescue medication 1 Percentage of trial participants who are non-compliant with investigational drug administration as defined in the protocol 1 Percentage of trial participants censored for primary objective analysis 2 Percentage of randomized trial subjects who were incorrectly stratified 2 The parameters proposed by Bhagat et al. [2] can be grouped into those that revolve around missing data (including those due to withdrawal of consent) and those that pertain to PDs. The amount of data published on each type of parameter differ significantly with more data available for parameters associated with missing data than for PD parameters.
Missing data from participants lost to follow-up were addressed in multiple previous publications from a variety of therapeutic areas [4][5][6][7][8] with rates of participants lost to follow-up consistent with the data gathered for this publication. In addition, Fewtrell (2008) and Kristman (2004) modeled loss to follow-up at similar rates to what is reported here [9,10]. Each of these publications had a different definition of loss to follow-up and covered a variety of therapeutic areas; however, this is representative of our industry.
The value of loss to follow-up as a QTL that is measured, monitored, and managed is highlighted in a 2011 literature review [11]. The literature review concluded that loss to follow-up might change conclusions written in peer-reviewed published study results in 15% of randomized clinical trials with time to event outcomes. In addition, some trials do not adequately report loss to follow-up [11]. Crutzen et al. addressed differential attrition across treatment arms of a trial; they reviewed 100 randomized clinical trials in general and internal medicine and found a mean attrition rate of 13%, but no significant differential attrition [4].
Also, both our study and the Sweetman et al. study [12] found significant variability in the percentage of ineligible participants with a small subset of trials having high percentages of participants with eligibility PDs. Trials in our analysis preceded the existence of industry-wide guidance on defining and reporting of PDs [14]. The sponsors did not receive any instructions for categorization of the PDs (minor or major) or inclusion/exclusion of minor PDs in the calculations of the percentages of participants with eligibility PDs. It can be speculated that the small number of trials reporting outliers for this parameter had a systematic issue with eligibility PDs. Standardizing PDs and establishing QTLs as a control may have had a beneficial effect on trial quality.
The public domain can be a rich source of historical data for QTL parameters that pertain to completeness of observations. Introduction of CONSORT diagrams in the majority of publications and disposition tables in results published in trial registries allows retrieval of information on patients withdrawing consent and lost to follow-up in the majority of historical trials [15]. On the other side of the spectrum are the parameters for which PD information in the public domain is very limited and therefore subject to selection bias. The availability of data on completeness of intervention reflected in parameters pertaining to treatment compliance and premature discontinuations is somewhere in the middle.
In addition to calculating historical benchmarks for parameters that may provide some basis for QTLs, it may be important to understand how various trial characteristics affect the parameter values. In the data collected from the TransCelerate member companies, the only notable difference within the dataset based on study characteristics was observed for the percentage trial participants with premature discontinuation of investigational drug. The mean parameter value was higher for larger trials. This finding is difficult to interpret without analyses of other trial characteristics. It can be hypothesized that larger trials usually represent later phases of development with longer treatment periods and more real-life settings making premature drug discontinuations more prevalent. Alternatively, the size of a trial may correlate with the therapeutic area, e.g., late phase oncology trials are often in the 201-1000 participants size category and are known to have high treatment discontinuation rates due to adverse events [16,17]. The comparison of AD and oncology trials performed by Medidata did not show any striking difference between these two therapeutic areas in percentage of lost to follow-up trial participants and percentage of trial participants with protocol deviations. The slightly higher number of patients withdrawing consent in oncology trials can be purely due to chance. It may also reflect the fact that oncology patients often seek additional lines of treatment and may change care centers after progression. This, in combination with study burden and mortality, can make obtaining full follow-up on patients more difficult.
Overall, the findings in this paper did not provide a definitive answer to the question of whether trial characteristics, such as phase, therapeutic area, and trial size, have a significant impact on historical values of parameters proposed for QTLs. More studies and analyses are needed to answer the question. The data presented in this article may, however, make the planning and hypotheses formulation for such work easier.
The analyses presented in this article have some limitations. As this was the first look at historical data in the context of parameters explored by TransCelerate, conclusions were limited by the lack of hypothesis testing. The survey methodologies, the large variance of trials for which Trans-Celerate member companies provided data, Medidata dataset being limited to only AD and oncology trials altogether limited comparisons between different therapeutic areas, drug administration routes, and other trial characteristics. While the authors attempted to analyze trials representative of the types for which parameters proposed by Bhagat et al. were intended [2], the analyzed trials are just a fraction of the studies performed every year; therefore, selection bias cannot be excluded.
Historical parameter values presented here provide clinical trial sponsors with benchmark information that may help sponsors in setting QTLs in new clinical trials. In addition, public domain data on historical values of parameters proposed for QTLs are available, but variable in terms of exactly what is presented across trials. Introduction of CONSORT diagrams in the majority of publications and disposition tables in results published in trial registries allows retrieval of information on patients withdrawing consent and lost to follow-up in the majority of historical trials [15]. On the other side of the spectrum are the parameters for which PD information in the public domain is very limited and therefore subject to selection bias. Regulatory authorities (e.g., United States Food and Drug Administration [US FDA]) are signaling potential revisions of approaches to trial results disclosure regulations to further align reporting on this type of information [18]. Also, industry forums and consortia may develop data exchange platforms that include historical data pertaining to PDs. Regardless, the body of the historical data available to establish QTLs in new trials will continue to grow in the future.

Conclusion
The first TransCelerate publication on QTLs included examples of parameters [2]. For this paper, we collected historical data on these parameters from clinical trials conducted by TransCelerate member companies and Alzheimer's disease and oncology trials in a dataset provided by Medidata and shared benchmark data on parameters that may serve as the basis for QTLs. We did not provide a definitive answer to the question whether trial characteristics have a significant impact on historical values of parameters proposed for QTLs. More studies and analyses are needed to answer the question.