FormalPara Key Points

Graphic displays of study treatment population-level data (study drug and comparator/placebo groups) should be linked to individual graphic timelines that depict serial biochemical measures in each study subject with acute or worsening liver injury, irrespective of the presumed etiology. Expert clinical narratives and time-linked diagnostic studies that provide valuable information for determining injury phenotype and causality should be appended to these graphic timelines

IT tools from sources such as CDISC should be used in the preparation of clinical trial findings pertinent to DILI analysis, in order to ensure complete and uniformly collected data results

Emergence of cases of mild, moderate or severe liver injury associated with a study drug should prompt the preemptive collection of DNA from all study subjects, both in the study drug and comparator treatment arms. In addition, serum samples from all study subjects obtained both before and during treatment should be stored to enable the future identification and study of candidate proteomic, metabonomic and other soluble biomarkers or predictors of DILI

1 Introduction

A workshop jointly sponsored by the Hamner-University of North Carolina Institute for Drug Safety Sciences and the European Innovative Medicines Initiative on best practices for the assessment of drug-induced liver injury (DILI) in clinical trials was convened in Boston, Massachusetts on November 9, 2012 [1]. Achieving an accurate early assessment of risk surrounding DILI in clinical trials continues to present a series of critical challenges for the developers of new drugs and biological agents, as well as government regulators. From a public health perspective there is a broad interest to (1) improve the quality, reliability and consistency of risk evaluation in clinical studies by diverse stakeholders, that include the large and small manufacturers of synthetic drugs and biological agents, academic investigators, governmental regulatory scientists and reviewers of New Drug Applications; (2) encourage broader enrollment of clinical study subjects who reflect ‘real-world’ patients, including those with different levels of pre-existing necro-inflammatory disease and fibrosis of the liver; (3) facilitate efficient and comprehensive regulatory review with attention paid to the impact on benefits and risks surrounding a new drug in individuals with increased susceptibility for the development of hepatotoxic reactions to the agent; and (4) provide a sound basis for aggregating data across drug development programs in order to predict the effects of demographic factors, background diseases and concomitant treatments on DILI and provide an expanding resource for pharmaceutical industry, academic and government scientists to discover new predictive biomarkers of DILI in a precompetitive space.

These objectives form a rationale for the development of best practices in data acquisition and analysis of hepatotoxicity in clinical trials. They are prompted by major gaps in the state of current knowledge to accurately assess the causal association of hepatic injury with exposure to newly developed study drugs or biological agents.

Species-dependent differences in xenobiotic metabolism and regulatory pathways that determine toxic responses to drugs have prevented a dependable extrapolation of drug-related findings in animals to man. Currently, there are no validated pre-clinical or clinical DILI biomarkers that reliably predict which new drugs can cause idiosyncratic DILI prior to occurrence of the actual events. In addition, clinical biomarkers have not yet been identified that indicate which patients are susceptible to serious DILI or whether adaptation or progression of liver injury will ensue after initiation of mild DILI. Second, DILI caused by different newly developed drugs may be caused by different toxicological mechanisms and associated with distinct clinical phenotypes as well as degrees of injury severity. These differences would impact what clinical datasets would be crucial to collect in clinical trials. Third, distinct susceptibilities to DILI are often related to demographic characteristics as well as baseline medical conditions in different populations. Required investigations surrounding an adequate evaluation of serious DILI cases in a clinical trial population must always include diagnostic testing for the systematic exclusion of all plausible competing etiologies for new-onset liver injury in similar patients, a thorough evaluation of pre-existing medical conditions and liver diseases, as well as the screening for relevant known pharmacogenomic markers of DILI risk.

Unfortunately, to achieve consensus for best practices in this arena there are a number of significant hurdles that must be overcome. On the one hand, it is self-evident that from a regulatory perspective, levels of refinement in the quantitative risk evaluation surrounding DILI that are required to adjudicate approvability of a new drug based on overall benefits and risks will depend on the strengths of the overall benefits. For example, the degree of quantitative precision that is required to assess risk for rare idiosyncratic liver failure surrounding a new treatment for a non-life threatening disorder of recurrent mild symptoms is higher than for a highly malignant tumor with a poor prognosis. On the other hand, in order to achieve uniform standards there is a necessity for establishing consistency in nomenclature and equivalency in data acquisition across drug development programs and investigational units for clinical studies. Comparability in the assessment of cases of hepatotoxicity across clinical trials in similar treatment populations will strongly depend on the development of common standards for the characterization of clinical courses of DILI and diagnostic testing to exclude non-drug related etiologies of liver injury.

This article discusses recommendations from a breakout session at the workshop that was charged to consider required data elements and best practices for data collection and standardization. The session paralleled others whose recommendations are summarized in companion articles on (1) methodology to assess clinical liver safety data [2], (2) causality assessment for suspected DILI [3], and (3) liver safety assessment in special populations (hepatitis B, C, and oncology trials) [4].

2 Standardization of DILI Terms

Standardization of nomenclature that characterizes hepatotoxicity in both pre-clinical and clinical settings is a prerequisite for establishing best practices in clinical trials. Characterization of clinically significant cases of DILI encompasses a need to define organ injury phenotype, the level of clinical severity that includes measures of liver function/dysfunction, and the level of likelihood of causal association with the study drug. Recently, there have been a number of international workshops and projects to establish standardization of terms for these attributes, most with reference to the characterization of hepatotoxicity caused by exposure to a marketed drug in non-study outpatients who are referred for evaluation [57]. In contrast, discussion about a plan to develop best practices as envisioned in this workshop specifically revolves around clinical trial enrollees treated with study drugs under development.

In clinical trials there is a unique opportunity to comprehensively monitor and systematically evaluate all enrollees at different time points from the pre-treatment phases to the end of the study. Generally, this assessment is complemented by a thorough evaluation of findings generated from pre-clinical in vitro and animal test systems.

DILI has been linked to a diverse range of drug and patient-specific pathological and/or clinical phenotypes and profiles of laboratory abnormalities [7] (see Table 1), implying that multiple potential mechanisms of injury are responsible for inciting hepatotoxicity.

Table 1 Pathological and clinical phenotypes of DILIa

When appropriate, these terms can be used as phenotypic descriptors of DILI cases in clinical trials. Establishing best practices surrounding the use of these different descriptors will require further discussion. In addition to hepatocytes, other cell types, including biliary, sinusoidal and Kupffer cells, may be damaged or contribute to the injury process. Moreover, although DILI is often associated with acute damage effects, in some instances it has also been linked to subacute, persistent and chronic forms of injury.

With such a diversity of DILI phenotypes and clinical signatures, it is self-evident that establishing rigorous standards in nomenclature is critically important to achieve best practices across drug development programs.

In conjunction with a requirement to adhere to precise definitions of different clinically important states of liver injury and levels of organ function/dysfunction in cases that occur in clinical trials, it is also necessary to conform to commonly adopted lab units (e.g. International Units/liter for serum liver enzymes and mg/dl for bilirubin) and approaches to establishing cut points between normal and abnormal measurements of circulating liver enzymes. As drug-induced hepatotoxicity in every individual is a dynamic process which changes over time from initiation of liver injury to progression or resolution, these definitions must incorporate criteria defined by the timing and evolution of these events. For example, the ratio of serum alkaline phosphatase (ALP) to alanine aminotransferase (ALT) levels often may increase over the course of acute liver injury.

Thus, since acute hepatocellular and cholestatic injuries are defined by R values [ratios of ALT fold upper limit of normal (ULN) ÷ ALP fold ULN are greater than 5 and less than 2, respectively], establishing a convention that these should be measured at the onset of liver injury is crucial. In addition, establishing criteria for cut-off boundaries between normal and abnormal values of liver enzymes is critically important. In some instances, it is appropriate that the distribution of values within a specific demographic group or treatment population, if documented, should serve as a frame of reference for the measurement of an upper limit of normal. However, when study populations are comprised of individuals with frequent and variable pre-treatment elevations of liver enzymes, such as in clinical trials to treat chronic viral hepatitis or non-alcoholic steatohepatitis (NASH), population-based upper limits of normal do not apply.

In the case of treatment trials for NASH, individual pre-treatment baseline measurements of serum liver enzyme and bilirubin may be optimal as reference levels to assess subsequent acute liver injury. However, in treatment trials for viral hepatitis the values of liver tests at their nadir, after treatment-induced viral suppression has occurred, may be more suitable as reference levels to assess possible DILI, if liver indicators later rise with continued treatment. These considerations are extensively discussed in a companion article [4].

At the time of enrollment of patients with pre-existing liver diseases, when should baseline measurements prior to treatment with a study drug be performed? In developing best practices, there has been general agreement that to reliably measure baseline enzymes prior to the initiation of a test drug, a minimum of two time points should be sampled. Baseline measurements at more than one time point are particularly important if the values of liver disease are likely to fluctuate as a result of the natural course of pre-existing liver disease. One proposal that was made is that these be performed during a clinical trial’s run-in phase, just before the initiation of study drug treatment as well as one month earlier.

Another consideration in adopting best practices for measuring liver enzymes during clinical trials is the nature of the study drug. Certain drug groups, including many classes of chemotherapeutic agents are marked by a narrow therapeutic index. These products would be expected to cause dose-related liver injury in many study subjects in a range often close to or overlapping with therapeutic dosing. With such drugs, studies of dose response and duration of treatment effects on liver biochemical indicators are vital. In addition, it is important to identify extrinsic and intrinsic modifiers of study drug exposure in the liver, as well as factors which predict increased susceptibility to hepatotoxicity by the study drug at lower doses.

3 Domains that Influence What Data Elements Should be Collected and Standardized

The required elements of data acquisition and standards of data management in clinical trials are influenced by the clinical context in which each study drug will be used. To optimize best practices for acquiring and managing necessary data there are five interconnected domains that must be harmonized. They are study design, data acquisition, DILI case assessment & management, data management and scientific and regulatory review. These domains inform one another with regards to the required elements they contain and can be modified for study drugs used in certain clinical contexts.

3.1 Impact of Study Design on Necessary Data Elements

Data required for a thorough analysis of DILI events in clinical trials are impacted by study inclusion/exclusion criteria. For example, when patients with a pre-existing liver disease are recruited into studies, it is especially important to accurately measure patient-level baseline values of all liver enzymes and other liver tests (e.g. serum bilirubin and INR) prior to treatment with the study drug. It was suggested at the best practices workshop that samples should be obtained at two or more time points one month apart in the pre-treatment phase to determine if these parameters are constant or subject to fluctuations. Because many drugs are metabolized and cleared by the liver, pre-existing liver diseases may significantly alter blood pK profiles, delivery rates of the drug to the liver and/or other tissues, hepatocellular modifications of the parent drug by phase I and II enzymes and/or secretion of the drug products in phase III out of hepatocytes into the bile or other extracellular compartments. Major disease-driven perturbations at any one of these steps caused by liver disease can have important effects on dose-dependent study drug efficacy and/or risk for toxicity.

As described, these can be different, depending on the pathological processes, clinical severity and chronicity of the underlying liver disease. Thus, studies which identify changes in study drug and drug metabolite exposure levels due to reduced drug-protein binding in the circulation, portosystemic shunting, alterations in the drug’s hepatocellular metabolism and clearance in a study population with pre-existing liver disease are crucial for the evaluation of risk factors for hepatotoxicity. In a similar vein, a comprehensive assessment of drug-drug interactions that alter the uptake, metabolism and clearance of study drugs metabolized and/or cleared by liver cells is necessary if there is a toxic exposure threshold that may be crossed. Finally, change-of-function genetic polymorphisms of a study drug transporter protein or metabolizing enzyme, and proteins which determine hepatic immune responses to the drug (e.g. major histocompatibility complex Class I and II molecules) take on special importance when cases of DILI caused by ‘intrinsic’ factors are identified in clinical trials. When genomic variants are discovered in a rigorously scientific manner to be associated with susceptibility to study drug-specific DILI there may be a future opportunity to screen patients as an aid in patient risk management.

With reference to best practices, to evaluate performance characteristics of putative genetic markers the workshop attendees concluded that it is critically important to systematically collect DNA biospecimens from all study subjects, including controls treated with a comparator/placebo, as well as those treated with the study drug who do not develop DILI.

3.2 Protocols for Data Acquisition

Taking study design and study enrollment criteria into account, it is essential to include detailed pre-specified instructions in clinical study protocols with comprehensive listings of all the clinical measurements and assays that should be performed at baseline and during treatment with the study drug/comparator in conjunction with detailed timelines that describe when and how biospecimens should be obtained, and handled by site investigators. Detailed protocols of required tests and clinical data should be provided for all study subjects and separately for all study subjects who develop acute liver injury or worsening of liver injury during treatment with the study drug or comparator with follow-data until the end of the study period (see Sect. 3.3). Protocols should also include a plan(s) to utilize expert internal and/or external consultation available to each drug developer to provide timely guidance on obtaining sufficient data for the comprehensive characterization of clinical phenotype and etiology as well as for effective ‘real-time’ clinical management of all cases of acute liver injury occurring during treatment with the study drug or comparator. If an emerging liver ‘signal’ comprised of cases of mild, moderate or severe liver injury is associated with a study drug, preemptive collection of genomic biospecimens from all study subjects in large clinical trials of the drug should be performed. In addition, serum samples obtained both before and during treatment of all study subjects should be stored to enable the future study of new proteomic, metabonomic and other soluble biomarkers or predictors of DILI (see below). In special circumstances, when there is a rising concern that a study drug may cause serious DILI, it may be necessary to unblind study subject treatment assignments of individuals with etiologically undiagnosed liver injury before the study is terminated.

Clinical development programs for drugs that will likely be used to treat patients with increased susceptibility to serious DILI after marketing has begun should contain sufficient data to identify and characterize important ‘risk factors’ and drug exposure effects. Target treatment populations that may be associated with an increased risk for DILI include patients with pre-existing liver disease, unique demographic risk-related characteristics, use of concomitant drugs responsible for potentially toxic drug-drug interactions with the study drug, reduced drug clearance, or genomic markers of increased risk.

3.3 Protocols for DILI Case Assessment and Management

As described above, complete protocols in clinical trials must provide instructions to obtain all required data elements that will enable full assessment of all cases of new onset or worsening liver injury. There are three broad reasons to establish a detailed map for clinical investigators to follow to obtain necessary data elements. First, it will provide a basis to mitigate risk for serious outcomes in individual study subjects with DILI based on prognostic considerations. Time-sensitive actions to optimally manage DILI include discontinuation of the study drug in a timely manner, avoiding re-challenge, modification of dosage, performance of time-sensitive diagnostic tests and institution of therapeutic interventions. Second, after evaluation of all DILI cases in the clinical trial database, it will provide a sound basis to predict risk for clinically serious DILI in post-market treatment populations based on analyses of phenotype, clinical severity, DILI incidence in the study population and identified factors tied to increased DILI susceptibility. Third, once uniform practices in data collection are adopted there will be a growing opportunity to aggregate data across clinical trials to facilitate the research of DILI.

3.4 FDA Guidance on Pre-Marketing Assessment of DILI as a Frame of Reference for Best Practices

A guidance on pre-marketing risk assessment of DILI in clinical trials was issued by FDA in 2009 [8]. FDA guidances generally do not establish legally enforceable responsibilities for industry, unless specific regulatory or statutory requirements are present. Rather, they describe the agency’s thinking on a topic and provide suggestions and recommendations. These can later be revisited and refined after careful science-based deliberation by FDA with stakeholders. Since the best practices workshop was convened by non-FDA parties, conclusions drawn by attendees do not represent an FDA position and should not be construed as replacing the 2009 guidance. At the workshop, there was a consensus that from a best practices perspective, points previously made in the FDA guidance pertinent to DILI case evaluation and management have proven very useful and should be followed. The guidance should be referred to directly, for this purpose.

In brief, the FDA guidance has emphasized that by definition Hy’s Law cases are marked by serious hepatocellular injury for which after an appropriate work-up for possible alternative causes the study drug has been identified as the most likely cause. The term ‘Hy’s Law’ refers to an original observation made by the late Dr. Hyman Zimmerman that at least 10 % of patients who develop jaundice caused by drug-induced hepatocellular toxicity will develop liver failure. As described in the 2009 guidance, biochemical criteria consistent with Hy’s Law include elevations of serum ALT or AST > 3X ULN, accompanied by increases of serum bilirubin levels >2X ULN that occur concomitantly or ensue within a one-month period. Since the hepatic injury associated with Hy’s Law is hepatocellular and not cholestatic in nature, the guidance notes that peak ALP levels are characteristically <2X ULN. Moreover, R values [ALT ÷ ALP (fold ULN)] >5 are consistent with hepatocellular DILI, as defined by an international consensus [5, 6]. The presence of Hy’s Law cases in a pre-approval clinical trial database raises substantial concern that post-marketing liver failure cases associated with the study drug are likely to occur, assuming that the incidence and prognostic implications of these events can be projected into a large treatment population. Attendees at the workshop agreed that new retrospective studies of liver safety databases for drugs known to have risk of causing liver failure would provide valuable opportunities to further investigate this link. Isolated serum ALT elevations often resolve even with continuation of treatment with an idiosyncratic hepatotoxic agent. Thus, such transient enzyme elevations, particularly those that are low or moderate, have little prognostic value. Since it is not yet possible to identify that small subset of study subjects with mild DILI marked by rises of aminotransferases who will fail to adapt to the study drug and progress to life-threatening forms of liver injury, the FDA guidance has set criteria that trigger a requirement for follow-up and evaluation of study subjects who develop abnormal test results as well as ‘stop-treatment’ recommendations that are conservative in favor of subject safety. The guidance stipulates that subjects who develop serum aminotransferase (AT) increases >3X ULN should be followed by repeat serum liver testing within 48–72 h. If repeat testing demonstrates that AT levels remain >3X ULN, or >2X pre-treatment values for subjects with elevated levels before study drug exposure, the guidance recommends close observation that includes elicitation of a detailed history and performance of diagnostic lab studies to exclude other causes of acute liver injury. In addition, continued liver testing 2–3X /week is recommended until resolution or return to baseline values of liver test results. Unfortunately, in the guidance there are notable gaps in setting criteria for initiating close observation of study subjects with pre-existing liver diseases marked by high baseline AT measures.

The guidance also recommends that the study drug be immediately stopped when any of the following results are obtained by the site investigator: 1. ALT or AST > 8X ULN; 2. ALT or AST remains >5X ULN over 2 weeks; 3. ALT or AST > 3X ULN & total bilirubin > 2X ULN or INR > 1.5; 4. ALT or AST > 3X ULN with symptoms (e.g. fatigue, nausea and vomiting, right upper quadrant pain, fever, rash) or eosinophilia. The guidance also recommends that if an earlier episode of hepatotoxicity with the study drug has occurred rechallenge should be avoided unless there are no other good therapeutic options.

To evaluate individual cases of hepatic injury in clinical study subjects, the guidance has set forth a series of recommendations. First, it enumerates all of the critical elements that should be ascertained for characterizing cases of liver injury and incorporated into the case report forms (CRFs) of subjects, either treated with the study drug or comparator in order to fully describe the events, as well as demonstrate results of a complete battery of diagnostic studies. These are incorporated into Table 2.

Table 2 Critical Elements for Characterizing Cases of Liver Injury in Clinical trialsa

Second, it has set forth recommendations to check for clinical correlates and perform diagnostic tests in subjects with possible study drug-induced hepatotoxicity that exclude all the main alternative causes of liver injury (see Table 3).

Table 3 Alternative Etiologies of Liver Injury other than Study Drug in Clinical Trialsa—the bolded items represent the minimal data to be collected

Finally, the guidance makes recommendations for the overall assessment of liver-related adverse events in a clinical trials database, once studies of a new drug candidate are completed in preparation of a New Drug Application or Biologic License Application. These recommendations include assessment of the metabolism of the study drug and measurement of the incidences of liver injury, in individual trials and in the entire clinical trial database, with stratification by levels of severity, as defined both by biochemical and clinical parameters. Descriptions of these strata are incorporated into Table 5. Submissions to FDA should also contain narrative summaries of all cases marked by biochemical parameters conforming to Hy’s Law that include extensive demographic, clinical and laboratory data to achieve comprehensive differential diagnoses and causality assessments.

3.5 Workshop Discussion on Acquisition of Critical Data Elements

The workshop attendees concluded that best practices being considered to evaluate DILI in clinical trials should support and maintain consistency with recommendations in the FDA guidance. In concert with the FDA guidance, the workshop concluded that a number of additional best practice measures are in order. Many of these proposed or considered measures are summarized in the tables as italicized comments. First, in Phase I studies of new drugs serum liver chemistry testing should be conducted in all study subjects exposed to the drug on a frequent basis (at least every 2 or 3 days during the study period). Second, critical elements for characterizing cases of liver injury in all study subjects should include sequential measurements of a core of clinical and laboratory parameters in cases of serious liver injury, at specified phases of hepatic injury and recovery (see Table 2). When needed, consultation with a clinical subject matter expert is recommended to guide the performance of necessary additional diagnostic studies. Third, biochemical testing of serum samples for liver indicators that determine whether the study drug should be continued, discontinued or undergo dose modification should be performed in local labs at each investigator site, in order to expedite timely risk management decisions surrounding study subjects. Separate testing of duplicate serum samples should later be performed in clinical study central labs to ensure consistency in the follow-up analyses and scientific reviews of the data. Fourth, narratives of all clinically significant acute or worsening liver injury cases in study subjects receiving study drug or comparator/placebo, including those with biochemical abnormalities conforming to Hy’s Law, should be assembled by clinicians with an in-depth knowledge of diagnostic hepatology to facilitate an informed evaluation and the effective communication of both clinical case characteristics and causality of liver injury to academic and regulatory scientists. To enhance transparency and contextualization of the liver findings in clinical trials, graphic displays of peak liver test results of all study subjects should be uniformly provided, in conjunction with detailed time-based displays of the changing biochemical parameters for each individual manifesting acute or significant worsening of liver injury during treatment with the study drug or comparator/placebo. Each study-subject level graph should be linked to its corresponding clinician-assembled case narrative. A companion article describes the graphic tools in more detail [2].

Finally, in drug development programs of agents with an emerging liver injury signal (pre-clinical findings of liver toxicity or drug-related increases of serum ATs in clinical study subjects) or in a class containing hepatotoxic drugs there is a unique opportunity to preemptively bank DNA of study subjects receiving the study drug as well as controls, using methods that conform to legal and ethical standards. The FDA has issued a new guidance in January 2013 on the premarket evaluation of clinical pharmacogenomics during early-phase clinical studies [9]. This guidance provides recommendations on when and how genomic investigations should be considered to address questions arising during drug development and regulatory review. In keeping with the general principles that are outlined in the guidance, should clinically serious hepatotoxicity turn out later to be causally linked to the study drug, banked DNA samples can be retrieved and analyzed to identify predictive biomarkers of increased susceptibility to DILI. It is recommended that informed consents with collection and storage of genomic DNA (e.g. blood or buccal cell DNA) be obtained from all study subjects (study drug and control groups). Expanding on this concept, serum samples systematically obtained before and after the initiation of treatment could be similarly banked to enable the later study of proteomic, metabonomic and other soluble markers or predictors of DILI. Urine samples could likewise be stored for the study of metabonomic markers.

The workshop determined that in keeping with the FDA guidance a comprehensive battery of tests for alternative etiologies of liver injury should be performed (see Table 3). In addition, serum samples during the acute and resolution phases of liver injury should be stored to enable the post-hoc performance of diagnostic tests other than those specified in the study protocol. In one case in which the retrospective serological testing established an alternative etiology of acute liver injury a clinical study subject with a possible diagnosis of DILI was found to have acute Type E viral hepatitis. Consistent with the FDA guidance, the workshop concluded that liver biopsies are only recommended if clinically indicated. In possible cases of acute DILI, they are generally not required to establish an alternative diagnosis. However, in special cases, liver histopathology may provide important diagnostic information, and the availability of liver tissue specimens or a pathology report may be useful. For this reason, when liver biopsies are performed, they should be stored and retrievable to enable post-hoc case evaluation.

3.6 Monitoring and Management of Study Subjects with Pre-existing Liver Disease

The workshop established that enrollment of study subjects with pre-existing stable chronic elevations of ATs due to pre-existing liver disease is acceptable (see Table 4), but only in the absence of increased serum bilirubin or evidence of end-stage cirrhosis.

Table 4 Analysis & Management of Study Subjects in Special Populations with Pre-existing Liver Abnormalities in Clinical Trialsa

In brief, it was felt that subjects being studied for the treatment of oncological diseases with tumor involvement of the liver marked by stable elevations of serum ATs and ALP should be included in studies, if similar patients would be likely to receive treatment with the same drug in a post-market setting. Moreover, in contrast to a general recommendation to exclude enrollment of patients with both hyperbilirubinemia and serum AT elevations in clinical trials of agents being developed to treat non-malignant conditions, the presence of tumor-associated hyperbilirubinemia with or without AT elevations would not necessarily disqualify study candidates from inclusion in clinical studies of anti-tumor agents. The workshop also determined that study subjects with pre-existing liver disease should only receive study drug if baseline liver test results obtained at two or more time points approximately one month apart show no major changes. As a frame of reference for the later monitoring of liver tests during treatment with the study drug, the baseline levels of each of the biochemical indicators would be computed as the means of these time-separated pre-treatment measures. In the event that any of these indicators increase to levels >2X above the nadir values during treatment with study drug, suggesting worsening liver injury consistent with possible DILI, confirmation and increased observation should be initiated, as outlined in the FDA guidance. Treatment stop rule options in the face of rising AT levels during treatment with a study drug in special populations with pre-existing liver diseases were discussed and are summarized in a companion article [4]. A consensus was not reached among meeting attendees with regards to endorsement of a specific set of instructions for discontinuation of treatment in these patients. The workshop concluded that a series of analyses of cases with new onset or worsening liver injury in clinical trial databases of the study subjects should be performed (see Table 5).

Table 5 Stratification of Cases of Liver Injury in Study Subjects treated with Study Drug or Comparator in Clinical Trial Database

Consistent with the FDA guidance, these assessments should stratify cases based on levels of liver injury severity and then compare equivalent strata of the trial enrollees treated with study drug versus placebo or comparator agents. The workshop also determined that strata of interest would be defined by the fold increases of liver tests results above the upper limits of normal. In studies of patients with pre-study liver test abnormalities, these strata would be defined by the fold increases of each individual’s baseline test results, as described in Table 4. Significant associations of these stratification groups with demographic characteristics, study drug dosing, other treatments, underlying conditions or other lab measures should be identified. If candidate DNA markers of DILI susceptibility have been identified by genome-wide association studies (GWAS) or targeted gene analysis (TGA), each stratum should be separately analyzed for the strength of this association.

3.7 Critical Diagnostic Tests and Time Course Data

As mentioned above, recent workshops have been convened to discuss standards and acquisition of valuable data elements pertinent to assessment of liver injury cases associated with exposure to a marketed drug in non-study outpatients who are referred for evaluation. Data elements deemed important for publication of cases previously identified by the NIH DILIN [10] and highlighted in the Liver Tox website [11, 12] are included in Table 6.

Table 6 Important Data Elements for Summaries of Possible DILI Cases in Clinical Trialsa

These were also considered to be important by the workshop attendees for DILI assessment in clinical trials. The workshop also discussed the utility of testing acetaminophen (APAP)-derived adducts on samples obtained from study subjects in which APAP overdose is suspected (e.g. when very high AT levels are detected). Serum gamma-glutamyl transferase (GGT) often rises in isolation or in concert with elevations of ALP or as a result of high ethanol exposure or after treatment with some cellular enzyme-inducing drugs. The workshop did not draw a firm conclusion about the value of this indicator to characterize DILI cases in clinical trials. With emerging pharmacogenomic markers of increased susceptibility to drug-specific DILI, relevant genomic test results, if available, should also be included as valuable data elements in the summaries of possible DILI cases in clinical trials. A key concept for best practices in the acquisition of critical data elements during clinical trials is that in such a prospective study setting there is a unique opportunity to serially gather both liver-related clinical and biochemical data over a specified sequence of time points, beginning at the study drug pre-treatment phase, through phases of early, peak and late phases of liver injury, and finally at the time of resolution or stabilization of organ damage. The best practices workshop concluded that for each study subject with acute or worsening liver injury, irrespective of presumed etiology, all serial liver tests and other relevant data elements that are clinical indicators and/or biochemical measures of hepatic injury should be graphically displayed on a timeline. In addition, the expert clinical narrative and all time-linked diagnostic studies that shed light on injury phenotype and causality should be appended to the graphic displays of each study subject.

3.8 Data Management for Scientific and Regulatory Review

Once data in clinical trials pertinent to DILI analysis have been collected, there is a requirement to ensure that they are complete, accurately tabulated and conform to acceptable data standards. To this end, a number of measures should be undertaken. These include adopting standard representations of the data elements that are consistent with a requirement in the Food and Drug Administration Safety and Innovation Act (FDASIA). The US law which was passed by the Congress in 2012 stipulates that FDA must establish standardized clinical data terminology for electronic submissions and standardization of drug application data. After a draft guidance is issued by FDA for stakeholders to provide comments on, this objective is to be fully implemented by 2017, using standardized clinical terminology developed by open standards development organizations. One example of a non-profit standards developing organization is the Clinical Data Interchange Standards Consortium (CDISC) [13]. CDISC supports the development of data standards for medical projects of any type, clinical study protocols, and the specification and reporting of test results. IT tools from sources such as CDISC should be used in the preparation of clinical trial findings pertinent to DILI analysis to ensure complete and uniformly collected data results. In the case of CDISC, a Standard Data Tabulation Model (SDTM) is used to repair erroneous or missing data and ensure standardization of terms prior to submission of a drug or biologics application to regulatory authorities for review.

The utilization of uniform data standards in Case Report forms (CRFs) for DILI cases in each drug development program and across programs is critical for the performance of reliable and consistent clinical study reviews and will enable user-friendly interfaces with other clinical lab datasets in order to explore the effects of different variables on liver injury and/or organ function. It will also encourage further improvements in statistical and graphical tools that are available for DILI analysis and readily usable by various stakeholders. Adoption of data standards during the early planning stages of a clinical study will ensure uniformity of the data that will be later acquired and reduce errors that can stem from post-hoc efforts to harmonize and convert data. To standardize collection of data in CRFs, CDISC is developing protocols for Clinical Data Acquisition Standards Harmonization (CDASH). The use of such protocols developed by CDISC or other appropriate sources during the conduct of clinical studies increases the likelihood of capture of complete and uniform data to enable robust analyses of DILI events, as well as the pooling of findings across trials.

Finally, harmonization and application of these analytic tools will facilitate the accurate evaluation of study drug-related DILI risk and enable exploratory investigations of different factors that modify this risk. Standards-based approaches will also provide a platform for the efficient completion of comprehensive and reliable reviews by regulatory scientists and other stakeholders of large and complex clinical trial datasets, while facilitating the downloading of clinical and laboratory data into valuable graphic and analytic programs (e.g. IT review instruments used by FDA regulatory scientists, such as eDISH [14], Antiviral Information Management System (AIMS) [15], SAS tools including JMP Clinical [16] and MAED Service [17], R statistical and graphical tools [18], and JReview [19]).

3.9 Highlights and Recommendations

  1. 1.

    Different DILI phenotypes should be referenced using appropriate corresponding descriptors, as listed in Table 1.

  2. 2.

    In clinical studies that enroll subjects with pre-existing liver chemistry abnormalities, two or more serum samples should be obtained over a one month period in the pre-treatment phase to determine whether the abnormalities are constant or changing.

  3. 3.

    When a candidate drug may later be associated with DILI, the systematic collection of biospecimens for DNA analysis should be undertaken from all study subjects, including those treated with the study drug who do/do not develop DILI as well as those treated with the comparator or placebo.

  4. 4.

    Emergence of cases of mild, moderate or severe liver injury associated with a study drug should prompt the preemptive collection of DNA from all study subjects, both in the study drug and comparator treatment arms. In addition, serum samples from all study subjects obtained both before and during treatment should be stored to enable the future identification and study of candidate proteomic, metabonomic and other soluble biomarkers or predictors of DILI.

  5. 5.

    The retrospective examination of liver safety databases for drugs that are found to be causally associated with liver failure in large treatment populations would provide an opportunity to study DILI susceptibility factors and further investigate the link with any Hy’s Law cases observed in clinical trials.

  6. 6.

    Biochemical testing of serum for liver indicators should be performed at each investigator site to expedite timely risk management decisions. Duplicate samples should also be tested in the sponsor’s central lab(s) to ensure consistency of lab measurements for analysis and scientific review.

  7. 7.

    Consultation with clinical experts is recommended to guide diagnostic testing and analysis of cases with new onset or worsening liver injury during treatment with a study drug or comparator/placebo. Narratives of all cases including those with abnormalities that are consistent with Hy’s Law should be assembled by clinicians with appropriate expertise.

  8. 8.

    Liver injury strata of interest in each treatment arm include those that are defined by the fold increases of liver chemistry results above the upper limits of normal, as described in Table 5. In studies of subjects with pre-existing liver test abnormalities, these strata might be defined by the fold increases of each individual’s baseline test results.

  9. 9.

    The workshop did not draw a firm conclusion about the value of serum GGT as a routine indicator to characterize DILI cases in clinical trials.

  10. 10.

    With emerging pharmacogenomic markers of increased susceptibility to drug-specific DILI, relevant genomic test results, if available, should be included as valuable data elements in the summaries of possible DILI cases in clinical trials.

  11. 11.

    Graphic displays of study treatment population-level data (study drug and comparator/placebo groups) should be linked to individual graphic timelines that depict serial biochemical measures in each study subject with acute or worsening liver injury, irrespective of the presumed etiology. Expert clinical narratives and time-linked diagnostic studies that provide valuable information for determining injury phenotype and causality should be appended to these graphic timelines.

  12. 12.

    IT tools from sources such as CDISC should be used in the preparation of clinical trial findings pertinent to DILI analysis, in order to ensure complete and uniformly collected data results.

4 Conclusion

Many of the recommendations made in the session on required data elements and best practices for data collection and standardization are in alignment with those made in other parallel sessions on methodology to assess clinical liver safety data, causality assessment for suspected DILI, and liver safety assessment in special populations (hepatitis B, C, and oncology trials). Nonetheless, a few outstanding issues remain for future consideration. For example, reconciling different options for study drug stopping rules in patients with background liver diseases will require further discussion.