FormalPara Key Points

Using chemical analysis to verify labeled contents in HDS products associated with liver injury helped increase review confidence in assigning causality.

Future use of chemical analysis causality assessment for HDS-related liver injury may increase the confidence of the attribution of causality in challenging cases.

1 Introduction

Drug-induced liver injury (DILI) can lead to significant morbidity and mortality and accounts for a large percentage of acute liver failure cases in the USA and Europe [1]. Over-the-counter medications such as acetaminophen and prescribed medications, particularly antibiotics, are a common cause of DILI [2]; however, herbal and dietary supplements (HDS) have unfortunately become an increasing cause of liver injury as well [3]. The US Drug-Induced Liver Injury Network (DILIN) has prospectively enrolled cases of liver injury due to medications and HDS since 2004, with HDS being recognized as the second most common cause of DILI [4,5,6,7]. Furthermore, several reports demonstrate that the outcomes from HDS-associated liver injury can be severe, leading in some cases to the need for liver transplantation or death [6, 7].

The diagnosis of DILI due to any agent is complicated by the lack of a specific diagnostic test that can provide incontrovertible proof of attribution and requires the elimination of other potential etiologies before assigning attribution. Therefore, confidence that an agent is the cause of hepatotoxicity relies upon suspicion that an agent is implicated in injury, typically based on the chronology of exposure and a careful exclusion of other causes of liver injury. Various structured causality assessment tools exist that allow an evaluator to arrive at a quantitative likelihood of attribution [8,9,10,11,12,13,14]. The most commonly used structured causality assessment process, the Russel Uclaf Causality Assessment Method (RUCAM), has been applied for both drugs and supplements [16]. A recent refinement of the RUCAM, the revised electronic causality assessment method (RECAM), offers a significant improvement in diagnostic precision and reliability as it is agent specific [16]. However, the determination of causality assessment and attribution of injury to HDS remains challenging, as the RUCAM was not built with HDS agents in mind as causative agents and the RECAM has not been validated for HDS-associated liver injury.

Refinement of a causality assessment process for HDS is required given their extensive use and potential to cause severe liver injury. Our group previously found that HDS labels are often inaccurate, as confirmed by chemical analysis [17, 18]. However, few publications exist that explore the value of chemical analysis in the causal attribution of liver injury to a product. Therefore, we aimed to determine the value of chemical analysis in the causality assessment process. We used the well-established DILIN process of diagnostic attribution, which is based on structured causality assessment complemented with expert opinion as a reference standard and compared this with the same process incorporating chemical analysis on previously adjudicated HDS cases.

2 Methods

The DILIN was established by the National Institutes of Health (NIH) in 2003 to study drug-induced liver injury (DILI). The DILIN’s Prospective Study enrolls patients with suspected DILI within 6 months of onset. Patients eligible for enrollment meet predefined laboratory criteria of serum aspartate aminotransferase (AST) or alanine aminotransferase (ALT) > 5 times or serum alkaline phosphatase (ALP) concentration > 2 times the upper limit of normal (or baseline before exposure) on two consecutive occasions at least 24 h apart. Those with an unexplained total serum bilirubin of greater than 2.5 mg/dL or an INR above 1.5 after exposure are eligible as well. A more in-depth review of eligibility, evaluation, and enrollment in the DILIN Prospective Study has been described in a previous publication [5].

The determination of attribution in the DILIN is based on a structured causality assessment approach complemented by consensus expert opinion. Three causality committee members, with expertise in DILI and hepatology (the clinical site principal investigator and two others), receive key clinical, laboratory, and diagnostic data abstracted from the DILIN baseline visit and clinical narrative [5]. Cases are graded as definite (> 95% likelihood), highly likely (75–95%), probable (50–74%), possible (25–49%), or unlikely (< 25%), reflecting the likelihood of liver injury being attributed to DILI (Table 1). In cases where more than one agent is implicated, each medication or product is scored separately for the likelihood that it was responsible for the injury. When the DILIN causality scores of the three assigned reviewers do not agree, the case is discussed by the entire committee on a teleconference wherein a final consensus score is assigned [5].

Table 1 Drug-induced liver injury network scoring categories

Since its inception through December 2020, the DILIN enrolled 2203 patients, with 23% of cases with implicated in HDS [6]. Since 2015, whenever available, implicated products are collected from patients and submitted for chemical analysis to the National Center for Natural Products Research at the University of Mississippi (NCNPR), which uses high performance liquid chromatography with mass spectroscopy to chemically analyze and catalog their contents. Specifically, the NCNPR assays for the presence of known hepatotoxins, including Camellia sinensis (also known as green tea) leaf (catechin, epicatechin, epigallocatechin, gallocatechin, epicatechin gallate, epigallocatechin-3-gallate or EGCG, caffeine, theobromine, theophylline), Garcinia cambogia fruit (hydroxycitric acid or HCA, garcinialactone, citric acid), Polygonum multiflorum root (resveratrol, emodin, aloe-emodin, physcion, polydatin, tetrahydroxystilbene or TSG), Scutellaria baicalensis (also known as skullcap) aerial parts (baicalein, wogonin, wogonoside, baicalin), Mitragyna speciosa leaf or kratom (mitragynine, speciogynine, paynantheine, corynoxine, 7-hydroxy mitragynine, mitrafoline, corynantheidine, speciofoline, mitragynalinic acid), Withania somnifera (also known as ashwagandha) root (withaferin A, withanolide D, withanoside IV, withanone, withanolide A, withanolide B, withanoside V), Curcuma longa (also known as turmeric) rhizome (curcumin, demethoxycurcumin, bisdemethoxycurcumin), anthraquinones (sennoside A-B, emodin, aloe-emodin, rhein, chrysophanol, cascarin, catenarin, danthron, cascarosides A-F, 1,8-dihydroxyanthraquinone, 2-aminoanthraquinone, 1-amino-2,4-dibromoanthraquinone,emodin-8-O-β-D-glucopyranoside,chrysophanol-8-O-β-D glucopyranoside), and aloe-emodin-8-O-β-D-glucopyranoside and non‐botanical (anabolic steroids, pharmaceuticals) compounds. For anabolic steroids and pharmaceuticals, the analyzed supplements were screened using Agilent MassHunter Forensics and Toxicology (9203 compounds) Personal Compound Database (PCD) kit with accurate mass measurements. Searching the PCD library helps to identify the compounds found by matching their product ion mass spectra and fragment ions. During this screening process, if any of these compounds were detected for anabolic steroids or pharmaceuticals, they were further confirmed using the reference standards.

The HDS are categorized on the basis of their marketed purpose for use, such as weight loss, performance enhancement, bodybuilding, and general wellness. Products collected from patients enrolled in the DILIN are assayed to assess whether the contents found reflect the labeled ingredients, when an actual or online list of purported ingredients label exists.

For this study, cases were included in which HDS was suspected as the cause of liver injury and originally scored as definite (> 95%); highly likely (75–94%); probable (50–74%); possible (25–49%); and unlikely (< 25%) DILI and for which product was collected from the patient for chemical analysis. Reviewers were blinded to original causality scores performed when the chemical analysis data were not available. There were 54 cases enrolled between 2004 and December 2020 that met these criteria (Fig. 1).

Fig. 1
figure 1

Flowchart of DILIN cases included in analysis

The initial step in our approach was to repeat the standard DILIN structured causality assessment approach with expert opinion from a panel of DILIN investigators (Table 1). The purpose of repeating the original causality assessment process was to create contemporaneous baseline scores for cases reviewed by the same reviewers involved in the second step of our study. Three reviewers were assigned to each case. An overall score that reflected the likelihood that the case represented hepatotoxicity due to an agent or agents was assigned.

In the second step, the 54 cases underwent a modified DILIN structured causality assessment approach, in which chemical analysis data were provided during the causality assessment process. Three reviewers were again randomly assigned to the cases for the second step review. The DILIN Data Coordinating Center (DCC) at Duke University was responsible for the random assignment of cases to reviewers, the blinding of reviewers to the original causality assessment scores, and the presentation of chemical analysis data for each case. The chemical analysis data reflected (1) whether the analysis confirmed the labeled contents and (2) the presence of any potential hepatotoxin, as listed above, or pharmaceutical. As per the DILIN’s protocol, cases were circulated to reviewers for independent assessment and score assignment. Scores were then reconciled by email, or when necessary, by conference call and majority vote.

The original overall causality scores (step 1) were compared with the overall causality scores resulting from causality assessment that incorporated chemical analysis (step 2). The data report the number and percentage of cases with the likelihood of DILI moving higher, lower, or remaining the same. To compare the causality scores in the first review versus the causality scores in the second review, we examined the data in a 5 × 5 contingency table cross-classified by the five causality categories from the first and second reviews. We tested whether there was a statistically significant difference in the marginal distributions of this contingency table. Specifically, we test whether the five-category causality distribution in the first review is the same as the five-category causality distribution in the second review. A chi-squared test with four degrees of freedom based on a weighted least square approach was used to test the equality of the two marginal distributions. This test was carried out by using the SAS procedure CATMOD with the weighted least squares approach to account for correlation due to the same subjects were scored twice. All analyses were carried out in SAS version 9.4 and a p-value less than or equal to 0.05 is considered statistically significant.

3 Results

The causality assessment scores resulting from the initial review, without chemical analysis data from step 1, are displayed in comparison with causality assessment scores with chemical analysis from step 2 in Table 2, and the marginal distribution of the causality assessment scores are displayed in Fig. 2. Overall, the addition of chemical analysis data to the causality assessment process caused the likelihood scores to shift to higher levels of confidence that the implicated HDS was the cause of hepatotoxicity. Using the chemical analysis data, 37% (n = 20) of the 54 cases were scored with a higher likelihood of causal attribution compared with the baseline assessments (see highlighted in Table 2). Specifically, out of these 20 cases, 60% (12/20) of those whose scores increased from probable to highly likely, and 25% (5/20) increased from very likely to definite. Two cases (10%, 2/20) had an increase in causality from possible to highly likely and one case (5%, 1/20) had an increase in causality from unlikely to probable. In these cases, one case detected anthraquinones on chemical analysis and in the other case, no new ingredients were detected but concern for anabolic exposure remained. One case (5%) increased from unlikely to possible after a chemical analysis revealed presence of anabolic steroids. Among the 20 cases where causality likelihood increased, 14 cases had higher causality scores in reassessment because the products contained the following agents: anthraquinones (n = 1), anabolics (n = 3), ashwagandha (n = 1), green tea (n = 7), turmeric (n = 1), and Garcinia cambogia (n = 1).

Table 2 Summary data of causality scores with and without chemical analysis
Fig. 2
figure 2

Percentage change in overall case causality scores, as assessed, using the DILIN causality assessment before and after inclusion of chemical analysis data

No change in causality score was noted in 52% of cases, and 11% of cases were scored as a lower likelihood of DILI after chemical analysis was available. In the majority of the 28 cases (52%), causality likelihood remained the same and had definite/highly likely score at the initial assessment (26/28). In the six cases (11%) where causality decreased, there was an alternate diagnosis being considered or a liver injury pattern not consistent with the identified supplement. For example, one case supplement contained confirmed green tea extract but the latency of liver injury was not thought to fit the clinical picture of HDS liver injury. Overall, there was a significant distributional shift to higher likelihood categories of DILI (p = 0.02 for comparing 9.3%, 40.7%, 33.3%, 11.1%, 5.6% versus 14.8%, 55.6%, 16.7%, 9.3%, 3.7%) in distributions of definite, highly likely, probable, possible, unlikely categories with use of chemical analysis, respectively (Table 2).

Products promoted for general well-being were the most common to have a shift to a higher degree of confidence in attribution as a result of chemical analysis, with 55% of such cases having an increased attribution in step 2. Bodybuilding and weight loss products class comprised 36% of the cases in which the causality score increased as a result of chemical analysis.

Table 3 lists the specific products and specific known hepatotoxic ingredients associated with increased likelihood scores in 15 of the 20 cases where the original causality score was probable, possible, or unlikely. In the five cases with increased causality score despite the lack of identified known hepatotoxic ingredients, the experienced reviewers suspected hepatoxic ingredients, but such ingredients were not identified in the chemical analysis provided. In these cases, potential hepatotoxicity of HDS the patient may have been taking remained, but chemical analysis was not available as the concerning HDS was not collected. Among the 28 cases where the causality score remained the same, 12 cases had known hepatotoxic ingredients identified by the chemical analyses. However, 9 of the 12 cases already had a causality score as definite or highly likely during step 1. The three cases with original causality score as probable/possible/unlikely did not lead to scoring change because the cases either had long latency that is not compatible with the identified hepatotoxic ingredient or no chemical analysis available on other products implicated.

Table 3 Specific products and ingredients associated with increased causality scores after chemical analysis

The presence of off-label (not shown on the product label) for the known hepatotoxic ingredients was found in 22% (n = 12) of the 54 cases reviewed. The most common off-label ingredient found in 7 of the 12 cases was anabolic steroids. The additional off-label ingredients were green tea in four and anthraquinones in one. We sought to determine how the chemical analysis influenced causality assessment; specifically, whether the accuracy of the product label, confirmation of a labeled ingredient known to be a hepatotoxin (i.e., catechin of green tea), or identification of any unlabeled ingredient (i.e., anabolic steroid) might explain the change in confidence. However, the multiplicity of ingredients in a relatively limited number of products precluded a meaningful analysis to determine which of these three factors may have explained our findings. Nonetheless, a qualitative review of the products suggests that the presence of an identified known hepatotoxic ingredient confirmed by chemical analysis was a common factor leading to an increase in confidence of attribution.

4 Discussion

Our study demonstrates that chemical analysis structured to verify labeled contents and to assay for a limited number of known potential hepatotoxins has value in the causality assessment process for HDS-induced liver injury. The value is shown by an increase in the level of confidence in attribution of a product as the cause of liver injury, as reflected by DILIN scores showing higher degrees of likelihood of causal attribution.

Chemical analysis techniques have long been available and have been applied to nonprescribed products [16]. Common applications have included applications to detect adulterants in performance-enhancing products and pharmaceuticals that might promote a marketed purpose for use of the HDS [19, 20]. At best, the identification of an unlabeled ingredient provides only circumstantial evidence for injury. The confidence to connect a suspected analyte as the cause for injury is strengthened by other factors such as the timing of exposure, exclusion of other causes that could explain injury, and when available, a reported pattern of similar injuries that were associated with the identified agent. Until identification of possible hepatotoxins in HDS agents becomes more widely available, such as with in vitro or in vivo assays, only re-exposure and recurrence of injury can be accepted as incontrovertible evidence of causality. Of course, such an approach is not tenable for ethical reasons, although such cases of inadvertent re-exposure to some natural products have been reported in the literature, providing strong support for causality [21].

The purpose of our study was not aimed to provide a diagnostic test that would change the current causality assessment processes, but rather, to demonstrate the potential value of chemical analysis as an adjunct to current causality assessment in challenging suspected HDS-induced liver injury situations and a potential research tool. Some limitations of our study are the lack of analysis of all implicated agents in these cases and the reproducibility of this adjudication process. Although we have shown that confidence in attribution is increased by chemical analysis, more work needs to be done to expand the panel of analytes as well as to assess for HDS from other categories of marketed purposes for use, as they were not included in our cases. Additionally, many of the cases were already considered high causality before chemical analysis, limiting us from seeing the full benefit of the use of chemical analysis. Despite these limitations, our findings suggest that the inclusion of chemical analysis results in its process of causality assessment may increase confidence in attribution of causality in challenging cases.