FormalPara Key Points
Table 1

1 Introduction

In France, more than 2.5 million patients are receiving treatment with levothyroxine [1] and most are administered the product Levothyrox®. In March 2017, Merck Serono, the French subsidiary of the German pharmaceutical company, Merck KGaA, launched a new formulation of Levothyrox®. It is anticipated that this new formulation will soon be marketed in 21 European Union (EU) countries [2]. Despite the fact that both formulations were shown to be bioequivalent, several thousand patients reported adverse drug reactions (ADRs), following this replacement [3]. In this opinion paper, we report that more than 50% of healthy volunteers, enrolled into a study which demonstrated average bioequivalence (ABE), were actually outside the a priori bioequivalence range. We therefore question the ability of an ABE trial to guarantee the switchability, within a patient, of the new and old levothyroxine formulations.

The objectives in developing a new formulation of Levothyrox® (hereafter named Levothyrox®NF) were two fold; to improve pharmaceutical stability and to ensure a potency specification over a shelf-life of at least 18 months. The active drug (synthetic l-thyroxine, levothyroxine, or L-T4) was the same as in the original formulation (hereafter named Levothyrox®OF). The excipients only were changed, with the replacement of lactose by mannitol and citric acid, both of which have been claimed by French authorities as excipients not known to have a recognised action on or to affect the administered dose of Levothyrox®NF [4]. Following this substitution and over 13 months of marketing the new formulation (from 27 March, 2017 to 17 April, 2018), as many as 31,411 patients had declared ADRs to the French network of pharmacovigilance centres, after switching from the old to the new Levothyrox® formulation; this number is approximately 1.43% of patients treated with Levothyrox®NF [3]. Most ADRs occurred shortly after this imposed change and the official pharmacovigilance review reported, for 1745 patients documented for their thyroidal status before and after the switch between the two formulations, that 23% were hypothyroid, 10% were hyperthyroid, while 67% were normal for thyroid-stimulating hormone status [1].

The conclusion of the French regulatory agency is that it is not possible, from their data analysis, to suggest an hypothesis to account for these ADRs. The possibility of bio-inequivalence between the old and the new formulations has been excluded, following a large pharmacokinetic study comparing the formulations [5]. This conclusion was based on the 90% confidence interval for the area under the curve (AUC) plasma concentration, which is a measure of internal exposure within the pre-defined European regulatory limits of 90.00–111.11%, hereafter reported as 0.9–1.11 limits.

2 Precisely What is Indicated by an Average European Union or US Food and Drug Administration Bioequivalence Trial?

It is important to recognise that a bioequivalence (BE) trial, conducted in healthy volunteers according to both the EU 2010 revised guidance [6] and the corresponding US Food and Drug Administration (FDA) guideline [7], does not guarantee that each individual patient in the target population, who switches from an older reference (R) formulation to a new test (T) formulation, will be “similarly” exposed to levothyroxine, nor is it intended to do so. In the overview of comments received on the draft EU BE guideline [8], one can read the following comment from stakeholders “The draft guideline deals only with average bioequivalence. The Population and Individual bioequivalence approaches are not mentioned anywhere; therefore, it is not clear as to whether these approaches are acceptable”. The European Medicines Agency’s succinct but uninformative answer was, “The average bioequivalence approach is the recommended method to establish bioequivalence”. In commenting on the 2010 revision of the EU guideline, Morais and Lobato [9] drew attention to the conceptual shift in European Medicines Agency guidance between the previous and the 2010 revised EU guideline on BE, with the replacement of a clinically orientated guideline by a quality-control orientated guideline. This explains why the notion of “essential similarity”, which was the basis for comparability of two medicinal products, to support their interchangeability in clinical use, was deleted owing to a lack of a sound legal basis. Conversely, the adoption of a ‘quality-like’ approach implies “less reliance on judgment based on clinical considerations” [9]. The new objective is to ensure that formulation differences can be detected because “pharmacokinetic parameters such as AUC and Cmax are more sensitive to difference in formulation and manufacturing process than to clinical endpoints” [9]. This new EU position is legally more supportable than the previous guidance, but it implicitly considers healthy subjects involved in an ABE trial to be equivalent to homogeneous ‘walking’ chromatographic columns, rather than being representative of a future heterogeneous targeted patient population.

3 Formulation Substitution is Subject to National Regulation and is Not Dealt with by European Union Guidelines

Formulation switchability to support a substitution of one product with another is a scientific principle not dealt with by EU guidelines (vide infra). Substitution policy is a national issue, not one regulated by the EU [6]. In contrast, in USA, the concept of individual bioequivalence (IBE), and its merits compared to ABE, have been extensively investigated [10,11,12]. It should be understood that the aim of ABE studies is solely to compare the population means between T and R products and thus to ensure that the mean (median) AUCs of the two formulations are sufficiently close to guarantee that their ratio is contained within the acceptable pre-defined regulatory limits. Average bioequivalence is typically used in the pre-marketing approval of new generic formulations. However, Levothyrox®NF is not a new generic formulation offered as a possible alternative to Levothyrox®OF for a new patient. It is a new formulation designed to replace Levothyrox®OF and the number of patients for which this change was imposed in France between March and June 2017 is estimated to be 2,188,432 [3]. Hence, the key question that should have been addressed before the marketing of Levothyrox®NF is: can a patient already treated with Levothyrox®OF be safely and effectively switched from this no longer available formulation to the new formulation? A study demonstrating ABE does not answer this question, i.e. the demonstration of ABE between Levothyrox®OF and Levothyrox®NF does not ensure their switchability.

4 Appropriate Conceptual Framework to Document Switchability Between Two Formulations is Individual Bioequivalence

The concept underlying switchability is that each patient has his/her own individual therapeutic window, that is, a range of plasma concentrations providing appropriate efficacy and safety. If a formulation change is made, the new formulation should ensure a drug exposure profile precisely located in this individual therapeutic window, thereby ensuring unchanged safety and efficacy [12].

For thyroxine, the therapeutic window is narrow; it is classified as a narrow therapeutic index drug [13], dosage for which each patient should be carefully titrated. This is provided for, first, by the availability of multiple dosage product strengths and, second, by the reduction in the classical BE acceptance interval from 0.80–1.25 to 0.90–1.11.

The appropriate conceptual framework to document switchability is IBE; the explicit aim is to document the switchability between two formulations. The concept of IBE was introduced more than 25 years ago [14] to address the limitations of ABE trials in addressing the issue of switchability. An IBE study compares the exposure obtained with each formulation within each individual subject, thereby ensuring that each individual will respond similarly to the two formulations. Investigation of IBE requires comparing the closeness of the distribution of bioavailability between T and R formulations by establishing not only population means (as for ABE) but also two variance terms, namely, the within-subject variance and the variance estimating the subject-by-formulation interaction (for further detail and critical comments see [11, 12, 15]). This interaction term documents the extent to which the individual differences between T and R formulations are similar across individual subjects. The FDA reported that an interaction is important when about 10% or more of individuals’ R/T ratios are outside the pre-defined a priori BE range [12]. Individual bioequivalence has been both extensively discussed and challenged and then, finally, not adopted by regulatory authorities. It is beyond the scope of this paper to discuss in detail the advantages and limitations of IBE. However, simply concluding that the IBE concept is not clinically relevant because some authors or organisations consider that there is no evidence for failure of ABE for approved generics, such as [16], is not acceptable. Compared to ABE, IBE studies require more complicated and expensive designs and are associated with several regulatory issues. These include defining IBE, how to measure it and how to analyse data (for detailed reviews, see a series of 13 articles published in a special issue of Statistics in Medicine in 2000 expressing the advantages and disadvantages [17]).

We concur with the opinion of the FDA Individual Population Bioequivalence Working Group [12] that the subject-by-formulation interaction, the most critical variance term to explore for switchability, is highly relevant. In this commentary, it is proposed that the BE of the two formulations of Levothyrox®, and more especially the subject-by-formulation interaction to assess whether IBE is established for the formulations, merits further consideration.

5 For Levothyrox®, More Than 50% of Subjects Enrolled in a Large European Union Regulatory Average Bioequivalence Trial were Actually Outside the a Priori Bioequivalence Range

Because of public and media concerns, and the desire of the French regulatory authorities to ensure full transparency for this major public health crisis, the BE dossier, including its raw data, have been made public: it can be down-loaded on the Agence Nationale de Sécurité du Médicament et des Produits de Santé [18]. The dossier provided data on L-T4, hereafter named T4. The T4 concentration–time profiles of 204 healthy individuals, for both old and new formulations, were retrieved. Blood samples were taken before administration (baseline) and regularly up to 72 h post-administration. For individual subject concentration–time profiles, AUC was computed by trapezoidal methods. According to the 2010 European Medicines Agency guideline, “If the substance being studied is endogenous, the calculation of pharmacokinetic parameters should be performed using baseline correction so that the calculated pharmacokinetic parameters refer to the additional concentrations provided by the treatment” [6]. In our analysis, both baseline-adjusted AUC, obtained by subtracting the baseline concentration from each post-administration concentration, and unadjusted AUC were calculated, to take account of overall T4 exposure, when evaluating IBE. It is rational to recognise, from the patient perspective, that it is this overall exposure that is clinically relevant.

The experimental design was a 2 × 2 cross-over. As the sequence of administration of the formulations to each individual was not reported in the public dossier, possible period or sequence effects were not considered in our analysis. However, for each individual subject, the exposure ratios AUCnew/AUCold (hereafter named IER) were computable for adjusted and non-adjusted T4 concentrations. This is of interest when documenting IBE, because as indicated above, the proportion of subjects outside the a priori BE interval (here 0.90–1.11) is directly related to the variance term measuring the subject-by-formulation interaction. This variance can, under some conditions, be estimated from the standard deviation of the individual mean formulation differences on a logarithmic scale (see [12] for explanation and [19] for demonstration). For example, assuming that the ratio of overall T/R means is 1 (as is the case for Levothyrox®) and assuming a bivariate normal distribution for the between-subject distribution, the proportion of individual T/R ratios outside the a priori BE interval of 0.80–1.25 is 13.7% for a standard deviation of 0.15 for the subject-by-formulation interaction, 0.15 being the cut-off value selected by the FDA [12]. The data and the R-script used to perform the computation and details of data analysis including management of missing data are available as Electronic Supplementary Material on the journal website.

Individuals were then classified into five groups, respectively corresponding to an IER in one of the following intervals: 0–0.8, 0.8–0.9, 0.9–1.11, 1.11–1.25, 1.25– (Table 1). Figure 1 illustrates the distributions of IER computed for T4 with and without adjustment for the baseline.

Table 1 Number of individuals from 204 investigated subjects in each class of individual exposure ratio (IER)
Fig. 1
figure 1

Distribution of individual exposure ratio (IER) [area under the curve new/area under the curve old] obtained with baseline-adjusted T4 (left panel), and unadjusted T4 (right panel) plasma concentrations. Blue vertical straight lines are the acceptable pre-defined limits, namely 0.9 and 1.11. An individual with an IER within these limits has an observed variation of exposure of less than 10% when switching from the old to the new formulation. Red dotted vertical straight lines, 0.8 and 1.25, are respectively, the limits below and above which the variation of exposure is greater than 20% when switching from the old to the new formulation

For the baseline-adjusted ratio (Fig. 1, left panel; Table 1), less than 50% of subjects (32.8%) were located in the a priori BE interval of 0.9–1.11, with an expected percentage having a 95% confidence interval of 26.4–39.7. The corresponding percentage for the unadjusted IER (Fig. 1, right panel; Table 1) was 83.3%, with a 95% confidence interval of 77.5–88.2.

In the dossier, the ABE was established on the adjusted AUC from zero to 72 h and, even if statistical re-analysis of the data set had not been possible, as a consequence of a lack of public information on trial design, it is acknowledged that the trial [5] and analyses were conducted professionally according to current EU guidelines. However, it is proposed that the IBE, focusing on intra-individual variability, as well as on a possible subject-by-formulation interaction, merits consideration.

The published experimental design [5] was not planned for statistical analysis of an IBE and this report does not claim with statistical protection that the two formulations are not switchable. Nevertheless, plotting the observed IER highlights a major “warning signal” requiring consideration for two reasons. First, less than 50% of subjects are within the a priori BE interval of 0.90–1.11 when (in compliance with the EU guideline) the baseline-adjusted AUC is considered. Second, there is an apparently more favourable finding, when the unadjusted AUC is considered. Whilst such data analysis is not recommended by the EU guidelines, it constitutes an important consideration, when discussing the relevance of IBE. Indeed, for the healthy subjects in this trial, having normal thyroid function, the administered T4 likely triggered a negative feedback on endogenous T4 secretion, with a buffering effect on T4 plasma concentration, thus resulting in a smaller IER dispersion than when adjusted AUC is considered. Axiomatically, it can be hypothesised that such rapid physiological adjustments will be less efficient or even absent in the targeted clinical population, these patients having either reduced or a total lack of thyroid function. In this case, it is ADRs that triggered the required dosage adjustment to ensure an individual euthyroidal status. Therefore, the appropriateness of using healthy euthyroidal subjects to assess BE for Levothyrox® formulations is questionable.

6 As More Than 50% of Individuals were Outside the A Priori Bioequivalence Range, the Existence of a Subject-by-Formulation Interaction is Not Unlikely

The fact that more 50% of individuals were outside the a priori BE range suggests the existence of a subject-by-formulation interaction, as reported for several drugs (for a recent review see [20]). Indeed, such findings have been reported previously for thyroxine. It was shown that the magnitude of the influence of pH on the pharmacokinetics of levothyroxine is formulation dependent and that two formulations that are considered as BE in healthy volunteers under fasting conditions may be not necessarily BE in patients with altered gastric pH [21] but that the absorption extent of a liquid formulation of T4 was not altered by proton-pump inhibitors [22]. Likewise, liquid T4 formulations are more efficacious than tablets in patients with malabsorption receiving T4 either for replacement or for suppressive therapy, whereas there were no significant differences in patients in the absence of malabsorption [23]. These literature reports indicate that there are clinical situations in which establishing equivalence for thyroxine in healthy volunteers may not translate unequivocally to equivalence in all patients. They illustrate potential concerns for many patients treated with Levothyrox®NF.

7 New, but not the Old, Levothyrox® Formulation Contains Mannitol, an Excipient Considered to be Critical for Drugs such as Levothyroxine Having a Low Permeability

A subject-by-formulation interaction can arise when either a subgroup of subjects or individual subjects have differing pharmacokinetic profiles for either a T or R formulation from the remainder of the population enrolled in a BE trial [19]. Mechanistically, it is attributable to some characteristic of this subpopulation leading to altered drug absorption. Levothyroxine is classified in the Biopharmaceutical Classification System as a Class III substance, i.e. one having high solubility but low permeability [24]. The new, but not the old, Levothyrox® formulation contains mannitol, an osmotic excipient considered to be critical [25], especially for Class III drugs (for a general review of the impact of osmotically active excipients on bioavailability and BE of Biopharmaceutical Classification System Class III drugs, see [26]). Indeed, low permeability compounds are often subject to site-dependent absorption, and their bioavailability can be dependent on gastrointestinal tract transit time, which may be influenced by mannitol. For example, the bioavailability of the H2-receptor antagonist, cimetidine, in a chewable tablet containing 2.264 g of mannitol, was reduced by 29% and this was due to a reduction in small intestine transit time of 20% [27]. The magnitude of effect of mannitol was shown to be dose dependent in the range of 0.755–2.265 g [28]. For the new formulation of Levothyrox®, the amount of mannitol is approximately 70 mg for a 100-mg tablet [29] and a patient may take two tablets. Whether a small amount of mannitol, of some 140 mg, can affect small intestine transit time and thereby be associated with decreased bioavailability of levothyroxine is not known. Moreover, according to Chen et al. [26] the quantitative dose–response relationship for mannitol on cimetidine/ranitidine absorption may not be extrapolated to other substances because, as well as an osmotic effect, an osmotically active excipient may influence either the absorption mechanism or the absorption site. For sorbitol, an isomer of mannitol, it has been reported that very small amounts (7, 50 or 60 mg) can affect drug absorption and this effect appears to be subject dependent [30].

8 Concluding Comments

In conclusion, a statistical analysis conducted in the conceptual framework of IBE would have enabled: (1) documentation of possible higher intra-individual variability for the new compared with the old formulation and, hence, possible reconsideration of the development of this new formulation; indeed, a fine individual subject calibration, due to day-to-day erratic variability in bioavailability, would be very difficult to establish, thus rendering less informative the ability of a snapshot sample to estimate the actual thyroid-stimulating hormone level; (2) consideration of a possible subject-by-formulation interaction, thus allowing both regulatory authorities and prescribing clinicians to be better placed to manage and systematically supervise all patients during transition from the old to the new formulation; and (3) thereby to anticipate a possible new titration for patients on whom the new formulation has been imposed. Average bioequivalence as the regulatory recommended BE approach notwithstanding, a requirement to explore a possible subject-by-formulation interaction to ensure switchability between products is justified, especially when millions of patients are involved. Such was the case for Concerta® (methylphenidate) and associated generic products; a subject-by-formulation analysis for each pharmacokinetic metric was recommended by the FDA in addition to the establishment of ABE [31]. Such analysis is warranted on the grounds of optimal risk management both for the millions of existing patients and for future EU patients undergoing thyroid-deficiency treatment with a drug, for which replacement of an old with a new formulation has been known for many years to be problematic internationally [32].