FormalPara Key Points

In the United States, an approved biosimilar can be designated as “interchangeable.”

The United States Food and Drug Administration expects clinical data to support a demonstration of interchangeability.

Clinical studies that support interchangeability should be designed primarily to evaluate if clinical performance is altered by multiple switching between a reference product and its biosimilar and whether such switching will result in differences in pharmacokinetics or immunogenicity profiles.

1 Introduction

A biosimilar is a biologic drug that is “highly similar to a reference (originator) product, and for which there are no clinically meaningful differences between the two products in safety, purity, and potency” [1]. The development and approval of a biosimilar is based on extensive analytical, structural, and functional comparisons with the reference product, comparative nonclinical (in vivo) studies, clinical pharmacokinetics (PK) and/or pharmacodynamics (PD), and immunogenicity, and usually comparative clinical efficacy and safety assessments. Taken together, these data comprise the “totality of the evidence” supporting a conclusion of biosimilarity [2]. Provided that there is sufficient scientific justification, a biosimilar can be approved for use in all indications held by the reference product by a process of extrapolation, without the need to conduct comparative clinical studies for each indication [1].

According to the Biologics Price Competition and Innovation Act (BPCIA) in the United States (US), a biosimilar can be designated as “interchangeable”, whereby it may be substituted for the reference product without the intervention of the healthcare provider who prescribed the reference product [3]. This enables pharmacy-mediated substitution, where state laws allow [2]. Indeed, in most states, the biosimilar may be substituted for the reference product if the biosimilar is designated as interchangeable [2, 4, 5].

To meet the additional designation of interchangeability in the US (as defined in the BPCIA), “an applicant must provide sufficient information to demonstrate biosimilarity and also to demonstrate that the biological product can be expected to produce the same clinical result as the reference product in any given patient” [2]. Moreover, “if the biological product is administered more than once to an individual, the risk in terms of safety or diminished efficacy of alternating or switching between the use of the biological product and the reference product is not greater than the risk of using the reference product without such alternation or switch” [2]. A biosimilar product that has satisfied the regulatory requirements for demonstrating biosimilarity can be expected to have the same benefits and risks as its reference product, regardless of whether it has obtained a designation of interchangeability [6, 7].

In contrast to the regulatory situation in the US, the European Medicines Agency (EMA) does not have the legal remit to designate a product as being interchangeable and has not authorized the designation of any product for automatic substitution by a pharmacist without the intervention of the prescriber [8]. Instead, without requiring additional clinical studies, individual national regulatory authorities in the European Union (EU) may endorse the switching from one reference or biosimilar product to another at the pharmacy level without consent of the prescriber (“automatic substitution”) [9].

The aim of this review is to consider the challenges in conducting clinical studies to support a designation of interchangeability, as defined in the guidance from the US Food and Drug Administration (FDA) [2]. In addition, potential alternative approaches to generating sufficient clinical evidence to support a designation of interchangeability are presented.

2 Clinical Interchangeability Studies: Guidance from the FDA

Final guidance from the FDA provides an overview of the key scientific considerations in demonstrating interchangeability of a biosimilar with a reference product [2, 10]. The guidance also provides clarification regarding the clinical data that are expected to support demonstration of interchangeability, considerations for the design and analysis of clinical interchangeability studies, and recommendations regarding the use of a US-licensed reference product [2]. Furthermore, as described in the guidance, evidence to support the demonstration of interchangeability may be collected as part of the marketing application package for initial biosimilar approval [2]. Once interchangeability between a reference product and its biosimilar has been demonstrated in patients with a disease for which the reference product is approved, the biosimilar may be designated as an interchangeable biosimilar for use in that and other diseases for which the reference product is licensed but in which the biosimilar was not studied [2].

2.1 Goal of Studies Expected to Support the Interchangeability Designation

The aim of clinical interchangeability studies is not to re-establish similarity between a biosimilar and its reference product, but to meet the FDA’s expectations for the demonstration of interchangeability [2]. By definition, the biosimilar has already been deemed by the FDA not to be clinically different from the reference product [1]. As such, there is little reason to expect altered PK, heightened immunogenicity response, increased safety risk, or improved or diminished efficacy in patients who switch back and forth from a reference product to the corresponding biosimilar [2].

The FDA guidance on interchangeability mentions that one important possible concern of multiple switches back and forth between a reference product and its biosimilar is the potential for changes in immunogenicity [2]. Therefore, “switching studies are designed to assess whether one product will affect the immune response to the other, once the switch occurs, and whether this will result in differences in immunogenicity or PK profiles” [2]. This perceived drawback of multiple switching between a reference product and its biosimilar is based on theoretical concerns and experience with switches between one biologic product and another that is different, and not its biosimilar [9, 11]. There is a low likelihood that the incidence, titers, or specificity of antidrug antibodies (ADAs) will increase or change as a result of multiple switches between a reference product and its biosimilar. This is particularly the case if the ADA response is highly restricted to the idiotype (anti-idiotypic antibodies); as with infliximab and adalimumab [12]. In addition, for a biosimilar that has been approved by the FDA or EMA, other product characteristics that could increase the immunogenicity potential of the molecule itself, such as aggregates, impurities, and oxidation, will have been shown to be highly similar to those of the reference product.

If a heightened or altered immunogenicity response was to occur as a consequence of multiple switching between a reference product and its biosimilar, it could create the potential for neutralization of biosimilars, their reference products, and related endogenous proteins; affect the PK and ultimately the efficacy of the product; and also result in hypersensitivity and other immune-mediated adverse events (AEs) [2]. In the scientific community, controversy remains regarding the need to conduct extensive clinical studies to evaluate the effects of switching (including multiple switches) on PK or immunogenicity.

2.2 Design of Studies Expected by the FDA to Support the Interchangeability Designation

In its guidance, the FDA recommends two potential designs for switching studies to evaluate interchangeability [2]. The first approach—a “dedicated switching study”—starts with a “lead-in period of treatment with the reference product, followed by a randomized two-arm period, with one arm incorporating switching between the proposed interchangeable product and the reference product (switching arm) and the other remaining as a non-switching arm receiving only the reference product (non-switching arm). The switching arm is expected to include at least two separate exposure periods to each of the two products (i.e., at least three switches, with each switch crossing over to the alternate product)” (Fig. 1) [2].

Fig. 1
figure 1

Adapted from US Food and Drug Administration [2]. AUCt area under the concentration versus time curve in the dosing period, BS proposed interchangeable biosimilar product, Cmax maximum concentration, PK pharmacokinetics, RP reference product

Design of a clinical switching study: dedicated approach [2].

The second approach—the integrated study—is designed so that biosimilarity and interchangeability can be supported by a single investigation. In the integrated, two-part study design, biosimilarity of the proposed interchangeable product and its reference product is assessed in the first stage with a parallel, head-to-head, comparative design. Subjects in the reference product arm are then re-randomized and, in the second stage, the dedicated switching study design is followed to evaluate interchangeability between the two products (Fig. 2) [3].

Fig. 2
figure 2

Design of a clinical switching study: integrated approach [2, 3]. “Sowing confusion in the field: the interchangeable use of biosimilar terminology,” Laura McKinley (US Regulatory Policy), John M. Kelton (US Medical Affairs), and Robert Popovian (US Government Relations), Current Medical Research and Opinion, 2019, Published by Taylor & Francis. Adapted by permission of the publisher Informa UK Limited trading as Taylor & Francis Ltd, https://www.tandfonline.com [3]. BS proposed interchangeable biosimilar product, RP reference product

The FDA guidance provides recommendations to conduct clinical interchangeability studies in patients rather than in healthy volunteers, and to use PK parameters, rather than efficacy or safety endpoints, as primary endpoints. The guidance states that “the primary endpoint in a switching study or studies should assess the impact of switching or alternating between the use of the proposed interchangeable product and the reference product on clinical PK and PD (if available), since these assessments are generally most likely to be sensitive to changes in immunogenicity and/or exposure that may arise as a result of alternating or switching” [2, 3].

In addition, the FDA guidance provides recommendations on the evaluation of PK endpoints. The guidance states that “the last switching interval should be from the reference product to the proposed interchangeable product, where the duration of exposure to the proposed interchangeable product after the last switch is sufficiently long to allow for washout of the reference product (i.e., at least three or more half-lives)”. This allows assessment of the PK of the proposed interchangeable product in the switching arm and comparison with the PK of the reference product in the non-switching arm [2].

2.3 Primary Endpoints of Clinical Interchangeability Studies: Statistical PK Assessment

As outlined in the final FDA guidance, studies that support interchangeability are designed primarily to assess whether the clinical performance is altered by multiple switching between a reference product and its biosimilar and, more specifically, whether such switching will result in differences in PK or immunogenicity profiles [2]. The FDA guidance recommends that, for intravenous administration, the area under the concentration versus time curve in the dosing period (AUCτ) should be considered as the primary study endpoint. For subcutaneous administration, the maximum (or peak) concentration (Cmax) and AUCτ should be considered as co-primary study endpoints [2]. These PK parameters should be analyzed using an equivalence approach, with the two-sided 90% confidence interval (CI) for the geometric mean ratio (GMR) of AUCτ and Cmax between the proposed interchangeable product and the reference product being within the range of 0.8–1.25 [2].

2.4 Current Evidence on the Effects of Switching

Most switching data pertaining to biosimilars are from studies designed to evaluate the effects of a single switch from a reference product to its biosimilar. A review of published data from clinical trials and post-marketing surveillance found no evidence to suggest that switching between biosimilars and their corresponding reference products results in significant safety concerns [13, 14]. Indeed, in the EU, the demonstration of biosimilarity, with rigorous post-marketing pharmacovigilance, is considered to be adequate to support switching in clinical practice [9, 14].

A systematic literature review of switching studies between related biologics (including biosimilars) identified no increased risk of immunogenicity-related AEs or decreased efficacy after a single switch from, or multiple switches between, a reference product and its biosimilar [15]. Recent studies have also shown that multiple switching between a reference product and its biosimilar has no apparent effect on efficacy, safety, or immunogenicity [16,17,18]. These multiple-switch studies were not designed to meet the FDA’s designation of interchangeability because they did not assess PK parameters as primary endpoints, which is expected by the FDA [2]. However, many have demonstrated comparable efficacy, safety, and immunogenicity of the biosimilar to its reference product among patients who switched from the reference product to the biosimilar, compared to those who continued treatment with the biosimilar.

For example, in open-label extensions of the PLANETRA and PLANETAS studies, which compared biosimilar infliximab-dyyb to reference infliximab in patients with RA and ankylosing spondylitis, respectively, patients who switched from reference infliximab to infliximab-dyyb had efficacy, safety, and immunogenicity outcomes similar to those of patients who continued treatment with infliximab-dyyb [19, 20]. In addition, the NOR-SWITCH study, a randomized, double-blind non-inferiority trial that assessed switching from reference infliximab to biosimilar infliximab-dyyb in patients with any of six different immune-mediated inflammatory diseases, demonstrated that switching from reference infliximab to infliximab-dyyb (switch group) was not inferior to continuing treatment with reference infliximab (maintenance group). Overall, disease worsening occurred in 26% and 30% of patients in the maintenance and switch groups, respectively. The safety profile was also similar between the maintenance and switch groups [21]. Similar outcomes were observed among patients who switched from reference infliximab to infliximab-dyyb during the 26-week, open-label extension [22].

Three switches between biosimilar etanercept GP2015 (Erelzi®; etanercept-szzs; Sandoz International GmbH, Holzkirchen, Germany; Sandoz, Inc., West Princeton, NJ, USA) and etanercept reference product (Enbrel®, Amgen Inc., Thousand Oaks, CA, USA; EU authorized) did not adversely affect efficacy, safety, or immunogenicity in patients with chronic plaque-type psoriasis [16]. Switching five times between biosimilar filgrastim EP2006 (Zarxio®; filgrastim-sndz; Sandoz International GmbH, Holzkirchen, Germany; Sandoz, Inc., West Princeton, NJ, USA) and filgrastim reference product (Neupogen®, Amgen Inc., Thousand Oaks, CA, USA) did not result in clinically meaningful differences in efficacy or safety in patients with breast cancer [17]. Switching four times between biosimilar adalimumab GP2017 (Hyrimoz®; adalimumab-adaz; Sandoz International GmbH, Holzkirchen, Germany; Sandoz, Inc., West Princeton, NJ, USA) and adalimumab reference product (Humira®, AbbVie Ltd, Maidenhead, UK; AbbVie Inc., North Chicago, IL, USA) resulted in no detectable impact on efficacy, safety, or immunogenicity in patients with active, clinically stable, moderate-to-severe plaque psoriasis [18].

Assessing the effects of switching in a “real-world” setting can be challenging, as patients may be switched back and forth among different products in clinical practice [23]. Observational studies of the use of epoetins and granulocyte colony-stimulating factors in patients with chronic kidney disease or cancer in Italy have demonstrated that there is frequent switching between different biological products belonging to the same class in routine clinical care [24, 25]. Post-marketing studies conducted in population-based databases have provided additional evidence of the comparative effectiveness and safety of originator and biosimilar epoetins, and have shown that switching from originator epoetin to a biosimilar epoetin, or vice versa, is safe and effective [26,27,28].

Moreover, experience from the DANBIO registry revealed no negative impact of switching from infliximab reference product (Remicade®, Janssen Biologics B.V., Leiden, The Netherlands; Janssen Biotech, Inc., Horsham, PA, USA) to biosimilar infliximab CT-P13 (Inflectra®; infliximab-dyyb; Pfizer Inc., New York, NY, USA; Remsima®, Celltrion Healthcare, Co., Ltd, Incheon, Korea) on disease activity in patients with rheumatoid arthritis (RA), psoriatic arthritis (PsA), or axial spondyloarthritis [29]. The adjusted 1-year retention rate was lower in the infliximab-dyyb group compared with that among patients who received infliximab reference product. However, this study had several limitations, including incomplete data due to the observational approach, and the potential nocebo effect in patients in the infliximab-dyyb group. The nocebo effect refers to the emergence of new or worsening symptoms brought about by negative expectations regarding a therapeutic intervention, either inert or active [30,31,32]. In the BIO-SWITCH study, small changes in disease activity indices observed in patients with RA, PsA, and axial spondyloarthritis were also attributed to the nocebo effect [33].

3 Challenges with the Use of PK Endpoints in Supporting Interchangeability

The FDA guidance states that interchangeability is supported if the 90% CI of the ratio of the log-normally distributed PK parameters for systemic drug exposure, Cmax and AUCτ, between the proposed interchangeable product and the reference product falls completely within the symmetric bioequivalence range of 80–125% (symmetric on the log scale) [34]. This arbitrarily determined range is the same as that used in PK bioequivalence studies comparing proposed generic small-molecule drugs to their reference products, based upon the assumption that a difference in systemic drug exposure of up to 20% is not clinically significant. These margins have also been used to establish PK similarity in clinical studies comparing biosimilars to their reference products [2]. However, it is important to recognize that, in an interchangeability study, these PK assessments are not intended to re-establish PK similarity between the biosimilar and its reference product, but rather serve as sensitive outcome measures to assess the impact of multiple switching between a reference product and its biosimilar on therapeutic drug concentrations [2].

It is important to note that the establishment of symmetric equivalence margins of 80–125% for Cmax and AUCτ does not consider biologic plausibility (e.g., the known effect of potential immunogenicity on the PK of a given molecule). For certain molecules (e.g., monoclonal antibodies such as adalimumab and infliximab), the occurrence of ADAs affects PK only in one direction (i.e., lowering plasma drug levels). Therefore, investigating the possibility of increased drug levels after multiple switches for these molecules might not be warranted, since such an occurrence is unlikely. The clinical relevance of differences in PK on efficacy and safety, both of which are potential concerns with interchangeability, should also be considered. Indeed, for certain products, a difference of 25% in Cmax and AUCτ might not have a discernible effect on safety or efficacy (e.g., drugs with clinical doses that yield concentrations at the plateau of the therapeutic concentration–effect curve).

Several challenges are associated with the use of PK endpoints in clinical interchangeability studies. The sample size required for PK studies conducted in patients, rather than in healthy volunteers, may be relatively large for certain products (e.g., subcutaneously administered drugs) because of a high coefficient of variation (CV) for PK parameters. For example, it has been reported that the estimated CV for adalimumab is ~ 50% for Cmax or AUCτ in patients, compared with that of ~ 30% in healthy volunteers [35]. Additionally, determination of AUCτ and Cmax in patients requires intense sampling with multiple PK assessments over a relatively short period of time. Dosing and sampling deviations may occur, which could contribute to increased variability. Moreover, the high variability of PK parameters observed in patients may be due to several factors, such as the use of concomitant medications and the wide distribution of body weights, serum albumin levels, and disease activity.

In addition, the need for intensive PK sampling to assess Cmax and AUCτ requires multiple blood draws, which adds to patient burden and could negatively affect study enrollment and, importantly, increase dropout and rates of non-evaluable subjects (e.g., number of subjects who would not be evaluable for PK assessments). The combination of high PK variability and the rate of non-evaluable PK could render dedicated clinical interchangeability studies, which follow the FDA guidance strictly, unnecessarily large.

Furthermore, since the occurrence of ADAs has been described to affect PK in one direction only, by lowering plasma drug levels, a two-sided CI using asymmetric margins (with “relaxation” of the upper margin) may be a more meaningful approach for studies supporting interchangeability. Table 1 shows the sample size requirements for an interchangeability study of a biosimilar with reference adalimumab conducted in patients with RA, comparing one with the symmetric equivalence margins of 80–125%, as recommended in the FDA guidance [2], to an example of one with asymmetric margins of 80–140%. Using asymmetric margins could allow for a smaller sample size in cases where the GMR is ≥ 100%.

Table 1 Total sample size required for 90% co-primary powera for symmetric and asymmetric equivalence margins

4 Challenges with the Use of Immunogenicity Assessments in Supporting Interchangeability

The FDA guidance states that clinical interchangeability studies should assess immunogenicity and be analyzed descriptively as a secondary endpoint [2]. We agree that a descriptive analysis of immunogenicity is adequate, as there are currently no established margins corresponding to the clinical relevance of differences in immunogenicity. Moreover, it is not possible to make meaningful comparisons of compounds with a low incidence of ADAs.

We believe that a key focus in assessing interchangeability should be examination of the evolution of immunogenicity profiles of the biosimilar and corresponding reference product as a consequence of switching. However, as similarity already has been established between the biosimilar and its reference product, it is expected that their immunogenicity profiles will be similar after alternating treatment between the biosimilar and reference product in patients. For example, studies of switching between biosimilar adalimumab ABP-501 (Amjevita™; adalimumab-atto; Amgen Inc., Thousand Oaks, CA, USA; Amgevita®, Amgen Europe B.V., Breda, The Netherlands) and the adalimumab reference product (Humira®) in patients with psoriasis showed that immunogenicity (as measured by antibody detection) was not affected by switching between treatments [36].

Furthermore, according to FDA guidance, a comparative clinical immunogenicity study (e.g., switching study) may be unnecessary to support the demonstration of biosimilarity or interchangeability for insulin products [37]. In contrast to other biologics, such as monoclonal antibodies, insulin is a relatively small protein with a straightforward structure that is well-characterized. Thus, if a comprehensive analytical evaluation demonstrates that the proposed biosimilar insulin is “highly similar” to its reference product, then “there would be little or no residual uncertainty regarding immunogenicity” and “minimal or no risk of clinical impact from immunogenicity” would be expected [37].

Finally, although switching between a reference product and its biosimilar is not expected to trigger immunogenicity, it is important to note that changes in the delivery device or formulation of any biological medication may result in increased immunogenicity. For example, after changes were made to both the formulation and delivery device of an epoetin alfa reference product, patients with chronic kidney disease who received this reference biologic medication developed anti-erythropoietin antibodies that resulted in fatal cases of pure red cell aplasia [11, 38,39,40]. However, the potential impact of such manufacturing changes on the safety and efficacy of a biologic product is evaluated according to a separate regulatory process for conducting a comparability assessment, which is distinct from the regulatory assessment of biosimilarity [41].

5 Challenges with the Use of Efficacy and Safety Endpoints in Clinical Interchangeability Studies

The FDA guidance states that “although assessments of efficacy endpoints can be supportive, at therapeutic doses many clinical efficacy outcomes would only be sensitive to large changes in exposure or immunogenicity, which may not be observed in a study of limited duration and with a limited number of switches” [2]. In addition, the use of efficacy and safety endpoints can present potential challenges in clinical interchangeability studies. It can be challenging to conduct a fully blinded study if different formulations of the biosimilar and its reference product are used, with potential differences in injection-site reactions or pain [42]. The nocebo effect may influence the assessment of efficacy, especially if the study is not fully blinded [43]. The FDA recognizes that efficacy endpoints are measured with less precision than PK parameters, perhaps necessitating larger sample sizes if an efficacy parameter is considered as the primary endpoint, and therefore should be analyzed descriptively [2].

The use of established efficacy endpoints in biosimilar trials may not be sensitive enough to detect potential differences in efficacy between a biosimilar and its reference product, since the use of such endpoints has not differentiated between two dissimilar biologic agents with distinct mechanisms of action. In the AMPLE trial, which compared subcutaneous administration of abatacept to that of adalimumab over 2 years in patients with RA, all clinical efficacy outcome measures [American College of Rheumatology (ACR) responses, changes in Disease Activity Score in 28 joints using the C-reactive protein (DAS28-CRP) improvement in the Health Assessment Questionnaire–Disability Index (HAQ-DI), and radiographic progression] yielded essentially the same results for both drugs over the 2-year study duration, despite abatacept and adalimumab having completely different mechanisms of action [44].

6 Are Multiple-Switch Clinical Trials Needed to Support Interchangeability?

In light of the high standards for biosimilar approval in the US, biologic plausibility of the effect of switching, accumulated clinical evidence, and limitations of the endpoints used in interchangeability studies, there has been discussion in the scientific community regarding the need to re-evaluate how interchangeability is determined [45].

We believe that the FDA should be flexible when considering statistical approaches, endpoints, and overall study design for interchangeability switching studies. For example, the potential use of asymmetric margins, or even a non-inferiority design, rather than symmetric margins could be considered as an alternative statistical approach to test the equivalence of PK endpoints when it is known that immunogenicity could only hasten drug clearance. Efficacy endpoints, ideally including appropriate surrogate biomarkers, could be used instead of PK endpoints, particularly when the immunogenicity potential of the drugs is low.

Currently, as experience with multiple switches of biosimilars has been limited, there is a regulatory expectation to perform multiple-switch trials to support an interchangeability designation. However, it may not be possible to fully blind such switching trials to minimize the nocebo effect. In addition, the design of such multiple-switch trials has ethical implications. For example, the need for multiple in-patient visits and intensive PK sampling to assess PK parameters would add to patient burden. Understandably, the current regulatory expectations for demonstrating interchangeability are conservative. However, over time, the FDA may conclude that switching studies are unnecessary to support an interchangeability designation as new evidence (e.g., results from clinical trials and real-world observational studies) emerges to show that PK, immunogenicity, and efficacy are not affected by multiple switches between a reference product and its biosimilar.

On a case-by-case and individual product basis, and when scientifically justified, the FDA should support alternative study designs that could hasten the development process without affecting the ability to ensure that the risk in terms of alternating or switching between use of the interchangeable biological product and its reference product is not greater than the risk of using the reference product without switching. For example, the designation of interchangeability could be supported by the current analytical, functional, and clinical data that are required to establish biosimilarity, and complemented by post-marketing surveillance in registries or pragmatic randomized controlled trials that focus on patient-centered outcomes, such as tolerability and adherence [9, 46].

7 Conclusions

From a clinical perspective, the risk associated with switching between a biologic reference product and its biosimilar is improbable. The potential for changes in immunogenicity, such as a heightened or altered immunogenic response, as a consequence of switching between a reference product and its biosimilar is not supported by current evidence. However, there is a legal requirement to demonstrate that “for a biological product that is administered more than once to an individual, the risk in terms of safety or diminished efficacy of alternating or switching between use of the biological product and the reference product is not greater than the risk of using the reference product without such alternation or switch” [2]. The use of PK parameters as primary endpoints for clinical interchangeability studies to support an interchangeability designation of biosimilars creates new challenges. Interchangeability should be assessed on a case-by-case basis, considering the “totality of the evidence” and biologic plausibility. Alternative approaches to statistical analysis (e.g., use of asymmetric rather than symmetric margins to test equivalence) and study designs that meet the FDA’s expectations for demonstration of interchangeability should be considered.