FormalPara Key Summary Points

Inflammatory bowel disease, broadly encompassing ulcerative colitis (UC) and Crohn’s disease, is a set of chronic, relapsing/remitting gastrointestinal diseases with long-term effects on patients’ quality of life and well-being.

Current systemic therapies, including corticosteroids, immunosuppressants, and anti-tumor necrosis factor alpha agents, are not effective for many patients and increase the risk of opportunistic infections, non-Hodgkin lymphoma, and other undesired side effects.

Etrolizumab is an investigational next-generation anti-integrin therapy with a dual action targeting two pathways of inflammation in the gut for the treatment of moderate-to-severe UC and Crohn’s disease.

Etrolizumab selectively inhibits α4β7 and αEβ7 to control both trafficking of immune cells into the gut and the inflammatory effects on the gut lining.

The etrolizumab phase 3 clinical trial program is the largest (> 3000 patients) and most comprehensive registrational program in UC and Crohn’s disease that was developed not only to characterize the safety and efficacy of etrolizumab but also to help advance the field by addressing unanswered clinical questions related to evaluation of treatment effects and treatment selection through an extensive repository of patient samples and data generated.

Digital Features

This article is published with digital features, including a summary slide and video abstract, to facilitate understanding of the article. To view digital features for this article go to https://doi.org/10.6084/m9.figshare.12006219.

Introduction

Inflammatory bowel disease (IBD) is a set of chronic, relapsing/remitting gastrointestinal diseases with long-term effects on patients’ quality of life and well-being. The two major types of IBD are ulcerative colitis (UC) and Crohn’s disease. Commonly used systemic therapies, including corticosteroids (CS), immunosuppressants (IS), and anti-tumor necrosis factor alpha agents (anti-TNFs), are not effective for many patients and increase the risk for opportunistic infections, non-Hodgkin lymphoma, and other undesired side effects [1,2,3]. More recently, the anti-integrin vedolizumab was approved for the treatment of patients with moderate-to-severe UC or Crohn’s disease who do not respond to conventional or TNFα inhibitor therapy, and the anti-interleukin (IL)-12/IL-23 antibody ustekinumab was approved for the treatment of moderate-to-severe UC and Crohn’s disease [4, 5]. However, there is a clear need for safer treatment options that limit the side effects of systemic immunosuppression and that have long-term efficacy in a broad patient population. Clinical research to improve the management of IBD is progressing, but even as new treatment options are being evaluated and approved for use, significant clinical questions remain unanswered. For example, with the emergence of new treatment options, evidence for comparative efficacy is critical to guide registration labeling and to define the optimal treatment paradigm. Few comparative efficacy trials in the postmarketing setting have been performed [6, 7]. Efforts to optimize treatment selection in the light of emerging clinical evidence are confounded by differences and/or incomplete characterization of clinical trial populations and the variable use of efficacy metrics/scales. The multitude of endpoints such as patient-reported outcomes (PROs), endoscopy, histology, and biomarkers are poorly correlated, and indeed, the methodology of evaluating these endpoints varies from study to study. In addition, while short-term symptomatic improvement is important to both patients and physicians, evaluating long-term outcomes is critical given the chronic nature of the disease. Therefore, it would be useful to conduct a comprehensive clinical program to generate data that can be used to systematically evaluate and correlate various short- and long-term efficacy and disease outcomes to inform optimal treatment selection.

IBD is characterized by a dysregulation of the immune system in genetically susceptible individuals in response to commensal microbiota and other environmental triggers [8]. Anti-integrins, such as etrolizumab and vedolizumab, are a new class of agents that selectively inhibit this lymphocyte trafficking to and within the large and small intestines while avoiding broad-spectrum immunosuppression. Inhibition of the interaction between the integrin α4β7 and its ligand mucosal vascular addressin cell adhesion molecule 1 has been shown to be effective in both UC and Crohn’s disease [9, 10].

Etrolizumab is a next-generation anti-integrin with dual action that targets two pathways of inflammation in the gut. Unlike vedolizumab, which targets only α4β7, etrolizumab selectively inhibits α4β7 and αEβ7 to control both trafficking of immune cells into the gut and the inflammatory effects on the gut lining (Fig. 1). Etrolizumab uniquely inhibits αEβ7-expressing lymphocytes residing in the gut that have been shown to exhibit an inflammatory phenotype in patients with UC [11, 12]. Inhibition of the αE integrin, therefore, targets not only the αEβ7-expressing inflammatory cells present in the gut mucosa before the initiation of therapy but also those αEβ7-expressing inflammatory cells that may continue to traffic into the gut mucosa via the α4β1:vascular cell adhesion molecule 1 pathway that is not inhibited by β7 or α4β7 antagonists and has been shown to potentially play an important role in Crohn’s disease [13].

Fig. 1
figure 1

Etrolizumab dual mechanism of action. IEL intraepithelial lymphocyte, MAdCAM-1 mucosal vascular addressin cell adhesion molecule 1, VCAM-1 vascular cell adhesion molecule 1

Results from the phase 2 study EUCALYPTUS have demonstrated a benefit of etrolizumab treatment over placebo in patients with moderate-to-severe UC [14]. A number of patients from EUCALYPTUS have now received more than 5 years of treatment with etrolizumab through enrollment in the open-label extension (OLE) phase 2 study SPRUCE. A robust phase 3 clinical program in UC and Crohn’s disease is ongoing and aims to evaluate the efficacy and safety of etrolizumab in well-defined patient populations in rigorous trials that include direct head-to-head comparisons against other approved biologics. Given the wealth of clinical trial and real-world data available from the anti-TNF agents, infliximab and adalimumab were chosen as the comparators of choice for the etrolizumab clinical trial program versus newer biologics, for which the clinical evidence and therapeutic experience are relatively more recent and more limited. Further, these clinical studies will evaluate historical clinical endpoints as well as several newer endpoints. Together, these studies will not only assess the efficacy of etrolizumab but will also provide a comprehensive data set to enhance future trial designs for IBD by allowing better understanding of the performance of and associations across various new endpoints and identifying study inclusion criteria that may facilitate better measurement of treatment effects. Herein, we provide an overview of the comprehensive phase 3 clinical program of etrolizumab in UC and Crohn’s disease.

Methods

Study Designs

The etrolizumab phase 3 clinical program is designed to evaluate safety and efficacy in patients with moderately to severely active UC or Crohn’s disease who have had inadequate response or intolerance to prior CS, IS, and/or anti-TNFs. The program consists of six multicenter, prospective randomized controlled trials (RCTs) and two OLE studies (Figs. 2, 3). The OLE studies will provide many years of data, which is critical given the chronic nature of the disease.

Fig. 2
figure 2

Ulcerative colitis trial designs. anti-TNF anti-tumor necrosis factor alpha agent, MCS Mayo Clinic score, OLI open-label induction, RB rectal bleeding score. *Patients who achieved a ≥ 3-point decrease and 30% reduction in MCS and ≥ 1-point decrease in RB or an absolute RB of 0 or 1 are randomly assigned to the maintenance arms

Fig. 3
figure 3

Crohn’s disease trial designs. *Patients who achieved a ≥ 70-point reduction in Crohn’s disease activity index score from baseline are again randomly assigned to the maintenance arms

HIBISCUS I and II, GARDENIA, and LAUREL are investigating patients with UC who are anti-TNF-naive: HIBISCUS I and II are identical induction trials evaluating etrolizumab head to head against an active comparator, adalimumab, and placebo; GARDENIA is a maintenance study evaluating etrolizumab against an active comparator, infliximab; LAUREL is a maintenance trial evaluating etrolizumab against placebo; HICKORY is an induction/maintenance trial evaluating etrolizumab versus placebo in anti-TNF-experienced patients with UC. Patients from the five UC RCTs may be eligible to roll over to open-label treatment in COTTONWOOD.

In Crohn’s disease, BERGAMOT is an induction/maintenance trial evaluating anti-TNF-naive and -experienced patients. Registration requirements for new medications in Crohn’s disease were in flux at the time of study conception; hence, BERGAMOT was designed with an exploratory cohort 1, which, in collaboration with the US Food and Drug Administration (FDA), European Medicines Agency (EMA), and clinical experts in IBD, helped inform endpoint selection for pivotal cohort 3 [15, 16]. Eligible patients from BERGAMOT may be able to roll over to open-label treatment in JUNIPER.

The trials started in May 2014 and will run until enrollment is complete at centers across Asia, Australia, Europe, North America, and South America. All procedures involving humans are in accordance with the standards of all ethics committees and institutional review boards and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent will be obtained from all individual patients included in the study.

Patients

Key eligibility criteria for all trials are having a diagnosis of UC or Crohn’s disease for at least 3 months before screening, having moderately to severely active disease, and having had an inadequate response or intolerance to prior treatment with CS, IS, and/or anti-TNF. In UC, moderately to severely active disease is defined as a Mayo Clinic score (MCS) of 6–12 with a centrally read endoscopy subscore ≥ 2 and no subscore < 1. In Crohn’s disease, moderately to severely active disease is defined by the clinical assessment of a Crohn’s disease activity index (CDAI) score of 220–480, as well as a simple endoscopic score for Crohn’s disease (SES-CD) ≥ 7 or ≥ 4 for isolated ileitis. The pivotal cohort 3 of BERGAMOT also requires a stool frequency score (SF) ≥ 6 or SF > 3 and abdominal pain score (AP) > 1. Patients should remain on stable doses of oral 5-aminosalicylic acid (5-ASA), oral CS [equivalent to ≤ 30 mg/day prednisone (≤ 20 mg/day allowed for BERGAMOT)], budesonide (≤ 9 mg/day), probiotics, and IS (azathioprine, 6-mercaptopurine, methotrexate) for the induction period; patients receiving oral CS are required to taper CS during maintenance therapy. Essentially, tapering of CS occurred at the end of induction with two rate bands: 5 mg/week for a CS dose of 10 mg/day and 2.5 mg/week for a CS dose ≤ 10 mg/day. The taper was expected to be completed within approximately 8 weeks after induction and at least 9 months before the primary endpoint assessment. Complete inclusion and exclusion criteria for each individual study will be fully detailed in future reports.

Interventions and Assessments

In the UC studies (Fig. 2), patients randomly assigned to etrolizumab receive 105 mg every 4 weeks. Depending on the protocol, patients may alternatively be randomly assigned to receive (1) adalimumab subcutaneously per approved dosing schedule (160 mg at week 0, 80 mg at week 2, and 40 mg at weeks 4, 6, and 8); (2) infliximab intravenously per approved dosing schedule (5 mg/kg at weeks 0, 2, and 6, and then every 8 weeks thereafter); or (3) placebo. The consideration to use adalimumab and infliximab as active comparators in the induction and maintenance phases of these studies was based on their approval and wide use as standard-of-care therapy at the time these trials were designed.

In the Crohn’s disease study (Fig. 3), patients were randomly assigned to receive induction therapy for 14 weeks of (1) subcutaneously administered etrolizumab 105 mg every 4 weeks; (2) subcutaneously administered etrolizumab 210 mg every 4 weeks; or (3) placebo and then, depending on induction response, maintenance dosing with (1) etrolizumab 105 mg every 4 weeks or (2) placebo. A higher dose of etrolizumab was included in the Crohn’s disease induction study because of the lack of phase 2 data and in order to explore any dose-ranging effects; notably, αE expression is higher in the ileum and proximal colon, which is frequently involved in Crohn’s disease, raising the possibility that a higher dose may be warranted to optimize efficacy in Crohn’s disease [17]. The BERGAMOT trial was initiated when clinical trial endpoints were in transition from the historical registrational endpoint of CDAI to endoscopy and PROs. The lack of evidence to quantify benefit from the more recent endoscopic and patient-reported measures made it challenging to formulate clinical/statistical assumptions for a head-to-head superiority or non-inferiority trial; therefore, the BERGAMOT study employed a placebo comparator.

All treatments were double blinded unless otherwise noted; HIBISCUS I and II, GARDENIA, and BERGAMOT used a double-dummy design to ensure masking of treatment or dose, respectively.

Assessments done before randomization and at regular intervals throughout the study period included physical examination, neurologic assessment, electrocardiograph, hematology and serum chemistries, serum for pharmacokinetic analysis of etrolizumab, antidrug antibody tests, stool sample analysis, biopsies (histology), immunohistochemistry, and colonoscopy. PROs, including the Inflammatory Bowel Disease Questionnaire, EuroQoL-5D, UC-PRO tool (“signs and symptoms” and “systemic systems” modules), Crohn’s disease (CD)-PRO tool (“signs and symptoms” and “systemic systems” modules), components of CDAI (SF, AP, general well-being) and components of MCS [SF and rectal bleeding score (RB)], are collected daily using electronic PRO devices. The etrolizumab program uses the UC-PRO Signs and Symptoms (SS) and CD-PRO/SS, the first PRO tools to undergo a rigorous development process outlined by the FDA, with input from patients and clinical experts [18, 19].

Outcomes

For the UC RCTs (Table 1), the primary endpoints are based on clinical response (≥ 3-point decrease and 30% reduction in MCS and ≥ 1-point decrease in RB or an absolute RB of 0 or 1), remission (MCS ≤ 2, with individual subscores ≤ 1 and an RB of 0), or clinical remission (MCS ≤ 2 with individual subscores ≤ 1) at the time points indicated in the individual protocols. The primary outcome measures for COTTONWOOD are long-term efficacy (based on the partial MCS assessed at 12-week intervals), remission and endoscopic remission at week 108, and the incidence and severity of adverse events.

Table 1 Ulcerative colitis trials

The co-primary endpoints for BERGAMOT are clinical remission (unweighted AP ≤ 1 and SF ≤ 3) and endoscopic improvement (≥ 50% reduction in SES-CD from baseline) at weeks 14 and 66 (Table 2). The primary outcome measures for JUNIPER are long-term efficacy (based on clinical remission assessed at 12-week intervals), endoscopic remission [SES-CD ≤ 4 (≤ 2 for patients with ileal disease) with no segment > 1] at week 108, and the incidence and severity of adverse events.

Table 2 Crohn’s disease trials

Secondary and exploratory endpoints include other endoscopic measures, histology, quality of life, safety, etrolizumab pharmacokinetics, and biomarkers assessed at various time points in the respective protocols and will help characterize the comprehensive efficacy and safety profile of etrolizumab, as well as any exposure–response correlations. Exploratory endpoints aim to assess the relationship between different endpoints, the ability to use biomarkers to predict and assess treatment response, and the potential to use baseline and assessment data to personalize care for patients with IBD. In addition, the etrolizumab studies evaluated the frequency and severity of all serious and nonserious adverse events that occurred during the conduct of the study.

Statistical Analyses

A detailed statistical analysis plan will be produced for each study and finalized before the primary analysis. Briefly, sample sizes for each study indicated in Figs. 2 and 3 were calculated to provide greater than 80% power at the two-sided 5% significance level for the primary efficacy endpoints based on previously observed active and placebo rates. All formal statistical comparisons for categorical data will use the Cochran–Mantel–Haenszel test statistic, adjusting for appropriate stratification factors. Continuous endpoints will be analyzed using an analysis of covariance model with the appropriate stratification factors and the baseline value of the studied measure as a covariate. For all analyses, the point estimate, 95% confidence intervals, and P value will be reported.

Strengths and Limitations

Etrolizumab for Patients with Moderate-to-Severe UC or Crohn’s Disease

The etrolizumab phase 3 clinical program is the largest and most comprehensive registrational program in UC and Crohn’s disease and will recruit more than 3000 patients with IBD. This program will characterize the effect of etrolizumab on numerous outcomes, including multiple clinical, endoscopic, histologic, PRO, and quality-of-life indices; an ambitious biomarker discovery program is also included. An extensive safety database with years of data will also result from the two OLE studies and will be useful in confirming the expected favorable safety profile of etrolizumab.

Which Is the Optimal First-Line IBD Therapy Option?

Data from this program will help clinicians make informed treatment decisions to optimize patient outcomes. Collectively, the trials study a large range of patients, including patients treated with etrolizumab, infliximab, adalimumab, or placebo. Anti-TNFs have been a mainstay in first-line treatment of moderate-to-severe UC and Crohn’s disease; although new treatments have been approved, clinicians attempt to navigate treatment choice using indirect comparisons that are quite limited given the disparate methodologies used across trials. To date, HIBISCUS I and II and GARDENIA will be the first phase 3 registration trials in UC to generate efficacy data for a new agent head to head against adalimumab and infliximab.

Planned subgroup analyses, including stratification by disease severity, disease location, and treatment history, may also provide insights into optimal treatment selection for patients. A key population of interest is anti-TNF-experienced patients, a difficult-to-treat population with limited options and uncompelling clinical data to guide therapeutic decisions. Preliminary analyses from the open-label induction cohort of HICKORY have suggested that treatment with etrolizumab is associated with an improvement in clinical, biomarker, endoscopic, and histologic outcomes in this hard-to-treat population [20, 21]. The etrolizumab program also includes the assessment of various biomarkers to characterize their relation to disease prognosis, prediction of response to therapy, and measurement of response. Collectively, data from this program will not only support the efficacy and safety of etrolizumab but also will provide rich data for physicians to decide when and how to best use etrolizumab.

What Is the Optimal Method to Assess Disease Activity and Discern Treatment Effect?

The clinical trial landscape of IBD is constantly evolving to find the optimal method to assess disease activity and discern the treatment effect of new agents. At the time the BERGAMOT study was initiated, there was a lack of clarity on regulatory endpoints for new agents in Crohn’s disease. The CDAI has historically been plagued with high placebo remission rates, limiting its ability to discern treatment effect [22]. Preliminary analyses from BERGAMOT exploratory cohort 1 observed similarly high rates of CDAI-based remission with placebo [23]. Health authorities have since indicated that evaluation of new agents in Crohn’s disease should include PROs and endoscopic assessment of mucosal inflammation [15]. As no validated PRO exists for Crohn’s disease, the PRO2, an index using a weighted composite of AP and SF based on CDAI, has been suggested for use in clinical trials [24]; however, initial analyses from BERGAMOT observed similarly high PRO2 remission rates with placebo, complicating assessment of treatment effect. Unweighted measures of AP and SF may better represent clinically meaningful improvements and provide a better assessment of impact of therapy. Assessment of cohort 1 from BERGAMOT supported the move from CDAI-based endpoints to the current co-primary endpoints of clinical remission (unweighted AP ≤ 1 and SF ≤ 3) and endoscopic improvement with an observed drop in placebo rates [23].

In alignment with the focus on PROs by health authorities, Roche and Genentech, in cooperation with a small consortium, have worked to develop the UC-PRO and CD-PRO [18, 19]. The UC-PRO and CD-PRO are modular instruments that were designed to comprehensively assess the signs, symptoms, and impact of UC and Crohn’s disease, capturing the experience from the perspective of the patient. These are the first PRO tools that have undergone a rigorous development process outlined by the FDA, with input from both patients and clinical experts. Health authorities are currently reviewing this tool and are in the midst of evaluating its use for PRO assessment; in fact, the EMA has issued a letter of support encouraging data sharing and further studies to validate this novel tool [25]. The signs and symptoms modules of the UC-PRO and CD-PRO (UC-PRO/SS and CD-PRO/SS) are included as secondary endpoints across the etrolizumab program—data from these can help support widespread use in clinical trials and in practice.

Objective assessment of inflammation is also important when discerning treatment effect. Central reading of endoscopies has emerged as the gold standard to reduce local reader bias and reduce placebo rates. However, there is no consensus on the optimal number of central readers, the inclusion of local readers, and the optimal adjudication method. Endoscopies from cohort 1 of BERGAMOT were read by a local reader and two central readers, allowing for the systematic evaluation of the performance characteristics of five different endoscopy reading models. Preliminary analysis has suggested that in Crohn’s disease, models with two readers provide the greatest discrimination in discerning treatment effect, and a model that includes at least one central reader and a local reader with consensus among readers determined on a sliding scale does so with the least requirement for a third reader to reach consensus [26]. Analyses further detailing differences between endoscopy read models and local versus central readers will be confirmed using data from BERGAMOT; these insights can not only support the assessment of efficacy of etrolizumab but also inform central reading methodology.

What Clinical Trial Endpoints Are Clinically Relevant?

The translation of clinical trial endpoints to guide decisions in clinical practice is key; correlations between clinical, endoscopic, and histologic outcomes and patient prognosis have yet to be fully elucidated.

Improvement in the endoscopic appearance of the intestinal mucosa is recognized as an important goal of therapy for its association with improved clinical outcomes [27]. However, specifics about this recommendation were recently under discussion. In patients with UC, initial evidence supported that patients achieving a Mayo endoscopic score (MES) of 0 or 1 had a similarly reduced rate of colectomy compared with those with an MES of 2 or 3 [28]. More recently, in a cohort with a median follow-up of 48 months, a MES of 0 was associated with a significantly lower rate of colectomy than a MES of 1 [29]. The UC RCTs of the etrolizumab program include endoscopic improvement (MES = 0 or 1) and endoscopic remission (MES = 0) as endpoints. Combined with long-term enrollment in the COTTONWOOD OLE, these data may provide insights into the clinical relevance of an MES = 0 compared with MES = 1. In Crohn’s disease, the long-term impact of achieving endoscopic and deep remission was explored in patients in the CALM trial [30]. The study found that patients in clinical remission (CDAI < 150) did not have lower rates of the composite endpoint of major adverse outcomes reflecting Crohn’s disease progression (new internal fistula/abscess, stricture, perianal fistula/abscess, Crohn’s disease hospitalization, or Crohn’s disease surgery); however, patients who achieved endoscopic remission [Crohn’s disease endoscopic index of severity (CDEIS) < 4 with no deep ulcerations] or deep remission (CDAI < 150, CDEIS < 4 with no deep ulcerations, and no steroids for at least 8 weeks) were significantly less likely to have a major event over time than those who did not. Similarly, BERGAMOT includes endoscopic remission as key secondary endpoint and, together with long-term enrollment in JUNIPER OLE, will explore the implications of endoscopic endpoints on patient prognosis.

Emerging evidence suggests the importance of histologic normalization, specifically the resolution of neutrophilic infiltration, as a key endpoint that is associated with better clinical outcomes in patients with UC [31, 32]. A variety of histologic scoring systems have been suggested for use, but evaluation of each scoring system has been largely limited by the small sample sizes and/or retrospective nature of the studies in which they have been assessed [33]. Recently, newer indices have undergone formal evaluation and are in use in the etrolizumab UC trials. The UC trials include endpoints of histologic remission, defined as a Nancy Histological Index (NHI) score of 0 or 1 [34]. The NHI was selected for formal assessment as it offers an easily interpreted index that concisely captures the “absence of neutrophils,” the most important factor to establish histologic remission, as an NHI score of 0 or 1 [35]. There will also be the opportunity to assess other scales such as the Robart’s Histopathology Index [36] and to determine which components of the scores are associated most strongly with long-term outcomes. Conversely, in Crohn’s disease, consensus about histology is much less clear owing to differences in disease pathophysiology, presentation patterns of inflammation within the bowel regions (i.e., skip lesions), and the immaturity of scoring indexes. Data from BERGAMOT will allow for the exploration of these questions.

A Vast Biobank of Patient Samples

The breadth of this program will not only comprehensively evaluate the safety and efficacy of etrolizumab but can also help address unanswered clinical questions. The etrolizumab clinical program will generate an extensive repository of samples (blood, biopsy, and stool) from more than 3000 patients, collected longitudinally throughout the trials, and data generated from these trials will allow for the investigation of additional measures of prognosis and response. Data from a broad patient population will be available to explore the correlation of disease activity assessments, evaluate the effect of various endpoints on long-term outcomes, and validate new tools to predict treatment outcome and improve patient prognosis.

In summary, the etrolizumab phase 3 clinical program is ongoing and will comprehensively characterize etrolizumab and inform its place in therapy. Although there remain many gaps in knowledge in the clinical landscape of IBD, information generated by this large program will improve the ability of clinical trials to identify effective treatments in IBD and will provide evidence to help guide clinicians to optimally care for their patients.