Adults diagnosed with mCRPC were included from January 1, 2013 to March 31, 2019 based on the US Flatiron Health Electronic Health Record (EHR)-derived de-identified database, an oncology EHR database that has been previously used to investigate treatment patterns and outcomes in mCRPC [10, 11, 20]. The US Flatiron Health database is a longitudinal database, comprising de-identified patient-level structured and unstructured data, curated via technology-enabled abstraction . Patients in Flatiron data were shown to be similar in sex and geographical distribution to patients in US cancer registries, such as the Surveillance, Epidemiology, and End Results Program or the National Program of Cancer Registries, but appeared to be diagnosed with later stages of disease and had different age distribution compared to patients in the cancer registries .
The Flatiron Health EHR-derived de-identified database  includes normalized data elements (e.g., demographics, diagnoses, visits, laboratories/vitals results, medication administration/orders/prescriptions, performance status), insurance), enhanced data elements curated specifically for the mCRPC target population (e.g., date of the initial PC diagnosis, cancer stage at the initial PC diagnosis Gleason score at the initial PC diagnosis, date of first metastasis, date of CRPC diagnosis, and LOTs), and month/year of death.
All database records were statistically de-identified and certified as fully compliant with US patient confidentiality requirements outlined in the Health Insurance Portability and Accountability Act of 1996. Since this study relied exclusively on de-identified patient records and did not involve the collection, use, or dissemination of individually-identifiable data, institutional review board approval was not necessary.
Study Design and Population
A retrospective longitudinal observational cohort design was used (Supplemental Fig. 1). The study sample included males aged 18 years and older with mPC and CRPC diagnoses between January 1, 2013 and March 31, 2019 and adenocarcinoma histology (Supplemental Fig. 2). The date of mCRPC diagnosis was defined as either the date of mPC or the date of CRPC diagnosis, whichever came later. If the mPC date preceded the CRPC date, the patient was assumed to have transitioned to mCRPC from the metastatic hormone-sensitive PC clinical state (mHSPC); if the CRPC date preceded the mPC date, the patient was assumed to have transitioned to mCRPC from the non-metastatic CRPC clinical state (nmCRPC).
Baseline characteristics were measured from the initial PC diagnosis to the mCRPC diagnosis date, as recorded in the EHR data. Patients were followed from the mCRPC diagnosis date to the end of observed clinical activity in Flatiron data or death. All patients had at least 1 month of follow-up post-index by design (Supplemental Figs. 1, 2). No minimum number of documented treatments was required at study entry.
Outcomes and Measurements
Patient demographic and clinical characteristics were measured during the baseline period and at the index date, based on information available in the normalized and enhanced Flatiron Health EHR-derived de-identified data. The proportion of patients treated at oncology centers in community settings versus academic centers was reported.
Treatment patterns pre-mCRPC diagnosis included the different classes of treatments used between the PC diagnosis and mCRPC diagnosis, both overall and during relevant pre-mCRPC clinical states [i.e., non-metastatic hormone-sensitive PC (nmHSPC), mHSPC, and nmCRPC clinical states]. A clinical states model has been previously used to track the progression of disease from the initial PC diagnosis to mCRPC diagnosis and to identify patient populations most in need of treatment [23,24,25].
Treatment patterns post-mCRPC diagnosis included the number of observed LOTs per patient, LOT sequences post-mCRPC diagnosis by treatment class (identified using a PC-specific proprietary algorithm ), regimens by LOT (class and agent-level), time between the date of mCRPC diagnosis and first line (1L) start, and time on and off treatment for LOTs by class. The number of observed LOTs per patient were evaluated both overall and among the subgroup of patients who died during the study period. For the latter, the number of observed LOTs per patient was not affected by censoring due to end of data availability, and thus provides a more accurate picture of the total number of LOTs the patient received before succumbing to the disease.
For the regimens by LOT analyses, the following five treatment groups were analyzed at class level: NHA (i.e., abiraterone, enzalutamide, apalutamide, and darolutamide), chemotherapy (e.g., docetaxel, cabazitaxel), Sip-T, and other therapy (i.e., thalidomide, bcg vaccine, nivolumab, atezolizumab, lenolidomide, durvalumab, ipilimumab, pembrolizumab, targeted therapies, radium-223, and clinical study drugs; non-NHA hormonal therapies or bone therapy agents were not included in concordance with the LOT algorithm), as well as any combinations of the four classes above.
For all classes of therapy, time on a given LOT was defined as the time from the first order/administration to the last order/administration of the agents in the LOT (for chemotherapy, this definition is conservative, as the full cycle of the last administration was not included), irrespective of censoring. Similarly, time off treatment was defined as the time between the last order/administration for a given LOT and the first order/administration for the next LOT. Given that the last LOT observed may be censored due to the end of data availability and thus underestimated, the proportion of patients with last LOT censored is also reported. When the last LOT was an IV therapy, censoring was defined as end of data availability due to end of clinical activity or death within 62 days of the last administration of the IV therapy; when the last LOT was an oral therapy, censoring was defined as end of data availability within 90 days of the last day supply of the oral therapy.
Finally, OS was assessed by LOT, after the start of 1L, 2L, and 3L, respectively, accounting for censoring. Further, we also assessed whether patients who move rapidly through multiple LOTs have worse OS than patients who advance slower through multiple LOTs.
Baseline characteristics and treatment patterns were analyzed using descriptive statistics [i.e., proportions, means, standard deviations (SD), medians, and interquartile ranges (IQR)]. The median time on and off treatment (months) and the corresponding IQRs were assessed across patients' LOT sequences based on the patients’ observed time on/off treatment during the study period, irrespective of censoring.
OS was analyzed using time to event analyses among patients who did not have other primary malignancies occurring concurrently with mCRPC, and who did not participate in randomized control trials (RCT). This analysis relied on a composite mortality variable described previously . Kaplan–Meier (KM) analyses were used to assess OS after 1L, 2L, and 3L start and to derive median OS and OS rates. Patients without a death event after each LOT start were censored at the end of observed clinical activity in Flatiron Health EHR-derived de-identified data. In addition, we reported results from a multivariate Cox proportional-hazards regression model that assessed the impact of LOT number post-mCRPC diagnosis (i.e., 1L, 2L, 3L or 4L + ; modeled as time-dependent exposure) adjusted for the following covariates measured at or before mCRPC diagnosis (modeled as fixed or time-independent covariates): age, time from initial PC diagnosis to mCRPC diagnosis, ECOG performance status, disease progression to mCRPC, Gleason score, hemoglobin level, lactose dehydrogenase level, albumin level, serum alkaline phosphatase level, prostate-specific antigen (PSA) level, site of metastasis, opioid use, cancer stage, prostatectomy/surgery, and radiation. The hazard ratios for this time-dependent exposure compared the risk of death between patients who had more versus fewer LOTs at each point in time post-mCRPC diagnosis, adjusting for potential confounders. Univariate Cox regression models were also performed for both the main exposure and each covariate.
All analyses were performed using SAS Enterprise Guide v.7.1 (SAS Institute, Cary, NC, USA).