Introduction

Lung cancer is the leading cause of cancer death worldwide, and most lung cancers are non-small cell lung cancer (NSCLC) (Jemal et al. 2010). More than half of all NSCLCs are diagnosed at an advanced stage; therefore, prognosis is often poor (National Cancer Institute 2013). In recent decades, platinum-based doublet chemotherapy has been used as first-line treatment for patients with advanced NSCLC (Hotta et al. 2004; Lilenbaum et al. 2005; D’Addario et al. 2005; Azzoli et al. 2009). However, such treatment is associated with only modest improvements in overall survival (OS) and quality of life (QOL) (Schiller et al. 2002; Rajeswaran et al. 2008). New treatment strategies, particularly the introduction of molecular-targeted agents and appropriate patient selection based on histology and/or genotyping, have resulted in marked progress in recent years, and OS in advanced NSCLC patients has improved (Ramalingam et al. 2011; Kobayashi and Hagiwara 2013; Kaneda et al. 2013). Although there are reports in the literature of OS of over 24 months (Arrieta et al. 2010; Maemondo et al. 2010; Niho et al. 2012), currently available treatments are not able to cure; attempts to find a curative treatment are ongoing (Black and Morris 2012; Berardi et al. 2013; Bayraktar and Rocha-Lima 2013).

OS is the gold standard endpoint for clinical trials assessing the efficacy of new drugs for the treatment of NSCLC (Food Drug Administration 2011). OS is defined as the time from random assignment of the first treatment to death. Measuring OS as an endpoint in clinical trials requires large number of patients and increasing longer follow-up, thus potentially increasing the cost of development and delaying time to approval. Additional issues with OS as an endpoint include the confounding impact of therapies given upon progression, and death not related to cancer (Garon 2012). Progression-free survival (PFS) has become an accepted alternate endpoint in assessing efficacy in advanced NSCLC (Food Drug Administration 2011; Garon 2012; Johnson et al. 2011). However, there are examples of improvement in PFS without an OS benefit (Lima et al. 2009), and an OS benefit without PFS improvement (Cheema and Burkes 2013). Recent well-developed multiple lines of therapies for advanced NSCLC patients after progression with first-line treatment have been increasing post-progression survival (PPS). Moderate to large improvement in PPS has resulted in reducing or losing an OS benefit in comparative clinical trials of first-line treatment, even though significant PFS benefit is elucidated (Lima et al. 2009). Therefore, PFS is probably the only rational endpoint for the current clinical trials, particularly in crossover design (Booth and Eisenhauer 2012; Mok 2011).

Hotta et al. (2011) and Hayashi et al. (2012) reported that PPS was highly associated with OS in first-line chemotherapy for advanced NSCLC, whereas the correlation between PFS and OS was moderate. This relationship was also found in second- and third-line chemotherapy for advanced NSCLC (Hayashi et al. 2013). PPS may be one of the factors reducing the OS benefit, but little is known regarding what factors other than PPS affect the association between OS and PFS. The aim of the present study was to identify factors affecting the association between OS and PFS, particularly those causing longer OS than that estimated based on PFS.

Materials and methods

Trial selection and database construction

Controlled trials for first-line treatment of advanced NSCLC published between January 1, 2003 and December 31, 2012 were identified through a systemic search of PubMed. Keywords used were “controlled clinical trial,” “first-line treatment” and “NSCLC.” The results were limited to articles published in English.

All retrieved abstracts were reviewed in accordance with pre-specified inclusion and exclusion criteria. Included studies were randomized phase II or III clinical trials of first-line therapy of advanced NSCLC that presented results for both OS and either PFS or time to progression (TTP). Studies were excluded if the treatment was adjuvant, maintenance or second-line therapy. Studies that investigated only immunotherapy regimens and those that were designed to assess combined modality treatment including radiation therapy and surgery were also excluded. To avoid bias, two observers (MA and MK) independently abstracted the data from the articles.

Data abstraction

Abstract data included publication year and reference, patient characteristics (age, gender, race, Eastern Cooperative Oncology Group Performance Status (PS), disease stage, histological type of NSCLC, smoking history, and genotype), treatment information (chemotherapy and/or molecular-targeted regimen), trial characteristics (study phase, study period, region (with/without Asian countries), number of countries, number of sites, and number of patients), and efficacy information (median months, PFS, TTP, OS). Percentages of males, PS 1, disease stage IV, and squamous cell carcinoma were used as variables for gender, PS, disease stage, and histological type of NSCLC, respectively. OS and either PFS or TTP were determined for all the treatment arms using published data or survival curves. PFS and TTP were collectively referred to as PFS in order to increase the sample size as done previously in recent reports (26,27). PPS was defined as OS minus PFS for each trial.

Data analysis

To assess the correlation between OS and either PFS or PPS, we used Spearman’s rank correlation coefficient (r). We conducted a simple linear regression analysis to obtain a regression line between PFS and OS and calculated the estimated OS from the PFS based on the regression equation. According to the ratio of observed OS/estimated OS, the treatment arms were classified into three groups: <0.8 (OS-reduced group), 0.8–1.2 (OS-association group), and >1.2 (OS-extended group). Factors influencing higher and lower than predicted OS were initially examined by univariate logistic regression analysis using a fixed-effect model between the OS-extended group and the OS-association group as well as the OS-reduced group and the OS-association group. After identifying potential influencing factors, multivariate logistic regression analysis was conducted in a stepwise fashion to further investigate factors that contribute to OS extension. A p < 0.05 was considered statistically significant throughout the analyses except where otherwise noted. The analyses were conducted using StatsDirect software (ver. 2.7.9; StatsDirect Ltd. UK).

Results

Characteristics of the trials

A total 175 potentially relevant trials were identified. Initially, 46 trials were excluded for at least one of the following reasons: other malignancies, non-randomized, phase I/II, review articles, combination analyses, subgroup analyses, and duplicate references. A further, 46 trials were excluded because the trials were in a second-line setting or involved maintenance therapy after first-line treatment. Finally, after excluding trials without information on the necessary endpoints (OS, PFS or TTP) or patient baseline characteristics, 65 trials were considered to be highly relevant for the present study. The selection process for the randomized controlled trials is shown in Fig. 1.

Fig. 1
figure 1

Flow chart showing the progress of trials through the review

The main characteristics of the 65 trials are listed in Table 1. A total of 140 treatment arms and 23,337 patients with advanced NSCLC were included. The median of the number of patients per trial was 70.5 (range 20–863) with most trials having a high proportion of males (71.0 %: range 20.5–95.7 %). The average median age was 61.1 years (range 56–78). Among 140 treatment arms, there were 86 chemotherapy arms, 7 molecular-targeted therapy arms, and 47 combination therapy arms. Eighty three arms were from phase II trials and 57 from III trials. Trials were classified into two groups by the year of start of the trial, considering the timing of the introduction of molecular-targeted agents: between 1998 and 2003 (59 arms), and between 2004 and 2008 (67 arms). The median OS was 9.9 months (range 3.5–30.5), and median PFS was 5.0 months (range 1.7–10.8) for all arms. Information of the number of participating counties and the race of trial subjects was very limited. There was limited data on genotyping of NSCLC and treatment after progression.

Table 1 Characteristics of 140 treatment arms in the 65 trials

Relation between OS and either PFS or PPS

The relation between OS and either PFS or PPS for the 140 arms is shown in Fig. 2. PPS was strongly associated with OS (Spearman’s r = 0.841, p < 0.0001), whereas PFS was more moderately associated with OS (r = 0.689, p < 0.0001). The regression line between OS and PFS was:

$${\text{OS }} = { 1}. 80 1 { } + { 1}. 7 4 9 { } \times {\text{ PFS}}\quad \left( {r^{ 2} = \, 0. 4 3 9} \right)$$
Fig. 2
figure 2

Correlation between Overall Survival (OS) and either (a) Progression-Free Survival (PFS) or (b) Post-Progression Survival (PPS) for 140 arms of 65 clinical trials for first-line treatment for patients with advanced NSCLC. The coefficients of correlation (r) between OS either PFS or PPS were 0.662 and 0.935, respectively. The size of each circle is proportion to the number of patients in the corresponding arm

Characteristics of the OS-extended group

Based on the ratio of the observed OS to the estimated OS, we classified the treatment arms into three groups: OS-reduced group (ratio: <0.8), OS-association group (ratio: 0.8–1.2), and OS-extended group (ratio: >1.2). Characteristics of the three groups are summarized in Table 2. There were 20 arms (14.3 %) from 14 trials in the OS-extended group (Arrieta et al. 2010; Maemondo et al. 2010; Niho et al. 2012; Hirsch et al. 2011; Park et al. 2007; Mok et al. 2009a, b; Heymach et al. 2008; Grossi et al. 2012; Gridelli et al. 2007; Gebbia et al. 2010; Chen et al. 2007; Lilenbaum et al. 2008; Ramlau et al. 2008; Table 3). The range of the observed OS/estimated OS ratio was 1.22–2.75. In the OS-association group, there were 94 arms of 53 trials, and in the OS-reduced group, there were 26 arms of 18 trials. After excluding the arms from the OS-extended group, the correlation between OS and PFS improved (Spearman’s r = 0.789, Supplement Figure 1).

Table 2 Characteristics of each category classified by the ratio of observed OS/estimated OS
Table 3 Design and characteristics of trials of the OS-extended group

Statistically significant differences on univariate logistic regression analysis between the OS-association group and the OS-extended group were found in the following variables: study phase, area, region, number of sites per trial, number of patients per arm, average age, the proportion of males, percentage of squamous cell carcinoma, and smoking history (Table 2).

Exclusion of arms from the OS-reduced group did not improve correlation between OS and PFS. Therefore, further analysis of the OS-reduced group was not conducted.

Identification of factors influencing OS extended

We selected types of drugs (chemotherapeutic agent, others), study period (1998–2003, 2004–2008), number of patients (<150, ≥150), average age (<63, ≥63), percentage of male (<70, ≥70), percentage of patients with PS1 <60, ≥60), percentage of patients with stage IV disease (<80, ≥80), and percentage of patients with squamous cell carcinoma (<30, ≥30) as potential influencing factors (Supplement Table 1). The number of arms, for which information about region, number of sites, and smoking history was small, was excluded from the logistic regression analyses.

On univariate logistic regression analyses using these potential factors, we identified variables such as number of patients, average age, percentage of males, and histological cancer type as factors potentially influencing extension of OS with statistically significant association (Table 4). On further multivariate logistic regression analyses, number of patients less than 150 per study arm, average age younger than 63 years, and percentage of patients in the study arm with squamous cell carcinoma of <30 % were identified as statistically significant influencing factors for extended OS as shown in Table 4.

Table 4 Influencing factors identified by univariate and multiple analysis for the OS-extended group

Discussion

New treatment strategies, particularly the introduction of molecular-targeted agents and appropriate patient selection based on histology and/or genotyping, have progressed markedly in recent years, and the OS in advanced NSCLC patients has improved (Ramalingam et al. 2011; Kobayashi and Hagiwara 2013; Kaneda et al. 2013). There are now examples of improvement in PFS without an OS benefit (Lima et al. 2009), and an OS benefit without PFS improvement (Cheema and Burkes 2013). Recent well-developed multiple lines of therapies for advanced NSCLC patients after progression with first-line treatment have been associated with an increase in PPS.

We confirmed that OS was more strongly associated with PPS than with PFS among 140 arms of 65 phase II and III clinical trials for first-line treatment of advanced NSCLC. Our results are similar to those previously reported (Hotta et al. 2011; Hayashi et al. 2012). These strong associations between PPS and OS have also been shown in the first-line treatment of advanced colorectal (Petrelli and Barni 2013) and breast cancers (Saad et al. 2010a, b).

Broglio et al. demonstrated that PPS has an important impact on the association between PFS and OS (Broglio and Berry 2009). When PPS is short, PFS benefit results in a statistically significant OS benefit; however, moderate to longer PPS results in reducing or losing an OS benefit in comparative clinical trials of first-line treatment, even if a significant difference in PFS is observed in randomized trials. If using OS as the primary endpoint, subsequent multiple treatments after the experimental treatment should be considered in clinical trial design (collecting data or defining subsequent treatment options) because PPS may be a potential confounding factor. However, none of the reports we reviewed mentioned these details.

Improving OS remains the gold standard of clinical trials. However, when OS benefit is diluted and masked by longer PPS, OS may not be the most appropriate primary endpoint for assessing the clinical effect for first-line treatment. Although PFS is not a very good surrogate of OS, particularly when PPS is long, PFS should be considered as an attractive endpoint, because it is available earlier than OS, is less influenced than OS by competing causes of death, and is not influenced by PPS. Notably, in advanced breast cancer, there have been few phase III trials where OS was used as the primary endpoint (Verma et al. 2011). It has become increasingly common for PFS or TTP to be used as a primary endpoint in recent phase III randomized trials of first-line treatment for advanced breast cancer (Saad et al. 2010a, b; Saad and Katz 2009) and metastatic colorectal cancer (Saad et al. 2010a). In these cancers, it is well known that subsequent line of therapy plays a major role in determining OS (Saad et al. 2010a, b; Tang et al. 2007; Chirila et al. 2012).

In advanced NSCLC, PFS has not been accepted as a surrogate for OS (Laporte et al. 2013). However, the US Food and Drug Administration has a draft guidance regarding the use of PFS as a clinical endpoint, which is likely to be accepted if the observed magnitude of effect is substantial and robust (Food Drug Administration 2011). The recent progress of multi-line treatment after first-line treatment for advanced NSCLC has provided longer PPS, which resulted in a reduced OS benefit as the primary endpoint, as already seen in advanced breast and colorectal cancers (Cufer et al. 2013), and PFS is a common primary endpoint in current randomized clinical trials. A search in clinical trial databases by Soria et al. (2010) found that more than 150 trials use PFS as the primary endpoint in stage III/IV of NSCLC. Schrimpf et al. (2013) proposes that PFS with the addition of some measures like patient-reported outcomes such as QOL and/or treatment toxicity could cover the clinical benefit in NSCLC studies for individualized therapies with clear patient selection. Mandrekar et al. (2010) reported that PFS or failure-free survival at 12 weeks was a stronger predictor of subsequent patient survival than tumor response and proposed that this be used routinely as an endpoint in phase II trials for advanced NSCLC. This could lead to more accurate assessment of the true efficacy of new drugs for advanced NSCLC.

Our meta-analysis with moderate association between PFS and OS suggests that PFS is not an appropriate surrogate for OS. We conducted subgroup analysis to identify factors associated with the longer OS than that estimated from PFS. Among 65 trials, we identified 20 arms of 14 trials as the OS-extended group, wherein the observed OS was 20 % longer than the estimated OS based on PFS. When these 20 arms were excluded from the analysis, the correlation between OS and PFS was improved. It was noteworthy that there were factors in addition to PPS involved in reducing the association. Univariate analysis identified four variables significantly relevant to OS extension (p < 0.05): number of patients (<150/arm), mean age (<63 years), percentage of males (<70 %), and histology of NSCLC (<30 % of squamous cell cancer). Multivariate analysis showed that three of these four variables were statistically correlated with OS-extended group: number of patients less than 150 per arm, mean age younger than 63 years and squamous cell cancer <30 %.

The most widely accepted prognosis determinants are disease stage and PS (Mountain 1997). Male gender, age older than 60 years, non-squamous histology, smoking, and weight loss are known to be prognostic factors (Charloux et al. 1997). Therefore, it is not unexpected that age younger than 63 years and squamous cell carcinoma <30 % were influencing factors of OS extension in our findings. PS is a valid prognostic factor (Belbaraka et al. 2010), but was not identified as such in our study. This might be due to the fact that mainly PS 0 or PS 1 patients were enrolled in most of the clinical trials (PS 2 was only seen in 4.2 % of all patients). The percentage of smokers was excluded from the regression analyses because of small sample size, but this could be expected to have an influence on OS in NSCLC clinical trials. Interestingly, no effect of molecular-targeted agents on OS extension was observed, as no statistical significance was observed in either the treatment regimen or the period in which the trials were conducted. In the OS-extended group, the number of trials in which Asian counties were involved was significantly greater in the OS-extended group than in the OS-association group. However, this factor was not analyzed further due to limited sample size. Participation of Asian countries in recent global clinical trials has been increasing, and an affect of region and race might be considered in future clinical trials.

When PFS is used as the primary endpoint in phase II trial and OS is used in phase III trial, our results suggest factors to be considered in protocol design in order to elucidate the true clinical benefit of experimental drugs for the first-line treatment of advanced NSCLC: (1) increasing the number of sites and number of patients which would improve the association between OS and PFS, (2) adjusting patient baseline characteristics, particularly relevant to prognosis factors both in phase II and phase III trial.

Several technical limitations of our study should be acknowledged. First, our study was not based on individuals, and many complex conditions were involved. Second, there was a limitation in terms of available parameters. Conducting trials in Asia seemed to influence OS extension, but small sample size meant this did not provide statistically significant information. Finally, data on subsequent treatment after progression were not available; such data may be very important when PPS is considered as a potential confounder.

In conclusion, we identified number of patients and well-known prognostic factors including age and histological cancer type as factors influencing longer OS. These factors should be considered for patient eligibility, when PFS is used as a surrogate primary endpoint for OS in randomized clinical trials of first-line treatment for patients with advanced NSCLC.