Background

In Europe, Prostate Cancer (PC) is the second most frequent cancer in men with an incidence of 9·55 per 1000 person-years when an invitation to screening is performed and 6·23 per 1000 person-years otherwise [1]. Early diagnosis improved by PSA testing has recently allowed a better estimation of its incidence [2]. Over the last decades, many progresses have been done in the treatment of patients with PC, partially explained by the improvement of the prediction of the disease progression based on scoring systems [3, 4]. The objectives of assessing PC patients’ risk level of future adverse health events is i) to avoid over-treatment of patients at low-risk of recurrence or death related to PC, and ii) to avoid under-treatment of high-risk patients.

Although guidelines are available for such stratified medical decision making [5, 6], some questions remain unresolved. One of the main issues to address concerns the trade-offs between the benefits and the costs of possible treatment options in terms of both survival and health-related quality of life (HRQoL). Several studies have shown that PC patients are ready to make trade-offs between their quantity and their quality of life [7,8,9,10], especially when providing balanced information of different treatment options [11]. For instance, elderly patients may never experience disease progression to metastatic stage during their remaining lifetime [12], while treatments aiming at preventing disease progression can substantially deteriorate their HRQoL [13]. Younger men may also prefer interventions that preserve their HRQoL, but at the potential cost of reducing the disease progression-free survival. In a patient-centered medical decision making perspective, the treatments should therefore be compared against each other by weighting their benefits and costs in terms of both survival and HRQoL.

In this context, we proposed a mini-review. This type of study provides a focused review of the literature, the main objective being to raise questions or to suggest new hypotheses for research. We aimed to question whether the trade-offs between survival and HRQoL are considered in high-impact factor journals and to suggest recommendations for future studies based on patient-centered endpoints.

Methods

Literature search strategy

A literature search was conducted from the PubMed database for recent papers published between May 01, 2013, and May 01, 2015. In order to obtain a picture of the main trends in the medical literature, we focused on nine prominent journals in oncology or general medicine (impact factor ≥ 15 in 2013). We indicated « prostatic neoplasms » as Medical Subject Headings (MeSH) terms and « randomized controlled trial » as publication type. The research equation used in PubMed is presented in Additional file 1. The PRISMA-P (Preferred Reporting Items for Systematic review and Meta-Analysis Protocols) checklist is also provided in the Additional file 2.

Data extraction

All papers resulting from this search were independently double-blinded reviewed (Y Foucher, M Lorent, or E Dantan). The first task was to exclude papers associated with non-randomized controlled trials, non-original works, without patients’ follow-up, or non-comparative analyses. The second task was to collect the following characteristics from the selected papers: the study design, the patients’ inclusion criteria, the patients’ maximum follow-up duration, the compared treatments, the sample size in each arm, the endpoints, the statistical methods used, the reference to the results of an additional paper and the financial support. If any disagreements between reviewers occurred, they were solved by discussions. We used Zotero to manage the records.

Results

Retained studies

The PubMed request allowed identifying 42 papers (Fig. 1). Because we only considered randomized clinical trials comparing at least two interventions, 12 publications were excluded: six re-analyses of clinical trials evaluating the prognostic capacities of markers or models [14,15,16,17,18,19]; one study related to body mass index (no comparison of treatments) [20]; one study without patients’ follow-up [21]; one paper without original results [22]; one case-cohort study [23]; one study without control group [24]; and one diagnostic study [25]. Finally, 30 papers [1, 26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,54] were retained and are described in Table 1. As detailed in the last column entitled “other results”, two papers referred to the trial NCT00887198 [27, 47] and three papers referred to the trial NCT00699751 [37, 42, 48].

Fig. 1
figure 1

Flow diagram of the literature search strategy and the used patient reported outcomes

Table 1 Descriptive of the 30 studies aiming to compare at least two interventions in a population of patients with or susceptible to have PC

Collected endpoints

Among the 30 papers, only 8 [26,27,28, 33, 35, 42, 43, 53] were partially based on the collection of Patient Reported Outcomes (PRO). Their median follow-up was 38 months (range from 12 to 52 months) versus 54 months (range from 3 months to 18 years) in the 22 remaining papers. Among the 8 retained papers, six [27, 28, 33, 35, 42, 53] compared the treatments consequences on the patient HRQoL collecting the Functional Assessment of Cancer Therapy-Prostate (FACT-P) questionnaire [55, 56]. The FACT–P is an internationally validated questionnaire specifically designed to assess the HRQoL of men with PC. It is derived from the FACT-General (FACT-G) questionnaire with an additional subscale of 12 items specific to PC (the Prostate Cancer Subscale, PCS). The FACT-G is a 27 items self-report questionnaire measuring general HRQOL in cancer patients (regardless of the tumor type). High FACT-P total score indicates better HRQoL. Note that some indexes are also derived from the FACT–P: the Trial Outcome Index (TOI) based on the physical and functional well-being subscales of the FACT–G and the PCS, and the FACT Advanced Prostate Symptom Index (FAPSI) including eight items from the FACT–P. The two remaining papers compared the interventions in terms of specific PRO: Araujo et al. [26] assessed the patients’ pain with the Short Form of the Brief Pain Inventory (BPI-SF) [57, 58], while Pisansky et al. [43] focused on sexual disorders with the International Index of Erectile Function [59], the Sexual Adjustment Questionnaire [60], and the Locke Marital Adjustment Test [61]. Among the six papers using the FACT-P questionnaire, two papers also employed the BPI-SF questionnaire [27, 35]. Note that only the study proposed by Basch et al. [27] presented a PRO measure (the pain intensity) as primary endpoint. Nevertheless, this paper referred to the same randomized clinical trial initially reported by Ryan et al. [47], which was designed (in particular the sample size determination) by using co-primary endpoints: the radiographic progression-free survival and the overall survival. Therefore, among the 27 trials included in the review, none was specifically designed to analyze the consequences of interventions in terms of HRQoL as a primary endpoint.

Statistical analyzes used to compare consequences in terms of HRQoL

Among the eight papers including some results related to PRO [26,27,28, 33, 35, 42, 43, 53], two main strategies were adopted: i) the analysis of the time to HRQoL change, defined as a relative change from baseline higher than a given percentage, or ii) the absolute difference between the HRQoL means at baseline and at a given post-baseline time.

More precisely, the time to HRQoL change was explored in four papers [27, 28, 33, 35]. The statistical analyses were based on the Kaplan-Meier estimator associated with the Log-Rank test or the Cox model. The definitions considered for the time to HRQoL change were heterogeneous:

  • In the study by Basch et al. [27], the authors studied the time from baseline to: a 10-point decrease of the FACT-P total score, or a 9-point decrease of the FACT-G score, or a 9-point decrease of the TOI.

  • In the study by Beer et al. [28], the authors studied the time from baseline to a 9-point decrease of the FACT-P total score.

  • In the study by Fizazi et al. [33], the authors studied two different endpoints: i) the time to deterioration of symptoms in the FAPSI, and ii) the time to deterioration of HRQoL in the FACT-P total score. In the two cases, there was no precision on the used threshold.

  • In the study by Fizazi et al. [35], the authors studied the time from baseline to a 10-point decrease of the FACT-P total score or death from any cause, whichever occurred first. Note that the authors compared additional HRQoL endpoints, but without taking into account the time-dependent characteristic of the HRQoL: the percentage of patients with at least a 10-point improvement in the FACT-P total score at any post-baseline assessment and the percentages of patients with at least a 3-point improvement in the five FACT-P subscales (physical wellbeing, social or family wellbeing, emotional wellbeing, functional wellbeing, and PCS). The six percentages were compared by using the stratified Cochran-Mantel-Haenszel test.

In the two remaining studies using the FACT-P, Parker et al. [42] compared the mean change in the FACT-P total score from baseline to week 16 (Student t-test), while Vitolins et al. [53] compared the 12-week HRQoL level by considering six different endpoints (ANalysis Of Variance): the FACT-P total score, the FACT-G score, the social wellbeing, the physical wellbeing, the emotional wellbeing, the functional wellbeing and the PCS.

Interestingly, one can notice that the 8 papers partially based on the PRO collection [26,27,28, 33, 35, 42, 43, 53] were differentially distributed according to the curative/palliative treatments. Among the 12 papers related to curative treatments, only 1 paper (8.3%) collected PRO [43]. In contrast, among the 18 papers related to palliative treatments, 7 papers (38.9%) collected PRO [26,27,28, 33, 35, 42, 53].

Merging the survival and the HRQoL dimensions

All papers analyzed these two dimensions separately, except for two papers [35, 36]. In the study by Fizazi et al. [35], the time to the first event between the HRQoL decrease and the patient death was studied. Heijnsdijk et al. [36] were interested in Quality-Adjusted Life-Years (QALYs) for merging the information about survival and HRQoL in order to perform a cost-effectiveness analysis of PC screening. Nevertheless, in their study, the HRQoL was not individually collected: assumptions were made regarding other data published in the literature.

Discussion

In the treatment of PC, the most effective intervention in terms of survival may not necessarily be the best one from the patient’s perspective if survival gain involves serious HRQoL deterioration due to treatments side-effects on sexual, urinary and bowel functions. Thus, in randomized clinical trials, it appears important to describe trade-offs between survival and HRQoL. Following this line, the Food and Drug Administration (FDA) has published a guidance document promoting the inclusion of patient-reported outcomes measures in drug development [62]. Moreover, several steps have been identified and proposed for a more patient-centered approach to drug development [63, 64], including patient-centered outcome research which aims to allow the voices of patients to be heard in assessing the value of health care options. In order to evaluate what is currently done in PC clinical research, we performed a mini-review focusing on randomized clinical trials published between 2013 and 2015 in medical journals with a high impact factor.

Among the 30 selected studies, only two papers attempted to merge the patient survival and HRQoL in a single endpoint. The first one, proposed by Fizazi et al. [35], compared the time to the first event between the patient death and the HRQoL deterioration. However, assuming death and HRQoL deterioration are equally important raises questions. The second one, proposed by Heijnsdijk et al. [36], computed QALYs to conduct a cost-effectiveness analysis of PC screening. Although QALYs have been primarily designed for economic evaluation purposes they could also prove useful for clinical decision making [65, 66]. In the late 1990’s the concept of Q-TWIST (Quality-adjusted Time WIthout Symptoms of disease and Toxicity of treatment), which is nearly identical to that of QALYs, has been used by physicians to present the results of PC clinical trials [67, 68]. Broadly speaking, QALYs are computed by assigning to the health states a synthetic HRQoL score, called “utility score”, ranging from zero (death) to one (perfect health) so that each year of life is weighted by the corresponding utility score given the patient’s health state. More precisely, 1 QALY represents 1 year alive in perfect health. For instance, a patient living 10 years with a utility at 0.8 will a have 8 QALYs (10*0.8). This value would be lower for a patient living for 12 years but with an utility at 0.6, the number of QALYs would then be 7.2 (12*0.6) due to a more efficient intervention but with important side effects for example. But the main limitation of the study proposed by Heijnsdijk et al. [36] is that the utility scores used to calculate QALYs were not individually collected during the trial, but they were retrieved from literature.

Among the 30 selected papers, only six papers proposed HRQoL collection but as a secondary endpoint with a short-term follow-up. Two additional papers compared the interventions in terms of specific PRO. This low proportion of PRO-based papers (8/30), is even more dramatic for curative treatments (1/12) compared to palliative treatments (7/18). The analyses of HRQoL were always performed separately from those related to patient survival. This way of presenting results did not allow an interpretation of the potential trade-offs between quantity and quality of life. The shortness of the follow-up in these studies also represents an important limit for balancing between the long-term quantity and quality of life. Moreover, even if six papers used the FACT-P questionnaire, the statistical analyses were highly heterogeneous. For instance, among the four papers in which the time from baseline to HRQoL change was described, the definitions of the HRQoL change were different, and the interval censoring and the informative censoring due to patient death were not taken into account in the analyses. As previously emphasized by Efficace et al. [69], who described that only one-fifth of randomized clinical trials in PC reported adequately PRO data to draw meaningful conclusions, our results indicated that methodological improvements related to HRQoL analyses are essential for a better interpretation by physicians. For instance, Martin et al. [70] have recently provided useful guidelines for better standardizing patient-centered outcomes.

As a matter of fact, specific methodological issues related to PRO analysis do not seem to be considered nor discussed in most of the six PRO-based studies of our review, such as missing data management or choosing a threshold for minimal important change in HRQoL level. Indeed, information on missing data description and analysis is often lacking, which is unfortunate. Such data are likely to be missing not at random, which might lead to biased estimates of treatment effect. Moreover, the choice of thresholds for time to HRQoL change, is either unjustified or refers to the concept of Minimal Clinically Important Difference (MCID) proposed by Cella et al. [55] The latter constituted an important step but it has nevertheless to be outlined that a sample-dependent statistically-based approach was used, which did not rely on the patient’s perspective.

In this mini-review, we voluntary restricted our study to trials published between 2013 and 2015 in medical journals with a high impact factor. This limits the generalizability of the findings. Firstly, we did not include the year 2016, while several important studies have been published. For instance, the ProtecT clinical trial aimed to compare active monitoring, radical prostatectomy, and external-beam radiotherapy for the treatment of clinically localized PC [71, 72]. The authors described separately, in two different papers, the clinical endpoints [71] and the patient-reported endpoints [72]. Again, this illustrates the need of developing future clinical trials that better consider the balance between quantity and quality of life in a single endpoint, such as QALYs. Secondly, many important studies are not published in these journals with high-impact. The researchers who publish in high-impact journals have distinct profiles compared with the researchers who publish in low-impact journals [73], and the cancer trials with positive outcomes are more likely to be published in journals with high-impact [74]. Note also that all main urological journals were not included because of an impact factor lower than 15.

However, the limitations do not disqualify the central message of our mini-review. Our aim was not to propose a complete systematic review, but rather to illustrate the paradox between acknowledging that the treatment choice involves trade-offs between quality and quantity of life and the scarcity of studies that take them into account. Among the 30 selected studies with high-impact, no study precisely describes the potential trade-offs between quantity and quality of life. Based on this result, one can reasonably suggest to further consider composite patient-centered outcomes in future clinical trials, especially for those published in journals with high-impact. Future studies should also take into consideration some psychological aspects that may affect HRQoL [75, 76] and the important role of the family [77].

Conclusion

In conclusion, our mini-review suggests that recent clinical trials published in journals with high-impact are not designed to precisely describe the potential trade-offs between the quantity and the quality of life. It is now time to avoid designing trials that mainly, or even only, consider clinical efficacy. Composite patient-centered outcomes merging the quantity with the quality of life are needed to propose the most appropriate treatment on behalf of patients’ best interest. We recommend the use of indicators such as QALYs as principal endpoint in future clinical trials.