In this issue, Kunikowska et al. [1] provide a landmark study with the first clinical use of a combination of simultaneously administered (“cocktail”) radio-nuclides. In their theoretical paradigm [2], each radio-isotope with its spectrum of emitted beta energies has an optimal target diameter in which most of the beta energy is absorbed and so-called “energy escape” prevented, thereby optimising the radiation dose to the tumour. A cocktail of 50% 90Y-DOTATATE (“90Y”) + 50% 177Lu-DOTATATE (“177Lu”) was used in 25 patients and compared to another group of 25 patients treated by 90Y-DOTATATE only. This followed pre-clinical work by de Jong et al., who, using a neuro-endocrine tumour rat model with one small and one large tumour subcutaneously implanted, found striking differences in overall survival and progressive disease in favour of the combination 90Y+177Lu compared with each single-agent treatment. Kunikowska et al. found greater overall survival in patients treated with 90Y+177Lu than with 90Y alone and concluded that the tandem therapy is more effective, although tumour response and progression-free survival were not significantly different.

A critical issue of this and many other radionuclide therapy studies is that patients were not randomised to receive 90Y alone or 90Y +177Lu. In the results section, the authors state that first one cohort patients were treated with 90Y, then another cohort with 90Y+177Lu when 177Lu became available in Poland. It cannot be stressed enough how important it is to comprehend the introduction of bias to clinical oncology trials and to address its potential sources in a systematic way. There are several tools available, for example the Cochrane risk of bias toolbox [3]. As clinical trials appear to be biased towards an exaggeration of treatment differences, guidelines for reporting of clinical trials have also been adopted by several cancer journals [4]. These include description of quality control methods; unaccounted patients; inevaluability rate; exclusion of ineligibility; power analysis and sample size; initial target sample size; control patients; patient subsets; and methods of statistical analysis. According to these guidelines, (definite) claims of therapeutic efficacy cannot be made on the basis of non-randomised trials unless the disease is so rare or prognosis so poor that controlled randomisation is practically impossible. On the other hand, it has been argued that randomisation is not necessary because matched historical or concurrent controls can be selected. However, this is a misconception, because randomisation does not ensure that treatment groups are medically equivalent, as done in the Kunikowska study by statistically comparing a number of known patient characteristics, but that the unknown biasing factors are randomly distributed, and thus a statistically significant difference is not the result of a non-random difference in these unknown prognostic factors.

Kaplan–Meier survival curves are very suitable to assess overall survival because they use the actual observed time-of-death data and interpolate the probability of survival in between these times. It would have been desirable for all patients to have attained the same observation period of 36 months, as mentioned in the methods section. In general, survival models such as Kaplan–Meier analysis classify “censoring” events, such as patients being alive at the end of a clinical trial, as non-informative (i.e. random). Indeed, the censoring time is independent of the prognosis of the patient, and hence such events may be considered non-informative. However, early withdrawal of a patient for reasons such as lack of compliance, early death and loss to follow-up can be informative. This can also be the case if not overall survival but rather other endpoints such as “event-free survival” are used. Consequently, it is essential to obtain follow-up information actively on all patients before analysis. In the study of Kunikowska et al., there is significantly better overall but not event-free survival, “event” being defined as progression, relapse or death. In other words, 90Y+177Lu patients did not have a better tumour response, remain tumour-stable for longer or show slower progression, but they did live longer. This seems odd, as one would assume tumour-related death to be the consequence and implication of a preceding tumour progression. In non-randomised trials, there are three main sources of bias to consider: differences in prognostic variables, in co-interventions and/or in outcome measurements. One explanation could thus be a higher number of disease-unrelated deaths as a result of differences in the distribution of unknown prognostic variables, resulting in a higher death rate in the group receiving 90Y only. Often, the referral pattern changes with the availability of a new therapy; over time, more advanced and often also less advanced cases are included for therapy. Regarding the presence or lack of co-interventions, such as simultaneously administered systemic therapy, additional necessary surgery, or even alternative medicine, this seems not to be an issue, but is not specifically mentioned. Finally, a drift in diagnostic tests for measuring outcome can be problematic, particularly in cohort studies. Tumour progression on the basis of a subjective assessment of symptoms has to be evaluated by the same observers that are blinded to the treatment arm. Systematic discussion of possible sources of bias and their direction of influence increases the strength of evidence, especially in non-randomised studies.

While a log-rank test can determine the risk of death systematically, i.e. if it is higher or lower in one group over a whole period, another approach, as mentioned in the study by Kunikowska, has been to apply a proportional hazards or Cox regression model. To provide a model for the survival of an individual patient, the following model is used: hi(t)=HRi×h0(t). This says that the risk of death for a particular patient is determined by a general “baseline” risk of death h0(t) that can vary in time and a hazard ratio (HRi) that is constant in time (the so-called proportional hazard model assumption). This HR is the ratio of the risk of death in (treatment) group A and the risk of death in (treatment) group B, for example 177Lu/90Y versus 90Y alone. The survival model of Kaplan–Meier is a robust model that has no additional assumptions, but in non-randomised studies differences in populations with regard to prognosis can heavily influence the results. The Cox regression model allows to determine which factors, or co-variables, influence survival by making a regression model of the logarithm of the relative risk: ln (HR)=β1X12X23X3+...... Why the authors used their particular co-variables and not others mentioned in the literature, such as the presence or absence of a carcinoid syndrome, markers of proliferation, extent of hepatic metastases or histological features, is unclear. An interesting approach would have been to see how the hazard ratio of the treatment group (90Y+177Lu versus90Y) would change with the addition of other prognostic factors to the model. If such an observed change in hazard ratio were large, then this would indicate the possibility that more confounding co-variables were present and the observed outcome less sure, emphasising the need for a randomised controlled trial.

Where to go from here? An intriguing question is that the paradigm of “energy escape” is a seemingly paradox to the known paradigm of “cross-fire” in which, by virtue of a long range of the beta particle, several cell diameters including those cells that have scarcely accumulated the radionuclide can be effectively treated. If one assumes a homogeneous mixture and equal affinity of 90Y-DOTATATE and 177Lu-DOTATATE, it may be expected that equal amounts of 90Y and 177Lu are distributed among a wide range of different sized tumours. While use of 177Lu in small tumours would provide a clear advantage in humans, the theoretical optimal diameter of 177Lu and 90Y (2 mm and 34 mm, respectively) with the size ranges reported in this study (25–75% in the range 39–91 mm and 30–75 mm, respectively, for 90Y and 90Y+177Lu) would imply a theoretical undertreatment of larger tumours in two ways, i.e. by the limited range of 177Lu within the tumour and by the lack of cross-fire by 90Y. In a palliative therapy, in which tumour size is important for morbidity and quality of life, this may be a different effect from that in an adjuvant setting, in which small subclinical tumours may be paramount to tumour recurrence. Since the authors did not evaluate a group with solely an 177Lu labelled somatostatin analogue, it remains unclear whether the seemingly superior outcome of the cocktail group is driven only by the addition of 177Lu or whether it is really an effect of the combination. With this latter argument in mind, it seems reasonable to evaluate this approach in a three-armed prospective trial. In our opinion, especially in the light of personalised medicine [5], such a trial should be performed by employing state-of-the-art dosimetric approaches. This could potentially lead to the development of a “personalised cocktail” approach. Abiding by the rules of experimental oncotherapy under a strict protocol, needed to increase the level of evidence and the acceptance of upfront therapy, the study by Kunikowska et al. shows that this approach could be successful and its paradigm clinically relevant. Thus, this study may well represent the beginning of an important development in clinical radionuclide therapy.