Sir,

Thank you to J.L. Vincent for having pointed out the great value of well-designed multicentre randomised controlled trials (RCTs) as the top level of evidence for evaluating the effects of therapeutic interventions in the critical care setting [1]. Multicenter trials allow a faster recruitment of adequate numbers of patients and the quality of the studies is improving over time. The author also stated that multicentre RCTs account for the vast amount of patient heterogeneity, and guarantee a greater generalisability of the results to all patient populations and intensive care units (also called external validity).

Although all the points addressed by J.L. Vincent can be confirmed, the statement that results of multicentre RCTs are more easily applicable to all patient populations deserves more attention. For instance, recent landmark trials performed in critically ill patients [2, 3] showed a beneficial effect of new interventions, generating renewed confidence in the possibility to decrease mortality and morbidity in severe diseases (ALI/ARDS, severe sepsis). But to what extent are the results of these studies applicable to the specific patient? Which criteria should the attending physician use to decide the applicability of the results to the admitted patient? Is the "multicentre" attribute sufficient to guarantee generalisability? The answers are as follows:

  1. 1.

    We study the patients in a trial not to find out anything about them but to predict what may happen to future patients given these treatments. When a large sample size is needed to detect a statistically significant difference between treatments, the magnitude of the beneficial effect is moderate. This is reflected by the 95% confidence interval. If the upper limit is close to one (no-risk line), the precision of the effect estimate, although statistically significant (p<0.05), is uncertain. In other words, the clinical relevance of the considered intervention might be very small for part of the referring population. This may be the case for the ARDSnet [2] and PROWESS [3] trials where 95% confidence intervals were 0.65–0.93 and 0.69–0.94, respectively. Simply shifting ten events from one group to another can change the conclusions. Could these results reliably and easily be extrapolated and generalised?

  2. 2.

    The extent to which it is wise or safe to generalise should be judged in individual circumstances, and there may not be a consensus [4]. Arguably, many RCTs use overrestriction inclusion criteria to maximise the effect of the intervention under investigation (or the power of the study), so that the degree of safe generalisability is reduced. For example, in the ARDSnet trial only 13% of the ARDS patients admitted in the participating intensive care units were enrolled in the trial. [5]. These kinds of data are not available for the PROWESS trial. Not surprisingly, trials in phase IV, where the inclusion/exclusion criteria are not strictly controlled, may return controversial or even opposite results, and the intervention not implemented in daily care. Rigour of methodology in performing (multicentre) RCTs is welcome, but criteria for applicability of results to the local patient should be investigated more extensively. Generalisation of RCTs is an intriguing, still open, issue.