Osteoarthritis is a common form of joint disease in the elderly and may cause severe pain and disability. Besides nonsteroidal anti-inflammatory drugs, the mainstay of management for osteoarthritic pain, transcutaneous electrostimulation (TENS), ultrasound and opioids have been advocated as viable treatment options [13]. However, the evidence for their effectiveness and safety is contradictory. The Cochrane database of systematic reviews offers a series of recently published reviews dealing with osteoarthritis, clarifying the therapeutic role of these interventions. In critically appraising what we already know using the GRADE Working Group grades of evidence, Rutjes et al. rate the quality of the originally published reports for key clinical core outcomes: pain, function and the number of patients experiencing an adverse event. In many of these instances, the overall quality of evidence is rated as low because, among other reasons, only a small proportion of included studies report a specific outcome (i.e., 8 out of 18 studies report the number of patients withdrawn because of adverse events) [3]. In these cases, the authors warn readers against the likely risk of selective outcome reporting bias.

What is selective outcome reporting and how can it lead to bias? Is selective outcome reporting related to publication bias?

In this paper, we briefly summarize the threats to meta-analyses due to selective outcome reporting and discuss recent discoveries in this field.

Publication bias

Randomised clinical trials (RCTs) are considered superior to observational studies in obtaining a precise and statistically unbiased estimate of the effects of an intervention. Unfortunately, although RCTs are desirable experiments in searching for moderate treatment effects, they are still prone to subtle form of bias. The family of publication-related biases plays a major role, and can lead to unrealistic estimates of drug effectiveness or alter the risk–benefit ratio. Publication bias originates in a prejudiced peer-review attitude: many reviewers are predisposed against recommending the publication of studies reporting non-significant findings [4]. Such studies tend to take longer to find their way into the published literature, or remain unpublished. The consequence is that it is difficult to find and include them in a meta-analysis when compared with studies producing statistically significant results [5]. If a study is not published on the basis of its results (publication bias), the omission of negative unpublished trials can lead to an over-inflation of intervention effects [6]. Chan and Altman [7] conclude in a milestone paper, by stating “The medical literature represents a selective and biased subset of study outcomes”. Pushing the concept by Chan and Altman to the extreme, the effect size observed in a meta-analysis of different studies could be on average, an accurate estimate of the extent of net publication bias operating in a specific field.

The awareness of publication bias prompted the use of early registration of trial protocols, and as a response to the 1997 FDA Modernisation Act [8], the US National Library of Medicine established the web-based registry in 2000 [9]. In 2004, the International Committee of Medical Journal Editors announced that any clinical trial aimed at publication in major scientific journals must be registered by September 2005 in a public clinical trials registry before participant enrolment [10]. This resulted in a large increase in the number of trials registered within, which in 2007 also mandatorily requires the reporting of the trial start date, and primary and secondary outcome results within 2 years of trial completion.

Notwithstanding, evidence suggests that registration does not guarantee publication of clinical trials in a timely manner in the scientific literature, and that often the quality of the information provided during the registration process is poor [11]. In particular, studies with statistically significant results or large sample sizes are likely to be published than those with nonsignificant results, and are published earlier. Moreover, the nonpublication of studies is often due to failure to submit rather than rejection by journals [12].

Selective reporting bias

Selective reporting bias in a study is defined to be the selection of a subset of analyses to be reported. When the selection process occurs in relation to outcomes, we refer to it as selective outcome reporting [13]. However, selective reporting may also occur in relation to subgroup analyses [14] per protocol rather than intention to treat analyses [15], as well as other analyses [16]. Selection may be driven to avoid redundancy of similar outcomes (i.e., two health related quality of life scales measuring the same criterion) or futility of some outcomes. Researchers, reviewers and editors are all involved in selecting the most interesting and attention-worthy outcomes (or to save precious journal space). In this case, selective reporting is less problematic. When outcome selection is driven by the significance or effect size, we refer to this as selective outcome reporting bias. In other words, in the presence of selective outcome reporting bias, published results are prone to the ‘statistically significant’ cliché: new statistically significant outcomes are introduced at the time of publication; statistically significant secondary outcomes are upgraded to primary end points; and nonsignificant primary outcomes are possibly omitted from reports [17].

Empirical evidence of outcome reporting bias

The existence of outcome reporting bias has been widely suspected for years although until recently, little was known about the prevalence and impact this has had on systematic reviews. The availability of trial protocols made it evident that often the published reports do not correspond to the registered study protocol. The findings of the principal studies that compared outcomes between protocols and publications are summarised in Table 1 (freely adapted from Table 5 of Dwan et al. [18]). In comparing trial publications to protocols, Dwan’s systematic review finds that 40–62% of studies have at least one primary outcome that was changed, introduced, or omitted, and that outcomes that are statistically significant have higher odds of being fully reported (range of odds ratios: 2.2–4.7).

Table 1 Relevance of outcome reporting bias in major published studies in the field

Rising et al. [19] illustrate the problems of selective outcome reporting bias with the example of the trials submitted to the Food and Drug Administration (FDA) in approved New Drug Applications (NDAs). The publication rate of efficacy trials submitted to the FDA and the trial characteristics as submitted to the FDA were compared with trials and characteristics published in peer-reviewed journals. Forty-one primary outcomes from the NDAs were omitted from the papers; the papers included 15 additional outcomes that favoured the test drug and 2 other neutral or unknown additional outcomes. There were 43 outcomes in the NDAs that did not favour the NDA drug, and of these, 20 were not included in the papers. Thus, the papers included more outcomes favouring the test drug than did the NDAs. These findings indicate that there are discrepancies between the data and the conclusions in NDAs and those published in medical journals, which tend to lead to more favourable presentation to practitioners of the NDA drugs.

A prevalence and impact of outcome reporting bias in a large unselected cohort of Cochrane reviews have been investigated in the ORBIT (Outcome Reporting Bias In Trials) project. This study finds that one-fifth of the statistically significant meta-analyses of the review primary outcomes are not robust to outcome reporting bias, and one-quarter will have overestimated the treatment effect by 20% or more [20].

These studies on selective outcome reporting help to explain the preponderance of favourable results observed in the medical literature. Three additional considerations can provide further explanation.

First, the number of outcomes reported in protocols is remarkably high, and it is generally not possible to reliably discern the primary from the secondary outcomes. Although not all the cited studies reported all the measured outcomes, Chan et al. [21] find that the median number of efficacy outcome measures per trial to be 20 (10–90th centile: 5–63) in one study that included 122 trials, and 11 (10–90th centiles: 5–63) in another study that included 519 trials [7]. The specification of a high number of outcomes can increase the potential for data dredging, where potentially significant associations are stumbled upon during data analysis. Often only those that are found to be significant are reported.

Second, in clinical trials, the primary outcome is the most important measure of efficacy, on which the sample size is calculated. The assumptions for this calculation are based on previously observed data or published results. When the primary outcome is replaced by a secondary one, these assumptions may differ, and analyses performed on an inadequate sample size may lead to erroneous results, usually favourable. Indeed, sample size and statistical methods in published trials are often clearly discrepant with respect to the pre-specified protocol [22]. The addition and removal of outcomes together with the sample size recalculation carry a high risk of bias. In fact, the prevalence of favourable results from the cohort of trials examined in the reported surveys is very high, and may reflect no true differences [18].

Third, the reporting of favourable preliminary results to the treatment under investigation has been under some scrutiny. As a result, scientists have been advised to critically appraise any favourable results presented at Meetings and Congresses or in the abstracts of published studies [23]. This high prevalence of favourable results can be related to outcome reporting bias of secondary or subgroup analysis, and it is often not followed or confirmed by peer-reviewed full paper publications.

It clearly appears that even in the era of evidence-based medicine, most of the knowledge we balance our clinical practice upon is at high risk of being, in reality, a true lie. True because any scientific trust is an exercise in trust [24]. True because this knowledge passed a reviewer critical appraisal, and received a favourable assessment.

A lie because even the most successful and appreciated studies are simply the ones that may suffer the worst net bias [25]. A lie because even the meta-analytic process is not always robust enough to detect selective outcome reporting bias [20].

Conclusions and policy implications for researchers and reviewers

Given the variability in protocol registration of clinical trials and in the quality of information provided, editors, peer reviewers, and readers of medical journals must carefully scrutinise trial registration records as a first step in the critical appraisal and interpretation of the results [26]. Any discrepancy in the trial protocol should be reported in the published article so that the clinician can evaluate the potential for bias. Specifically to outcome reporting bias, outcome definitions should not be vague to avoid possible tampering. The standardisation of outcomes in specific clinical areas, if implemented, will reduce the potential for bias [27, 28].

Those who slip in allowing outcome reporting bias to occur should be seriously noted by the scientific community. The adoption of reporting guidelines and quality assessment tools such as those promoted by the EQUATOR Network, The Cochrane Collaboration and the GRADE working Group [29] may improve the conduct and reporting of trials. All new Cochrane reviews––like the ones about osteoarthritis [3]––will include a risk of bias assessment step to ascertain the likeliness of outcome reporting bias, following the guidelines presented in the Cochrane Handbook (Chap 8.13) [16]. Finally policy makers should defend open access to research. Formal legislation for granting public timely access to the protocols approved by research ethics committees and basic trial results, regardless of their potential publication in medical journals, should be encouraged and protected. All these efforts together will increase the trust in science.