Background

Meta-analyses of medical studies are conducted in order to synthesise research evidence on the subject of interest and provide an epidemiological evaluation of results from primary studies [1]. The use of meta-analysis allows us to quantify the pooled effect of an exposure variable, such as a risk factor or intervention, on an outcome of interest using the results from all available primary studies [2]. The precision and the statistical power of the hypothesis tested in a meta-analysis are usually higher than that of the primary studies due to the increase in the amount of data contributing to the overall pooled estimate [3].

Only primary studies with a common outcome can be pooled in a meta-analysis and so dichotomisation of continuous outcomes presents a difficulty over and above the loss of power [4], underestimation of effect size [5], and the need for larger samples [6, 7] associated with the practice. When different cut-points for a particular continuous outcome have been used in primary studies, their results cannot be compared in a meta-analysis [4]. Pooling primary studies with the continuous and binary form of an outcome in separate meta-analyses [8], may lead to conflicting results and conclusions [8, 9] due to loss of power and selection bias. More precisely, primary studies included in the calculation of pooled estimates may differ for the continuous and dichotomous form according to data presented in the separate reports and, therefore, a meta-analysis may not include all the primary studies carried out on a research question, leading to an incomplete and potentially biased summary of the evidence. Further, information from the same primary study may be used in both meta-analyses, thus making the results repetitive and not necessarily confirmatory [9].

Peacock et al. [10] have previously described a distributional method for use in primary studies which permits researchers to present both the comparison of means and comparison of proportions. This method involves transforming the difference in means between two groups, into a comparison of proportions of subjects that fall below (or above) a threshold of interest, to give a ‘distributional estimate’ expressed as a difference in proportions, risk ratio (RR) or odds ratio (OR). The standard error for the distributional estimate is derived as a function of the means and standard deviations of the sample using the delta method and so inferences drawn from the comparison of proportions reflect inferences about the comparison of means. The purpose of this study was to use the distributional method described above to illustrate how dichotomisation of a continuous outcome in primary studies may result in biased estimates of pooled RRs and ORs in meta-analysis particularly when either outcome includes only a subset of the available primary studies. To do this, we considered an outcome that is commonly reported as dichotomous and/or continuous, birthweight (analysed as continuous (g), or as dichotomous (low birthweight (LBW): < 2,500 g). This threshold (2,500 g) is clinically relevant both in clinical trials and epidemiology spanning various areas of health research.

Methods

Search strategy

Searches of electronic databases were conducted in PubMed, Embase, Web of Science and the Cochrane Database of Systematic Reviews from January 2010 to December 2011 using search terms ‘birthweight’ OR ‘birth weight’. The search was limited to meta-analyses and human studies in PubMed and Embase. The references of the papers that met the inclusion criteria were searched for additional studies.

Meta-analyses in which birthweight was an outcome variable (either primary or secondary) were eligible. Meta-analyses were excluded if birthweight was a risk factor and not an outcome and if the systematic review did not include a meta-analysis on a birthweight outcome. If the birthweight outcome was ‘small and/or large for gestational age’ or presented in terms of correlation coefficient, the meta-analyses were excluded. Other exclusion criteria were genetic and ecological studies. Titles and abstracts were screened and the full texts of studies which met the eligibility criteria were retrieved. The flowchart showing the search strategy is presented in Figure 1 in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [11].

Figure 1
figure 1

Flow diagram of search process for meta-analyses included in this study.

Illustrative analyses

A secondary analysis was performed in order to illustrate the consequences of dichotomisation. The distributional method was used to obtain distributional RR/ORs for each of the primary studies included in each meta-analysis using the reported sample means and standard deviations. These distributional estimates for LBW were then pooled to obtain a summary distributional RR with confidence intervals (CIs) using either the fixed or random effects model as appropriate. We refer to this pooled estimate as the ‘pooled distributional estimate’. This process was undertaken using meta-analyses for which the means and standard deviations of the birthweight data in the pooled primary studies could be accessed. The various steps for the application of the distributional method in obtaining distributional risk and ORs have been set out in another paper [10] and a Stata ado-file is available (http://wwwhomes.uni-bielefeld.de/osauzet/distributional.htm).

For this illustrative study, we assume that birthweight follows a normal distribution as required by the distributional method [10].

All the analyses were performed using Stata version 12.0 [12].

Results

Meta-analyses included in study

A total of 772 papers were retrieved from the search and of these, 76 published meta-analyses which met the inclusion criteria were included in this study. Fifty-two (52/76) of the meta-analyses included in this study combined randomised controlled trials, and in 24/76, observational studies were pooled. Details of the meta-analyses included in this study can be found in an additional file [see Additional file 1].Thirty-seven percent (28/76) of the meta-analyses reported only the dichotomous form of the outcome while 26% (20/76) reported only the continuous form. In 1/76 meta-analysis, birthweight was analysed as continuous for one intervention and as binary for another. Thirty-six percent (27/76) of the meta-analyses presented both binary and continuous forms of the variable. Among these, 7/27 reported results that were statistically significant for one outcome and not for the other. The number of meta-analyses reporting either dichotomous or continuous outcomes is presented in a flow diagram (see Figure 2). There was no discussion related to the presentation of two separate meta-analyses for both forms of the outcome in any of the 76 meta-analyses papers included in this study.

Figure 2
figure 2

Flow diagram showing details of meta-analyses included in this study.

Using the distributional method in secondary meta-analysis

Tables 1 and 2 show the results of the secondary meta-analyses performed to illustrate how the results might have looked had the distributional method been used to give means and proportions in all primary studies. Secondary analyses were performed using data reported in meta-analyses (N = 39/76) where primary study means and standard deviations were reported and could be accessed (that is meta-analyses reporting a continuous outcome: N = 21/39; meta-analyses reporting both continuous and dichotomous: N = 18/39).

Table 1 Secondary analyses in meta-analyses reporting only birthweight mean difference outcome (N = 21)
Table 2 Secondary analyses for meta-analyses reporting both continuous and dichotomous outcomes (N = 18)

For the meta-analyses reporting only the mean difference outcome (N = 21/39) for an intervention/exposure, secondary analyses provide distributional estimates for low birthweight that reflect those of the mean differences (see Table 1).

Where both forms of outcome were reported, the number of pooled primary studies for each differed in 16/18 meta-analyses, so the results for different outcomes were based on different subsets of the available data (see Table 2). In 2/18 [39, 44], the same studies were combined for each outcome and although the distributional RR were similar to those of the published RR, the CIs were narrower with inferences consistent with those of the mean differences.

The distributional estimates provided similar inferences to the mean difference outcomes in 17/18 meta-analyses (see Table 2) confirming that the distributional estimates are valid. For 1/18 meta-analysis [40], where the results of the distributional estimates were not consistent with that of the mean difference outcome (see Table 2), many of its primary studies were very small with very low means and so the distributional method would not be recommended [10].

Secondary analyses could not be performed on 28/76 meta-analyses which reported only the dichotomous form of the outcome. In a further 9/76 meta-analyses reporting both dichotomous and continuous forms of the birthweight outcome, details of the pooled primary studies’ data could not be accessed and the reasons are outlined in an additional file [see Additional file 2].

Discussion

The aim of this study was to illustrate how dichotomisation of a continuous outcome in primary studies may result in biased estimates of pooled RRs and ORs in meta-analysis, using published meta-analyses reporting the birthweight outcome as an example. There is a difficulty in comparing results on the basis of statistical significance; a comparison of means will be more powerful than a comparison of proportions below a cut-point in the same datasets and, therefore, the former are more likely to be statistically significant. This is one reason why using a distributional method in primary studies to estimate proportions below a cut-point carries an advantage in that estimates of mean differences are comparable to estimates based on comparison of proportions [10].

Researchers commonly dichotomise continuous data such as birthweight as it may be difficult to interpret differences in means, but a difference in percentage of low birthweight may be more meaningful. When a continuous outcome is dichotomised in some primary studies but not others, this may cause difficulties for the meta-analyst. The process of selecting primary studies for inclusion in a meta-analysis may include deciding between studies reporting the continuous or the binary form of an outcome. In such cases, the set of studies reporting the continuous outcome may be different from those reporting the binary form. Where a different set of studies are combined for each of the two outcomes, there is the possibility of biased results. This is illustrated in Table 2 where the number of studies and/or sample sizes are very different for the two outcomes in 16/18 meta-analyses. For these meta-analyses, although the precision of the distributional estimates gave similar inferences as the mean differences, they were not comparable with those of the published LBW outcome because they are based on different sets of data.

We do not imply that the distributional approach gives the more accurate result in our illustration as the limited availability of data in published papers prevented its application for all studies. The best option is for primary studies to report both forms for birthweight as other researchers may wish to see them.

Several methods have been developed for combining individual studies reporting continuous and binary outcomes in one meta-analysis to obtain one summary measure in meta-analysis [8, 9, 52]. Whitehead et al. [8] obtained a summary log-odds ratio while other authors [9, 52] have recommended converting the estimates from individual studies to effect sizes and then combining these. These methods are helpful in allowing all studies to be pooled but do not overcome the problem of the loss of power when dichotomising.

We have used the distributional method in secondary meta-analysis to demonstrate how dichotomisation in primary studies may result in inconsistent estimates in the context of meta-analyses. We are not advocating the distributional method as a tool for meta-analysts who are using aggregate data, but rather wish to highlight its usefulness in primary studies.

Conclusions

Researchers who wish to dichotomise continuous outcomes in primary studies may consider using the distributional approach to obtain the difference in proportions or RR/OR to present alongside differences in means. Where this has not been done, and if the individual study outcomes follow a normal distribution and means, standard deviations, sample sizes are given, the meta-analyst can compute distributional estimates for use in pooled summaries. In this way, meta-analyses will be less subject to selective outcome bias.