In a simplistic term, meta-analysis is described as “a statistical analysis that combines or integrates the results of several independent clinical trials considered by the analyst to be combinable”. It requires the same methodological rigor that is applied to other forms of research [1]. It is increasingly reported to measure various effects sizes in the medical literature to establish with statistical confidence the risks and benefits of a particular clinical intervention [13]. It offers the advantage of integrating the results of independent clinical trials investigating comparable interventions by determining an average effect size for the combined data. This allows data from small studies, that alone offer limited guidance to clinical practice, to be incorporated into evidence-based recommendations [13]. Its object is to present a balanced, precise and impartial summary of the existing research outcomes on a particular topic.

While the results of individual studies can provide useful clinical information, there are often limitations to the clinical guidance they can provide. Single studies may be too small and underpowered to detect small but clinically relevant treatment effects and thus may not provide reliable evidence on which to base clinical practice [3, 4]. It is expensive, time consuming and logistically difficult to undertake large studies of sufficient magnitude to demonstrate with certainty small but important treatment effects [3], especially in the present environment of global financial crises. Therefore, the logical solution to this clinical dilemma is the statistical methodology of meta-analysis.

Results from well conducted meta-analyses offer considerable benefit to the clinical sciences over narrative (unsystematic) reviews or single studies [3, 5]. By providing a more objective evaluation of already available data, meta-analysis may assist in understanding differences in observations between studies and clarifying resultant clinical debate [3, 6]. It allows data from independent studies evaluating similar interventions, which on their own provide limited or conflicting evidence due to their small sample size and variability in methodological quality, to be combined with confidence providing a clearer picture in terms of risks or benefits of a particular treatment using the technique of meta-analysis [3, 6]. In this way, performing a meta-analysis provides a greater level of precision in estimation of treatment effects than can be achieved through individual studies [6]. Meta-analysis also provides a more objective and reproducible approach to reporting the conclusions attained from review processes [3]. Like any other inferential statistical method, its ability to generalize study results to the wider population beyond the sample is also a potential benefit of the meta-analysis technique [3]. Moreover, it has the potential to stimulate research in areas where evidence is shown to be lacking while offering insight and guidance into the design and magnitude of the studies required to demonstrate treatment effects with adequate statistical power [3, 57].

However, like many other statistical methods, the procedure of meta-analysis is not without its limitations. A poorly conducted meta-analysis may provide unreliable and misleading results [1]. The pooling of biases from multiple sources inherent in the included studies and within the processes used to conduct the meta-analysis remains a major criticism of the procedure [8]. In 1996, to address the suboptimal reporting of meta-analyses, an international group developed a guidance called the QUOROM Statement (QUality Of Reporting Of Meta-analyses), which focused on the reporting of meta-analyses of randomized controlled trials [9]. In 2006, the name from QUOROM to PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) was changed to encompass both systemic reviews and meta-analysis. The guidelines were revised and a number of items deemed essential were retained and others were added to the checklist [10]. Adherence to PRISMA guidelines is an essential requirement for compiling a good quality and robust meta-analysis. The PRISMA statement consists of a 27-item checklist and a four-phase flow diagram which provides step by step guidelines in producing an unbiased, transparent and complete reporting of the meta-analysis. It is imperative that authors follow the latest version of PRISMA guidelines which is readily available on the internet [11].

Meta-analyses are performed using either fixed or random effect models. The decision as to which is most appropriate depends on the context to which the meta-analysis is applied. The fixed effects model of meta-analysis is underpinned by the assumption that one identical true treatment effect is common to every study included in the meta-analysis, and as such, only within-study variation is present [1214]. In practice, this model results in the larger studies having the greatest influence within the analysis, and smaller studies having considerably less effect on the combined estimate of the effect size relative to the large studies [12]. The random effects method (DerSimonian and Laird) assumes that the true treatment effects in the individual studies may be different from each other. This model allows for a range of normally distributed true effect sizes within the studies to be included in the meta-analysis [1215]. It acknowledges that different studies may yield similar results in which the combined estimates obtained will represent a mean of true effects of the populations studied [12]. In addition to within-study variation, the assumption that each study represents a slightly different population adds an additional layer of sampling error, that is, between-study variation [1215]. The random effects model provides adjustment for unmeasured differences and biases present between studies [13]. As the between-study differences are the only difference between the models, comparable results will be produced using either model in a meta-analysis in which no residual differences between studies are present [13]. The result of this model is that smaller studies are weighted more heavily than they would be using a fixed effect model, and larger studies do not influence the combined estimates as strongly [12]. In this sense the random effects model may be described as a variation on the fixed effects model that incorporates less precision to factor in additional variation beyond that of sampling error. That is, in the presence of heterogeneity, a random effects model will yield wider confidence intervals and more conservative estimates of mean effect size with higher statistical significance than would be attained through a fixed effects model [6, 15]. However, limitations of the random effects model may result in misleading meta-analysis interpretations.

Model selection for meta-analysis should be guided by the context of the data being studied [12]. In clinical practice the assumptions of the fixed effects model are rarely plausible [12]. Differences in patient demographics, health care practitioner skills and practices, hospital service availability and implementation of differing hospital policies are just a few of the many factors that render the assumptions of the fixed effects model void when applied to meta-analyses evaluating therapeutic or clinical interventions. Therefore, logic would suggest the random effects model is more appropriate for meta-analysis in these settings [12].

An alternative theory for model selection is to base the decision on the detection of heterogeneity through the use of the Cochran’s Q statistic or I 2 Index. Some authors advocate that the presence of heterogeneity in the average effect size mandates the use of a random effects model to account for within- and between-study variability, while its absence suggests the use of a fixed effects model [15, 16]. However, due to the low power of the tests for heterogeneity genuine between-study variation may remain undetected [12, 15]. This may in turn result in an inappropriate application of the fixed effects model to substantially dispersed data [12]. Conversely, the application of a random effects model in a data set with less than expected dispersion will not affect the outcomes obtained: the random effects model will function in the same manner as the fixed effects model [12]. For these reasons, the random effects model is recommended for application to meta-analyses of clinical trials, even in cases where tests of heterogeneity fail to reject the null hypothesis of equality of average effect sizes [6, 12, 15].

Heterogeneity is the variation occurring between studies and may arise from issues such as differences within the studied population, interventions imposed, exposures being studied, methodological design and outcomes obtained [14]. Although some degree of heterogeneity is inevitable in a meta-analysis due to the realities of clinical practice [17, 18], the degree of between-study heterogeneity present determines the quality and legitimacy of the results obtained [8]. Detection of heterogeneity within and between the studies included in a meta-analysis is an essential step in meta-analytical procedures as it has implications for proceeding with the analysis and interpretation of computed results [8, 18]. The Q test and I 2 Index are commonly used methods in meta-analysis for detecting heterogeneity. An I 2 index equalling 0 % suggests no between-study variability occurring within the analysis and that all variation observed is a result of sampling error. Conversely, the degree to which an I 2 index approaches 100 % suggests the extent to which the observed variation can be attributed to between-study variability rather than exclusive sampling error. Both methods (i.e. Q test and I 2 index) have low statistical power to detect heterogeneity when study numbers are small and may be oversensitive to detect heterogeneity when large numbers of studies are present [16]. Secondly, when reported in the absence of its 95 % confidence intervals, the I 2 index may be open to misinterpretation about the degree of heterogeneity present [19]. This is because an I 2 estimated at 0 % may still have wide 95 % confidence intervals and an upper 95 % confidence interval that reflects substantial heterogeneity may be an indication that considerable heterogeneity is present [19]. For these reasons it is recommended that the I 2 index always be reported with its 95 % confidence intervals [19].

The vulnerability of meta-analysis to publication bias is a commonly expressed concern and one of the major criticisms of the technique as the validity of a meta-analysis is reliant on a thorough representation of eligible studies being located [6, 8, 14, 20]. The selective publication of studies, such as those with statistically significant results and those from large multi-centre trials in preference to smaller studies or those demonstrating little treatment effect, is well recognized within the literature [18, 21, 22]. Studies with statistically significant results are three times more likely to be published and in a timelier fashion than those that do not report a significant treatment effect [22]. Studies sponsored by pharmaceutical companies are less likely to be published than those funded by government or other organizations, and multi-centred studies are more likely to be published than single-centre studies [2325]. These realities of publishing practices may result in pooled effect sizes obtained from studies exclusively located from the published scientific literature demonstrating a more significant result in terms of the magnitude of harm or benefit of an intervention than in actuality [18, 21]: this has serious implications for application of meta-analysis recommendations to clinical practice. There are several statistical methods available for the investigation of publication bias. These may be broadly categorized as those developed to assess for the presence of publication bias, to examine the impact of publication bias, to adjust for the assumed presence of publication bias and to predict the number of missing studies and thus the likelihood of publication bias being present [26].

Funnel plots are the traditional and widely used method for detecting publication bias in meta-analyses [6, 8, 14, 18, 22, 27, 28]. Funnel plots are a form of scatter plot in which the treatment effect for individual studies are plotted against a measure of study precision such as standard error, the inverse standard error (precision), study size or variance [22, 26, 28]. A funnel plot of a meta-analysis free of publication bias is symmetrical, with the points on the scatter plot forming an inverted funnel shape around an overall treatment effect [6, 8, 20, 28]. The expected inverted funnel shape is attributable to the effect of increasing precision with increasing study size and the expectation that larger more highly powered studies will be outnumbered by smaller studies showing differing outcomes [26, 28]. However, the interpretation of symmetry equating to absence and asymmetry indicating presence of publication bias has been suggested to be misleading and too simplistic owing to the number of other factors that can affect the shape and symmetry of funnel plots [6, 20, 28]. The presence of true heterogeneity or the overestimation of treatment effects due to flawed research methodology has an impact on the shape and direction of the funnel plot [20, 28]. The model of meta-analysis applied (i.e. fix or random effects) has the potential to impact on the detection of publication bias. As random effects models give relatively higher weighting to smaller studies than the fixed effects model, the magnitude of bias is increased if publication bias is present in meta-analyses in which the random effects model is applied [13, 28]. Finally, the assessment of a funnel plot symmetry or asymmetry is generally conducted visually, and therefore is a subjective evaluation [6, 21, 29]. Nevertheless, funnel plots continue to be promoted for assessment of publication bias [6], and appear to have merit as a method of data exploration [14].

Pooling results from multiple studies retrospectively will inevitably result in pooling the biases included in each individual study and this remains a principal criticism of meta-analysis [8, 30]. Common biases include location, language and methodological quality. Even though positive and negative biases are likely to cancel each other, the varying sizes and lack of information on the direction of biases make the assessment process more complicated and uncertain.

Sensitivity analyses are undertaken to explore sources of bias or effect modifiers that are suspected to be present among the included studies [8, 18]. They are performed by altering features of the initial analysis with a view to assessing the robustness of obtained results when different aspects of the analysis are altered [31, 32]. Examples of sensitivity analyses may include exclusion of studies on the basis of methodological quality or sample size; analysis with both random and fixed effects statistical models; or inclusion of studies in which unclarified or ambiguous reporting resulted in earlier exclusion [7, 8, 31, 32]. Methods for detection of publication bias are a specific form for sensitivity analysis [14]. The decision to perform a sensitivity analysis may be made during the planning stages of a meta-analysis, however, it is often the decisions and uncertainties identified while undertaking the review that highlight the need for these investigations [31, 32]. Sensitivity analyses help identify the impact of a specific factor on the effect size as well as compare the effect sizes under various study conditions.

In this issue of Hernia, three independent meta-analyses on the issue of heavy versus lightweight mesh on the subject of inguinal herniorrhaphy are published. The first paper by Uzzaman et al. [33] is based on the older QUOROM guidelines [9] which has been superseded by the PRISMA guidelines in 2005 [10]. They have therefore missed a number of checklist items. The authors have not provided the PRISMA flow chart of information through different phases of meta-analysis which provides a snapshot of various stages of the review process. Instead they have used the old QUOROM flow chart. Secondly the authors have utilised fixed effects methods to calculate the effect size of a number of outcome variables. This methodology in our opinion is questionable even in the absence of heterogeneity as discussed above. Also the meta-analysis lacks methodological rigor and contains a number of errors. The authors have analysed erroneous sample size (i.e. incorrect number of hernias) for a number of studies which include Bringman et al. [34] (correct sample size 494, LW = 251 and HW = 243), Simetanski [35] (correct sample size 380, LW = 208 and HW = 172), and Simetanski et al. [36] (correct sample size 182, LW = 92, HW = 90). The pooled data (for a number of variables) is incorrect and therefore the results are unreliable and invalid.

The authors have rightly excluded Nikkolo et al. [37] and Puccio et al. [38] studies because patients were not blinded in terms of interventions. Their exclusion of study by Paajanen [39] and Toreivia et al. [40] was surprising as their randomization methodology though not perfect was adequate for inclusion in the meta-analysis. The study by Champault et al. [41] was rightly excluded as it contains laparoscopic and open inguinal hernia repair which we consider two different interventions altogether and inclusion of which would have produced procedural bias. Table 3 [33] addresses the risk of bias in included studies, although this could have been expanded to include the location, surgeon’s experience in performing this type of surgery and language to mention but a few. The authors definition of heavyweight mesh (>80 g/m2) versus lightweight mesh (40 g/m2) is debatable and factors such as pores size should have been included in the description of the two types of mesh. Certainly over a period of time, there is shrinkage of all types of meshes. The question remains whether the light weight mesh will shrink more than heavyweight mesh over a period of time and if so, only the long term data will determine the true recurrence rate, an issue that might be worth pursuing in future publications by these authors. The authors have made no attempt to provide any assessment of publication bias that may have an impact on the cumulative evidence. This is not unusual as a recent publication revealed that 30 (63 %) out of 48 meta-analysis failed to make any reference to publication bias [42]. PRISMA 2009 [10] checklist makes the reporting of this issue as one of the key points when compiling a meta-analysis.

The second meta-analysis by Smietanski et al. [43] included 8 trials—two more than the one by Uzzaman et al. [33]—and has far more robust methodological precision. The authors have followed the latest PRISMA guidelines and have included most but not all the checklist items. The authors have appropriately utilised the random effects method for analysis even in the absence of heterogeneity based on the above arguments. The inclusion of study by Paajanen [39] provides over 221 extra patients for analysis. However, the inclusion of the Nikkolo et al.’s study [37] is one of the weaknesses of this paper which has not blinded their patients to the type of mesh used for their procedure unlike the other seven studies (patients’ bias). The authors have taken due care in their analysis as no major discrepancies or errors were detected in any of the studies (sample size based on number of patients or hernias) (Table 1 [43]). The results are therefore accurate, reliable and valid. The authors have provided a number of funnel plots for detection of publication bias which is the simplest and most commonly used method.

The last meta-analysis to grace the pages of this journal by Junsheng et al. [44] which we feel is flawed from the outset. The authors have tried to compare the effects of lightweight and heavyweight meshes in two entirely different procedures, that is, open (anterior) hernia repair and laparoscopic hernia repair. We strongly believe that this is akin to comparing apples and oranges. Secondly, they have tried to analyse randomized controlled trials and non-randomized controlled trials which is inappropriate. In RCTs, subjects are randomized either to a treatment (lightweight mesh) or control group (heavyweight mesh). By the very nature of chance, random assignment tends to balance covariates, so that there are no systematic differences (bias) in measured and unmeasured covariates between subjects assigned to treated and control groups. If randomization is performed correctly, differing outcomes indicate treatment effect. A major issue is that non-randomized or observational studies are more exposed and prone to biases due to poor control for unmeasured confounding variables. However, the medical literature, as is the case with this systematic review and meta-analysis, is full of non-randomized and uncontrolled studies. Should the results of these studies be dismissed completely or should they be considered as useful additional information for systematic reviews remains a moot point and to the best of our knowledge the practice of combining RCTs and non-randomized controlled trials is frowned upon and considered an improper practice. The authors have used Egger’s test for detection of publication bias, however, as the analysis is flawed its inclusion is meaningless.

The argument therefore of methodological rigor cannot be overstated when conducting a good quality meta-analysis which is considered to represent the highest level of evidence in conceptualizations of evidence-based practice. The number of factors that directly and indirectly impact on the determination of estimated effect size is large. It is therefore incumbent upon the authors to take due care in framing and conducting their search strategies using the latest guidelines and checklist. This issue of Hernia, however, highlights some disturbing trends in the publication of meta-analysis. First of all the eagerness of the authors to submit substandard meta-analyses without taking due care in framing and conducting their search strategies and using the right methodological analytical tools is evident. This lack of rigor produces misleading and erroneous results and if these are adopted in routine surgical practice may cause more harm than good to the patients. Secondly, the willingness by the journals to accept these meta-analyses based on referees’ report who may lack experience in meta-analysis and unfamiliar with the latest guidelines is concerning. The acceptance of sub-standard meta-analyses may have a negative impact on Journals’ standing in the surgical community. To avoid any of the above mentioned concerns, it is imperative that journals produce guidelines in producing these meta-analyses, which insist that authors rigidly adhere to these guidelines and the review process involves referees who have a thorough knowledge of meta-analysis (both clinical and statistical). It is important for the authors and the journals to identify the limitations and flaws in the meta-analysis conducted and published in the journals so that there is absolute transparency in terms of outcome.