A random effects metaanalysis model with BoxCox transformation
 1.4k Downloads
 3 Citations
Abstract
Background
In a random effects metaanalysis model, true treatment effects for each study are routinely assumed to follow a normal distribution. However, normality is a restrictive assumption and the misspecification of the random effects distribution may result in a misleading estimate of overall mean for the treatment effect, an inappropriate quantification of heterogeneity across studies and a wrongly symmetric prediction interval.
Methods
We focus on problems caused by an inappropriate normality assumption of the random effects distribution, and propose a novel random effects metaanalysis model where a BoxCox transformation is applied to the observed treatment effect estimates. The proposed model aims to normalise an overall distribution of observed treatment effect estimates, which is sum of the withinstudy sampling distributions and the random effects distribution. When sampling distributions are approximately normal, nonnormality in the overall distribution will be mainly due to the random effects distribution, especially when the betweenstudy variation is large relative to the withinstudy variation. The BoxCox transformation addresses this flexibly according to the observed departure from normality. We use a Bayesian approach for estimating parameters in the proposed model, and suggest summarising the metaanalysis results by an overall median, an interquartile range and a prediction interval. The model can be applied for any kind of variables once the treatment effect estimate is defined from the variable.
Results
A simulation study suggested that when the overall distribution of treatment effect estimates are skewed, the overall mean and conventional I ^{2} from the normal random effects model could be inappropriate summaries, and the proposed model helped reduce this issue. We illustrated the proposed model using two examples, which revealed some important differences on summary results, heterogeneity measures and prediction intervals from the normal random effects model.
Conclusions
The random effects metaanalysis with the BoxCox transformation may be an important tool for examining robustness of traditional metaanalysis results against skewness on the observed treatment effect estimates. Further critical evaluation of the method is needed.
Keywords
Metaanalysis Random effects model Skewed data BoxCox transformationAbbreviations
 BC
Proposed model using BoxCox transformation without sign inversion
 BCSI
Proposed model using BoxCox transformation with sign inversion
 EXP
Shifted exponential distribution
 GLMMs
Generalised linear mixed models
 IPD
Individual patient data
 IQR
Interquartile range
 LN
Shifted lognormal distribution
 MCMC
Markov chain Monte Carlo
 N
Normal distribution
 NRE
Normal random effects model
 nSN1
Skewnormal distribution with moderate negative skewness
 nSN2
Skewnormal distribution with large negative skewness
 pSN1
Skewnormal distribution with moderate positive skewness
 pSN2
Skewnormal distribution with large positive skewness
 RMSE
Root mean square error
Background
Metaanalysis is a useful statistical tool for combining results from independent studies, for example where estimates of a treatment effect (e.g odds ratio, mean difference or standardised mean difference) from randomised controlled trials are pooled in order to make inferences about an overall summary effect. A random effects metaanalysis model that assumes different true treatment effects underlying different studies is often needed as it allows for unexplained heterogeneity across studies [1]. In the random effects model, the true treatment effects for each study are usually assumed to follow a normal distribution; thus, an overall mean (summary) effect is obtained by estimating the mean parameter of this distribution.
In this article, we focus on problems caused by an inappropriate normality assumption of the random effects distribution, in particular in regard to the impact on the mean effect estimate, quantification of heterogeneity and prediction interval. Turner et al. [2] suggested that the misspecification of the random effects distribution seriously affected the estimates of the random effects variances. Lee and Thompson [3] showed that the shape of the predictive distributions of the treatment effect was substantially affected by the shape of the assumed random effects distribution. The normality assumption may therefore be a restrictive assumption for metaanalysts who are interested in producing a summary treatment effect, quantifying heterogeneity and deriving a prediction interval, especially if the true random effects distribution is skewed.
Alternative parametric distributions have been considered for the random effects distribution in mixed models; for example, tdistribution [4], gamma or mirrored gamma distribution [2], Laplace (doubleexponential) distribution [5], skewed normal or skewed tdistribution [3], mixture distributions [6]. And also, as an approach to outliers in metaanalysis, Baker and Jackson [7] proposed a model that allows the random effects to be longtailed, which provides a downweighting of outliers and removes the necessity for an arbitrary decision to exclude the outliers. Gumedze and Jackson [8] used likelihood ratio test statistics to detect and downweight outliers in the metaanalysis. However, each has disadvantages as discussed in Lee and Thompson [3]; for example, the mixture distributions can fail in situations where there are a few outliers. When assuming a skewed distribution for the random effects in a metaanalysis, the mean and the variance are not appropriate representatives for summarising the skewed true treatment effects. The overall mean for the skewed treatment effects would be pulled in the direction of the extreme observed estimates; hence, it could result in misleading conclusions from the metaanalysis. It is also not straightforward to quantify the impact of heterogeneity, such as I ^{2}, if there is a nonnormal random effects distribution. Indeed, Higgins et al. [9] mentioned that some alternative parametric distributions may not have parameters that naturally describe an overall effect, or the heterogeneity across studies.
Here, we propose a novel random effects metaanalysis model, where a BoxCox transformation [10] is applied to the observed treatment effect estimates. The aim of the BoxCox transformation is to achieve approximate normality of the overall distribution of the observed treatment effect estimates after transformation. The use of the BoxCox transformation in linear models has been studied extensively [11, 12, 13, 14]. In particular, Gurka et al. [15] provided an extension of the BoxCox transformation to linear mixed models and demonstrated that a single transformation parameter would simultaneously help achieve normality of both the random effects and the residual error. However, the BoxCox transformation has not been used commonly in the context of metaanalysis. Indeed, a work by Kim et al. [16] is the only metaanalytic application of the BoxCox transformation that we are aware of. They proposed a multivariate response BoxCox regression model for modelling individual patient data (IPD). However, the approach by Kim et al. [16] cannot apply to the cases of more readily available aggregate data (such as observed estimates of the treatment effect and their standard errors), because their model just allows the individual patient responses to be transformed and thus requires IPD. We rather consider transforming the observed treatment effect estimates using the BoxCox transformation and suggest summarising the overall effect by an overall median rather than the overall mean, and quantifying the impact of heterogeneity by an interquartile range rather than commonly used I ^{2}. The method no longer requires the IPD.
In this section, we introduce two motivating examples which will be used for illustrating the proposed model. In the “Methods” section, we introduce the standard normal random effects models, and describe how to make the Bayesian inference in the random effects metaanalysis from the following viewpoints: the overall mean effect, the heterogeneity and the prediction interval. And then, we describe our new random effects model with the BoxCox transformation. In the “Results” section, we conduct a simulation study to examine the performance of our proposed model under some situations where true random effects follow nonnormal distributions, and compare the results with those from the standard normal random effects model. Moreover, we illustrate our proposed model using the examples. Finally, we conclude this article with some discussion.
Motivating examples
Example 1: Teacher expectancy on pupil IQ
Example 2: Antidepressants for reducing pain in fibromyalgia syndrome
Hauser et al. [20] reported a metaanalysis of randomised controlled trials to investigate the efficacy of antidepressants for fibromyalgia syndrome, which is a chronic pain disorder associated with multiple debilitating symptoms. 22 trials using different classes of antidepressants were involved in the analysis, and estimates of the standardised mean difference in pain (for the antidepressant group minus the control group) were combined using a random effects model. The data was obtained from Figure 3 in Riley et al. [21]. Figure 1b shows a forest plot and a simple histogram of estimates of the standardised mean differences, with negative values indicating a benefit for the antidepressants. The histogram suggests the presence of negative skewness on the estimates.
Methods
Normal random effects model
where θ _{ i } is the true (but unknown) treatment effect for the ith study and is represented by the sum of θ and u _{ i }. The u _{ i } is assumed to follow a normal distribution with mean zero and variance τ ^{2}, indicating that the true treatment effect for the ith study, θ _{ i }, is normally distributed about θ with the betweenstudy variance τ ^{2}. ε _{ i } is a sampling error within the ith study and is assumed to follow a normal distribution with mean zero and variance \(\sigma _{i}^{2}\), where the withinstudy variance \(\sigma _{i}^{2}\) is commonly considered to be known. Of key interest is an estimate of the mean parameter of the random effects distribution, θ, as this provides the mean treatment effect of the included studies. Also of interest is an estimate of τ ^{2}, to quantify the amount of heterogeneity and to derive a 95 percent prediction interval [9].
Bayesian estimation of model parameters
where b is a constant value given by practitioners. It is well known that the results from Bayesian metaanalyses could be potentially sensitive to the choice of prior distributions, especially to the prior of the betweenstudy variance τ ^{2} (e.g. see Lambert et al. [22] for the details). Various noninformative priors for τ ^{2} have been suggested in previous researches; for example, a uniform prior on τ [23, 24], a uniform prior on log(τ ^{2}) [25], an inversegamma prior on τ ^{2} [26] and a halfCauchy prior on τ [24]. We consider the uniform prior on τ in the range of (0,b), where the upper limit, b, should be decided according to the individual situations. The uniform prior on the standard deviation increasingly becomes known as a reasonable alternative to a more general inversegamma prior on variance (e.g. see Gelman [24] for the details). In practice the sensitivity of specified priors should be investigated by applying many other priors for the parameters or by using prior distribution based on empirical evidence [27, 28], though we in this article avoid the extensive discussion for the prior. In the Bayesian framework, a posterior mean and a 95 percent credible interval are commonly used for summarising the posterior distribution. We implement our Bayesian analysis by using Markov chain Monte Carlo (MCMC) methods, with a free R software and its rstan package (see the Stan Modelling Language User’s Guide and Reference Manual [29] for the details). The source code for conducting metaanalyses with the normal random effects model (1) is shown in Additional file 1.
Quantification of heterogeneity
Here, s ^{2} is referred to as ‘typical’ withinstudy variance. In this article, we calculate I ^{2} based on the estimated τ ^{2} during each sample of the Bayesian estimation process; that is, we summarise the posterior distribution of I ^{2} derived by using samples of τ ^{2} drawn from its posterior distribution.
Prediction interval
Following estimation of model (1) using the MCMC, a (100−q) percent prediction interval is obtained by taking (q/2)th and (100−q/2)th quantiles of samples drawn from the predictive distribution (5). For example, lower and upper bounds of 95 percent prediction interval are given by 2.5th and 97.5th quantiles of samples from the predictive distribution, respectively. Note that this is just one option for obtaining the 95 percent prediction interval, and other ways of defining the interval can be chosen depending on where we want to take the lower and upper limits. When interest lies in predicting probability that the treatment is effective by more than a clinically important difference in a new study, we can find this by calculating the proportion of samples drawn from the predictive distribution which satisfy a specified criteria for the effectiveness of the treatment (e.g. odds ratio < 80 percent). The sampling can be achieved by first drawing samples of parameters from the posterior distribution p(θ,τ ^{2}y) and then drawing samples from p(θ _{new}θ,τ ^{2}) with fixed parameters obtained in the previous step [4]. In the second step, the drawing is performed by θ _{new}∼N(θ,τ ^{2}). In this manner, the prediction interval accounts for the heterogeneity in true treatment effects and naturally incorporates all parameter uncertainty (e.g. in θ and τ ^{2}). It should be interpreted differently from the credible interval for the mean effect, which only indicates the uncertainty in the mean effect itself, not the entire distribution of true treatment effects across studies [21].
Random effects model with BoxCox transformation
BoxCox transformation
for y _{ i }+α>0 (i=1,…,k), where we keep y _{ i } for ease of notation, though y _{ i } could refer to any continuous measure (not just an effect size). λ and α denote a transformation and a shift parameter respectively, and these parameters are estimated from the observed data. \(\dot {g}(\alpha)\) is a geometric mean of y _{ i }+α for i=1,…,k. The normalisation using the geometric mean \(\dot {g}(\alpha)\) could lead a stable estimation of λ and α, in comparison with a standard BoxCox transformation without the normalisation.
To be exact, it is proper to assume that the transformed variable y _{ i }(λ,α) follows a truncated normal distribution except for the case of λ=0, because of the condition that y _{ i }+α must be a positive value. When interest lies in inference in original scale before transformation (not in the scale after transformation), we need to specify the distribution of the observed values before the BoxCox transformation and deal with the truncation precisely [31, 32, 33, 34]. However, these are beyond the scope of this article. For mathematical convenience, we below assume that the transformed variable y _{ i }(λ,α) follows a normal distribution with no consideration of the truncation.
Proposed metaanalysis model and its estimation
The model structure is basically same as the normal random effects model (1), though now the BoxCox transformation (6) is applied to the observed treatment effect estimates for each study and μ _{ i } denotes a true effect of the BoxCox transformed variable y _{ i }(λ,α) which has a ‘known’ variance of \(\phi _{i}^{2}(\lambda,\alpha)\) (see section below). The proposed model aims to improve the overall normality of the observed treatment effects estimates (y _{ i }) across studies; their overall distribution is the sum of the random effects distribution of true effects and the withinstudy sampling distribution of estimates. As long as the studies have reasonable sample size, the withinstudy sampling distribution of y _{ i } will be approximately normal due to the central limit theorem. However, there is no such guarantee for the random effects distribution [7], and thus any asymmetry in the random effects distribution will consequently cause asymmetry in the overall distribution for y _{ i }. The following processes are required to implement the proposed random effects model (7).
Definition of variance of the BoxCox transformed treatment effect estimate
where recall α is the shift parameter, λ is the transformation parameter, \(\dot {g}(\alpha)\) is the geometric mean of y _{ i }+α for i=1,…,k, μ is the mean parameter of the random effects distribution in the transformed scale and \(\sigma _{i}^{2}\) is the withinstudy variance from the ith study. The relationship between variances before and after transformation has been applied for stabilising variance [35, 36] or representing inhomogeneity variances in linear models with BoxCox transformation weighting [37].
Frequentist estimation of λ and α
We treat the transformation parameter λ and the shift parameter α as nonstochastics; i.e. we first estimate these parameters by a maximum likelihood estimation, and then make inference about the other parameters μ and τ ^{2} conditioning on \(\lambda =\hat {\lambda }\) and \(\alpha =\hat {\alpha }\), where \(\hat {\lambda }\) and \(\hat {\alpha }\) are maximum likelihood estimates of λ and α respectively. Maruo and Goto [34] has investigated the influence of not considering the uncertainty associated with estimation of λ, and showed the confidence interval around the median from an univariate analysis with the BoxCox transformation was slightly liberal (from two to three percent).
Substituting the estimates \({\hat {\mu }(\lambda,\alpha)}\) and \({\hat {\tau }^{2}(\lambda,\alpha)}\) into l _{ λ,α }(μ,τ ^{2}), then we have a log likelihood \(l_{\lambda,\alpha }(\hat {\mu }(\lambda,\alpha)\), \(\hat {\tau }^{2}(\lambda,\alpha))\) for the fixed λ and α. Then, we obtain a set of (λ,α) for which the log likelihood takes the largest value as the approximate values of \(\hat {\lambda }\) and \(\hat {\alpha }\).
An issue known as nonregular problem is caused in the maximum likelihood estimation of α because the range of the distribution is determined by the unknown shift parameter α [38, 39]. For example, it is argued that the likelihood function of α fails to have a local maximum [38]. In this article, we focus on the inference in the original scale before transformation; hence, we assume the concern about the estimation of α would not have substantial impact than if we were interested in the exact estimation of the transformation and the shift parameter (λ and α) themselves. This could be an area of further research.
Bayesian estimation of model parameters
We assume the vague priors for μ and τ ^{2} in the same way as (2); i.e. μ∼N(0,10000) and τ∼U(0,b), where b is a constant value given by practitioners. It is straightforward to draw samples from the posterior distribution (10) by MCMC. The source code for conducting metaanalyses with the proposed model (7) is shown in Additional file 1, which includes the step of finding the maximum likelihood estimates of λ and α.
Interpretation of results
A median overall treatment effect
The median (13) can now be used for the inference of an overall (summary) treatment effect on the original scale. We recommend using the median as a representative of centre of skewed distributions, which is more robust than the mean against the skewness and the outliers on the observed treatment effect estimates.
Quantification of heterogeneity using the ratio of IQR squares
from the proposed model (7) would be close to the square root of betweenstudy variance from the normal random effects model (1). For this comparability, we recommend using the measure of (14), which is known as normalised IQR, for quantifying the magnitude of the heterogeneity.
The ratio of IQR squares would be comparable with the I ^{2} when the treatment effect estimates are normally distributed, because of the comparability between the IQR and the betweenstudy variance.
Prediction interval
A (100−q) percent prediction interval can be obtained by taking (q/2)th and (100−q/2)th quantiles of samples drawn from the predictive distribution (17). Again, note that this is just one option for obtaining the 95 percent prediction interval as mentioned in the previous section.
Another transformation for dealing with the negative skewness
As described in the previous section, the BoxCox transformation (6) requires the condition that y _{ i }+α must be a positive value for i=1,…,k, which can cause difficulty in estimating the model parameters. This may also be more problematic when the treatment effect estimates have negative skewness, because the shift parameter is subject to inflation in such situation. In order to avoid the negative skewness on the treatment effect estimates, we here consider another transformation using a sign inversion. The transformation with the sign inversion described below will be applied only when the observed treatment effect estimates are negatively skewed.
where \(\dot {h}(\alpha)\) is now a geometric mean of −y _{ i }+α for i=1,…,k. For each study, the same withinstudy variances can be assigned to the inverted treatment effect estimates.
for obtaining samples from the predictive distribution.
Implementation of our proposed model
 1.
Calculate the weighted sample skewness (18).
 2.
If the weighted sample skewness calculated in Step 1 takes a negative value, invert the sign of observed treatment effect estimates and then move to Step 3; otherwise just move to Step 3.
 3.
Calculate the maximum likelihood estimates of the transformation (λ) and the shift (α) parameter using the loglikelihood function (9).
 4.
Perform the Bayesian estimation (MCMC sampling) for the other parameters given the maximum likelihood estimates of the transformation and the shift parameter.
 5.
Calculate the measures of interest (overall median, normalised IQR and ratio of IQR squares) using the MCMC samples obtained in Step 4.
 6.
Draw samples from the predictive distribution using the MCMC samples obtained in Step 4, and calculate the prediction interval.
Steps 1 and 2 are needed only when applying the sign inversion. Without the sign inversion, we will begin the procedure from Step 3.
Results
Simulation study
We conducted a simulation study to examine the comparative performance of the standard normal random effects model (1) and the proposed model (7). Since the proposed model allows the presence of skewness on the treatment effect estimates, we supposed some situations where the true treatment effects had a skewed distribution, and compared results from the two models in terms of the estimation of overall effect and the quantification of heterogeneity. We also supposed another situation where the treatment effect estimates were normally distributed. In such situation, the two models are expected to provide similar results.
Design
Overview of the simulation study
Step 1  Choose a random effects distribution f(ψ) from candidates including normal distributions, skewnormal distributions, shifted exponential distributions and shifted lognormal distributions, where ψ represents a true parameter vector of the random effects distribution. 
Step 2  Choose the number of studies (k), mean of the distribution for the withinstudy variance (σ ^{2}) and true parameters of the random effects distribution (ψ). 
Step 3  Draw a withinstudy variance of the treatment effect estimate for the ith study (i=1,…,k); \(\tilde {\sigma }_{i}^{2}\sim N(\sigma ^{2},0.040)\) conditioned on \(0.010<\tilde {\sigma }_{i}^{2}<(2\sigma ^{2}0.010)\). 
Step 4  Draw a sampling error of the treatment effect estimate for the ith study (i=1,…,k); \(\tilde {\epsilon }_{i}\sim N(0,\tilde {\sigma }_{i}^{2})\), where \(\tilde {\sigma }_{i}^{2}\) is obtained in Step 3. 
Step 5  Draw a true treatment effect for the ith study (i=1,…,k); \(\tilde {\theta }_{i}\sim f(\psi)\), where f(ψ) is the specified random effects distribution with the true parameter ψ. 
Step 6  Obtain a treatment effect estimate for the ith study (i=1,…,k); \(\tilde {y}_{i}=\tilde {\theta }_{i}+\tilde {\epsilon }_{i}\), where \(\tilde {\epsilon }_{i}\) and \(\tilde {\theta }_{i}\) are obtained in step 4 and step 5 respectively. 
Step 7  Using \(\tilde {y}_{i}\) and \(\tilde {\sigma }_{i}^{2}\) for i=1,…,k, fit the normal random effects model (1) and the proposed model (7) separately. 
Step 8  Obtain a posterior median and a 95 percent credible interval of the overall mean from the normal random effects model (1), and those of the overall median from the proposed model (7). Check whether their credible intervals contain the true overall median of 0.000. 
Step 9  Obtain a posterior median and a 95 percent credible interval of the I ^{2} from the normal random effects model (1), and those of the ratio of IQR squares from the proposed model (7). Check whether their credible intervals contain the true ratio of IQR squares given by one of either (20.0%, 40.0%, 80.0%). 
Step 10  Repeat Steps 1 to Step 9 10,000 times. 
Step 11  Using the posterior medians of the overall mean or the overall median obtained in Step 8, compute a bias and a root mean square error around the true overall median of 0.000. 
Step 12  Obtain a coverage probability of the overall mean or the overall median by computing the proportion of the time that the 95 percent credible intervals contained the true overall median of 0.000. 
Step 13  Using the posterior medians of the I ^{2} or the ratio of IQR squares obtained in Step 9, compute a bias and a root mean square error around the true ratio of IQR squares given by one of either (20.0%, 40.0%, 80.0%). 
Step 14  Obtain a coverage probability of the I ^{2} or the ratio of IQR squares by computing the proportion of the time that the 95 percent credible intervals contained the true ratio of IQR squares given by one of either (20.0%, 40.0%, 80.0%). 
Scenarios of random effects distributions and their true parameters
Random effects  Normalised  Ratio of  

distribution  Scenario  True parameter  Median  IQR  IQR squares 
Scenario 13: Normal distribution (N)  
N(mean,variance)  1  N(0,0.1581^{2})  0.000  0.025  20.0% 
2  N(0,0.2582^{2})  0.000  0.067  40.0%  
3  N(0,0.6325^{2})  0.000  0.400  80.0%  
Scenario 46: Skewnormal distribution with moderate positive skewness (pSN1)  
SN(location,scale,slant)  4  S N(−0.1724,0.2547,5)  0.000  0.025  20.0% 
5  S N(−0.2812,0.4159,5)  0.000  0.067  40.0%  
6  S N(−0.6883,1.0198,5)  0.000  0.400  80.0%  
Scenario 79: Skewnormal distribution with large positive skewness (pSN2)  
SN(location,scale,slant)  7  S N(−0.1734,0.2557,20)  0.000  0.025  20.0% 
8  S N(−0.2829,0.4192,20)  0.000  0.067  40.0%  
9  S N(−0.6923,1.0258,20)  0.000  0.400  80.0%  
Scenario 1012: Skewnormal distribution with moderate negative skewness (nSN1)  
SN(location,scale,slant)  10  S N(0.1715,0.2546,−5)  0.000  0.025  20.0% 
11  S N(0.2803,0.4154,−5)  0.000  0.067  40.0%  
12  S N(0.6874,1.0195,−5)  0.000  0.400  80.0%  
Scenario 1315: Skewnormal distribution with large negative skewness (nSN2)  
SN(location,scale,slant)  13  S N(0.1725,0.2556,−20)  0.000  0.025  20.0% 
14  S N(0.2820,0.4191,−20)  0.000  0.067  40.0%  
15  S N(0.6914,1.0255,−20)  0.000  0.400  80.0%  
Scenario 1618: Shifted exponential distribution (EXP)  
EXP(rate,shift)  16  E X P(5.1507,−0.1348)  0.000  0.025  20.0% 
17  E X P(3.1542,−0.2202)  0.000  0.067  40.0%  
18  E X P(1.2877,−0.5377)  0.000  0.400  80.0%  
Scenario 1921: Shifted lognormal distribution (LN)  
LN(mean,variance,shift)  19  L N(0,0.1578^{2},−1)  0.000  0.025  20.0% 
20  L N(0,0.2569^{2},−1)  0.000  0.067  40.0%  
21  L N(0,0.6147^{2},−1)  0.000  0.400  80.0% 
In Step 2, we set the number of studies, mean of the distribution for the within study variance and true parameters of the random effects distribution. In Steps 36, we obtained treatment effect estimates for each study. In particular, the withinstudy variance \(\sigma _{i}^{2}\) was drawn from a normal distribution with mean σ ^{2} and variance 0.040 conditioned on \(0.010<\sigma _{i}^{2}<(2\sigma ^{2}0.010)\). The mean of the normal distribution, σ ^{2}, was chosen so that the ‘typical’ withinstudy variance (4), which depended on the number of studies involved in the metaanalysis, became 0.100 on average. We set the value of σ ^{2} to either 0.1089, 0.1122, 0.1147, 0.1158 in each simulation with k=5,10,20,40, respectively.
In Step 7, using the generated metaanalysis data, we fit the normal random effects model (1) and the proposed model (7) separately. In the proposed model, we also applied the transformation with the sign inversion for the negatively skewed data. Note that the transformation with the sign inversion is applied only when the observed treatment effect estimates are negatively skewed. And then, in Steps 89, we computed the posterior medians and the 95 percent credible intervals of: the overall mean and the I ^{2} from the normal random effects model (1), the overall median and the ratio of IQR squares from the proposed model (7).

Bias around the true overall median: (mean of the posterior medians of the overall mean/the overall median) −(true overall median of 0.000)

Root mean square error (RMSE) around the true overall median: ((standard deviation of the posterior medians of the overall mean/the overall median) ^{2}+(bias around the true overall median)^{2}) ^{1/2}

Coverage probability of the true overall median (%): the proportion of the time that the 95 percent credible intervals of the overall mean/the overall median contained the true overall median of 0.000

Bias around the true ratio of IQR squares: (mean of the posterior medians of the I ^{2}/the ratio of IQR squares) −(true ratio of IQR squares given by one of either (20.0%, 40.0%, 80.0%))

RMSE around the true ratio of IQR squares: ((mean of the posterior medians of the I ^{2}/the ratio of IQR squares) ^{2}+(bias around the true ratio of IQR squares)^{2}) ^{1/2}

Coverage probability of the true ratio of IQR squares (%): the proportion of the time that the 95 percent credible intervals of the I ^{2}/the ratio of IQR squares contained the true ratio of IQR squares given by one of either (20.0%, 40.0%, 80.0%)
We notice that using the terms of bias, RMSE and coverage probability for the results from the normal random effects model is not necessarily correct. This is because the normal random effects model provided the results of overall mean and I ^{2}, which were not the targeted true values (or the reference values). However, in this article, the overall median and the ratio of IQR squares are highly recommended for representing the overall effect and quantifying the heterogeneity in the metaanalysis of skewed data, respectively. Then, the above quantities are useful for assessing how the findings under the two models can be different from the recommended inferential measures in skeweddata situations.
Estimation
Before estimation of model (7) for a particular simulated dataset, the grid search procedure was performed for estimating λ and α for the dataset. The candidate values of λ were specified in a range of −3.00≤λ≤6.00 with a step size of 0.01. We considered constituting a subset of α as the minimum values of {(y _{ i }+α):i=1,…,k}; i.e. α ^{∗}=α+ min{y _{ i }:i=1,…,k}. The candidate values of α ^{∗} were specified in a range of 0.01≤α ^{∗}≤2.01 with a step size of 0.10.
We used the normal and the uniform prior for the mean and the variance parameter respectively, as described in the previous section. The upper limit of the uniform prior distribution on τ was given by b=10 for each model. For the Bayesian estimation of model (1) and (7), the iterative process of the MCMC algorithm produced three chains each with 20,000 samples of parameters. We discarded the first 2,000 samples (socalled burnin samples) in order to prevent dependence on the starting values. And also, we took a sample at only every 2nd iteration (thinning) in order to avoid autocorrelation between the samples taken. Therefore in total, 24,000 samples of parameters were drawn. We graphically checked the convergence of MCMC sampling using first 5 simulations for each scenario, with no diagnostic methods.
Results

Figure 2 plots the results for the overall mean or the overall median, with the betweenstudy variation (the true ratio of IQR squares: Small = 20.0%, Moderate = 40.0%, Large = 80.0%) on the horizontal axis. The number of studies was fixed as k = 20.

Figure 3 plots the results for the I ^{2} or the ratio of IQR squares, with the betweenstudy variation (the true ratio of IQR squares: Small = 20.0%, Moderate = 40.0%, Large = 80.0%) on the horizontal axis. The number of studies was fixed as k = 20.

Figure 4 plots the results for overall mean or the overall median, with the number of studies (k = 10, 20 and 40) on the horizontal axis. The true ratio of IQR squares was fixed as 80.0% (i.e. the scenario of large betweenstudy variation).

Figure 5 plots the results for the I ^{2} or the ratio of IQR squares, with the number of studies (k = 10, 20 and 40) on the horizontal axis. The true ratio of IQR squares was fixed as 80.0% (i.e. the scenario of large betweenstudy variation).
The nominal level of coverage probability is 95 percent. All the scenarios of the random effects distributions are displayed in the same panels in order of the normal (N), the skewnormal with moderate positive skewness (pSN1), the skewnormal with large positive skewness (pSN2), the skewnormal with moderate negative skewness (nSN1), the skewnormal with large negative skewness (nSN2), the shifted exponential (EXP) and the shifted lognormal (LN) from left to right. And also, in each figure, (i) cross marks and solid lines represent the normal random effects model, (ii) black circle marks and broken lines represent the proposed model using BoxCox transformation (6), (iii) black triangle marks and dotted lines represent the proposed model using BoxCox transformation with the sign inversion (19) for the negatively skewed data. We below refer to the normal random effects model, the proposed model without and with the sign inversion as NRE, BC and BCSI respectively.
Overall treatment effect
When the normal distributions were assumed as the true random effects distribution, the NRE and the BCSI provided unbiased estimations of the overall effect, regardless of the scenarios of the betweenstudy variation and the number of studies. The overall median from the BC was subject to a negative bias in the scenario of the large betweenstudy variation and the small number of studies, though this bias decreased as the number of studies increased. The NRE, the BC and the BCSI provided similar RMSEs, except for the scenario of the small number of studies where the RMSEs from the NRE were smaller than those from the BC and the BCSI. The coverage probabilities from the NRE were slightly larger than those from the BC and the BCSI in all the scenarios, but all these coverage probabilities were close to the nominal level of 95 percent.
When the skewnormal distributions were assumed as the true random effects distribution, the overall means from the NRE were pulled in the direction of skewness and substantially different from the true zero overall median, especially in the scenarios of the large betweenstudy variation. In the scenarios of the positive skewness (pSN1 and pSN2), the biases of the overall mean from the NRE increased positively; conversely in the scenarios of negative skewness (nSN1 and nSN2), those increased negatively. The degree of bias was larger in the scenario of the large skewness (pSN2 and nSN2). And also, regarding the overall means from the NRE in the scenarios of the large betweenstudy variation and the large number of studies, the RMSEs were inflated and the coverage probabilities were substantially below the nominal level of 95 percent. On the other hand, the overall medians from the BC and the BCSI had the smaller biases regardless of the scenarios of the betweenstudy variation and the number of studies. In the scenario of the negative skewness (nSN1 and nSN2) and the large betweenstudy variation, the BC was subject to a negative bias, though this bias decreased as the number of studies increased. The BC and the BCSI provided quite similar RMSEs and coverage probabilities in the scenarios of the positive skewness (pSN1 and pSN2); while, in the scenarios of the negative skewness (nSN1 and nSN2) and the large betweenstudy variation, the coverage probabilities from the BC were below the nominal level of 95 percent. This indicates that the BC could have difficulty in dealing with the negatively skewed data as expected, but the BCSI performs well.
When the shifted exponential and the shifted lognormal distributions were assumed as the true random effects distribution, the overall means from the NRE were substantially different from the true zero overall median, especially in the scenarios of the large betweenstudy variation. And also, in such situation, the RMSEs were seriously inflated and the coverage probabilities were below the nominal level of 95 percent, which became more noticeable as the number of studies increased. This would be because the scenarios including larger number of studies tended to generate more heavytailed data. On the other hand, the overall medians from the BC and the BCSI were similar and had much smaller biases in comparison with the overall means from the NRE. The BC and the BCSI also provided similar results of the RMSE and the coverage probability, which were much better than those from the NRE especially in the scenarios of the large betweenstudy variation and the large number of studies.
In summary, we found that the overall mean from the NRE could be substantially influenced by the skewness on the random effects distribution. Taking into account that the overall mean from the NRE was pulled in the direction of skewness and had the lower coverage probability, the NRE might therefore produce overall effect estimates that do not reflect the median treatment effect if the overall distribution of treatment effect estimates is skewed or heavytailed. Moreover, it was indicated that the sign inversion in the BoxCox transformation could be an effective way for precisely estimating the overall median of the negatively skewed treatment effect estimates.
Quantification of heterogeneity
When the normal distributions were assumed as the true random effects distribution, the NRE, the BC and the BCSI provided similar results of the bias, the RMSE and the coverage probability, regardless of the scenarios of the betweenstudy variation and the number of studies.
When the skewnormal distributions were assumed as the true random effects distribution, the NRE, the BC and the BCSI provided similar results in almost all of the scenarios. In the scenarios of the large betweenstudy variation, the coverage probabilities of I ^{2} from the NRE were slightly lower than those of the ratios of IQR squares from the BCSI. In the scenarios of the negative skewness (nSN1 and nSN2), the ratios of IQR squares from the BC were subject to negative biases and had the larger RMSEs in comparison with the NRE and BCSI. This again indicates that the BC could have difficulty in dealing with the negatively skewed data.
When the shifted exponential and the shifted lognormal distributions were assumed as the true random effects distribution, the I ^{2} values from the NRE were larger than the ratios of IQR squares from the BC and the BCSI in the scenarios of the large betweenstudy variation. The RMSEs from the NRE, the BC and the BCSI were quite similar, though the coverage probabilities of I ^{2} from the NRE were seriously below the nominal level of 95 percent in the scenarios of the large betweenstudy variation, compared with those of the ratios of IQR squares from the BC and the BCSI. This became more noticeable as the number of studies increased, which would be again because the scenarios including larger number of studies tended to generate more heavytailed data. The BC and the BCSI provided quite similar results of the bias, the RMSE and the coverage probability, regardless of the scenarios of the betweenstudy variation and the number of studies.
In summary, we found that the I ^{2} from the NRE was influenced by the skewness on the random effects distribution. In particular, the heavytailed data seriously affected the estimation of I ^{2} in the NRE. Moreover, it was again indicated that the sign inversion in the BoxCox transformation could be an effective way for precisely estimating the ratios of IQR squares of the negatively skewed treatment effect estimates.
Performance when the number of studies is small
Additional file 2: Table S1 shows results for the scenario of the number of studies k=5. In regard to the estimation of the overall treatment effect, having small number of studies had a limited influence on bias. Indeed, the biases on the overall median from the BC and the BCSI as well as the overall mean from the NRE were similar to those for the scenario of the number of studies k=10, except for the overall median from the BC for the scenario of the negative skewness (nSN1 and nSN2) where the negative biases were increased. However, the coverage probabilities from the BC and the BCSI were below nominal level of 95 percent for almost all the scenarios. In particular, the BCSI provided around 90 percent coverage probabilities for the scenarios of the small and the moderate betweenstudy variation. In contract, the coverage probabilities from the NRE were substantially above the nominal level of 95 percent. These indicate an issue of metaanalysing the small number of studies. In regard to the quantification of heterogeneity, the NRE, the BC and the BCSI were subject to large positive bias of the I ^{2} or the ratio of IQR squares, which inflated their RMSEs. From these findings, we conclude our proposed model is applicable even when the number of studies is 5, but may have difficulty in ensuring sufficient accuracy in estimation of the overall treatment effect and quantification of heterogeneity.
Application
Consider now application to the examples described in the previous section. We applied the normal random effects model (1) and the proposed model (7) to each example, and estimated the posterior distributions of parameters of interest in each model. The transformation with the sign inversion was also applied to example 2 (the weighted sample skewnesses were 2.123 and −1.847 in example 1 and 2 respectively). Note that the transformation with the sign inversion is applied only when the observed treatment effect estimates are negatively skewed.
Estimation
Before estimation of model (7) for each example, the grid search procedure was performed for estimating λ and α. The candidate values of λ were specified in a range of −3.00≤λ≤6.00 with a step size of 0.01. We considered constituting a subset of α as the minimum values of {(y _{ i }+α):i=1,…,k}; i.e. α ^{∗}=α+ min{y _{ i }:i=1,…,k}. The candidate values of α ^{∗} was specified in a range of 0.01≤α ^{∗}≤2.01 with a step size of 0.10.
We used the normal and the uniform prior for the mean and the variance parameter respectively, as described in the previous section. The upper limit of the uniform prior distribution on τ was given by b=10 for each model. For the Bayesian estimation, the iterative process of the MCMC algorithm produced three chains each with 2,000,000 samples of parameters. We discarded the first 5000 samples (socalled burnin samples) in order to prevent dependence on the starting values. And also, we took a sample at only every 5th iteration (thinning) in order to avoid autocorrelation between the samples taken. Therefore in total, 1,185,000 samples of parameters were drawn. We again graphically checked whether the burnin samples were sufficient and that the MCMC chains converged, with no diagnostic methods.
Overall treatment effect and quantification of heterogeneity
Posterior median and 95 percent credible interval of: overall mean and square root of betweenstudy variance from the normal random effects model, overall median and normalised IQR from the proposed model
NRE  BC  BCSI  

Square root of  
betweenstudy  Normalised  Normalised  
Overall mean  variance  Overall median  IQR^{a}  Overall median  IQR^{a} 
Post. (s.d.)  Post. (s.d.)  Post. (s.d.)  Post. (s.d.)  Post. (s.d.)  Post. (s.d.) 
(95% CI)  (95% CI)  (95% CI)  (95% CI)  (95% CI)  (95% CI) 
Example 1: Teacher expectancy on pupil IQ  
0.083 (0.061)  0.146 (0.087)  0.030 (0.051)  0.084 (0.074)  n/a  n/a 
(−0.021,0.222)  (0.011,0.344)  (−0.058,0.144)  (0.004,0.278)  
Example 2: Antidepressants for reducing pain in fibromyalgia syndrome  
−0.418 (0.067)  0.164 (0.097)  −0.369 (0.056)  0.094 (0.077)  −0.361 (0.057)  0.098 (0.081) 
(−0.567,−0.298)  (0.013,0.384)  (−0.489,−0.267)  (0.005,0.291)  (−0.484,−0.259)  (0.005,0.306) 
Posterior median and 95 percent credible interval of: I ^{2} from the normal random effects model, ratio of IQR squares from the proposed model; 95 percent prediction intervals from each model
NRE  BC  BCSI  

Ratio of IQR  Ratio of IQR  
I ^{2} (%)  95%  squares^{a} (%)  95%  squares^{a} (%)  95% 
Post. (s.d.)  prediction  Post. (s.d.)  prediction  Post. (s.d.)  prediction 
(95% CI)  interval  (95% CI)  interval  (95% CI)  interval 
Example 1: Teacher expectancy on pupil IQ  
44.9 (24.0)  (−0.284,0.500)  20.9 (21.9)  (−0.179,0.393)  n/a  n/a 
(0.5,81.9)  (0.1,73.6)  
Example 2: Antidepressants for reducing pain in fibromyalgia syndrome  
39.1 (22.7)  (−0.879,−0.001)  17.3 (19.1)  (−0.732,−0.118)  18.4 (19.8)  (−0.753,−0.112) 
(0.4,78.0)  (0.1,65.7)  (0.1,67.9) 
Prediction interval and predictive probability
Table 4 also shows the 95 percent prediction intervals from the two models. In example 2, the results from the BC and the BCSI were quite similar. In both examples, the prediction intervals from the BC and the BCSI were substantially narrower than those from the NRE. This is likely due to the inflation of the betweenstudy variance from the NRE. Especially in example 2, the BC and the BCSI provided much stronger evidence of efficacy of the treatment, with the upper bound of the 95 percent prediction interval now much further below 0. Figure 6b shows the predictive distributions from the NRE, the BC and the BCSI for each example. The 95 percent prediction intervals were also depicted on the same panel, where the cross, the black circle and the black triangle represent the medians of predictive distribution from the NRE, the BC and the BCSI, respectively. We found the BC and the BCSI provided skewed prediction intervals, which reflects the asymmetry detected and the asymmetric predictive distribution; whereas the NRE method gave symmetrical prediction intervals in both examples.
Example 1.
Since a positive value indicates a higher mean score for the treatment group, we obtained the predictive probability of a beneficial treatment by counting the number of samples drawn from the predictive distribution which were larger than specified values on the horizontal axis. The probability curve from the NRE was located entirely over those from the BC. That is, the NRE predicted larger probabilities of the true effect being in favour of treatment than the BC. For instance, the probability of the true treatment effect being above 0.1 was 0.428 for the NRE but 0.221 for the BC.
Example 2.
Since a negative value indicates a benefit for the antidepressants, we obtained the predictive probability by counting the number of samples drawn from the predictive distribution which were smaller than specified values on the horizontal axis. The results from the BC and the BCSI were quite similar. When the size of specified difference was small (e.g. from −0.3 to 0.0), the predictive probabilities from the NRE were slightly smaller than those from the BC and the BCSI. In contrast, when the size of specified difference was large (e.g. from −0.8 to −0.4), the predictive probabilities from the NRE were larger than those from the BC and the BCSI.
Discussion
We proposed a new random effects model based on the BoxCox transformation to deal with skewness in the overall distribution of the observed treatment effect estimate for metaanalysis. The simulation study shows that the proposed model has the potential to provide more appropriate inferences in the presence of skewness, especially in regard to the estimation of the overall treatment effect and the quantification of heterogeneity. The simulation study indicates that the normal random effects model gives an overall mean that is pulled in the direction of skewness, and is thus an inappropriate summary for representing the centre of skewed data. Similarly, I ^{2} from the normal random effects model can be inflated given skewed treatment effect estimates, because it overestimates the random effects variance. This also causes prediction intervals that are too wide.
Our proposed model substantially reduces these problems. It is flexible that the observed data determine the shape of distribution and thus the required BoxCox transformation to ensure the normality of transformed treatment effect estimates. We suggest using the overall median effect to summarise the proposed metaanalysis model, back on the original scale of interest. The median is known as a more robust summary measure than the mean against the skewness and the outliers on the observed data. We also defined the ratio of IQR squares under the proposed model for quantifying the impact of heterogeneity in the metaanalysis. When considering the skewed data, the variance is no longer the best measure for describing the spread of the distribution. We recommend the (normalised) IQR of the true effects distribution as a measure for quantifying the extent of the heterogeneity. The ratio of IQR squares has an aspect of the proportion of the betweenstudy variation that is due to the heterogeneity across studies to the total variation in the treatment effect estimates, which is the same concept as I ^{2} from the normal random effects model. In the simulation study, the ratio of IQR squares reduced the inflation of I ^{2} when the treatment effect estimates were skewed or heavytailed.
We note that our simulation assumes that sample sizes in each study are large enough for the central limit theorem to apply, such that (a) treatment effect estimates do have a normal distribution within studies, (b) the variance of the estimate is wellestimated (such that it can essentially be assumed known). Thus, situations with small studies are not considered, but this would be useful for further research.
The application to the two examples illustrated the two models could provide different conclusions for the summary effect and the amount of heterogeneity for the same metaanalysis data. In addition, given skewness the applications indicate the proposed model better predicts the treatment effect in a new study over the normal random effects model. The normal random effects model provided symmetric predictive distributions and its 95 percent prediction interval; on the other hand, the proposed model provided the skewed shape of predictive distributions and its asymmetric 95 percent prediction interval as expected. The difference in the shape of predictive distributions had a significant impact on the predictive probabilities that the treatment is effective in a new study. Another limitation is that, although our simulations covered a wide range of scenarios and were computationally intensive, other scenarios still need to be investigated. In particular, we did not consider when sample sizes within studies are small, and we only considered when it could be correctly assumed that study estimates were normally distributed and their variances were known. This allowed any asymmetry to be due to the random effects distribution, rather than the withinstudy distributions. Further research in situations of small trials and/or rare events would be welcome.
Note that the parameters included in our proposed model are estimated by two stages. We first get the point estimates of the transformation and the shift parameter (λ and α) using the profile likelihood function, and then estimate the other parameters (μ and τ) using the Bayesian approach conditioned on \(\lambda =\hat {\lambda }\) and \(\alpha =\hat {\alpha }\). Although a simultaneous estimation of all parameters (λ, α, μ and τ) within the framework of Bayesian approach is more straightforward, we have had a difficulty in the convergence of MCMC sampling for this. Therefore, we take a procedure of first finding a transformation to normalise the treatment effect estimates (i.e. the transformation and the shift parameter are dealt with as nonstochastics), and then making inferences conditioned on the maximum likelihood estimates of the transformation and the shift parameter.
However, there are some limitations of the proposed model and further research is required. For interpretation and presentation of the metaanalysis results, the Bayesian approach is used for estimating the model parameters. We are interested in functions of the estimated parameters rather than the estimated parameters themselves; i.e. (i) the overall median (13) which represents a summary treatment effect, (ii) the normalised IQR (14) which quantifies the magnitude of heterogeneity, (iii) the ratio of IQR squares (16) which quantifies the impact of heterogeneity. The Bayesian approach using the MCMC method is straightforward enough to estimate these measures with variability, because uncertainty (i.e. variance estimation) of the estimated measures can be obtained by using MCMC samples directly (e.g. mean, median, standard deviation, 2.5th and 97.5th quantiles of the MCMC samples), without additional distributional assumptions or asymptotic approximations. The 95 percent prediction interval is also computed in a simple manner by the Bayesian approach, which can be obtained by taking 2.5th and 97.5th quantiles of samples drawn from the predictive distribution (17). However, a frequentist approach may be another useful option in some situations. When considering the frequentist estimation for the proposed model, it is not straightforward to derive asymptotic distributions (and also the 95 percent confidence intervals) of the maximum likelihood estimators for the measures of interest. We expect a bootstrap method can be used for solving this issue. In addition, it is not necessarily clear how the choice of prior for the betweenstudy variance parameter makes impact on the results from our proposed model. The uniform prior on the standard deviation scale has been known as a reasonable noninformative prior for the conventional normal random effects model, though this may not be the case for our proposed model. Further extensive simulation studies will be needed for assessing this.
Finally, we note that our proposed model assumes the metaanalysis data available is representative of the populations of interest (like all metaanalysis models). In particular, if asymmetry in the observed treatment effect estimates is due to bias, for example publication bias and selective reporting, then the summary result, the heterogeneity measures and the predictive inference may be inappropriate (as then the random effects distribution is inappropriately captured). Then, a concern may arise when it is difficult to distinguish possible causes of skewness on the observed treatment effect estimates. Several scenarios can be considered for the reason why the treatment effect estimates are skewed across studies; for example, (i) the treatment effect distribution suffers from the publication bias and/or the selective reporting, (ii) the treatment effect distribution is a mixture of two different distributions, (iii) the treatment effect distribution is truly skewed, (iv) the treatment effect distribution is skewed simply due to estimation errors. It should be performed first to explore the possible causes of skewness. When the scenario (i) or (ii) is true, our proposed model may not be appropriate, but is likely to be more robust than the normal random effects model. This is because extreme study results locating on one side are expressed as the tail of the treatment effect distribution in our proposed model. Therefore, we conclude our proposed model is applicable for all the scenarios and to is likley to produce more suitable metaanalysis results in comparison to the conventional normal random effects model. However, further research and extended simulations are needed to critically examine this in more detail, especially in situations where publication and selection biases are causing the asymmetry. Our proposed model aims to reduce nonnormality in the random effects distribution by observing the nonnormality in the overall distribution of the y _{ i }’s. Therefore, it is likely to perform best when the betweenstudy heterogeneity is large relative to the withinstudy variability, such that skewness in the overall distribution can be detected and will be due to asymmetry in the true treatment effects. Further research may also consider applying the BoxCox transformation to just the random effects distribution in model (1) (i.e. to just the θ _{ i }’s). Although Gurka et al. [15] suggests that the BoxCox transformation in a mixed effects model should be viewed in terms of their success in normalising the total error, the y _{ i } themselves do not need to be transformed, and thus left on their original scale familiar to metaanalysts.
Conclusions
We proposed a random effects metaanalysis with BoxCox transformation to deal with the skewness in metaanalysis data. The proposed metaanalysis model has the potential to provide more robust inferences for summary treatment effects when the random effects distribution is skewed. It could be used to examine the robustness of traditional metaanalysis results, heterogeneity measures, and predictive inferences to skewed random effects distributions. However, further research would be welcome to examine the method in further simulated and empirical examples.
Notes
Acknowledgements
Not applicable.
Funding
Not applicable.
Availability of data and materials
The methods used to generate the simulated datasets are described in the Results section. The data used in the examples are included in published articles.
Authors’ contributions
YY devised the study and wrote the first draft. All authors contributed to the interpretation of the results and approved the final version.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary material
References
 1.Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. A basic introduction to fixedeffect and randomeffects models for metaanalysis. Res Synth Methods. 2010; 1:97–111.CrossRefPubMedGoogle Scholar
 2.Turner RM, Omar RZ, Thompson SG. Bayesian methods of analysis for cluster randomized trials with binary outcome data. Stat Med. 2001; 20:453–72.CrossRefPubMedGoogle Scholar
 3.Lee KJ, Thompson SG. Flexible parametric models for randomeffects distributions. Stat Med. 2008; 27:418–34.CrossRefPubMedGoogle Scholar
 4.Smith TC, Spiegelhalter DJ, Thomas A. Bayesian approaches to randomeffects metaanalysis: a comparative study. Stat Med. 1995; 14:2685–99.CrossRefPubMedGoogle Scholar
 5.Demidenko E. Mixed Models: Theory and Applications. Hoboken, NJ: Wiley; 2005.Google Scholar
 6.Böhning D. Computerassisted Analysis of Mixtures and Applications: Metaanalysis, Disease Mapping and Others. Boca Raton, FL: Chapman and Hall/CRC; 2000.Google Scholar
 7.Baker R, Jackson D. A new approach to outliers in metaanalysis. Health Care Manage Sci. 2008; 11:121–31.CrossRefGoogle Scholar
 8.Gumedze FN, Jackson D. A random effects variance shift model for detecting and accommodating outliers in metaanalysis. BMC Med Res Methodol. 2011; 11:19.CrossRefPubMedPubMedCentralGoogle Scholar
 9.Higgins JPT, Thompson SG, Spiegelhalter DJ. A reevaluation of randomeffects metaanalysis. J R Stat Soc: Ser A. 2009; 172:137–59.CrossRefGoogle Scholar
 10.Box GEP, Cox DR. An analysis of transformations. J R Stat Soc Ser B. 1964; 26:211–46.Google Scholar
 11.Carroll RJ, Ruppert D. Power transformations when fitting theoretical models to data. J Am Stat Assoc. 1984; 79:321–8.CrossRefGoogle Scholar
 12.Sakia RM. Retransformation bias: a look at the boxcox transformation to linear balanced mixed anova models. Metrika. 1990; 37:345–51.CrossRefGoogle Scholar
 13.Sakia RM. The boxcox transformation technique: a review. The Stat. 1992; 41:169–78.Google Scholar
 14.Lipsitz SR, Ibrahim JG, Molenberghs G. Using a boxcox transformation in the analysis of longitudinal data with incomplete responses. Appl Stat. 2000; 49:287–96.Google Scholar
 15.Gurka MJ, Edwards LJ, Muller KE, Kupper LL. Extending the boxcox transformation to the linear mixed model. J R Stat Soc: Ser A. 2006; 169:273–88.CrossRefGoogle Scholar
 16.Kim S, Chen MH, Ibrahim JG, Shah AK, Lin J. Bayesian inference for multivariate metaanalysis boxcox transformation models for individual patient data with applications to evaluation of cholesterollowering drugs. Stat Med. 2013; 32:3972–990.CrossRefPubMedPubMedCentralGoogle Scholar
 17.Raudenbush SW. Magnitude of teacher expectancy effects on pupil iq as a function of the credibility of expectancy induction: A synthesis of findings from 18 experiments. J Educ Psychol. 1984; 76:85–97.CrossRefGoogle Scholar
 18.Raudenbush SW, Bryk AS. Empirical bayes metaanalysis. J Educ Stat. 1985; 10:75–98.CrossRefGoogle Scholar
 19.Hartung J, Knapp G, Sinha BK. Statistical MetaAnalysis with Applications. Hoboken, NJ: Wiley; 2008.CrossRefGoogle Scholar
 20.Häuser W, Bernardy K, Üceyler N, Sommer S. Treatment of fibromyalgia syndrome with antidepressants: a metaanalysis. J Am Med Assoc. 2008; 301:198–209.CrossRefGoogle Scholar
 21.Riley RD, Higgins JPT, Deeks JJ. Interpretation of random effects metaanalyses. Br Med J. 2011; 342:549.CrossRefGoogle Scholar
 22.Lambert PC, Sutton AJ, Burton PR, Abrams KR, Jones DR. How vague is vague? a simulation study of the impact of the use of vague prior distributions in mcmc using winbugs. Stat Med. 2005; 24:2401–028.CrossRefPubMedGoogle Scholar
 23.Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. London: Chapman and Hall/CRC; 2003.Google Scholar
 24.Gelman A. Prior distributions for variance parameters in hierarchical models. Bayesian Anal. 2006; 1:515–34.CrossRefGoogle Scholar
 25.Spiegelhalter DJ. Bayesian methods for cluster randomized trials with continuous responses. Stat Med. 2001; 20:435–52.CrossRefPubMedGoogle Scholar
 26.Spiegelhalter DJ, Abrams KR, Myles J. Bayesian Approaches to Clinical Trials and Healthcare Evaluation. London: Wiley; 2004.Google Scholar
 27.Turner RM, Davey J, Clarke MJ, Thompson SG, Higgins JPT. Predicting the extent of heterogeneity in metaanalysis, using empirical data from the cochrane database of systematic reviews. Int J Epidemiol. 2012; 41:818–27.CrossRefPubMedPubMedCentralGoogle Scholar
 28.Rhodes KM, Turner RM, Higgins JPT. Predictive distributions were developed for the extent of heterogeneity in metaanalyses of continuous outcome data. J Clin Epidemiol. 2014; 68:52–60.CrossRefPubMedGoogle Scholar
 29.Stan Modeling Language User’s Guide and Reference Manual. http://mcstan.org/. Accessed 13 July 2017.
 30.Higgins JPT, Thompson SG. Quantifying heterogeneity in a metaanalysis. Stat Med. 2002; 21:1539–58.CrossRefPubMedGoogle Scholar
 31.Goto M, Inoue T, Tsuchiya Y. On estimation of parameters in powernormal distribution. Bull Inf Cybern. 1984; 21:41–53.Google Scholar
 32.Johnson NL, Kotz S, Balakrishnan N. Continuous Univariate Distributions. New York: John Wiley & Sons; 1994.Google Scholar
 33.Maruo K, Shirahata S, Goto M. Underlying assumptions of the powernormal distribution. Behaviormetrika. 2011; 38:85–95.CrossRefGoogle Scholar
 34.Maruo K, Goto M. Percentile estimation based on the powernormal distribution. Comput Stat. 2013; 28:341–56.CrossRefGoogle Scholar
 35.Bartlett MS. The use of transformations. Biometrics. 1947; 3:580–619.CrossRefGoogle Scholar
 36.Kulinskaya E, Morgenthaler S, Staudte RG. Meta Analysis: A Guide to Calibrating and Combining Statistical Evidence. Chichester: John Wiley & Sons; 2008.Google Scholar
 37.Box GEP, Hill WJ. Correcting inhomogeneity of variance with power transformation weighting. Technometrics. 1974; 16:385–9.CrossRefGoogle Scholar
 38.Atkinson A, Pericchi LR, Smith RL. Grouped likelihood for the shifted power transformation. J R Stat Soc Ser B. 1991; 52:473–82.Google Scholar
 39.Cheng RCH, Traylor L. Nonregular maximum likelihood problems. J R Stat Soc Ser B. 1995; 57:3–44.Google Scholar
 40.Azzalini A. A class of distributions which includes the normal ones. Scand J Stat. 1985; 12:171–8.Google Scholar
 41.Azzalini A. The SkewNormal and Related Families. Cambridge: Cambridge University Press; 2013.CrossRefGoogle Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.