Background

Much has been discussed about disadvantages of the odds ratio (OR) as a measure of association in cross-sectional studies, cohort studies and clinical trials [18], and instead of it, relative risk (RR) or prevalence ratio (PR) according to the study design are proposed. Although some authors suggest that it is enough the outcome to be rare (<10 %) between unexposed [4, 5, 9] or rare in the general population [10, 11], Torman [12] showed that OR is only a good approximation of PR or RR and therefore can be interpreted as such, when the outcome is rare in the two strata of exposure.

In the case of binary outcomes and independent data, several alternatives to logistic regression have been proposed. One of them is the log-binomial model [13, 14], a generalized linear model with binomial response and log link function. Another proposal is the use of Poisson regression with robust variance [1517]. Since robust Poisson regression assumes that the outcome has a Poisson distribution, probabilities larger than 1 can be estimated [16]. However, there may be convergence problems when fitting the log-binomial model [16, 18]. The use of the log-binomial model as a first choice of analysis was recommended by some authors [17, 19, 20] who compared this and other methods through simulation and observed that, when the log-binomial model converges, the resulting estimates of RR or PR have better properties. The Bayesian approach for the log-binomial model was proposed as a way to solve the convergence problems and the median point estimator and equal-tail credible interval (CI) were recommended [21]. Torman and Camey (unpublished manuscript) explored other Bayesian estimators and their final recommendation was to use the mode as a point estimator and the same credible interval.

For binary outcomes from dependent data, as in cluster randomized clinical trials or multilevel modelling, only frequentist proposals are known by the authors. Zou and Donner [22] and Yelland et al. [23] proposed Poisson and log-binomial regression models estimated with Generalized Estimating Equations (GEE), and compared them through simulation. They both verified that the GEE log-binomial model may have convergence problems. Yelland et al. [23] also found convergence problems with the GEE Poisson model, though less frequently than the GEE log-binomial. Zou and Donner [22] stressed that the proposed GEE models should be used only if the number of clusters is greater than 50. Moreover, the authors identified that the GEE Poisson model can estimate probabilities greater than 1. In the analysis of dependent data, in addition to GEE based models (also called marginal models), mixed models (also called conditional models) can be adopted. Yelland et al. [24] noted that there may be convergence problems when using the mixed-effects log-binomial model and that solutions for this problem are lacking.

On the estimation of PR or RR with polytomous outcomes, only papers with frequentist approach were found. Camey et al. [25] evaluated the performance of separated log-binomial and robust Poisson models, where several dichotomous outcomes are created and the desired model is fitted for each one. Comparisons with estimates from multinomial logistic regression were carried out by simulation and the conclusion was that the proposed approaches are more accurate and precise. The final recommendation was that fit of separate log-binomial models should be tried first and only resort to separate robust Poisson regressions if convergence problems occur. However, when considering separate dichotomous outcomes, the true multinomial nature of the response is ignored and there is no guarantee that the coefficients found will produce valid probabilities for the reference category. Another proposal is the log-multinomial model [26], which considers the correct distribution of the outcome, however, it may face convergence problems and estimate probabilities outside the correct interval. This last problem was not expected since the correct distribution of the outcome is adopted.

In this paper it is intended to exemplify the use of the Bayesian approach for the log-binomial model with independent data and extend that approach to support dependent data and multinomial outcomes. For this purpose four examples will be used.

Methods

Independent data and binary outcome

Data from a cohort of 65 patients admitted to a hospital in Porto Alegre for acute decompensated heart failure (ADHF) are used to illustrate the estimation of RR on binary outcomes with independent data. The outcome is death in three days after admission. The predictors considered were sodium (mEq/L), septum (mm) and pulmonary artery systolic pressure (PASP, mmHg), all continuous. In addition to estimating the RR of predictors, it is intended to get a formula to calculate a risk of death score that can be used in a practical way at admission. Since the estimated probabilities should naturally measure risk of death, it was decided to use the fitted model itself as a formula to calculate the risk score.

The corresponding log-binomial model is given by:

$$ {\theta}_i=P\left({Y}_i\Big|{X}_{1i},{X}_{2i},{X}_{3i}\right) = exp\left({\beta}_0+{\beta}_1{X}_{1i}+{\beta}_2{X}_{2i}+{\beta}_3{X}_{3i}\right)\kern1.25em i=1,\dots, 65 $$

where Y i  = 1 if the i-th individual died within the follow-up period and 0 otherwise, X 1i is his sodium level, X 2i his septum measure and X 3i his measure of PASP. A very important feature of the log-binomial model, which can cause convergence problems, is the fact that the coefficients of the model must be restricted to the condition that θ i  ≤ 1, ∀ i, i.e., only where the probabilities of each individual be between 0 and 1. In this model, exp(β 1), exp(β 2) and exp(β 3) are the respective RRs of predictors (in the specific case of this data set, which is a cohort).

Two frequentist approaches were fitted to these data, robust Poisson regression and log-binomial model, both using the R 3.0.0 [27] function glm, with the sandwich package [28] for robust Poisson; and the Bayesian approach for the log-binomial model using Markov Chain Monte Carlo (MCMC) with the OpenBugs 3.2.2 program together with the R BRugs package [29, 30]. To compare the predictive power of the probabilities estimated by each model, ROC curves were obtained with the R Epi package [31]. The codes for this and other models used in this article can be found in the supporting web material (Additional file 1).

Dependent data and binary outcome

To illustrate RR and PR estimation with the existence of dependency between observations, two sets of data were used. The first is from a cluster clinical trial, introduced by Kerry and Bland [32]. The aim was to verify the effect of an intervention on the practice of requiring radiology examinations, used by general practitioners in a certain hospital. For this, 34 doctors were divided equally in intervention and control groups, and for each patient referred for X-ray, it was evaluated if the requirement was in compliance with the guidelines.

In this context, the mixed-effects log-binomial model is given by:

$$ {\theta}_{ij}=P\left({Y}_{ij}=1\Big|{X}_{ij,ui}\right) = exp\left({\beta}_0+{\beta}_1{X}_{ij}+{u}_i\right)\kern1.5em i=1,\dots, 34\kern1em j=1,\dots, {n}_i $$

where Y i j  = 1 if the j-th examination of the i-th doctor was in accordance with the guidelines and 0 otherwise, X i j  = 1 if the i-th the doctor was in the intervention group and 0 if it was in the control group, u i is the effect due to the i-th doctor, for which it is supposed that u i  ~ N(0, σ 2 u ) and n i is the number of patients seen by the i-th physician. The GEE log-binomial model equation is the same equation of the independent data case, just no longer assuming independence between observations. In the GEE Poisson model, assumption of Poisson distribution is added to the outcome.

Three frequentist models were fitted to this data set, using SAS version 9.3 (SAS Institute, Cary NC): mixed-effects log-binomial model with Proc GLIMMIX, GEE log-binomial model and GEE Poisson model with Proc GENMOD using an exchangeable working correlation matrix. This matrix was chosen to make possible the estimation of the intraclass correlation coefficient (ICC) [33], which measures the degree of data dependence. The Bayesian approach for the mixed-effects log-binomial model was performed again with BRugs.

The second data set comes from a cross-sectional study with multilevel design on evaluation of the Unified Health System (Sistema Único de Saúde, SUS) [34] by the users. Data were collected by the SUS General Ombudsman Office Department (Departamento de Ouvidoria Geral do SUS) of the Strategic and Participatory Management Secretariat (Secretaria de Gestão Estratégica e Participativa) of the Ministry of Health (Ministério da Saúde), through telephone contact. The inclusion criteria were to be 16 years old or older and to have used SUS in the last 12 months. Respondents were inhabitants of 61 municipalities, and multilevel analysis was adopted to consider the expected dependence among individuals residing in the same municipality. The outcome was user dissatisfaction. Predictor variables related to municipalities and individuals were considered. For comparison of the different fits it was used the final model presented by the authors, obtained from a sample of 12,879 interviews. The mixed effects logistic regression was fitted using SAS Proc GLIMMIX and the mixed-effects log-binomial model estimated via MCMC. Here, the use of logistic regression aims to compare differences between OR and PR in a large sample. The equation of the mixed-effects log-binomial model is similar to the one in the cluster clinical trial example, adding predictors.

The extension of the log-binomial model to incorporate mixed effects through the Bayesian approach was made by adding a normally distributed random effect in the linear predictor of the model, which can be seen in the given code. This same term was added in the place where the restriction for probabilities between 0 and 1 is implemented in the MCMC code.

For the mixed-effects logistic model there is more than one way to estimate ICC, here we use the following formula [35]:

$$ IC{C}_{logist}=\frac{\sigma_u^2}{\sigma_u^2+\frac{\pi^2}{3}} $$

This expression is obtained by considering that the binary outcome comes from a continuous latent variable and that this one conforms to a model with residuals following the standard logistic distribution. Using the same reasoning for the mixed effects log-binomial model [see Additional file 1], a model is reached for the continuous latent variable with residuals following an exponential distribution with mean equals to 1, which leads to following formula for the ICC:

$$ IC{C}_{log- bin}=\frac{\sigma_u^2}{\sigma_u^2+1}, $$

easily estimated by point and interval estimators in the Bayesian approach. However, in the frequentist approach to logistic regression no direct way was found to obtain the confidence interval (CI) for the ICC in SAS. Likewise, in the trial data it was not found a way to estimate between clusters variance with GEE, nor to estimate the confidence interval of the ICC for this method.

Independent data and polytomous outcome

The last database evaluated is one provided in the book of Hosmer and Lemeshow [36] on the birth weight of 189 live births. For illustration of the models with polytomous outcome, birth weight was divided into 4 categories (<2.5 kg, 2.5 kg to 3 kg, 3 kg to 4 kg, 4 kg or more) and the third one was considered the reference category. Thus, a binary variable can be defined for each category of the outcome: Y i j  = 1 if the i-th born had birth weight belonging to the j-th category, j = 0,1, …, 3, j = 0 being the reference category and i = 1, …, 189. Mother’s age (X 1i ) and her smoking condition during pregnancy (X 2i ) were used as predictors.

The log-multinomial model defined in this context is:

$$ {\theta}_{ij}=P\left({Y}_{ij}=1\Big|{X}_{1i},{X}_{2i}\right)= \exp \left({\beta}_{j0}+{\beta}_{j1}{X}_{1i}+{\beta}_{j2}{X}_{2i}\right)\kern0.75em j=1,\dots, 3. $$

It is redundant to estimate coefficients for the reference category (j = 0) since

$$ {Y}_{i0}=1-{\displaystyle {\sum}_{j=1}^3{Y}_{ij}},\ \forall i,\ \mathrm{then}\ P\left({Y}_{i0}=1\Big|{X}_{1i},{X}_{2i}\right)=1-{\displaystyle {\sum}_{j=1}^3P\left({Y}_{ij}=1\Big|{X}_{1i},{X}_{2i}\right)},\ \forall i $$

Three frequentist analyzes were performed on these data: separated robust Poisson regressions, separate log-binomial models and log-multinomial model. The log-multinomial model was fitted in Stata version 9.2 (Stata Corp, College Station, Texas) with syntax courtesy of Leigh Blizzard. Separate regressions were performed in R, the same way as in the case of binary outcomes. To implement the log multinomial model via MCMC, in addition to the restriction for probabilities to be between 0 and 1, it is necessary to restrict the probability of the reference category be the complement of the sum of the probabilities of the other categories [see Additional file 1].

Details common to all the examples

To choose the numbers of interactions, burn-in period and thin for MCMC, graphical analysis and Gelman and Rubin statistic [37] were used. At least 1000 iterations were used for estimation. The prior distributions assigned to the models coefficients were normal with zero mean and variance 106 as suggested by Chu and Cole [21]. In the case of models for dependent data, the uniform distribution from 0.01 to 100 for the deviation of the random effects was adopted, as suggested by Gelman [38] in the context of normal mixed models. The priors used are all vague in order to minimize their influence on the results. The mode and the equal tails credible interval were used as Bayesian point and interval estimators, respectively.

For all analyzes, point and 95 % confidence/credible interval estimates will be shown. For comparisons between methods, the ranges of the intervals were calculated. For comparison of point estimates, one of the frequentist methods was adopted as a reference and the relative difference in percentage (Δ %) from the other methods was calculated. Additional information about computational time will also be presented for the Bayesian models considering running in a computer with 3.40 GHz processor and 4 GB RAM.

Results

For the Bayesian models, all the chains simulated were considered well mixed after the chosen burn-in and thin was applied. This was checked through comparison of the trajectory plots before and after the discard of some generated values. Also, the Gelman and Rubin statistics was near to one in all situations. Details of convergence check are shown for the cluster randomized trial in the supporting web material (Additional file 1) as an example.

Independent data and binary outcome

In the cohort of 65 patients with heart failure, 15 died (23.1 %), therefore, OR may not be a good approximation to RR. Table 1 shows the results of the analyses performed. The frequentist log-binomial model did not converge and therefore does not appear. It can be observed that the point estimates of the coefficients differed more than the estimates of RRs, with percentage differences ranging from 12.5 to 38.6 %. Except for the intercept, the ranges of the intervals differed in the second decimal place, and the robust Poisson regression generally had a shorter range. In terms of significance of predictors, the methods diverged only on PASP, which was considered significant by robust Poisson regression and not significant by the Bayesian method.

Table 1 Results of the ADHF patients cohort analyses

Figure 1 shows a scatter plot comparing the probabilities predicted by each method. Although there is a high correlation (r = 0.984), it is apparent that the predicted probabilities are different since there are several points distant from the straight line of equality. It is also apparent that robust Poisson produced two unacceptable estimates of probability, i.e., greater than 1. An individual with 13 mm septum, 136 mEq/L of sodium and 100 mmHg of PASP got a 1.211 probability of death predicted by Poisson and another individual with 12 mm septum, 127 mEq/l of sodium and 53 mmHg of PASP got a probability of death estimated in 1.008. All those values are outside the normality ranges recommended, but are plausible. Both patients died. Figure 2 shows the ROC curves of the probabilities estimated by Poisson regression and by the log-binomial model via MCMC. Probabilities predicted by Poisson regression have an area under the curve slightly larger than the Bayesian method, but the optimal cutoff point of the Bayesian method is higher.

Fig. 1
figure 1

Scatterplot of probabilities predicted through robust Poisson regression versus MCMC Log-binomial and straight line of equality

Fig. 2
figure 2

ROC curve of probabilities predicted through (a) Robust Poisson regression and (b) Bayesian log-binomial model

Dependent data and binary outcome

In the cluster clinical trial presented in Kerry and Bland [33], among the 429 requests in the intervention group, 341 (79.48 %) were in compliance with the guidelines. Now, among the 702 requests in the control group, 509 (72.51 %) were adequate. In Table 2 the results of the models fitted for these data are shown. The results of the GEE Poisson model were virtually identical to the GEE log-binomial and so were deleted. Convergence was not obtained when fitting the mixed-effects log-binomial model via SAS. The largest difference was observed in the point estimate of the ICC. The ranges of CIs were slightly wider in the Bayesian approach. Both methods detected significant effect of the intervention.

Table 2 Results of the analyses of the cluster clinical trial on guidelines for radiology requests

Table 3 presents estimates based on the SUS users’ satisfaction survey data. Among the respondents, 7,875 (61.15 %) were classified as dissatisfied with the SUS. Large differences can be noted between logistic regression’s OR and the PR estimated by log-binomial model via MCMC. Range of Bayesian CI was shorter for all parameters. The methods differed only on the significance of three variables: percentage of literate population, health units per hundred thousand inhabitants and graduate or higher education level and the CI obtained by MCMC for all of them included the value 1 while that obtained by logistic regression excluded it.

Table 3 Results of multilevel analysis of the SUS users’ satisfaction data

Independent data and polytomous outcome

Among 189 births in the database taken from Hosmer and Lemeshow [36], 59 (31.22 %) weighed below 2.5 kg, 38 (20.10 %) between 2.5 and 3 kg, 83 (43.92 %) between 3 and 4 kg, and 9 (4.76 %) of 4 kg or more.

The results of the analyses conducted are shown in Table 4 and comparative measurements in Table 5. When fitting the log-binomial model for category 4 kg or more there was a non-convergence message, but results were produced and it was decided to show them. In general, the separated log-binomial models produced the more similar to the log-multinomial model estimates. As for the directions of associations, no discrepancy occurred. However, for the last category of the outcome, the Bayesian method was the only one which identified the association of smoking as significant and the only one which did not identify association of age as significant. Bayesian and log-multinomial methods produced, in general, wider range intervals.

Table 4 Results of the analyses of the low birth weight data with multinomial outcome
Table 5 Comparisons among analyses of the low birth weight data with multinomial outcome

Making up predictions of probabilities for each Poisson regression, no case of probability greater than 1 occurred. However, when adding the predicted probabilities for the three outcomes, in one case a value greater than 1 is obtained. The same occurred for the separate log-binomial models and for the log-multinomial model.

Discussion

Some interesting features learned from the analysis performed are worth mentioning. For the cohort of patients with ADHF, a low degree of agreement between the probabilities estimates by the Bayesian and the frequentist method was observed. This was expected since the largest differences were found in the point estimation of the coefficients. The MCMC log-binomial model produced lower probabilities estimates. This shrinkage happens probably because of the correct parametric restriction. Poisson regression can estimate probabilities greater than one so the probabilities are inflated for this data. Also, the Bayesian method produced a higher optimal cutoff point in the ROC curve, which was more coherent with the outcome (death).

For the SUS example, the random effects logistic regression can produce estimates of OR quite distant from the PR, and all of its estimates overestimate the association of predictors, a property already widely discussed about OR in the context of independent data [2, 49, 11].

For the birth data from Hosmer and Lemeshow [36], regardless of the reference category, at least two other outcomes cannot be considered rare (occurred for more than 10 % of the sample). This fact is very likely to occur with at least one of the categories in polytomous data. So, for polytomous data, there will be very few situations in each the OR will be a good approximation of RR or PR.

Also for this dataset, it was observed that all the frequentist methods estimated probabilities greater than one for the reference category. So, only the Bayesian model can be used to obtain valid predicted probabilities for the reference outcome.

It is important to stress that the choice of the reference category for the outcome in the log-multinomial model will not affect the interpretation of the RRs of the other categories, unlike what happens in the multinomial logistic regression. The chosen category must be the one for which identifying associated predictors is of no interest. However, in the separate models it is possible to estimate also a model for that reference category, but this is not interesting because it does not enable comparisons with the log multinomial model.

Conclusions

The Bayesian approach was presented in this paper as a unified way to estimate relative risk (or prevalence ratio) for situations with binary outcomes and dependent or independent data and for polytomous outcomes in independent data. It was not illustrated here, but the extension to the case of polytomous outcomes and dependent data can be made with the same fit made for dichotomous outcomes.

It was shown that the Bayesian approach overcomes difficulties of convergence common in the frequentist approach for the log-binomial model. It correctly restricts the parameter space to produce valid probabilities, which is especially fundamental in cases such as the study of patients with ADHF where, besides associated predictors being known, it is desired to build a prediction score. Chu and Cole [21] showed that in addition, a restriction for a space of values of covariates not observed in the sample can be programmed, so even for values of the predictors of patients outside the study, the estimated probabilities may be valid.

For dependent data, the proposed Bayesian method overcomes the difficulty of convergence of the mixed-effects log-binomial model. Besides, it has the advantage that it allows obtaining estimates of the random effect variance (point and interval) and the ICC interval more directly than the GEE method. An expression to calculate the ICC for random effects log-binomial models was also proposed. Such expression still needs to be compared with the ones proposed by Yelland et al. [24].

For polytomous data, it was seen that the separate methods and the log-multinomial model may not produce valid probabilities for the reference category of the outcome. Even though it did not occur in this work, the use of separate Poisson regressions and the log-multinomial model can still result in invalid probabilities for the outcomes analyzed. The log-multinomial model may still face convergence problems, and is only implemented in a commercial computer application (Stata). The proposed MCMC methodology produced coherent estimates and through the use of free programs (R and OpenBugs).

A limitation of the Bayesian approach is that the computational time can be quite large. This occurred especially with the data on low birth weight and on satisfaction about SUS. In the first one there was a high correlation between the values generated, a large number of values had to be generated to get a reasonable sample for estimation, even using a high thin. In the second one, the huge sample size and the large number of parameters to be estimated were responsible for a slow performance of the routine. One alternative tested to overcome this problem was the use of Laplace integration [39] by means of the package INLA [40], which has the advantage of being much faster than MCMC. The method worked very well for the examples with dependent data, but for the patients with ADHF data it produced probabilities greater than 1, and therefore the results were not shown. More studies are necessary to understand whether this limitation can be overcome. Another alternative that can be investigated is the use of other MCMC softwares, like JAGS [41] and the more recent STAN [42]. The modification of the considered priors can also be an alternative to improve the performance of the Bayesian approach, especially relating the variance of the random effect. Regarding the priors of the coefficients, Chu and Cole [21] also evaluated the normal priori with variance 102 and concluded that it produced results very similar to that of variance equal to 106.

A limitation of this study is that only empirical comparisons between methods were proceeded. So, strong recommendations about which method is usually better cannot be made. Bayesian methods appeared to be promising since they can deal correctly with the probabilities involved in analyzing both binary and polytomous outcomes. Simulations were performed by Chu and Cole [21] and Torman and Camey (unpublished manuscript) only for the case of binary outcome and independent data. Plan is to make simulation studies also for situations with binary outcome and dependent data and for polytomous outcomes and independent and dependent data.