Modeling perinatal mortality in twins via generalized additive mixed models: a comparison of estimation approaches
- 194 Downloads
Abstract
Background
The analysis of twin data presents a unique challenge. Second-born twins on average weigh less than first-born twins and have an elevated risk of perinatal mortality. It is not clear whether the risk difference depends on birth order or their relative birth weight. This study evaluates the association between birth order and perinatal mortality by birth order-specific weight difference in twin pregnancies.
Methods
We adopt generalized additive mixed models (GAMMs) which are a flexible version of generalized linear mixed models (GLMMs), to model the association. Estimation of such models for correlated binary data is challenging. We compare both Bayesian and likelihood-based approaches for estimating GAMMs via simulation. We apply the methods to the US matched multiple birth data to evaluate the association between twins’ birth order and perinatal mortality.
Results
Perinatal mortality depends on both birth order and relative birthweight. Simulation results suggest that the Bayesian method with half-Cauchy priors for variance components performs well in estimating all components of the GAMM. The Bayesian results were sensitive to prior specifications.
Conclusion
We adopted a flexible statistical model, GAMM, to precisely estimate the perinatal mortality risk differences between first- and second-born twins whereby birthweight and gestational age are nonparametrically modelled to explicitly adjust for their effects. The risk of perinatal mortality in twins was found to depend on both birth order and relative birthweight. We demonstrated that the Bayesian method estimated the GAMM model components more reliably than the frequentist approaches.
Keywords
Penalized splines Generalized linear mixed models Penalized quasi-likelihood Laplace approximation Markov chain Monte Carlo Variance componentsAbbreviations
- DPQL
Double penalized quasi-likelihood
- GAMM
Generalized additive mixed models
- GLMM
Generalized linear mixed model
- MACP
Mean average coverage probability
- MASE
Mean average squared error
- MCMC
Markov chain Monte Carlo
- REML
Restricted maximum likelihood
- SGA
Small for gestational age
Background
Twins are 2–4 times more likely to die in the perinatal period compared to singletons [1]. Second-born twins, however, are known to be at higher risk of perinatal mortality than first-born twins [2, 3, 4]. While birthweight and gestational age are both well-known determinants of perinatal mortality [5], birthweight is more likely to be a major component of the risk difference between first- and second-born twins because co-twins are usually delivered at the same gestational age. Moreover, second twins, on average, weigh less than first twins [6]. It is unclear if the mortality risk differences between second and first twins depend on birth order or birthweight.
Luo et al. [5] showed that perinatal mortality risk differences in second vs first twins depended on their relative birth size: risks were similar when birthweights were similar, increasingly higher as second twins weighed less, and progressively lower as second twins weighed more. However, in the conditional logistic regression model used, they controlled for the effect of small for gestational age (SGA) via a binary indicator (1 = yes; 0 = no) based on gestational age and birthweight. Controlling for a binary version of continuous confounder(s) may lead to residual confounding [7]. In this analysis, we evaluated the association of birth order with perinatal mortality after adjusting for both birthweight and gestational age, among others.
Because birthweight and gestational age may have nonlinear associations with mortality, we used generalized additive mixed models (GAMMs) [8] that employ unknown smooth functions to model nonlinear covariate effects, and random effects to account for correlation in twin-pairs. Smooth functions can be estimated in various ways [8, 9, 10]; here, we used penalized regression splines represented as mixed model components [11]. This allows the use of mixed model methodology and software to make systematic inference on all model components for the GAMMs.
Although several methods are available for estimating GAMMs, in practice results may vary widely depending on the method used. Motivated by the difficulties we encountered when analyzing the perinatal twin mortality, we investigate the performance of different methods via simulation in a setting similar to the twin data situation.
In this paper, we systematically compare the performance of the Bayesian and likelihood-based estimation techniques for inference in the GAMMs via a simulation study. We also apply these methods to the US matched multiple birth data to study the association between birth order and perinatal mortality by birth-order specific weight difference in twins.
Methods
Generalized additive mixed models (GAMMs)
Generalized additive mixed models (GAMMs) [8] extend generalized linear mixed models (GLMMs) [12] to allow nonlinear functional forms between independent variables and the response. They provide a flexible modeling framework to use additive nonparametric functions to model the effects of continuous covariate(s) while using random effects to model correlation between responses. Estimating the nonparametric smooth function by penalized regression splines, the GAMM can be expressed as a GLMM. Details on the GAMM and its mixed model representation are provided in the Additional file 1: Supplementary Material.
Estimation of GAMMs
The GAMM model parameters may be estimated via frequentist or Bayesian approaches. The frequentist approaches rely on approximation methods while Bayesian methods use Markov Chain Monte Carlo (MCMC). We investigate two approximation methods: double penalized quasi-likelihood (DPQL) [8, 13] and the Laplace approximation [14], as well as a Bayesian approach [15].
The Bayesian methods require specification of prior distributions, a non-trivial task for variance components [16]. For the GAMM, an appropriate choice for the prior distributions of variance components is crucial because curve estimation depends on the variance components; over (under) estimation of the variance components corresponds to undersmoothing (oversmoothing).
Analysis of twins perinatal mortality data
Methods
Data and models
To study the relationship between perinatal mortality and birth order, we used the matched multiple birth dataset from the United States National Centre for Health Statistic’s (NCHS) 1995–1998. For all multiple births in years 1995–1998, the NCHS data contained information on perinatal and infant mortality, and maternal and pregnancy characteristics. An extended version (1995–2000) of this dataset was used by Luo et al. [5].
There were a total of 446,570 matched births. We excluded (15.6% of the total) matched births with the following criteria due to missing or implausible data: (i) triplet and higher order multiple birth (n = 23,672); (ii) unknown breech presentation (n = 5041); (iii) unmatched twins (n = 3650; see, Martin et al. [17] for details); (iv) twins with unknown birth order (n = 3507); (v) extreme gestational ages (< 23 weeks or > 42 weeks, n = 17,475); (vi) extreme birthweights (< 500 g or > 6000 g, n = 4589); (vii) twins not delivered at the same gestational week (n = 9918); and (viii) birthweight difference between second and first twins greater than 100% (n = 1758). The final study cohort included 376,960 twin births in 188,480 twin pregnancies.
To assess perinatal mortality risk differences between second- and firstborn twins by birth order-specific weight difference, we conducted a stratified analysis following Luo et al. [5]. Based on the birthweight difference between twins, we divided the dataset into 7 strata as follows: (i) within ±5% (similar); (ii) first twins heavier by 5–15%; (iii) first twins heavier by 15–25%; (iv) first twins heavier by ≥ 25%; (v) second twins heavier by 5–15%; (vi) second twins heavier by 15–25%; and (vii) second twins heavier by ≥ 25%.
In order to estimate the odds ratio (OR) and 95% confidence interval (CIs) of perinatal death comparing second vs first twins in each stratum, we used GAMMs. This approach is broadly similar to the conditional logistic regression approach of Luo et al. [5]. We adjusted for potential confounders including fetal sex, presentation, birthweight, gestational age, and mode of delivery. Birthweight and gestational age effects were modelled nonparametrically. We did not adjust for maternal characteristics or any other factors common to a twin pair as these were perfectly matched for twins.
where i = 1, ..., m_{h} indexes the twin pair in h^{th} stratum and j = 1, 2 twins within pairs, the fixed effects covariates x_{hij} included an intercept, birth order, fetal sex, presentation, and mode of delivery; f_{1 }(birthweight_{hij}) and f_{2 }(gestational age_{hij}) are centred twice-differentiable smooth functions of birthweight and gestational age, respectively. The random intercept variance and all other model parameters are stratum-specific and m_{h} denotes the number of twin-pairs in the h^{th} stratum. Note that, in our analysis, birthweight and gestational age were linearly correlated (r = 0.72). The correlations between all other covariates were negligible.
Data analysis
- 1.
DPQL under maximum likelihood (ML) estimation.
- 2.
Laplace approximation. For DPQL and Laplace methods, standard errors of the estimated fixed effects and smooth functions were obtained from a posterior covariance matrix as in Lin and Zhang [8].
- 3.
A Bayesian approach in which noninformative priors were used for all parameters. Specifically, N(0, 10^{6}) distributions were used for all fixed effects (β), while half-Cauchy priors [16] with scale parameter set to 25 were considered for each variance component (e.g., for σ^{2}_{int}). We ran 2 chains with 55,000 iterations after discarding the initial 5000 burn-in iterations. The chains were thinned by keeping every 50th iteration and estimates were the sample medians. Convergence of the chains was assessed following Gelman and Rubin [21] and also by visually examining the trace plot, density plot, and sample autocorrelation function for each parameter.
All analyses were carried out in R software employing glmmPQL [22] and gamm4 [23] functions for DPQL and Laplace approximate methods, respectively. The Bayesian analysis via MCMC was performed using JAGS [24] which is a mature and declarative language for Bayesian model fitting.
Results of the data analysis
Characteristics of mothers and twin births included in the twins perinatal mortality study
Characteristic | Mothers | Twins | |
---|---|---|---|
First Born | Second Born | ||
Mothers, n (%) | 188480 | ||
Race | |||
White | 149459 (79.3) | ||
Black | 31912 (16.9) | ||
Other | 7109 (3.8) | ||
Age | |||
< 20 | 13192 (7.0) | ||
20–34 | 140992 (74.8) | ||
≥ 35 | 34296 (18.2) | ||
Newborns^{a} | 188480 | 188480 | |
Sex, boy | 94326 (50.1) | 94654 (50.2) | |
Gestational age, week | 35.7 (3.2) | 35.7 (3.2) | |
Birth weight, gram | 2407.5 (615.5) | 2383.9 (618.5) | |
Breech/Malpresentation | 40832 (21.7) | 51661 (27.4) | |
Cesarean | 100271 (53.2) | 108413 (57.5) |
Stratified comparisons of second and firstborn twins: rates and ORs of perinatal death
Variable | Twin births n(%) | Perinatal death n (per 1000) | OR^{a} (95% CI) | Variance of random intercepts | |||
---|---|---|---|---|---|---|---|
Firstborn | Secondborn | Laplace Fit | Bayesian Fit^{b} | Laplace Fit | Bayesian Fit | ||
Birth weight, heavier in %^{c} | |||||||
Heavier firstborn twin | |||||||
≥ 25% | 32,940 (8.74) | 358 (21.74) | 989 (60.05) | 4.15 (2.31, 6.13) | 3.42 (2.47, 4.70) | 104.2 | 3.5 |
15 to < 25% | 35,810 (9.50) | 295 (16.48) | 490 (27.37) | 2.31 (1.46, 3.65) | 1.97 (1.58, 2.49) | 74.7 | 5.5 |
5 to < 15% | 72,230 (19.16) | 565 (15.64) | 723 (20.02) | 1.68 (1.33, 2.12) | 1.39 (1.20, 1.62) | 42.3 | 4.4 |
Similar birth weight | |||||||
within ±5% | 109,998 (29.18) | 1040 (18.91) | 1174 (21.35) | 1.48 (1.28, 1.72) | 1.27 (1.13, 1.43) | 31.8 | 5.4 |
Heavier secondborn twin | |||||||
5 to < 15% | 69,804 (18.52) | 617 (17.68) | 608 (17.42) | 1.16 (0.90, 1.50) | 1.19 (0.97, 1.40) | 69.9 | 4.7 |
15 to < 25% | 32,118 (8.52) | 354 (22.04) | 334 (20.80) | 0.86 (0.67, 1.12) | 0.91 (0.71, 1.20) | 73.9 | 5.6 |
≥ 25% | 24,060 (6.38) | 506 (42.06) | 351 (29.18) | 0.14 (0.07, 0.26) | 0.33 (0.25, 0.45) | 107.5 | 3.3 |
The variance estimates of the random intercepts are shown in the last two columns of Table 2. Estimates obtained from the Laplace method were implausibly large (up to thirty times bigger than those of the Bayesian method), possibly because the Laplace approximation performs poorly for binary data with small cluster size (n_{i} = 2) and very low event probability, or, because the method did not converge well without reporting any warning. The Bayesian estimates of the variance of random effects indicated large heterogeneity between twin pairs (nearly 5 in most strata).
In summary, the estimated ORs by different methods disagreed by a noticeable margin; the shape of the nonlinear associations varied widely, one method failed to converge, and the variance component estimates differed markedly. Because it was unclear which estimates should be reported, we conducted a simulation study to investigate the performance of the Bayesian and frequentist approaches for estimating GAMMs under twin data setting.
Simulation study
Methods
Data generation: mimicking twin-pairs data setting
Analysis of the simulated data
- 1.
DPQL under maximum likelihood (ML) or restricted ML (REML) estimation.
- 2.
Laplace approximation. For DPQL and Laplace methods, standard errors of the estimated fixed effects and smooth functions were obtained following the same procedure as for the twin data analysis.
- 3.
A Bayesian approach similar to those used for perinatal mortality data analysis with β ~ N(0, 10^{6}) but considering three alternative independent prior specifications for each variance component: (i) Uniform (0, 100); (ii) Half-Cauchy with scale parameter set to 25; and (iii) Inverse Gamma (0.001, 0.001). Using priors (i)-(iii), Bayesian methods are referred to later as, respectively: Bayesian-UNIF, Bayesian-HC, and Bayesian-IG. The Bayesian estimates were medians from 55,000 iterations of the MCMC algorithm after discarding the first 5000 iterations as burn-in. We ran a single chain and thinned it by keeping every 50th iteration.
Performance indicators
where \( \mathbbm{1} \)(.) denotes an indicator function; \( {\hat{f}}_L \) and \( {\hat{f}}_U \) are the lower and upper limits of the point-wise CI, respectively.
Results of the simulation study
Estimate, 95% confidence/credible interval (CI), mean average squared distance (MASE), mean average 95% coverage probability (MACP), and mean average coverage length (MACL) for model parameters estimated via various approaches when number of clusters m = 1000, cluster size n_{i} = 2, and \( {\rho}_{x_1.{x}_2}=0.7 \)
Method | \( {\sigma}_{int}^2=0.75 \) | β_{trt} = 0.7 | f_{1}(x_{1}) | f_{2}(x_{2}) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
\( {\hat{\sigma}}_{int}^2 \) | PRB | 95% CI | \( {\hat{\beta}}_{trt} \) | PRB | 95% CI | MASE | MACP | MACL | MASE | MACP | MACL | |
Event probability = 0.05 | ||||||||||||
DPQL (ML) | 15.83 | 2010.60 | (4.56, 27.10) | 1.24 | 76.44 | (0.23, 2.24) | 7.510 | 0.32 | 1.71 | 11.784 | 0.33 | 1.72 |
DPQL (REML) | 30.45 | 3959.66 | (16.39, 44.56) | 1.07 | 52.38 | (0.01, 2.13) | 6.472 | 0.34 | 1.79 | 12.584 | 0.34 | 1.70 |
Laplace ML | 56.02 | 7369.66 | (5.71, 106.34) | 0.79 | 13.21 | (0.07, 1.51) | 0.763 | 0.70 | 1.76 | 0.907 | 0.73 | 1.82 |
Bayesian (Uniform Prior) | 0.96 | 27.42 | (0.06, 2.88) | 0.75 | 6.54 | (0.28, 1.25) | 0.148 | 0.94 | 1.39 | 0.112 | 0.94 | 1.24 |
Bayesian (Half-Cauchy Prior) | 0.87 | 15.40 | (0.06, 2.71) | 0.72 | 3.25 | (0.29, 1.22) | 0.142 | 0.94 | 1.27 | 0.103 | 0.94 | 1.15 |
Bayesian (IG Prior) | 0.39 | −48.40 | (0.01, 2.20) | 0.71 | 1.71 | (0.27, 1.18) | 0.149 | 0.93 | 1.25 | 0.103 | 0.93 | 1.13 |
Event Probability = 0.5 | ||||||||||||
DPQL (ML) | 0.87 | 15.70 | (0.40, 1.34) | 0.64 | −8.55 | (0.42, 0.86) | 0.032 | 0.89 | 0.57 | 0.023 | 0.88 | 0.48 |
DPQL (REML) | 0.98 | 30.92 | (0.52, 1.44) | 0.66 | −5.86 | (0.42, 0.90) | 0.028 | 0.91 | 0.58 | 0.024 | 0.90 | 0.47 |
Laplace ML | 0.36 | −52.49 | (0.12, 0.60) | 0.66 | −5.77 | (0.43, 0.89) | 0.037 | 0.87 | 0.58 | 0.024 | 0.87 | 0.49 |
Bayesian (Uniform Prior) | 0.82 | 8.99 | (0.33, 1.40) | 0.71 | 1.35 | (0.47, 0.97) | 0.032 | 0.95 | 0.70 | 0.024 | 0.95 | 0.61 |
Bayesian (Half-Cauchy Prior) | 0.80 | 6.04 | (0.34, 1.36) | 0.71 | 0.76 | (0.47, 0.96) | 0.032 | 0.95 | 0.67 | 0.023 | 0.95 | 0.58 |
Bayesian (IG Prior) | 0.72 | −6.50 | (0.19, 1.30) | 0.69 | −0.92 | (0.47, 0.95) | 0.033 | 0.94 | 0.68 | 0.023 | 0.95 | 0.58 |
The lower panel of Fig. 2 compares the empirical pointwise coverage probabilities of the 95% credible intervals of f_{1 }(x_{1}) and f_{2 }(x_{2}) obtained using different estimation methods. The coverage probabilities (CP) of the CIs from all Bayesian methods were close to the nominal value (95%), except when biases in the estimated nonparametric functions were noticeable. In contrast, CIs from DPQL (ML) and Laplace methods yielded very low coverage probabilities, and DPQL (ML) had relatively poor coverage probabilities (mean CP 33%) compared to the Laplace method (mean 73%). Around the peak areas, CIs from DPQL (ML) and Laplace methods yielded very low coverage probabilities. In such cases, coverage probabilities from the fully Bayesian methods were also low, but nonetheless much better than frequentist methods.
Discussion
We re-analyzed twins perinatal mortality data to study the association between birth order and perinatal mortality by adopting the flexible GAMMs in which continuous covariates (birthweight and gestational age) were nonparametrically modelled to adjust for their effects more completely. Overall, how best to estimate flexible regression curves when the outcomes are correlated and binary is unclear, especially when cluster sizes are small. Thus, we analyzed twins data estimating GAMMs by different frequentist and Bayesian methods, and used simulated data to compare the performance of these estimation techniques for a setting similar to the twin-data.
Using the multiple matched data from the US National Centre for Health Statistic’s (NCHS) 1995–1998, we obtained results that varied with respect to the estimation methods. Our simulation results for small cluster size (n_{i} = 2) with low event probability (similar to the NCHS data) suggested the superiority of the Bayesian method in estimating all model components, especially using the Half-Cauchy (HC) priors for the variance components. We thus rely on the results from the Bayesian-HC fit for our data analysis. These results suggest that the risk of perinatal mortality depended on the twins’ birth order and the risk differences in second vs first twins depended on their relative birthweight. Second twins were more likely to die than first-born co-twins when they had similar (within ±5%) birthweights (adjusted OR = 1.27, 95% CI: 1.13, 1.43). The risks of perinatal death for second-born twins were progressively higher as they weighed less than first-born twins (adjusted ORs: 1.39, 1.97 and 3.42 when weighed 5–15%, 15–25% and ≥ 25% less, respectively) and increasingly lower as they weighed more (adjusted ORs: 1.19, 0.91 and 0.33 when weighed 5–15%, 15–25% and ≥ 25% more, respectively; most of the ORs were significantly different from 1). Similar to the simulation results, the Bayesian analysis using uniform priors for variance components (Bayesian-UNIF) yielded slightly larger ORs whereas using inverse gamma priors (Bayesian-IG) yielded slightly smaller ORs as compared to the Bayesian-HC method (see Additional file 1: Table S1 for the results).
The effect of relative birthweight was also confirmed by Luo et al. [5] but they did not find any significant association between birth order and perinatal mortality when both twins had similar (within ±5%) birthweights (OR = 0.97, 95% CI: 0.84, 1.12). Also, the ORs they obtained from the stratified analyses were closer to 1 in most cases. This may be due to using different models, or adjusting for different sets of confounders. They used a binary indicator ‘small for gestational age’ to control for the effect of birthweight and gestational age, which might lead to residual confounding [7].
Similar to the findings from the simulation study, the Laplace estimate of the variance of the random intercepts in each stratum was unusually large - indicating an extreme heterogeneity between twin pairs. The fitted smooth curves for birthweight and gestational age by the Laplace method were less likely to capture the true shapes of association due to the poor estimates of the variance components. The curve estimation largely depends on the estimates of the variance components in a GAMM, and the Laplace method yielded poor estimates of the variance components as evident from the simulation study. The DPQL method failed to fit the model in each stratum and this was in agreement with the findings from the simulation study in which DPQL failed to converge often.
The observed performance of the DPQL and Laplace approximation in estimating the model components in the simulation study was not surprising as they are known to yield biased estimates for small cluster size. However, we demonstrated the strength of Bayesian methods when the system was stressed, i.e., when cluster size and event probability were small. While using Frequentist methods with more refined likelihood approximation (e.g. adaptive Gaussian quadrature) may improve performance but is not feasible as the mixed model representation of GAMMs involves a large number of random effects and Gaussian quadrature is not computationally efficient for more than four random effects [25].
There are some limitations to this study. First, we analyzed the NCHS 1995–1998 twin matched data that we had access to. Unfortunately, the updated version of the data from 1995 to 2000 was not publicly available during this analysis. We do not believe that the results would have changed appreciably given two more years of data. Next, we considered a stratified analysis for twin-data analysis, although a single model that included an interaction term for birth order and relative birth size might be more appropriate to study their association with perinatal mortality by estimating these effects using the whole dataset at once. Stratification was used to make our results comparable to Luo et al. [5], and to reduce the computational resources required for handling a huge dataset. Finally, we omitted a potential confounder, zygosity (Monozygotic-MZ/Dizygotic-DZ), because the data was not available. MZ twins are likely to be more correlated both for birthweight and for potential mortality than DZ twins.
Conclusion
We adopted a sophisticated statistical model, GAMM, to precisely estimate the perinatal mortality risk differences between first- and second-born twins from a large dataset in which birthweight and gestational age were nonparametrically modelled to explicitly adjust for their effects. Overall, the perinatal mortality risk differences in second vs first twins were found to depend on both birth order and relative birthweight. We demonstrated that the Bayesian method (especially using half-Cauchy prior for variance component) estimates the GAMM model components more reliably than the frequentist approaches for small cluster size.
Notes
Acknowledgements
We thank two reviewers whose comments have greatly improved this manuscript.
Authors’ contributions
MM, JH and AB determined the overall scope of this study. MM planned the analytical strategies, analyzed the data, interpreted the results, designed and carried out simulation study and wrote the manuscript, and AB supervised the whole work. JH offered feedback on the data analysis and manuscript and contributed interpreting the results. All authors read and approved the final manuscript.
Funding
AB is funded by the Fonds de recherche Santé Québec (FRQS). The funding body had no role in designing the study and collection, or in the analysis, or in the interpretation of the data or in writing the manuscript.
Ethics approval and consent to participate
In this paper we use secondary data on Matched Multiple Birth from the United States National Centre for Health Statistic’s (NCHS) Vital Statistics 1995–1998. The data are available online (https://www.nber.org/data/vital-statistics-matched-multiple-births-data.html) and no ethical approval was required.
Consent for publication
Not Applicable.
Competing interests
The authors declare that they have no competing interests.
Supplementary material
References
- 1.Parker JD, Schoendorf KC, Kiely JL. A comparison of recent trends in infant mortality among twins and singletons. Paediatr Perinat Epidemiol. 2001;15:1–8.CrossRefGoogle Scholar
- 2.Smith GCS, Pell JP, Dobbie R. Birth order, gestational age, and risk of delivery related perinatal death in twins: retrospective cohort study. BMJ. 2002;325:1004–9.CrossRefGoogle Scholar
- 3.Armson BA, O’Connell C, Persad V, Joseph KS, Young DC, Baskett TF. Determinants of Perinatal mortality and serious noenatal morbidity in the second twin. Obstet Gynecol. 2006;108(3):556–64.CrossRefGoogle Scholar
- 4.Smith GCS, Fleming KM, White IR. Birth order of twins and risk of perinatal death related to delivery in England, Northern Ireland, and Wales, 1994-2003: retrospective cohort study. BMJ. 2007;334:576–80.CrossRefGoogle Scholar
- 5.Luo ZC, Ouyang F, Zhang J, Klebanoff M. Perinatal mortality in second-vs firstborn twins: a matter of birth size or birth order? Am J Obstet Gynecol. 2014;211(153):e1–8.Google Scholar
- 6.Sheay W, Ananth CV, Kinzler WL. Perinatal mortality in first- and second-born twins in the United States. Obstet Gynecol. 2004;103:63–70.CrossRefGoogle Scholar
- 7.Benedetti A, Abrahamowicz M. Using generalized additive models to reduce residual confounding. Stat Med. 2004;23:3781–801.CrossRefGoogle Scholar
- 8.Lin X, Zhang D. Inference in generalized additive mixed models by using smoothing splines. J R Stat Soc B. 1999;61(2):381–400.CrossRefGoogle Scholar
- 9.Wang Y. Mixed effects smoothing spline analysis of variance. J R Stat Soc B. 1998;60:159–74.CrossRefGoogle Scholar
- 10.Gu C. Smoothing Spline ANOVA Models. New York: Springer-Verlag; 2002.CrossRefGoogle Scholar
- 11.Ruppert D, Wand MP, Carroll RJ. Semiparametric Regression. Cambridge: Cambridge University Press; 2003.CrossRefGoogle Scholar
- 12.Breslow NE, Clayton DG. Approximate Inference in Generalized Linear Mixed Models. J Am Stat Assoc. 1993;88:9–25.Google Scholar
- 13.Mullah MAS, Benedetti A. Effect of Smoothing in Generalized Linear Mixed Models on the Estimation of Covariance Parameters for Longitudinal Data. Int J Biostat. 2015. https://doi.org/10.1515/ijb-2015-0026.
- 14.Molenberghs G, Verbeke G. Models for Discrete Longitudinal Data. New York: Springer-Verlag; 2005.Google Scholar
- 15.Gilks WR, Richardson S, Spiegelhalter DJ, editors. Markov Chain Monte Carlo in Practice. London: Chapman and Hall; 1996.Google Scholar
- 16.Gelman A. Prior distribution for variance parameters in hierarchical models. Bayesian Anal. 2006;1(3):515–33.CrossRefGoogle Scholar
- 17.Martin A, Brady EH, Candace MC, Sally AC, Margaret LS, Martha LM. The Matched Multiple Birth File. 1998; Available at ftp://ftp.cdc.gov/pub/health-statistics/nchs/Datasets/mmb2/Methdoc9500.pdf.Google Scholar
- 18.Wood SN. Thin plate regression splines. J R Stat Soc B. 2003;65:95–114.CrossRefGoogle Scholar
- 19.Wood SN. Generalized Additive Models: An Introduction with R. New York: CRC Press; 2006.CrossRefGoogle Scholar
- 20.Ruppert D. Selecting the Number of Knots for Penalized Splines. J Comput Graph Stat. 2002;11(4):735–57.CrossRefGoogle Scholar
- 21.Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences (with discussion). Stat Sci. 1992;7:457–72.CrossRefGoogle Scholar
- 22.Ripley B, Venables B, Bates DM, Hornik K, Gebhardt A, Firth D. MASS 7.3-45. R package. 2016. http://cran.r-project.org.Google Scholar
- 23.Wood S, Scheipl F. gamm4 0.2-4. R package. 2016. http://cran.r-project.org.Google Scholar
- 24.Plummer M. Jags version 1.0.3 manual. Technical Report; 2009.Google Scholar
- 25.Chen J, Liu L, Johnson BA, O'Quigley J. Penalized likelihood estimation for semiparametric mixed models, with application to alcohol treatment research. Stat Med. 2013;32:335–46.CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.