Bayesian model with application to a study of dental caries
Abstract
Background
Dental caries are a significant public health problem. It is a disease with multifactorial causes. In Sub-Sahara Africa, Ethiopia is one of the countries with a high record of dental caries. This study was to determine the risk factors affecting dental caries using both Bayesian and classical approaches.
Methods
The study design was a retrospective cohort study in the period of March 2009 to March 2013 dental caries patients Hawassa Haik Poly Higher Clinic. The Bayesian logistic regression procedure was adapted to make inference about the parameters of a logistic regression model. The purpose of this method was generating the posterior distribution of the unknown parameters given both the data and some prior density for the unknown parameters.
Results
From this study the prevalence of natural dental caries was 87% and non-natural dental caries were 13%. The age group of 18–25 was higher prevalence of dental caries than the other age groups. From Bayesian logistic regression, we found out that rural patients, do not clean their teeth, patients from SNNPR and age group 18–25 are statistically significant. The finding from the Bayesian statistics approach is getting popular in data analysis than classical statistics because the technique is more robust and precise.
Conclusions
Bayesian approach was found to be better than classical method as the value of the standard errors in Bayesian approaches is smaller than that of classical logistic regression. The Bayesian credible interval is smaller than the length of the confidence interval for all significant risk factors. Age, sex, place of residence, region and habit of cleaning teeth was found to have a significant effect on dental caries patients.
Keywords
Bayesian approach Dental caries Binary logistic regression MCMC Posterior distribution Prior densityAbbreviations
- CI
Confidence Interval
- HPD
Highest Posterior Density
- MCMC
Markov chain Monte Carlo
- OR
Odds Ratio
- SNNPR
South Nation Nationality People
- winBUGS
Win Bayesian using Gibbs sampling
Background
Dental caries is a microbial, multifactorial disease that succeeds in destroying the hardest substance of the human body, the enamel [1]. This disease is identified by the World Health Organization (WHO) as one of the most important public health issues [2]. Now a day dental caries on the rise to become major public health problems worldwide, nearly 60–90% of children and about 100% of adults have dental cavities, often leading to pain and discomfort [3].
The problem related with dental caries leads to a decrease in the quality of life of the affected individuals and society, with disparities related to well-known issues of socioeconomic, lack of preventive efforts, and dietary changes [4]. The burden of dental caries can affect school attendance, eating and speaking which leads to impair growth and development [5, 6].
Dental caries is one of the public health problems in both developed and developing countries [7]. Deteriorating oral health is an emerging public health concern in developing countries, yet little attention has been given to oral health in most sub-Saharan countries. The extents of caries, periodontal diseases and the associated risk factors have not been widely studied at the community level [8]. It is increasing gradually due to the growing consumption of sugary substances and poor oral care practices and inadequate health service utilization [9]. Ethiopia previous studies showed that, there were differences in different localities with regard to the prevalence of dental caries; 48.5% in Finote Selam, Ethiopia [10], 21.8% in Bahir dar city Ethiopia [11] and 78.2% in Debre Tabor General Hospital dental clinic [11, 12].
Dental caries causes tooth pain, discomfort, eating impairment, loss of tooth and delay language development. Furthermore, dental caries has effects on children’s concentration in school and a financial burden on the families [13, 14]. Risk factors such as sex, age, dietary habits, socioeconomic and oral hygiene status are associated with an increased prevalence and incidence of dental caries in a population [15]. The Person suffers from dental caries were examined for the type of dental caries in relation to different factors. The occurrence of dental caries was found to be slightly higher in females 51.45% [16].
Age is directly and strongly associated with prevalence of dental caries with increasing age the number of surfaces affected by caries increases, plateauing at around 50 years of age [17]. Teeth should be cleaned thoroughly at least twice a day using a fluoride toothpaste. Brushing helps remove the plaque and food particles from the tooth surface and flossing helps remove the plaque and food particles from the areas between the teeth. In Ethiopia, existing dental health services are limited. Even though, dental caries are highest in the country, much is not known about the factors affecting it in the study area. Therefore, this study was to determine statistical association between dental caries and some risk factors among patients attending Dental Clinic in Hawssa Haik poly Higher Clinic.
Methodology
Where P_{i} is the probability of experiencing the outcome of interest for subject i, and X_{1i},..., X_{ki} are risk factors and β_{i} denotes the i^{th} regression coefficient [19]. Based on this model, the effect of each risk factor on the outcome can be expressed as an odds ratio. Binary outcomes are common in retrospective studies such as cohort studies. Logistic regression yields an odds ratio that approximates the risk ratio when the risk outcomes is low (< 10%). A consensus has been reached in an extensive argument in much of the literature that the risk ratio is preferred over the odds ratio for retrospective studies in case of the risk outcome less than 10%.. To obtain a model-based estimate of risk ratios, log-binomial regression has been recommended. However, this model may fail to converge and many methods have been provided as an alternative in these situations as Robust Poisson [20]. Log-binomial regression model is similar to the logistic regression model, except that it assumes a log link instead of a logit link, hence providing risk ratios instead of odds ratios. It can be presented as,
Based on this model, the effect of each risk factor on the outcome can be expressed as a risk ratio. There may be challenges when using the log-binomial model to estimate the RR because when fitting the log-binomial model, especially given continuous variables, non-convergence may be an issue when the MLE is close to or on the boundary of the parameter space [21]. The log-binomial is commonly used to estimate the RR; the OR estimated using logistic regression is often used to approximate the RR when the outcome is rare. However, regardless of the prevalence of the outcome, logistic regression predicted exposed and unexposed risks may be used to estimate the RR. When maximum likelihood estimation is used to fit the logistic model, estimation of the standard error of the RR is difficult. To overcome such difficulty in the estimation of the SE of the RR and provide a flexible framework for modeling, we developed a Bayesian logistic regression (BLR) model to estimate the OR, with an associated credible interval.
This gives a complex posterior distribution that is complicated to converge to a known distribution. In order to determine the posterior distribution, we will use the MCMC in the simulation of the random numbers following the posterior distribution. The Markov chain Monte Carlo method is a general method that generates the estimates of β (unknown parameters) from appropriate distribution and then corrects the values generated to have a better estimate of the desired posterior distribution [23]. The Gibbs sampling algorithm is a method to generate an instance from the distribution of each variable in turn, conditional on the current values of the other variables. It is a special case of Metropolis-Hasting algorithm where the random value is always accepted. Suppose that we partition the parameter vectors of the interest into the components. The term convergence of an MCMC algorithm refers to whether the algorithm has reached its equilibrium (target) distribution [24]. Several diagnostic tests have been developed to monitor the convergence of the algorithm such as time series, Density, autocorrelation, Gelman Rubin [25].
Results of analysis
Tabulation of the response variable with each explanatory variable
Variable | Categories | Dental caries | |
---|---|---|---|
No -natural (%) | Natural (%) | ||
Gender | Female | 327 (11.6) | 2501 (88.4) |
Male | 451 (14.2) | 2728 (85.8) | |
Residence | Urban | 434 (11) | 3510 (89) |
Rural | 344 (16.7) | 1719 (83.3) | |
Region | SNNPR | 183 (19.6) | 751 (80.4) |
Others | 595 (11.7) | 4478 (88.3) | |
Age | <=18 | 90 (9.4) | 872 (90.6) |
18–25 | 295 (14.5) | 1742 (85.5) | |
26–35 | 211 (13.2) | 1392 (86.8) | |
> = 35 | 182 (13.0) | 1223 (23.4) | |
Clean teeth | Yes | 68 (18.2) | 305 (81.8) |
No | 710 (12.6) | 4924 (87.4) |
Time series plot
Density plot
The plots for all risk factors indicate that the coefficient has bimodal density and hence the simulated parameter values were converged (Fig.1 and Appendix: Fig. 2).
Autocorrelation plot
From Fig. 1 and Appendix: Fig. 2, we observed that the autocorrelation for all parameters become low when we consider a lag equal to 50. Thus, an independent sample can be obtained by rerunning the algorithm with thin set equal to lag 50. The plots show that independent chains were mixed or overlapped to each other which confirm its convergences.
Gelman–Rubin statistics
It is one way of checking convergence in Bayesian analysis. It can be applied only when multiple chains are used. Gelman–Rubin convergence Statistics with the width of the pooled green, the average width of within the individual runs blue and their ratio for plotting purposes the pooled within the interval width are normalized to have an overall maximum of one (Appendix: Fig. 2).
Results of classical approach
While the odds ratio (OR) is one of the most frequently used measures of association between a risk factor and an outcome in epidemiology, the risk ratio is important indices to quantify the strength of association between a given natural dental caries and a suspected risk factor. The main reason for the popularity of the OR is because the OR is the measure of association usually provided by logistic regression models. There is a large body of literature discussing the relationship between OR and RR. There is still an ongoing debate on the appropriateness of odds ratios versus prevalence ratios as measures of effect in retrospective cohort studies. It is known that the OR overestimates the RR when the outcome of interest is larger than 10%. The logistic model provided a better fit to the data relative to the log binomial and Poisson models, each of which can be problematic. Using a Poisson model with a robust standard error generally makes an adequate correction for the standard error. The log binomial model may fail to converge, which is not uncommon.
Model Summary for classical approach
Logistic | Robust Poisson | ||||||
---|---|---|---|---|---|---|---|
Variables | Estimate(S.E.) | OR | 95%CI | p-value | RR | 95%CI | p-value |
Intercept | 1.723 (0.203) | 5.600 | 1.3313, 2.1259 | < 2e-16 *** | 0.822 | −0.350, − 0.043 | 4.705e-10 *** |
Gender(ref = Female) | |||||||
Male | −0.2037 (0.0786) | 0.8157 | −0.3582,-0.0502 | 0.009502 ** | 0.974 | −0.081, 0.028 | 0.007982 ** |
Residence (ref = urban) | |||||||
Rural | −0.3701 (0.0813) | 0.6907 | −0.5290,-0.2103 | 5.30e-06 *** | 0.951 | −0.110, 0.009 | 1.536e-05 *** |
Region(ref = others) | |||||||
SNNPR | 0.4943 (0.0970) | 1.6393 | 0.3023,0.6827 | 3.47e-07 *** | 1.081 | −0.0012, 0.158 | 7.055e-06 *** |
Age group(ref = < 18) | |||||||
18–25 | − 0.48289 (0.12817) | 0.6170 | − 0.7387,-0.2357 | 0.000165 *** | 0.945 | −0.137, 0.0255 | 4.203e-05 *** |
26–35 | − 0.32040 (0.13420) | 0.7259 | − 0.5873,-0.0606 | 0.016969 * | 0.966 | −0.119, 0.051 | 0.015609 * |
> =35 | − 0.2884 (0.1376) | 0.7494 | − 0.56156,-0.0215 | 0.036091 * | 0.970 | − 0.117, 0.057 | 0.038829 * |
Clean Teeth(ref = no) | |||||||
Yes | 0.38855 (0.14164) | 1.4748 | 0.10399,0.6600 | 0.006082 ** | 1.061 | −0.0545, 0.177 | 0.017764 * |
The OR = 1.475 indicates that patients clean their teeth were 47.5% more likely to have natural dental caries compared to patients did not clean their teeth controlling for the other variables in the model. The result gives an OR = 0.8157, this indicates that, male are 0.8157 less likely to have natural dental caries than female.
Results of Bayesian approach
Model summary for Bayesian approach
Parameters | Mean(β) | S.E_{β} | MC error | Median | HPD | |
---|---|---|---|---|---|---|
2.5% | 97.5% | |||||
α(intercept) | 1.726 | 0.1976 | 0.003618 | 1.724 | 1.343 | 2.112 |
Gender(ref = Female) | ||||||
β1(Male) | −0.2038 | 0.07863 | 5.056E-4 | −0.2039 | − 0.3579 | − 0.04931 |
Residence (ref = urban) | ||||||
β_{2} (Rural) | −0.3699 | 0.08083 | 5.049E-4 | − 0.37 | − 0.5282 | − 0.2109 |
Region(ref = others) | ||||||
β_{3} (SNNPR) | 0.4957 | 0.09676 | 9.845E-4 | 0.4963 | 0.3043 | 0.6835 |
Age group(ref = < 18) | ||||||
β_{4}(18–25) | − 0.484 | 0.1279 | 0.001494 | −0.4827 | − 0.7383 | − 0.2373 |
β_{5}(26–35) | − 0.3205 | 0.1338 | 0.001507 | − 0.3193 | − 0.5867 | − 0.06207 |
β_{6}(> = 35) | − 0.2887 | 0.137 | 0.001513 | −0.2879 | − 0.5606 | − 0.022 |
Clean Teeth(ref = no) | ||||||
β_{7}(Yes) | 0.389 | 0.1395 | .002127 | 0.3909 | 0.1126 | 0.6593 |
Model comparison
The model comparison between classical approach and Bayesian approach
parameters | S.E. and Confidence interval for maximum likelihood estimators | SD and Credible interval for Bayesian estimator | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Estimate | S.E. | Interval Estimate (95%) | β | Sd | Credible Interval (95%) | |||||
Lower | Upper | Length | Lower | upper | length | |||||
α(intercept) | 1.723 | 0.203 | 1.3313 | 2.1259 | 0.795 | 1.7 | 0.1976 | 1.343 | 2.112 | 0.769 |
β1(Male) | −0.2037 | 0.078 | −0.358 | −0.050 | 0.308 | −0.204 | 0.078 | −0.3579 | −0.04931 | 0.308 |
β_{2} (Rural) | −0.3701 | 0.0813 | −0.5290 | −0.2103 | 0.319 | −0.3699 | 0.08083 | −0.5282 | − 0.2109 | 0.317 |
β_{3} (SNNPR) | 0.4943 | 0.097 | 0.3023 | 0.6827 | 0.380 | 0.4957 | 0.09676 | 0.3043 | 0.6835 | 0.379 |
β_{4}(18–25) | −0.4829 | 0.1282 | −0.7387 | − 0.2357 | 0.503 | − 0.484 | 0.1279 | − 0.7383 | −0.2373 | 0.501 |
β_{5}(26–35) | −0.3204 | 0.1342 | −0.5873 | −0.0606 | 0.527 | −0.3205 | 0.1338 | −0.5867 | − 0.06207 | 0.5246 |
β_{6}(> = 35) | −0.2884 | 0.138 | −0.5616 | −0.0215 | 0.540 | −0.2887 | 0.137 | −0.5606 | − 0.02299 | 0.538 |
β_{7}(Yes) | 0.38855 | 0.142 | 0.10399 | 0.660 | 0.556 | 0.389 | 0.1395 | 0.1126 | 0.6593 | 0.5467 |
Discussion
The prevalence of non-natural dental caries found in the present study was 13%. From this study we found that the odds ratio of being non-natural dental caries for males were higher than females. Similarly study done in Ethiopia about the prevalence of dental caries in North west Ethiopia showed the prevalence of dental caries was found to be different between male and female [26]. The highest proportion of dental caries is observed in the age group 18–25 on the other hand, the lowest proportion of dental caries in the age group < 18 which is supported by the study [9]. The urban patient is more likely to dental caries than rural patient. The reason could be patient who lives in urban areas tend to use more sweet consumption than rural patient. The paper [27] which shows that there are differences in oral health related behavior between urban and rural residences confirms our study. The prevalence of daily use of tooth picks was consistently and significantly higher among more urban than rural residence.
Conclusions
In this study we tried to show the performance of Bayesian logistic regression over the classical logistic regression. The factors Age, gender, region, place of residence and habit of cleaning teeth were associated risk factors for dental caries. A comparison of the classical and Bayesian approach logistic regression reveals lower standard errors of the estimated coefficients in the Bayesian logistic regression approach. At the same time in Bayesian approach were used and compare with method of maximum likelihood and found that the length of the Bayesian credible interval is smaller than the length of the confidence interval for all factors.
Notes
Acknowledgements
The authors would like to thank the Director of Hawassa Haik Poly Higher Clinic for permission to conduct this study and to publish this paper. We also would like to express our highest gratitude to the staff of the Hawassa Haik Poly Higher Clinic to collected the data.
Funding
Not applicable.
Availability of data and materials
The datasets generated during the current study are available from the corresponding author on reasonable request.
Author’s contributions
MSW analyzed and interpreted the data and designed the study, collected the data and contributed to writing the manuscript. DBB participated in data analysis and interpretation and review of the manuscript. Both authors participated in the preparation of the manuscript and approved the final manuscript.
Ethics approval and consent to participate
This study was approved by research, evaluation and the ethical review committee of Hawassa University, informed verbal consent was obtained from all study subjects since the issue is not culturally sensitive and the consent procedure was approved by the research, evaluation and the ethical review committee.
Consent for publication
Authors prove consent of publication for this research.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Diouf M, et al. Dental caries and associated determinants among students of the military School of Saint Louis (Senegal). Open Journal of Epidemiology. 2017;7(04):299.CrossRefGoogle Scholar
- 2.Muller M, et al. Epidémiologie de la carie dentaire. Encycl Méd Chir Odontologie. 1997;1:23–10.Google Scholar
- 3.Organization, W.H., Oral health fact sheet. n. 318, April 2012, 2012.Google Scholar
- 4.Bagramian RA, Garcia-Godoy F, Volpe AR. The global increase in dental caries. A pending public health crisis. Am J Dent. 2009;22(1):3–8.PubMedGoogle Scholar
- 5.Kathmandu RY. The burden of restorative dental treatment for children in third world countries. Int Dent J. 2002;52(1):1–9.CrossRefGoogle Scholar
- 6.Petersen PE, et al. The global burden of oral diseases and risks to oral health. Bulletin of the World Health Organization. 2005;83:661–9.PubMedPubMedCentralGoogle Scholar
- 7.Marsh PD. Are dental diseases examples of ecological catastrophes? Microbiology. 2003;149(2):279–94.CrossRefGoogle Scholar
- 8.Berhane HY, Worku A. Oral health of young adolescents in Addis Ababa—a community-based study. Open Journal of Preventive Medicine. 2014;4(08):640.CrossRefGoogle Scholar
- 9.Organization. W.H., Prevention Methods and Program for Oral Diseases WHO Technical Report Series 713. Geneva: WHO; 1984.Google Scholar
- 10.Teshome A, Yitayeh A, Gizachew M. Prevalence of Dental Caries and Associated Factors Among Finote Selam Primary School Students Aged 12–20 years, Finote Selam Town, Ethiopia. Age. 2016;12(14):15–7.Google Scholar
- 11.Mulu W, et al. Dental caries and associated factors among primary school children in Bahir Dar city: a cross-sectional study. BMC research notes. 2014;7(1):949.CrossRefGoogle Scholar
- 12.Tafere Y, et al. Assessment of prevalence of dental caries and the associated factors among patients attending dental clinic in Debre Tabor general hospital: a hospital-based cross-sectional study. BMC oral health. 2018;18(1):119.CrossRefGoogle Scholar
- 13.Moses J, Rangeeth B, Gurunathan D. Prevalence of dental caries, socio-economic old school going children of chidambaram status and treatment needs among 5 to 15 year old school going children of chidambaram. J Clin Diagn Res. 2011;5(1):146–51.Google Scholar
- 14.Zhang S, et al. Dental caries status of Bulang preschool children in Southwest China. BMC Oral Health. 2014;14(1):16.CrossRefGoogle Scholar
- 15.Okoye. L., caries Experience among Schoolchildren in South-eastern Nigeria: 15. Caries Res. 2010;44(3):177.CrossRefGoogle Scholar
- 16.Khan AA, Jain SK, Shrivastav A. Prevalence of dental caries among the population of Gwalior (India) in relation of different associated factors. European journal of dentistry. 2008;2:81.PubMedPubMedCentralGoogle Scholar
- 17.Treasure E, et al. Factors associated with oral health: a multivariate analysis of results from the 1998 adult dental health survey. Br Dent J. 2001;190(2):60.CrossRefGoogle Scholar
- 18.Hosmer DW Jr, Lemeshow S, Sturdivant RX. Applied logistic regression. Vol. 398: John Wiley & Sons; 2013. https://www.wiley.com/en-us/-p-9780470582473.
- 19.Wilson JR, Lorenz KA. Modeling binary correlated responses using SAS, SPSS and R. Vol. 9: Springer; 2015. https://www.springer.com/gp/book/9783319238043.
- 20.Janani L, et al. Statistical Issues in Estimation of Adjusted Risk Ratio in Prospective Studies. Arch Iran Med. 2015;18(10)P.713–9.Google Scholar
- 21.Petersen MR, Deddens JA. A comparison of two methods for estimating prevalence ratios. BMC Med Res Methodol. 2008;8(1):9.CrossRefGoogle Scholar
- 22.Gelman A, et al. Bayesian data analysis: Chapman and Hall/CRC; 1995. http://www.stat.columbia.edu/~gelman/book/.
- 23.Ntzoufras I. Bayesian modeling using WinBUGS. Vol. 698: John Wiley & Sons; 2011. www.stat-athens.averb.gr/~jbn/winbugs-book.
- 24.Albert J. Bayesian computation with R: Springer Science & Business Media; 2009. https://www.springer.com/gp/book/9780387922973.
- 25.Walsh B. Introduction to Bayesian analysis. Lecture notes for EEB. 2002;1:596z.Google Scholar
- 26.Ayele FA, et al. Predictors of dental caries among children 7–14 years old in Northwest Ethiopia: a community based cross-sectional study. BMC Oral Health. 2013;13(1):7.CrossRefGoogle Scholar
- 27.Blay D, Åstrøm AN, Haugejorden O. Oral hygiene and sugar consumption among urban and rural adolescents in Ghana. Community Dent Oral Epidemiol. 2000;28(6):443–50.Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.