1 Significance

Child well-being may increase the probability of child survival and improve child health, and it is important to conduct research on child health-related issues and identify risk factors that have a significant impact on child morbidity. This study helps to fill the gap left by previous studies. In addition, this study used Bayesian inference with informative priors on child morbidity in Bangladesh. In this case, we used historical priors as an informative prior distribution, which may provide an improved estimate. Therefore, the key message of this study was to use the historical prior distribution as prior information to increase the accuracy of the study results.

2 Introduction

Child morbidity, the perception of being unwell as a result of specific conditions or illnesses. This term pertains to the prevalence of health issues that affect the well-being of children [1]. Infectious diseases, such as pneumonia, diarrhea, and malaria, are the leading causes of global under-five child deaths [2]. Alarming statistics reveal that two-thirds of global child mortality occurs in underdeveloped countries [3]. As a result, in order to reduce global child mortality, the urgent need to address child morbidity in low and middle income countries is evident, particularly as these region are struggling to meet the Sustainable Development Goal (SDG) target related to child mortality reduction (target 3.2), which aims to bring child deaths per thousand live births down to 25 by the year 2030 [4].

While there has been a notable decline in global under-five child mortality in recent decades [5], the progress in developing and under-developed regions, such as Africa and Bangladesh, has not been satisfactory [6]. Reports from the World Health Organization indicate that the Sub-Saharan Africa and the South Asia bear the burden of approximately 80% of all child deaths worldwide [5]. A recent study conducted in Tanzania stated that, 63 child deaths per thousand live births occurred in the 2016 estimate in Tanzania, though it was declined by 42% [7]. A separated study conducted in Tanzanian, drawing data from 35 hospitals, identified respiratory distress as the primary cause of early neonatal death, accounting for approximately 21% of cases [8].

Bangladesh, a country facing the challenges of child morbidity, experiences a substantial number of child deaths annually. In Bangladesh, a popular national representative survey named “Bangladesh Demographic and Health Survey” estimated that approximately 45 children died before their fifth birthday [9]. Pneumonia is accounts for 19% of child deaths per year in Bangladesh [10]. According to another Bangladeshi study, nearly 17% of children under the age of five in Bangladesh were reported to have diarrhea, and more than 20% of children under the age of five were reported to have colds and fever [3]. Fever was yet another typical pediatric condition that was considered to be a major burden on the global public health system [11]. According to this literature, there is no doubt that comprehensive work and continuous efforts are important to ensure further reductions in child mortality and improve the state of child well-being in order to achieve the SDG goals.

Child wellbeing may increase the probability of child survival and improving child health is one of the major policy agendas for the governments, especially in developing countries [12]. As a result, it is critical to conduct research on child health-related issues and to identify risk factors that have a significant impact on child morbidity. Different studies were conducted on the prevalence of child morbidity and its significant factors by using traditional or classical binary logistic regression model [13, 14]. Additionally, a recent study in Bangladesh adopted a machine learning logistic classifier to identify the risk factors of child morbidity [15]. However, there is a possibility of bias in maximum likelihood estimation in a classical estimation approach with small size of samples [16]. In this situation, the Bayesian approach can be used as an alternative.

Based on previous studies, Bayesian inference may produce a more accurate estimate in the presence of prior information than the classical likelihood estimation procedure [17]. While many researchers often opt for flat or non-informative priors when employing the Bayesian approach, our study took a different approach by utilizing informative priors [18, 19]. By incorporating informative priors derived from historical data, we were able to enhance the model’s accuracy and achieved improved estimation of parameter values. As a result, our Bayesian analysis yielded more robust and reliable results compared to traditional classical inference methods [20]. Despite various research efforts in the field of child morbidity, there is a noticeable research gap concerning the application of Bayesian inference to explore child morbidity in Bangladesh. In light of this gap, the primary objectives of this study are twofold: first, to investigate the impact of demographic determinants on child morbidity, and second, to identify the risk factors through Bayesian logistic regression utilizing an informative prior distribution.

Understanding the complexities of child morbidity and its influencing factors in crucial for effective policy formulation and targeted interventions. By addressing child morbidity, especially in vulnerable regions like Bangladesh, we can create a significant impact on global child mortality rates. This research endeavors to contribute valuable insights into the realm of child health, supporting efforts to achieve the SDG goals and promoting well-being of children worldwide.

3 Materials and methods

3.1 Data source

The Bangladesh Demographic and Health Survey, 2017–18 nationally representative survey data were used to conduct this study, which was funded by the United States Agency for International Development (USAID) in Bangladesh. The valuable data can be accessed through the following link: https://dhsprogram.com/data/available-datasets.cfm.

3.2 Sample design and sample size

The authorities of the Bangladesh Demographic and Health Survey used stratified sampling design in two stages to conduct this survey. In the first stage, 675 Enumeration Areas (EAs) were selected, and 30 households from each EA were selected in the second stage of this survey. The BDHSs covered 20,250 residential households from the survey of 2017–2018. From these sampled households, 8421 mothers/caretakers with under-five children (who were alive) were interviewed.

3.3 Dependent variable

Child morbidity among under-five children is measured using three types of child health problems, namely fever, diarrhea, and ARI. Since the target respondents for this study were mothers/caretakers with under-five children, all respondents were asked questions related to these three types of child illness. Table 1 describes these health-related problems with response and code.

Table 1 The module of under-five child morbidity

When a child is suffering from at least one of these problems, then he/she can be distinguished as a “Morbidity child”, otherwise “No” [15]. That is, the main outcome variable of this study was a binary response variable, which can be written as:

$$Child\,Morbidity=\left\{\begin{array}{l}1;\,Yes, child\,is\,being\,experienced\,at\,least\,one\,health\,problem \\ 0;\, No,\, child\, is\, not\, being\, experienced\, any\, of\, these\, health\, problems\end{array}\right.$$

3.4 Independent variable

A set of sociodemographic risk factors was considered that could determine the morbidity of children under five years of age in Bangladesh, which was illustrated in Table 2 in detail.

Table 2 Description of independent variables

3.5 Statistical analysis and software

This study split the analysis section into three units. These are: simple descriptive analysis, bivariate analysis, and multivariate analysis. The percentage distribution of the study variables is included in the descriptive analytical section. Bivariate analysis was conducted to examine the association between selected variables and child under-five morbidity. The chi-square test statistic was employed for this analysis, which measures the degree of association between two categorical variables. The chi-square test statistic \(({\chi }^{2})\) can be defined as:

$${\chi }^{2}=\sum \frac{{(Observed\,frequecy-Expected\,frequency)}^{2}}{Expected\,frequency} \sim {\chi }_{(r-1)(c-1)}^{2}$$

where r is the number of categories for the independent variable and c is the number of categories for the dependent variable.

In the multivariate setup, the effect of an independent variable on the child morbidity status among under-five children can be determined using logistic regression. Here, Bayesian logistic regression with an informative prior was used to identify risk factors for under-five morbidity in Bangladesh.

It is well known that the popular traditional binary logistic regression model is used to explain the probability of a binary response in terms of a function of some covariate [13, 14]. In Bayesian binary logistic regression, leveraging prior knowledge allows us to attain more precise estimates compared to the traditional logistic regression model estimation approach [16]. That is, Bayesian binary logistic regression is conducted based on the Bayes theorem, which can be defined as,

$$f\left(\beta |{Y}_{i},{X}_{ip}\right)=f({Y}_{i}|\beta )\times f(\beta )$$

Here, \(f\left(\beta |{Y}_{i},{X}_{ip}\right)\) be the posterior distribution of the parameters \(\beta\), \(f({Y}_{i}|\beta )\) be the likelihood function represent the probability of observing the data \(({Y}_{i})\) given specific values of the unknown parameters \((\beta )\), and \(f(\beta )\) be the prior distribution represents our initial beliefs or knowledge about the unknown parameter \(\beta\), which can be extract from previous study. In this study child morbidity (\({Y}_{i}\)) be the binary dependent variable, and \({X}_{i1},\dots ,{X}_{ip}\) be a set of independent variables, and it has a Bernoulli distribution with parameter \({\pi }_{i}\). So, the link function can be written as,

$$logit\left({\pi }_{i}\right)=log\frac{{\pi }_{i}}{{1-\pi }_{i}}={\beta }_{0}+{\beta }_{1}{X}_{i1}+\dots +{\beta }_{p}{X}_{ip}$$

where,

$${\pi }_{i}=\frac{\mathrm{exp}\,({\beta }_{0}+{\beta }_{1}{X}_{i1}+\dots +{\beta }_{p}{X}_{ip})}{1+\mathrm{exp}\,({\beta }_{0}+{\beta }_{1}{X}_{i1}+\dots +{\beta }_{p}{X}_{ip})}$$

Using the value of \({\pi }_{i}\), the likelihood function \(f({Y}_{i}|\beta )\) can be written as (for a given sample size \(n\),

$$f\left({Y}_{i}|\beta \right)=\prod \left(\begin{array}{c}{M}_{i}\\ {Y}_{i}\end{array}\right){\left({\pi }_{i}\right)}^{{Y}_{i}}{\left(1-{\pi }_{i}\right)}^{{M}_{i}-{Y}_{i}}$$

Here \({\beta }_{i}\) be the unknown parameters which can be estimate by using prior distribution. There are two types of prior information in Bayesian analysis, i.e., a) flat/ non-informative prior, b) informative prior. In this study we use informative prior, termed as “Historical Prior distribution” which can be extract from previous survey data (i.e., Bangladesh Demographic and Health Survey, 2014, which is available at https://dhsprogram.com/data/available-datasets.cfm. This historical prior may improve the precision of the unknown parameter estimate [20, 21]. The most common priors for logistic regression parameters are used, which are of the form

$${\beta }_{j}\sim N\left({\mu }_{j},{\sigma }_{j}^{2}\right).$$

To identify the value of \(\mu\) and \({\sigma }_{j}^{2}\), this study applies maximum likelihood estimation procedure to estimate the unknown parameter from previous survey dataset, and then the parametric bootstrap resampling procedure was used for the efficient computation of Bayes prior distributions [22].

In practical, directly calculating the posterior distribution may not be analytically feasible, especially for complex models. Therefore, numerical methods like Markov Chain Monte Carlo (MCMC) simulation approximation are often used to obtain the marginal posterior distribution for unknown parameters. MCMC methods generates samples from the posterior distribution, allowing researchers to make statistical inferences and estimate model parameters. For a deeper understanding of the numerical methods used to calculate and determining the posterior distribution in Bayesian statistics, we recommend reading the article [23].

For the simulation, convergence was achieved after 150,000 iterations per chain, following a burn-in period of 500 iterations, and thinning of every 99th element for each model. The total number of Markov chains utilized in the simulation was 4. In this study, we present parameter estimates as odds ratios with 95% credible interval. Credible intervals, in Bayesian statistics, signify a specific level of confidence or uncertainty in parameter estimates, directly derived from the posterior distribution, incorporating prior knowledge and observed data.

The classical MCMC approximation was implemented easily through STATA 16 with “bayesmh” package and estimate the marginal posterior distributions for each parameter.

4 Results

4.1 Prevalence of child morbidity

Figure 1 shows regional prevalence of child morbidity in Bangladesh. Out of the total sample size of 8421 mothers/caretakers with under-five alive children, 35.6% were affected by child morbidity conditions. Fever was the most common condition, affecting 33.1% of the children, followed by diarrhea (4.7%) and acute respiratory infection (ARI) (3.0%). Among the divisions, Barisal had the highest prevalence of child morbidity at 41.8% as well as on other causes.

Fig. 1
figure 1

Spatial illustration of prevalence of child morbidity in Bangladesh

4.2 Association between socio-demographic factors on child morbidity

Table 3 presents the percentage distribution of the selected socio economic and demographic characteristics among the 8421 participants. The majority of children were aged 12–35 months (39.9%) and 36–59 months (38.8%). Slightly, over half of the child were male (52.1%) and 47.9% of them were female. Most of the children didn’t show any evidence of being underweight (78.2%) while only 21.8% did. An overwhelming majority of them were born alone (98.4%). Regarding maternal education, 48.5% of the children’s mothers had a secondary level of education, and 28.6% had primary education. The majority mothers were not working (59.4%). In terms of wealth status, 41.6% of the children were from poor families, and 39.6% were from rich families. The majority of the children came from Muslim families (92.0%). Most of the child’s family has mass media access (54.7%) while 45.3% of them didn’t have the facility. Over two-third of the child were from rural residence (72.6%) and 27.4% of them were from urban residence. Furthermore, according to the \({\chi }^{2}\) test of independence, Table 3 shows that child morbidity was statistically significantly associated with child age and sex, underweight status, maternal education, wealth status, religion, and residence of Bangladesh.

Table 3 Percentage distribution and association between selected covariates and child morbidity among under-five children in Bangladesh

4.3 Identify factors contributing to child morbidity using Bayesian logistic regression

This study tries to fit a Bayesian binary logistic regression model, which is a mixture of the likelihood function and the prior distribution. This study used an informative prior distribution from the previous survey and was termed "historical prior distribution". To extract this historical prior, this study used the Bangladesh Demographic and Health Survey, 2014 survey data set, and developed the posterior distribution. This study used the Markov Chain Monte Carlo (MCMC) simulation via the Metropolis–Hastings Algorithm, where the number of Markov chains was 4, and the convergency of this simulation was reached after 150,000 iterations (per chain) after 500 burn-in samples and thinning of every 99th element of the chain.

This study concludes from the Bayesian inference results in Table 4 that child age was positively and significantly associated with child morbidity. Children with less than 12 months of age were 1.51 times more likely to get affected by child morbidity situations, while children aged 12–35 months had nearly 48% more chance (OR = 1.48, 95% CI: 1.36, 1.58) compared to the children of the 36–59 months age group.

Table 4 Odds ratios (OR) and 95% credible interval (95% CI) from Bayesian (informative prior) logistic regression

Children’s sex also has a positive significant association with child morbidity whereby male children were 1.07 times more likely to be affected by child morbidity diseases than female children. And children who were underweight had 31% more chances (OR = 1.31, 95% CI: 1.21, 1.40) of being affected by child morbidity conditions compared to children who were not underweight.

Maternal education also showed a positive, significant association with child morbidity. Children with an uneducated mother had a 21% higher risk of developing child morbidity conditions (OR = 1.21, 95% CI: 1.04, 1.40). And in the case of children with primary or secondary level educated mothers, both have a 1.30 times greater chance of being affected by child morbidity conditions.

Results from our Bayesian inference showed that children from Muslim families were 25% more likely to deal with child morbidity conditions (OR = 1.25, 95% CI: 1.11, 1.39) compared to children from families with other religious perspectives.

To check the evaluation of the convergence of the MCMC sampling algorithm, trace plots of the samples can become very useful and the fluctuation in the plots show the balance of the numerical distribution of the parameter. Figure 2 shows the posterior distribution for the coefficients of the model parameters under the normal prior (informative) distributions. To see how well the samples are mix, we can use MCMC trace plots of the model parameters which is also a good indicator that the Bayesian inference is successful as well as the informative prior distributions used are accurate, and the Markov chain has converged. In order to assess MCMC convergence, the Gelman-Rubin diagnostic compares the results of various Markov chains. If the Gelman-Rubin diagnostic statistics (\({R}_{c}\)) is less than 1.1 for all model parameters (\(\beta\)), one can be fairly confident that convergence has been reached [24]. In this study, the values of \({R}_{c}<1.1\) for \({\beta }_{i};i=1,\dots ,11\). So, we now confident without any doubt that the Markov chain has converged. Here, \({\beta }_{1}=<12 months\,children\), \({\beta }_{2}=12-35 months\,children\), \({\beta }_{3}=Male\,children\), \({\beta }_{4}=Underweight\,children\), \({\beta }_{5}=Uneducated\,mother\), \({\beta }_{6}=Primary\,educated\,mother\), \({\beta }_{7}=Secondary\,educated\,mothe\) r, \({\beta }_{8}=Poor\,wealth\,status\), \({\beta }_{9}=Middle\,wealth\,status\), \({\beta }_{10}=Muslim\,family\), and \({\beta }_{11}=Rural\,residence\).

Fig. 2
figure 2

Trace plots for model parameters of the Bayesian logistic regression model with informative prior

5 Discussion

In recent decades, Bangladesh has made significant progress in improving child health sector [25]. As a result, the child mortality rate has been reduced by more than 65% in the period between 1990 and 2017 in Bangladesh [26]. Despite the progress made in Bangladesh, the reduction in child morbidity rate has not been successful. From that point of view, child health issues are still a major challenge for Bangladesh and need development in this part. This study was mainly done with this challenge in mind, and set the objectives accordingly. The main aim of this study was to identify the risk factors related to child morbidity with the help of informative prior and describe their impact, which will help policymakers in decision making. In this case, we applied the Bayesian logistic regression model with the MCMC simulation method and showed the trace plot of this model, which indicates that the Bayesian inference is successful as the informative prior distributions used are accurate and the Markov chain has converged.

This study revealed that more than one-third of children in Bangladesh suffer from fever or diarrhea or ARI related problems. These findings align closely with the results of a previous study conducted in Bangladesh, reaffirming the persistent prevalence of these health conditions among children [27]. However, to effectively address this issue, it is crucial to delve deeper into the potential factors of these high morbidity rates. Our study identified higher child morbidity rate in the Barisal division. This elevated rate can be attributed to inequalities in healthcare utilization for common illnesses among under-five children [28].

The findings of this study demonstrated various socio-demographic risk factors that have a significant impact on child morbidity in Bangladesh. This study’s findings indicate that the age of children has a notable impact on child morbidity. Compare to 36–59-month age group children, < 12 months or 12–35-month children were suffering more at least one of the child health problems. This discovery was concurrent with various previous study [29, 30]. One possible reasons for higher morbidity rates for < 12 months children is that they have more frequent contact with others, increasing their exposure to infectious agents [31]. Another significant finding of this study was, male children had higher risk of morbidity than female counterparts. Compare with previous studies which were done by nationally representative survey data also found the approximately similar results in their analysis [15, 32]. Research has shown that in humans, females tend exhibit humoral and cellular immune responses to infections or antigenic stimulation compared to males [33].

Child nutrition serves as a crucial indicator of child health outcomes, as well-nourished children generally experience better overall health. This study found from Bayesian logistic regression model that underweight children had higher odds of experiencing child morbidity compared to their well-nourished counterparts. Previous study conducted in Cambodia showed that lower the nutrition status of children increase the risk of illness [34].

Maternal education always and important demographic indicator of child health outcome. According to this study’s finding, children born from lower educated mothers had high risk of suffering from fever/ diarrhea/ ARI than children from highly educated mothers. It is natural that educated mothers are most aware of their child’s health, preventive care and effective use of modern health services [35]. This finding was consistent with another previous study conducted in Africa [36], Nepal [37].

Furthermore, this study found Muslim religious mother children had higher risk of morbidity than non-Muslim religious mother. A recent study conducted using 15 sub-Saharan African countries and demonstrated that Muslim had lower immunization coverage for all their children [38].

6 Strengths and limitations of the study

The main strength of this study was that the data came from a large and nationally representative survey. Another theoretical strength of this study was fitted a Bayesian logistic regression model by using informative prior which was extract using historical survey data set from Bangladesh. To ensure the accuracy and reliability of the prior distribution, a resampling technique known as “bootstrapping” was utilized, which allowed for precise estimation of the mean and variance from historical data. Researcher from Bangladesh incorporate historical prior and revealed improved estimates compared to flat prior distribution [20]. While our study provides valuable insights into child morbidity risk factors in Bangladesh, several limitations should be acknowledged. First, the cross-sectional design limits our ability to establish causality between the identified risk factors and child morbidity. Future research should consider longitudinal or experimental design for causal inference. Additionally, owing to data limitations, this study is unable to incorporate several important factors such as, environmental factors, healthcare utilization pattern and so on.

7 Conclusion

This study concludes that child morbidity is a serious public health issue in Bangladesh. The Binary logistic regression with Bayesian MCMC simulation tries to identify the risk factors using previous study results. Children age, sex, nutrition, maternal education, wealth status are important determinants which influence the child morbidity in Bangladesh. Therefore, it is necessary to increase the child health related programs to prevent the risk of children disease. Based on our study findings, we recommend implementing health education and awareness campaigns for parents and caregivers, focusing on child nutrition, and promoting timely healthcare-seeking behaviors to reduce morbidity rates. Additionally, enhancing maternal education is crucial. Since this study used bootstrap algorithm to extract the prior distribution from BDHS, 2014 survey data, further research should explore the different types of prior distribution by using Bayesian meta-analysis [39].