Background

The discrepancy in IMR and low life expectancy of the SSA versus the other parts of the world attracts several researchers. The report of the World Bank in 2011 pointed that the IMR was 75/1000 in SSA versus 11/1000 in developed countries [1]. The same report pointed that half of the ten million children who die every year is in SSA. The World Bank dataset from 1960 to 2005 suggests that low life expectancy at birth in SSA is relatively higher in Middle Africa as compared to other sub-regional disparities of SSA [2]. The World Bank records of 2017 indicated that the IMR was 51.50/1000 in SSA [3]. Central African Republic had the highest IMR of 87.60/1000, the lowest IMR were found in Mauritius (11.60/1000), the IMR in Rwanda was 28.90/1000. Several studies on factors that could lower the infant mortality have been done and recommendations were suggested but the IMR remains a problem in SSA.

The multiple events model for infant mortality at the Kigali University Teaching Hospital analysed in [4] leaves a question on whether the adopted model is stable. The main causes of instability may be the correlation of the covariates or relatively small sample size [5]. One of the ways of assessing instability in survival regression models is a use of re-sampling techniques [6]. The analysis in [4] is a none re-sampled model that used the primary dataset of the year 2016. Two observable events per subject are death and the occurrence of at least one of the common conditions that may also cause the long-term death to infants. It was found that the Marginal Risk Set Model (MRSM) also known as the Wei, Lin and Weissfeld Model (WLWM) fit the data well. The WLWM is among the multiplicative methods for analysing ordered events found in [7]. Other multiplicative models include the Andersen-Gill Model (AGM) and the Prentice, Williams and Peterson Model (PWPM) [8].

The present study uses two popular nonparametric methods of re-sampling namely bootstrap which is based on the random samples with replacement [9], and jackknife method that is based on sampling by leaving out one observation at time [9]. The size of the sample in [4] is 2117 and the record is effective in the year 2016. The long-term results could be assumed according to the stability potentially observed after re-sampling. Several manuscripts on re-sampling in survival analysis are limited on the re-sampled Cox proportional hazards model and on estimating standard errors of the survival and hazard functions such as in [6, 10,11,12,13] where bootstrap is involved [13,14,15,16]; in which the jackknife is implicated or [17,18,19,20,21,22] where hazard and survival functions with their respective standard errors are of interest. The present study analyses the bootstrap-based MRSM with 1000 replicates and the jackknife-based MRSM. The results are then compared to that of the MRSM.

Methods

Dataset

The time to event data of 2117 newborns at the KUTH is recorded from the 1st January to the 31st December 2016. At KUTH, all newborns are recorded in registries with all details of parents and clinical outcomes of each newborn. The information in registry provides references on card indexes that provide information on clinical behavior of babies after leaving the hospital. KUTH as a site of interest in this study is a central Hospital where most of complicated childbirths countrywide are transferred. In 2016, KUTH recorded relatively high incidence of stillborn cases (69 stillborn babies or 3.259%) and relatively high infant mortality rate (3.873%). Table 1 summarises the information on newborns at KUTH along the study time.

Table 1 Summary on newborns under study

The study is interested on subjects with a correct information on the covariates of interests. The two events per subject are observed namely the death and the incidence of at least one chronic disease or complication such as severe oliguria, severe prematurity, very low birth weight, macrosomia, severe respiratory distress, gastroparesis, hemolytic, trisomy, asphyxia and laparoschisis. Apart from the event status and the time to event, 11 covariates are recorded and subdivided in demographic covariates which include the age and the place of residence for parents; clinical covariates for female parents that include obstetric antecedents, type of childbirth and previous abortion. Clinical covariates for babies include APGAR; gender, number of births at a time, weight, circumference of the head, and height. Table 2 gives a description of the variables of interest.

Table 2 Description of variables in the dataset on newborns at Kigali University Teaching Hospital (KUTH) during the period 01-January-2016 to 31-December-2016

Statistical methods

Marginal risk set model

Assume that h(t|xi) is the hazard function of the survival time T given the p fixed covariates xi = (xi1, xi2,. .., xip). Let h0(t) be the hazard function when xi = (0, 0,. .., 0) for all i, then

$$ h\left(t|{\mathbf{x}}_i\right)={h}_0(t)\ \exp \left({\boldsymbol{\beta}}^{`}{\mathbf{x}}_i\right) $$
(1)

where β = (β1, β2,. .., βp) is a p-dimensional vector of model parameters [23]. Define an indicator function as.

δij(t) = 1 if individual i is at risk of the jth event and δij(t) = 0 otherwise.

The marginal risk set model (MRSM) or the Wei Lin and Weisfeld Model (WLWM) assumes that events are unordered where each event has its own stratum and each data point appears in all strata [4, 24]. In other words, the kth time interval per subject is in the kth stratum, k = 1, 2,. .., n.

The hazard function for the jth event for the individual i is given by

$$ h\left(t|{\mathbf{x}}_i\right)={\delta}_{i\ j}(t){h}_{0\ j}(t)\ \exp \left({{\boldsymbol{\beta}}^{`}}_j\ {\mathbf{x}}_i\right) $$
(2)

Maximum likelihood and parameter estimation

Let]0, τi [be the interval of time in which the individual i is observed with ni the number of events of the individual i along]0, τi [and Assume that two events cannot occur simultaneously in continuous time. The probability density function for the outcome ni along]0, τi [is given by.

L(Φ) \( =\prod \limits_{i=1}^n{L}_i\left(\varphi \right) \)

where

$$ {L}_i\left(\varphi \right)=\prod \limits_{j=1}^{n_i}h\left(t|{x}_i\right){e}^{-\underset{0}{\overset{\tau_i}{\int }}{\delta}_{ij}(v)h\left(v|{x}_i\right) dv}. $$
(3)

In (3), individual i has ni events with ni ≥ 0 at times ti1 ≤ ti2 ≤ · · · ≤ tini .

The appropriate partial likelihood functions for tied time to event data is well described in [24] and in [25] and include Breslow’s, Efron’s and Cox’s techniques. The maximum likelihood estimates are given by a system

$$ \Big\{{\displaystyle \begin{array}{c}\frac{\partial \ln L\left(\varPhi \right)}{\partial \alpha}\\ {}\frac{\partial \ln L\left(\varPhi \right)}{\partial \beta}\end{array}} $$
(4)

where α is known as the baseline parameter vector while β is a vector of model parameters. The Newton-Raphson method is one of numerical methods used for solving system (4). The adequacy checking of the likelihood estimates is done by finding the elements αα, αβ, βα and ββ of the information matrix and assume that as \( n\to \infty, \hat{\varPhi}-\varPhi \mapsto N\left(0,{\Im}^{-1}\left(\hat{\varPhi}\right)\right) \) [4, 26].

In MRSM, n is assumed to be the maximum number of events per subject while τk, k = 1, 2, ...n are times to events per subject along the study time with range [0, T]. The study time is partitioned into n + 1 intervals of the form

$$ 0-{\tau}_1,0-{\tau}_2,...,0-{\tau}_n,0-T. $$
(5)

STATA 15 provides results of the MRSM by applying the Cox Proportional Hazards Model (CPHM) to the dataset in the setup (5). The test of proportional hazards assumption is done by checking patterns of survival functions per groups of each covariate. Figure 1 presents the patterns of survival functions per groups of each covariate using Kaplan-Meier estimation. The patterns are approximately parallel for the covariates of interest. This allows a construction of the MRSM for all the covariates.

Fig. 1
figure 1

Plots of the survival function per groups of covariates

Re-sampled MRSM

The Bootstrap Marginal Risk set Model (BMRSM) is the inference of model (2) based on bootstrap samples (see Appendix). The BMRSM consists of applying model (2) to each of the B bootstrap samples xi*k, ∀k ∈ [1, B] of covariates xi, ∀i ∈ [1, p]. Bootstrap model parameter estimation in presence of tied events uses either Breslow, Efron or Cox approach. The bootstrap standard error is obtained by using Eq. (6) of the Appendix.

As for the BMRSM, the Jackknife Marginal Risk Model (JMRSM) consists of applying model (2) to each of the n jackknife samples xi*k of covariates xi, i ∈ [1, p] with a use of Breslow, Efron or Cox approach for estimating the jackknife model parameters. The Jackknife standard error is given by Eq. (7) found in the Appendix.

Results

Using Breslow estimation [27], Table 3 presents unadjusted MRSM, BMRSM, JMRSM and corresponding adjusted models. Unadjusted and adjusted MRSM, BMRSM and JMRSM are also presented in Tables 4 and 5 for Efron [28] and Cox estimation [29].

Table 3 Breslow estimation
Table 4 Efron estimation
Table 5 Cox estimation

The results of the unadjusted JMRSM are relatively close to that of the unadjusted MRSM (Table 3). The standard errors in JMRSM and MRSM are close for all covariates. The standard errors in BMRSM and MRSM are also close for covariates except for all levels of covariates childbirth where the standard error in BMRSM is about 4 times that of MRSM and the upper levels of covariates weight, head and height where the standard error in BMRSM is about 20 times that of MRSM. Significance difference in levels of covariates is found at the same covariates for both MRSM, BMRSM and JMRSM except at the upper level of the covariate abortion where significance is suggested by the MRSM. Following the recommendations of Parzen and Lipsitz [30], the χ2 test statistics suggest a higher performance of the JCPHM as compared to the CPHM and BCPHM since the χ2 is relatively everywhere lower for the JCPHM..

Discussion

The overall results of MRSM, BMRSM and JMRSM by different approaches of ties handling (Tables 3, 4 and 5) are not critically different as expected. The STATA default method (Breslow) is then of interest in the analysis. The JMRSM is adopted for checking stability since the results are closer to that of MRSM than that of BMRSM. The similarity between MRSM and JMRSM suggests that the MRSM may be stable. The global analysis upholds the significance difference of all levels of covariates age, gender, number and APGAR and intermediate levels of covariates weight and head.

The re-sampled adjusted models by Breslow technique of handling tied events suggest that the risk of death or attracting a chronic disease of babies whose parents’ age range from 20 to 34 years old is lower than that of babies whose parents are under 20 years old and that of babies whose parents are 35 years and above. Basinga et al. [31] argue that the unintended pregnancy induces abortion in Rwanda, their study suggests a relatively higher rate of teenage unintended pregnancies as compared to the other age ranges, this contributes on the first hand, to the increase of infant mortality rate. On the second hand, the study by Olausson et al. [32] confirms a relatively higher risk for teenage pregnancies due to biological immaturity. As for the advanced maternal age, Lampinen et al. [33] point that it is associated with relatively poorer outcomes to pregnancies due to the observed higher incidence of chronic medical conditions among older women.

The results show that the risk for male babies is higher than that of female babies. This complies with the usual better survival outcome of the females as reports several manuscripts such as [34] or [35]. Multiple babies survive better than singleton babies; this is however against the results from studies conducted in Sub-Saharan Africa by Monden and Smits [36] and Pongou et al. [37]. This may be due to the small number of multiple newborns recorded at KUTH along the year 2016. The survival outcomes of babies whose APGAR is below 4/10 are poorer than that of babies with higher APGAR score. Babies whose weight range from 2500 g to 4500 g survive better than those whose weight is below 2500 g and those whose weight is above 4500 g while babies whose circumference of head range from 32 cm to 36 cm survive better than those whose circumference of head is below 32 cm. The results of APGAR, weight and circumference of the head comply with the recommendations of the clinical medicine and related manuscripts such as [38] for example.

The study shows that the BMRSM is close to JMRSM and MRSM for all significant covariate but the BMRSM shows relatively higher standard errors for some non-significant covariates. The discrepancy between standard errors after re-sampling for covariates such as childbirth, weight, head and height suggests the instability of the MRSM at these specific covariates and this emphasizes their non-significance in the MRSM.

The present analysis is limited on eleven covariates. Unavailable covariates concerning parents that could improve models are, for example, demographic covariates such as the parent’s education level, employment and income; behavioral covariates namely smoking habit, alcohol consumption and dietary and physiotherapeutic variables such as sports activity level. These variables are not recorded in registry at KUTH.

Conclusion

Marginal Risk Set Model (MRSM) and related re-sampling using Bootstrap (BMRSM) and Jackknife (JMRSM) are described and compared with a use of the dataset on infant mortality. The JMRSM and MRSM displayed relatively close results. The risk is higher for babies whose parents are under 20 years old parents as compared to older parents. Babies born with APGAR greater or equal to 7/10 were found to have a better survival outcome than those born with APGAR less than 4/10 and those whose APGAR range between 4/10 and 6/10. The risk is lower for underweight babies as compared to babies with normal weight and overweight. The survival outcomes for babies with normal circumference of head were found to be better than those with relatively small head. The study suggests that pregnancy of under 20 years old parents should be avoided, also appropriate clinical ways of keeping pregnancy against any cause of infant abnormality could help in lowering infant mortality.