1 Background of the Study

Tuberculosis and HIV have been closely linked since the emergence of AIDS and TB is the most common infectious disease affecting HIV-sero positive individuals and causing to their death [1, 2]. Globally, the number of TB patients who had been diagnosed with HIV status reached 2.1 million in 2010 which is equivalent to 34 % of notified cases of TB. Of the 8.8 million incident cases worldwide an estimated 1.1 million (13 %) were found to be co-infected with HIV [3].

Today, HIV and TB treatments are common in many societies and the use of drugs has altered the joint dynamics of both diseases. About one third of 39.5 million HIV infected people worldwide were co-infected with TB [4] and up to 50 % of individuals living with HIV are expected to develop TB. Many TB carriers who were infected with HIV are 30–50 times more likely to develop active TB than those without HIV. For individuals infected with HIV the presence of other infections including TB tends to increase the rate of HIV replication. This acceleration may result in higher levels of infection and rapid HIV progression to the AIDS stage [5].

Over the past 20 years, HIV has fuelled TB notification rates which have increased 3–5 fold in many African countries. By 2007, the continent accounted for 79 % of the global burden of HIV associated TB [6]. Worst affected were those countries in the east and south of the continent where HIV prevalence rates are highest. In South Africa and Swaziland approximately 1 % of the population develops TB annually. Notification rates in some poor communities in South Africa have even increased to over 2 % per year rates that are almost unprecedented in the era of short course multi-drug chemotherapy [7, 8].

WHO ranked Ethiopia as seventh among the 22 high burden countries with TB where the estimated annual incidence and prevalence, respectively, were 379 and 643 cases per 100,000 populations [9]. The prevalence of HIV among TB patients was estimated as high as 41 % [10, 11]. In Ethiopia routine data obtained from 44 locations in the year 2005/2006 showed that about 41 % of TB patients were HIV+. In addition, another routine data set collected in 2006/2007 estimated that some 31 % of TB patients were HIV+ [12]. TB was the cause of 76,000 deaths in Ethiopia out of which 30 % were among HIV+ patients [6]. Beside the rate of mortality caused by the co-infection, the extent of the negative impact on the quality of livelihood resulting from mental disorders was studied [13].

In many medical cases more than one clinical outcome are measured longitudinally at the same time on the same subject where these measured clinical outcomes are correlated. For example, SBP and DBP for hypertensive patients and CD4 count, beta-2-macroglobulin and weight for HIV infected patients were measured longitudinally at same time. Since they are highly related changes in either often affect changes in the other. In such cases the univariate longitudinal analysis does not take into account correlation between observations on different response variables at each time points. Beside this knowing how the evolution of one is related to the evolution of the other, as well as how the association changes or evolves overtime is difficult from univariate longitudinal analysis. Joint modeling of longitudinal data in other way round accounts two types of correlations which are known to be serial correlation and cross correlation. The serial correlation accounts correlation between observations at different time points within a subject and the cross correlation accounts correlation between different responses at each time point. If different types of outcomes are measured at each time point, the correlation structure is more complicated and hence, more difficult for drawing inference [14, 15].

Therefore, the primary objective of present study was to model jointly weight which act as a marker for TB patients and CD4 count which act as a biomarker for HIV infected individuals. Furthermore, the study deals with linear mixed modeling of weight and CD4 count measurements to know the factors that affect the mean change of each outcome variable overtime.

2 Materials and Methods

2.1 Data Source

The HIV/TB co-infection data used for this study were obtained from Jimma University Specialized Hospital HIV and TB Outpatient Clinics, South West of Ethiopia. The study population consists of all HIV/TB co-infected patients who were 18 years old and above, and who were on ART treatment any time between 1st February 2009 and 1st July 2014. Of the total 850 co-infected patients from both clinics at the hospital, 254 patients who have at least one CD4 count and weight at the same time were considered for the study. All the patients’ epidemiological, laboratory and clinical information were obtained from the patients chart of ART followup retrospectively from patients card.

2.2 Ethical Consideration

Ethical clearance was obtained from Department of Statistics of Jimma University. Personal information was kept confidentially without disclosing to others during data collection from patient cards.

2.3 Variables of the Study

The two outcome variables considered for this study were the longitudinal CD4 count measurement and weight of co-infected patients which is measured in kg. The CD4 cell counts per mm\(^3\) of blood, which is considered as a biomarker was measured approximately within 6 months interval whereas the weight of the patients were measured at each patients visit. The 14 independent covariates considered for the separate and joint modeling are listed in Table 1 with respective of their categories.

Table 1 List of covariates used in the separate and joint modeling

2.4 Model Specification

Longitudinal responses may arise in two common situations. One is when the measurements taken on the same subject at different times and the other is when the measurements taken on related subjects. In both cases, the responses are likely to be correlated [14]. For longitudinal data, two sources of variations were considered which were the within-subject which arises during the measurements within each subject, and between subject variation which arises during the measurement between different subjects. Modeling within subject variations help us to study changes overtime while modeling between subject variation help us to understand differences between subjects.

2.4.1 Univariate Linear Mixed Effects Model

Before going to the joint modeling of longitudinally measured weight and CD4 count of HIV/TB co-infected patients, the linear mixed model were employed for each outcome to identify an appropriate covariates that predicts the mean change of weight and CD4 count overtime. Therefore, the linear mixed-effects model handling the two of variation is

$$\begin{aligned} \mathbf {y}_j = \mathbf {X^{\prime }}_j\mathbf {\beta }_j + \mathbf {Z^{\prime }}_j\mathbf {b}_j + \mathbf {\varepsilon }_j \end{aligned}$$
(1)

where \(\mathbf {y}_j\) is the \(n_j\times 1\) vector of observed response values, \(\mathbf {\beta }_j\) is the \(p\times 1\) vector of fixed-effects parameters, \(\mathbf {X}_j\) is the \(n_j\times p\) observed design matrix corresponding to the fixed-effects, \(\mathbf {b}_j\) is the \(q\times 1\) vector of random-effects parameters, \(\mathbf {Z}_j\) is the \(n_j\times q\) observed design matrix corresponding to the random-effects, and \(\mathbf {\varepsilon }_j\) is the \(n_j\times 1\) vector of residuals for the jth response. The corresponding assumptions for model (1) are \(\mathbf {b}_j\sim N_{q_j}(\mathbf {0}, \mathbf {\Psi }_j)\) and \(\mathbf {\varepsilon }_j\sim N_{n_j}(\mathbf {0}, \mathbf {\Sigma }_j)\); where \(\mathbf {\Psi }_j\) and \(\mathbf {\Sigma }_j\) are the variance-covariance matrix for \(\mathbf {b}_j\) and \(\mathbf {\varepsilon }_j\) for each outcome variable respectively.

2.4.2 Methods of Parameter Estimation

Suppose that a random sample of \(n_j\) for jth response observations is obtained from a univariate normal linear mixed effects model as defined in equation (1). Then, the likelihood of the model parameters given the vector of \(n_j\) observations is defined as

$$\begin{aligned} L=L(\mathbf {\beta _j,\gamma _j;y_j})=\frac{exp\left\{ -\frac{1}{2}(\mathbf {y_j-X^{\prime }_j\beta _j})^{\prime }\mathbf {V}^{-1}(\mathbf {y_j-X^{\prime }_j\beta _j})\right\} }{(2\pi )^{\frac{n}{2}}|\mathbf {V}|^{\frac{1}{2}}} \end{aligned}$$
(2)

where \(\mathbf {\beta }_j\) is a vector of fixed-effect parameters and \(\mathbf {\gamma }_j\) is a vector containing the variance parameters for the jth response. Hence, the log-likelihood of the model parameters is defined as

$$\begin{aligned} \begin{aligned} l&=l(\mathbf {\beta _j,\gamma _j;y_j})=-\frac{n}{2}\log (2\pi )-\frac{1}{2}\log |\mathbf {V}|-\frac{1}{2}(\mathbf {y_j-X^{\prime }_j\beta _j})^{\prime }\mathbf {V}^{-1}(\mathbf {y_j-X^{\prime }_j\beta _j})\\&=-\frac{n}{2}\log (2\pi )-\frac{1}{2}\left[ \log |\mathbf {V}|+(\mathbf {y_j-X^{\prime }_j\beta _j})^{\prime }\mathbf {V}^{-1}(\mathbf {y_j-X^{\prime }_j\beta _j})\right] \end{aligned} \end{aligned}$$
(3)

The required parameters of the model were obtained by maximizing the log-likelihood function with respect to \(\mathbf {\beta }_j\) and \(\mathbf {\gamma }_j\). However, the maximum likelihood approach may produce variance parameters that are biased downwards since they are based on the assumption that the fixed-effects parameters are known [16]. To handle this problem the present study used restricted maximum likelihood (REML) method, which treat fixed-effects as parameters rather than constant.

2.4.3 Multivariate Linear Mixed Model

The mixed model can be easily extended to include multiple response variables by further stacking the data and defining a specific variance-covariance structure for the random effects. In our case, consider modeling weight and CD4 count measurements of co-infected patients jointly overtime by incorporating random intercepts and slopes in order to model the correlations overtime between the two responses. As we defined earlier, the primary aim of this study was to model jointly the longitudinally measured CD4 count and weight of HIV/TB co-infected patients which was modeled by multivariate linear mixed model.

Let \(y_{ijk}\) represent the ith observation from the jth subject for the kth response variable, where \(i=1,2,\ldots ,n_{jk}; j = 1,2,\ldots ,s; k=1,2\). Also define \(N_k=\sum _{i=1}^sn_{jk}\) and \(N=N_1+N_2\). The vector \(\mathbf {y}_{jk}=[y_{1jk},y_{2jk},\ldots ,y_{n_{jk}jk}]^{\prime }\) represents the \(n_{jk}\) observations of the kth response variable from the jth subject and the vector \(\mathbf {y}_k=[\mathbf {y}_{1k}, \mathbf {y}_{2k},\ldots ,\mathbf {y}_{sk}]^{\prime }\) represents the \(N_k\) observations for the kth response variable across all response variables and subjects. Finally, the vector \(\mathbf {y}=[\mathbf {y}_{1}, \mathbf {y}_{2},\ldots ,\mathbf {y}_{k}]^{\prime }\) represents the N observations across all response variables and subjects. In the context of modeling our response variables, the linear mixed-effects models for each response variable for subject j taken at time t can be specified as

$$\begin{aligned} \begin{aligned} y_{j1}(t)=\mu _1(t)+a_{j1}+b_{j1}(t)+\varepsilon _{j1}(t)\\ y_{j2}(t)=\mu _2(t)+a_{j2}+b_{j2}(t)+\varepsilon _{j2}(t) \end{aligned} \end{aligned}$$
(4)

where \(\mu _k(t)\) refers to the average evolution (of the kth response over time) and is a function of the fixed effects. The subject specific random intercepts \(a_{jk}\) and slopes \(b_{jk}(t)\) describe how the subject specific profiles deviate from the average profile for the kth response. The two response trajectories are joined together by assuming a joint distribution for the vector of random-effects, \(\mathbf {b}_j\), such as

$$\begin{aligned} \mathbf {b}_j= \left[ \begin{array}{c} a_{j1}\\ b_{j1}\\ a_{j2}\\ b_{j2}\\ \end{array} \right] \sim N(\mathbf {0, \Psi }) \end{aligned}$$

where the variance-covariance matrix for the random effects, \(\mathbf {\Psi }\), has the following structure:

$$\begin{aligned} \mathbf {\Psi }= \left[ \begin{array}{cccc} \sigma ^2_{a_1} &{}\quad \sigma _{a_1b_1} &{}\quad \sigma _{a_1a_2} &{}\quad \sigma _{a_1b_2}\\ \sigma _{b_1a_1} &{}\quad \sigma ^2_{b_1} &{}\quad \sigma _{b_1a_2} &{}\quad \sigma _{b_1b_2}\\ \sigma _{a_2a_1} &{}\quad \sigma _{a_2b_1} &{}\quad \sigma ^2_{a_2} &{}\quad \sigma _{a_2b_2}\\ \sigma _{b_2a_1} &{}\quad \sigma _{b_2b_1} &{}\quad \sigma _{b_2a_2} &{}\quad \sigma ^2_{b_2}\\ \end{array} \right] \end{aligned}$$
(5)

The error components for each response, which are independent of the random effects can be taken to be correlated or uncorrelated \((\sigma _{12}=0)\), such that the error components are defined as

$$\begin{aligned} \left[ \begin{array}{c} \varepsilon _{1j}\\ \varepsilon _{2j}\\ \end{array} \right] \sim N\left( \left[ \begin{array}{c} 0\\ 0\\ \end{array} \right] , \left[ \begin{array}{cc} \sigma ^2_1 &{}\quad \sigma _{12}\\ \sigma _{21} &{}\quad \sigma ^2_2\\ \end{array} \right] \right) \end{aligned}$$

2.4.4 Special Cases for the Random Effect Variance Matrix

We obtain special cases for the variance-covariance matrix of the random effects by making specific assumptions for the variance-covariance matrix \(\mathbf {\Psi }\). The first assumption is when the two outcome variable could be taken to be completely independent at any point in time, we impose their variance-covariance matrix which has the following special form given by

$$\begin{aligned} \mathbf {\Psi }= \left[ \begin{array}{cccc} \sigma ^2_{a_1} &{}\quad \sigma _{a_1b_1} &{}\quad 0 &{}\quad 0\\ \sigma _{b_1a_1} &{}\quad \sigma ^2_{b_1} &{}\quad 0 &{}\quad 0\\ 0 &{}\quad 0 &{}\quad \sigma ^2_{a_2} &{}\quad \sigma _{a_2b_2}\\ 0 &{}\quad 0 &{}\quad \sigma _{b_2a_2} &{}\quad \sigma ^2_{b_2}\\ \end{array} \right] \end{aligned}$$

Within a response variable, the random intercept and slope induce within-subject correlations in the repeated measures overtime, while assuming independence between subjects. Moreover, this model assumes that the two responses are completely independent. The results for the model would be identical to fitting two separate random-effect models.

The second assumption is when the two response variables could be taken to be completely dependent. In this case, the two responses essentially “share” the same set of random effect parameters (intercept and slope). When two parameters are completely dependent, the correlation between them is equal to one. This occurs when the covariance between the parameters is equal to the square root of the product of their respective variances. Most notation, however, define the model with a \(2\times 1\) vector of random effects, such as

$$\begin{aligned} \mathbf {b}= \left[ \begin{array}{c} a_j\\ b_j\\ \end{array} \right] \sim N(\mathbf {0, \Psi }), with\quad \mathbf {\Psi }= \left[ \begin{array}{cc} \sigma ^2_a &{}\quad \sigma _{ab}\\ \sigma _{ba} &{}\quad \sigma ^2_b\\ \end{array} \right] \end{aligned}$$

Clearly, the aforementioned structure imposes strong assumptions on the relationship between the two response variables. It is very unlikely that the two responses would exhibit complete dependence in the association between the random slopes and between the random intercepts. One advantage of this model, when the assumption is tenable, is that it drastically reduces the number of random effects that must be estimated when the number of response variables is large. For models with a large number of response variables, estimation would likely be impossible if the shared-parameters (or alternative approach) were not used.

2.4.5 Association of the Evolutions (AOE)

One important question that may be addressed with a joint mixed-effects model is how the evolution of one response is associated with the evolution of another response (“association of the evolutions”). By definition, the correlation between the evolutions for the two random slopes is given by

$$\begin{aligned} r_e=\frac{Cov(b_1,b_2)}{\sqrt{Var(b_1)}\sqrt{Var(b_2)}}=\frac{\sigma _{b_1b_2}}{\sqrt{\sigma ^2_{b_1}}\sqrt{\sigma ^2_{b_2}}} \end{aligned}$$
(6)

It may be noted that the above expression is produced using those values from the \(\mathbf {\Psi }\) matrix defined in (5).

2.4.6 Evolution of the Association (EOA)

A similar idea that may be investigated using a joint mixed effects model is how the association between the responses evolves overtime (“evolution of the association”). Assuming uncorrelated errors, the marginal correlation between the two responses as a function of time is given by

$$\begin{aligned} \begin{aligned} r_m(t)&=\frac{Cov(y_{j1}(t), y_{j2}(t))}{\sqrt{Var(y_{j1}(t))}\sqrt{Var(y_{j2}(t))}}\\&=\frac{\sigma _{a_1a_2}+t\sigma _{a_1b_2}+t\sigma _{a_2b_1}+t^2\sigma _{b_1b_2}}{\sqrt{\sigma ^2_{a_1}+2t^2\sigma _{a_1b_1}+2t^2\sigma ^2_{b_1}+\sigma ^2_1}\sqrt{\sigma ^2_{a_2}+2t^2\sigma _{a_2b_2}+2t^2\sigma ^2_{b_2}+\sigma ^2_2}} \end{aligned} \end{aligned}$$
(7)

Assuming correlated errors, the marginal correlation between the two responses as a function of time is given by

$$\begin{aligned} \begin{aligned} r_m(t)&=\frac{Cov(y_{j1}(t), y_{j2}(t))}{\sqrt{Var(y_{j1}(t))}\sqrt{Var(y_{j2}(t))}}\\&=\frac{\sigma _{a_1a_2}+t\sigma _{a_1b_2}+t\sigma _{a_2b_1}+t^2\sigma _{b_1b_2}+\sigma _{12}}{\sqrt{\sigma ^2_{a_1}+2t^2\sigma _{a_1b_1}+2t^2\sigma ^2_{b_1}+\sigma ^2_1}\sqrt{\sigma ^2_{a_2}+2t^2\sigma _{a_2b_2}+2t^2\sigma ^2_{b_2}+\sigma ^2_2}} \end{aligned} \end{aligned}$$
(8)

The delta method could be used to obtain 95 % confidence bounds for \(r_m(t)\) at any particular point in time. Two observations can be made from the uncorrelated errors by noticing \(t = 0\) the marginal correlation reduces to

$$\begin{aligned} r_m(t)=\frac{\sigma _{a_1a_2}}{\sqrt{\sigma ^2_{a_1}+\sigma ^2_1}\sqrt{\sigma ^2_{a_2}+\sigma ^2_2}} \end{aligned}$$
(9)

which is essentially the correlation between the two random intercepts. In fact, when the error components are small, the closer the marginal correlation at \(t = 0\) approximates the correlation between the random intercepts. Also, as t increases \(r_m(t)\) converges to \(r_e\) for the case with uncorrelated errors, and to

$$\begin{aligned} r_m(t)=\frac{\sigma _{a_1a_2}+\sigma _{12}}{\sqrt{\sigma ^2_{a_1}+\sigma ^2_1}\sqrt{\sigma ^2_{a_2}+\sigma ^2_2}} \end{aligned}$$
(10)

for the case of correlated errors, which indicates that the absolute value of the marginal correlation at \(t = 0\) cannot be higher than the correlation between the random intercepts. It may also be noted that as t increases the marginal correlation converges to the correlation between the random slopes, while the variance-covariance parameters of the random effects determine the shape of the marginal correlation function [17].

2.4.7 Joint Model Estimation Techniques

In the particular context of random-effects models, so-called adaptive quadrature rules can be used [18], were the numerical integration is centered on the estimates of the random effects, and the number of quadrature points is then selected in terms of the desired accuracy. To illustrate the main ideas, we consider Gaussian and adaptive Gaussian quadrature, designed for the approximation of integrals of the form

$$\begin{aligned} \int f(z)\phi (z)dz \end{aligned}$$
(11)

for a known function f(z) and for \(\phi (z)\) the density of the multivariate standard normal distribution. Therefore, first standardize the random effects such that they get the identity covariance matrix. Then, the likelihood contribution for subject i equals

$$\begin{aligned} f_i(\mathbf {y}_i|\mathbf {\beta ,\gamma ,\Psi })=\prod _{i=1}^{n_i}\int f_{ij}(y_{ij}|\mathbf {b}_i,\mathbf {\beta ,\gamma })f(\mathbf {b}_i|\mathbf {\Psi })d\mathbf {b}_i \end{aligned}$$
(12)

where \(\mathbf {b}_i\) is \(q\times 1\) dimensional vector of unknown random effects and \(\mathbf {b}_i \sim N(\mathbf {0,\Psi })\), \(\mathbf {\beta }\) is a vector of fixed-effects parameters and \(\mathbf {\gamma }\) is a vector containing the variance parameters. f(z) and for \(\phi (z)\) denotes the density of the multivariate standard normal distribution.

2.4.8 Model Selection Criteria

Several model selection procedures exists but none of which were the best. To have an appropriate model for the univariate LMM and multivariate LMM most commonly known model selection criterions; Akaike Information Criterion (AIC) [19] and the Bayesian Information Criterion (BIC) [14] were considered for this study.

$$\begin{aligned} AIC&=-2\log L+2p \end{aligned}$$
(13)
$$\begin{aligned} BIC&=-2\log L+npar\log (N) \end{aligned}$$
(14)

where \(-2 \log L\) is twice the negative log-likelihood value for the model, p is the number of estimated parameters, npar denotes the total number of parameters in the model and N is the total number of observations used to fit the model. Smaller values of AIC and BIC reflect an overall better fit.

Table 2 Demographic and clinical characteristics of the study patients

3 Results and Discussion

3.1 Descriptive Statistics

Some demographic information and baseline characteristics of all patients disaggregated by patient status were presented in Table 2. Regarding the sex composition of patients, out of total of 254 co-infected patients 139 (54.72 %) of them were males and 18 (56.25 %) death were also occurred in male group in comparison with female group. More than half 146 (57.48 %) of the co-infected patients belongs to orthodox religious group whereas 17 (6.69 %) belongs to protestant religious group. Of the total deaths occurred in these categories small number 2 (6.25 %) of deaths were occurred in protestant religious group. When we look at the educational level category of the co-infected patients, 108 (42.52 %) attended primary education while only 17 (6.69 %) attended tertiary education. Table  2 also depicts that 8 (3.15 %), 23 (9.06 %), 123 (48.43 %) and 100 (39.37 %) co-infected patients’ were at baseline clinical stage-I, stage-II, stage-III and stage-IV respectively. Totally, 40 (93.75 %) deaths were occurred in both clinical stage-III and stage-IV at baseline time in comparison with remaining two baseline clinical stages. About 13 (40.63 %) and 10 (31.25 %) death were occurred in ambulatory and working categories of patients functional status respectively. To this end, 37 (66.07 %) and 12 (21.43 %) were missed to followup from these two groups respectively in comparison with bedridden group at baseline during the time of co-infection periods.

Mean with the corresponding standard deviation at each time points with respective sample sizes for both outcome variables was shown in Table 3. The average number of square root of CD4 count was 11.965 at baseline. There was a general increment in the mean value up to 30 months starting from baseline and starts declining. However, when we look at the standard deviations up to 36 months there was slight variation among the measurement times where smaller variation in square root of CD4 measurement was observed at 48 month. Also it can be observed from Table 3, smaller mean weight of co-infected patients were observed at the baseline of co-infection period and larger mean weight was observed at 48 month with larger weight variation among the co-infected patients.

Table 3 Summary measures of covaraites at each time points with respective sample sizes

3.2 Exploring Individual Profile and the Mean Structure

To underpin the model building and visualize the pattern of weight and CD4 measurements of the patients overtime, the overall individual plots were considered.

Fig. 1
figure 1

Individual profile plot and the evolution of mean structure overtime for both covariates

The graph (Fig. 1) demonstrates the variability (within and between patients) in square root of CD4 count and weight measurements of HIV/TB co-infected patients. The red line loess smoothing technique suggests that the mean structure growth of both variables was in a linear and quadratic pattern overtime. Nevertheless, the rate of increment was high in square root of CD4 count than weight.

3.3 The Separate Longitudinal Analysis

The separate longitudinal analysis were started from the fixed-effect modeling to select an appropriate covariates that predicts weight and square root of CD4 count where this linear model only considers the between source of variation among co-infected patients. First, we have used stepwise automatic variable selection method to select appropriate fixed-effects from Table 1. After arriving at an optimal subset, the selected fixed-effects model was fitted with different random effects for both responses to have an appropriate linear mixed-effects models. The summary of the linear mixed-effects models which were modeled by considering different random effects were shown in Table 4.

Table 4 Selection of random effects to be included in the linear mixed-effects model

From Table 4, for both responses only considering the quadratic time effect to the model worsen the LMM. However, considering the linear time effects to the quadratic time effect improves the LMM with only quadratic time effects since its inclusion lowers the AIC and BIC values. Finally, when we look at the improvement of the model with inclusion of random intercept to that of random linear and quadratic time effects of the LMM, there was an improvement of the model and this model have lower AIC and BIC values than the remaining six LMMs. Hence, we consider the LMM with random intercept linear and quadratic time effects as an appropriate model for the separate models.

Table 5 Separate models for each covariates

As we can observe from the fitted LMM for weight of co-infected patients (Table 5), there was significance difference weight at baseline by sex, patients status and educational level since the estimated coefficient for male, missed the follow up and secondary education categories were significant (\(p<0.05\)). But there was no significance difference weight at baseline by the functional status category of co-infected patients. When we look at the mean change in weight overtime, the linear time have positive significant effects whereas the quadratic time have negative significant effects. And also, the mean change overtime in weight of co-infected patients were differ by the functional status of the patients since the interaction of linear time with working functional status have significant effects on weight at 5 % level of significance. When we look at the fitted LMM for square root of CD4 count, there was a significance difference in square root of CD4 measurement at baseline by the functional status of the patients since the estimated coefficient of bedridden category was significant at 5 % level of significance. Likewise the case of weight, the linear time have positive effect on mean change of square root of CD4 count overtime whereas the quadratic time change have negative effect at 5 % level of significance.

3.4 Joint Analysis of Square Root of CD4 Count and Weight

To have an appropriate joint model that represent the longitudinally measured CD4 count and weight of the co-infected patients, different candidate joint model with different random effect for the joint modeling were considered. The AIC and BIC were used as a guideline in selecting covariates for the model. A smaller AIC and BIC values were generally indicates a better model. The results of the fitted models information criterion were displayed in Table 6.

Table 6 Joint model selection criterion

Table 6 elucidates that the joint model with random effects was considered as the worst model compared to the joint model with inclusion of random linear and quadratic slopes since it have larger AIC and BIC values. Moreover, inclusion of random linear slope to the random intercept for the joint modeling improves the joint model with random intercept only. However, when we include quadratic slope time effects for the joint modeling to that of with random intercept and linear slope, the joint model becomes worst since it has larger AIC and BIC. Therefore, we consider the joint model with random intercept and linear slope as an appropriate joint model since it have smaller AIC and BIC values than the remaining models. Table 6 also shows that the value of fit statistics (AIC and BIC) for all joint models were less than that of the separate model. This indicates that joint model was fitting better the data as compared to the univariate model. Therefore, our joint analysis of these two outcomes was justified and the separate analysis results were likely to be biased. The result of selected appropriate joint model for the longitudinally measured CD4 count and weight of HIV/TB co-infected patients was shown in Table 7.

Table 7 Joint analysis of longitudinal square root of CD4 cell count and weight

As can be seen from the fitted joint model (Table 7), covariates like sex, educational level and functional status were the factors significantly contributing to the prediction of HIV/TB co-infected patients weight at baseline. Table 7 also suggests that the linear time effect has a positive effect on the mean change of weight whereas the quadratic time change has negative effect overtime at 5 % level of significance. To this end, weight change of the patients were differ by functional status since the estimated coefficient for ambulatory functional status group with linear time interaction have significant effect.

From the fitted joint model, the baseline CD4 cell count was differ by patient status and functional status since the estimated coefficients for missed patient status category in comparison with transferred category, ambulatory and bedridden functional status category in comparison with working functional status category have significant effects at 5 % level of significance. With regard to the mean change in square root of CD4 measurements overtime, the linear time effect has a positive sign and found to be statistically significant. Nevertheless, the quadratic time effect has a significant negative effect on the mean change of CD4 count overtime. Further, the mean change in CD4 cell count of the patients differ by functional status category since the estimated coefficient of ambulatory functional status with time interaction in comparison with working functional status category with time interaction have a significant effects on the mean change of CD4 count overtime at 5 % level of significance.

The estimated variance-covariance matrix for the final joint model was

$$\begin{aligned} \hat{\mathbf {\Psi }}= & {} \left[ \begin{array}{cccc} \hat{\sigma }^2_{a_1} &{}\quad \hat{\sigma }_{a_1b_1} &{}\quad \hat{\sigma }_{a_1a_2} &{}\quad \hat{\sigma }_{a_1b_2}\\ \hat{\sigma }_{b_1a_1} &{}\quad \hat{\sigma }^2_{b_1} &{}\quad \hat{\sigma }_{b_1a_2} &{}\quad \hat{\sigma }_{b_1b_2}\\ \hat{\sigma }_{a_2a_1} &{}\quad \hat{\sigma }_{a_2b_1} &{}\quad \hat{\sigma }^2_{a_2} &{}\quad \hat{\sigma }_{a_2b_2}\\ \hat{\sigma }_{b_2a_1} &{}\quad \hat{\sigma }_{b_2b_1} &{}\quad \hat{\sigma }_{b_2a_2} &{}\quad \hat{\sigma }^2_{b_2}\\ \end{array} \right] \\= & {} \left[ \begin{array}{cccc} 78.9581&{}\quad 0.1228 &{}\quad -224.94 &{}\quad -0.5296\\ 0.1228 &{}\quad 0.0219 &{}\quad -1.3535 &{}\quad 0.02124\\ -224.94 &{}\quad -1.3535 &{}\quad 907.20 &{}\quad -0.1613\\ -0.5296 &{}\quad 0.02124 &{}\quad -0.1613 &{}\quad 0.0777\\ \end{array} \right] \end{aligned}$$

and the resulting correlation matrix was

$$\begin{aligned} \mathbf {R}= \left[ \begin{array}{cccc} \hat{\rho }_{a_1a_1} &{}\quad \hat{\rho }_{a_1b_1} &{}\quad \hat{\rho }_{a_1a_2} &{}\quad \hat{\rho }_{a_1b_2}\\ \hat{\rho }_{b_1a_1} &{}\quad \hat{\rho }_{b_1b_1} &{}\quad \hat{\rho }_{b_1a_2} &{}\quad \hat{\rho }_{b_1b_2}\\ \hat{\rho }_{a_2a_1} &{}\quad \hat{\rho }_{a_2b_1} &{}\quad \hat{\rho }_{a_2a_2} &{}\quad \hat{\rho }_{a_2b_2}\\ \hat{\rho }_{b_2a_1} &{}\quad \hat{\rho }_{b_2b_1} &{}\quad \hat{\rho }_{b_2a_2} &{}\quad \hat{\rho }_{b_2b_2}\\ \end{array} \right] = \left[ \begin{array}{cccc} 1 &{}\quad 0.093 &{}\quad -0.841 &{}\quad -0.214\\ 0.093 &{}\quad 1 &{}\quad -0.303 &{}\quad 0.514\\ -0.841 &{}\quad -0.303 &{}\quad 1 &{}\quad -0.019\\ -0.214 &{}\quad 0.514 &{}\quad -0.019 &{}\quad 1\\ \end{array} \right] \end{aligned}$$

From the estimated variance-covariance matrix (\(\hat{\mathbf {\Psi }}\)) of the estimated joint model for the random effects, it can be seen that variability was lower for square root of CD4 count (\(\hat{\sigma }^2_{a_1}=78.95\)) than weight (\(\hat{\sigma }^2_{a_2}=907.20\)). Beside, the correlation matrix shows that (\(\mathbf {R}\)) the baseline subject specific baseline CD4 measurement and weight were negatively correlated (\(\hat{\rho }_{a_1a_2}=-0.841\)). On the other hand, there was a moderate positive association between subject specific change in square root of CD4 and weight of the patient overtime (\(\hat{\rho }_{b_1b_2}=0.541\)). There was a negative cross correlation between the baseline weight and time slope of square root of CD4 measurement (\(\hat{\rho }_{b_1a_2}=-0.303\)) overtime. Similarly, there was a negative correlation between subject specific baseline weight and the patient specific change in square root of CD4 overtime (\(\hat{\rho }_{b_2a_1}=-0.214\)) overtime. It can thus be concluded that there was a moderate positive association between the mean change of weight and square root of CD4 counts overtime.

4 Conclusion

In this study univariate and bivariate linear mixed methods were considered for fitting two continuous response variables measured longitudinally. Based on separate analysis, the evolution of weight was significantly differ with respect to sex (female), baseline age, time, quadratic time, missed patient status category, secondary education category and the interaction of linear time with working functional status. Like wise, the baseline CD4 measurement was differ by patient status and functional status since the estimated coefficients for missed patient status category in comparison with transferred category, ambulatory and bedridden functional status category in comparison with working functional status category have significant effect. The results of this study also demonstrates that the joint modeling of longitudinally CD4 count and weight measurements fits the data better than those obtained from the separate model. To sum up, the joint model suggests that there was a moderate positive association between the mean change of weight and square root of CD4 count overtime.