Study design and data
Data for this cross-sectional study were obtained from Demographic and Health Surveys (DHS), which are nationally representative household surveys conducted in low- and middle-income countries. This study used data from 19 recent DHS surveys conducted between 2003 and 2012 in sub-Saharan Africa available as of October 2014 and that included rapid HIV test results and questions on self-reported tobacco use. The DHS uses a multi-stage, stratified sampling design with households as the sampling unit . Within each sample household, all women and men meeting the eligibility criteria are interviewed. Because the surveys are not self-weighting, weights are calculated to account for unequal selection probabilities as well as for non-response. With weights applied, survey findings represent the full target populations. The DHS surveys include a household questionnaire, a women’s questionnaire, and in most countries, a men’s questionnaire. All three DHS questionnaires are implemented across countries with similar interviewer training, supervision, and implementation protocols.
For the HIV testing, blood spots were collected on filter paper from a finger prick and transported to a laboratory for testing. The laboratory protocol includes an initial enzyme-linked immunosorbent assay (ELISA) test, and then retesting of all positive tests and 5–10 % of the negative tests with a second ELISA. For those with discordant results on the two ELISA tests, a new ELISA or a Western Blot is performed . Participation in the test was voluntary and before collecting blood samples each selected participant was asked to provide informed consent to the testing . In order to ensure confidentiality, the HIV test results were anonymously linked to individual questionnaire information .
Respondents were explicitly asked “Do you currently smoke cigarettes?” Those who responded ‘yes’ to this question were defined as current cigarette smokers, whereas those who responded ‘no’ were defined as current non-smokers.
Individual level factors
The following individual-level factors were included in the models: sex of the respondent (male versus female), respondents’ age in completed years (18 to 24, 25 to 34, 35 to 44 or 45 or older), educational attainment (no education, primary, secondary or higher); marital status (never married versus ever married) and occupation (working or not working). DHS did not collect direct information on household income and expenditure. We used DHS wealth index as a proxy indicator for socioeconomic position. The methods used in calculating DHS wealth index have been described elsewhere [28, 29]. Briefly, an index of economic status for each household was constructed using principal components analysis based on the following household variables: number of rooms per house, ownership of car, motorcycle, bicycle, fridge, television and telephone as well as any kind of heating device. From these criteria the DHS wealth index tertiles (poor, middle, and rich) were calculated and used in the subsequent modelling.
We used the term neighbourhood to describe clustering within the same geographical living environment. Neighbourhoods were based on sharing a common primary sample unit within the DHS data. The sampling frame for identifying primary sample unit in the DHS is usually the most recent census. The unit of analysis was chosen for two reasons. First, primary sample unit is the most consistent measure of neighbourhood across all the surveys , and thus the most appropriate identifier of neighbourhood for this cross-region comparison. Second, for most of the DHS conducted, the sample size per cluster meet the optimum size with a tolerable precision loss .
The following neighbourhood-level factors were included in the models: place of residence (rural or urban area), neighbourhood poverty-, illiteracy- and unemployment rates. We categorized neighbourhood poverty-, illiteracy- and unemployment rates into two categories (low and high), to allow for non-linear effects and provide results that were more readily interpretable in the policy arena. Median values served as the reference group for comparison.
Country-level data were collected from the reports published by the United Nations Development Program . At country-level, we included percentage rural population and intensity of deprivation. Intensity of deprivation is average percentage of deprivation experienced by people in multidimensional poverty. Like wealth index, intensity of deprivation was computed using principal component based on data on household deprivations in education, health and living standards, however, at the country-level . The country-level variables were also categorized into two (low and high) levels.
In the descriptive statistics the distribution of respondents by key variables were expressed as percentages.
We used multivariable logistic multilevel regression models to analyse the association between individual compositional and contextual factors associated with current cigarette smoking among people living with HIV. We specified a 3-level model for binary response reporting current cigarette smoking or not-currently smoking, for people living HIV (at level 1), in a neighbourhood (at level 2) living in a country (at level 3) (see Fig. 1).
We constructed five models. The first model, an empty or unconditional model without any explanatory variables, was specified to decompose the amount of variance that existed between country and neighbourhood levels. The second model contained only individual-level factors, the third model contained only neighbourhood-level factors, and fourth model contained only country-level factors. Finally, the fifth model simultaneously controlled for individual-, neighbourhood- and country-level factors (Full Model).
Fixed effects (measures of association)
The results of fixed effects (measures of association) were reported as odds ratios (ORs) with their 95 % credible intervals (CrIs). Bayesian statistical inference provides probability distributions for measures of association (ORs), which can be summarized with 95 % credible intervals (95 % CrI), rather than 95 % confidence intervals (95 % CI). A 95 % credible interval can be interpreted as there being a 95 % probability that the parameter takes a value in the specified range.
Random effects (measures of variation)
The possible contextual effects were measured by the intraclass correlation (ICC) and median odds ratio (MOR). We measured similarity between respondents in the same neighbourhood and within the same country using ICC. The ICC represents the percentage of the total variance in the probability of reporting current cigarette smoking that is related to the neighbourhood- and country-level, i.e. measure of clustering of odds of reporting cigarette smoking in the same neighbourhood and country. The ICC was calculated by the linear threshold (latent variable method) . Following the ideas of Larsen et. al. on neighbourhood effects , we reported the random effects in terms of odds. The MOR measures the second or third level (neighbourhood or country) variance as odds ratio and estimates the probability of being a current cigarette smoker that can be attributed to neighbourhood and country context. MOR equal to one indicates no neighbourhood or country variance. Conversely, the higher the MOR, the more important are the contextual effects for understanding the probability of being a current cigarette smoker.
Model fit and specifications
We checked for multi-collinearity among explanatory variables examining the variance inflation factor (VIF) , all diagonal elements in the variance-covariance (τ) matrix for correlation between −1 and 1, and diagonal elements for any elements close to zero. None of the results of the tests provided reasons for concern. Thus, the models provide robust and valid results. The MLwinN software, version 2.31, was used for the analyses [36, 37]. Parameters were estimated using the Markov Chain Monte Carlo procedure . The Bayesian Deviance Information Criterion was used as a measure of how well our different models fitted the data. A lower value on Deviance Information Criterion indicates a better fit of the model .