Trachoma is a blinding disease caused by recurrent ocular Chlamydia trachomatis infection, an organism that produces chronic inflammation of the tarsal conjunctiva. This is characterised by sub-epithelial follicles, which may meet the definition for the sign trachomatous inflammation—follicular (TF) [1]. TF is the sign whose prevalence in 1–9-year-olds is used to determine whether public health-level interventions against active (inflammatory) trachoma are needed [2]. Through repeated reinfection [3, 4], conjunctival scarring may develop, eventually causing the eyelashes to turn inward and touch the globe, a state known as trachomatous trichiasis (TT). In-turned eyelashes that abrade the cornea can result in corneal opacity and blindness [1]. Corrective surgery [5, 6] or epilation [7] are used to manage TT.

Ocular chlamydial transmission is declining in many countries [8,9,10,11] suggesting exceptional progress in interrupting the transmission cycle. Until recently TT prevalence was only evaluated within the context of TF surveys [12]. As TT plays an essential role in trachoma elimination, it remains important to focus on areas where TT is still a public health problem, even in the absence of TF.

The pathogenesis of trachoma, implicitly conceptualized within WHO recommendations for district-level interventions, is of repeated episodes of active trachoma incrementally increasing the cumulative risk of TT. It should be noted that this may be a simplistic outlook on the complicated pathway to TT and additional elements may influence progression. However, active trachoma is a pre-requisite on TT’s causal pathway, with moderate to high prevalences of TF being a proxy for current transmission of ocular C. trachomatis, and TT a proxy for historic transmission. The prevalence levels of these signs are therefore signals for C. trachomatis transmission intensity at different times (TF is current, and TT is historic and cumulative). Even though TF prevalence and TT prevalence are markers of transmission at different time points or over different time scales, in areas where antibiotic mass drug administration (MDA) for trachoma [13] has not yet occurred, it is often assumed that ocular C. trachomatis transmission intensity has remained more or less constant over decades, and that TF prevalence and TT prevalence will therefore closely correlate. This assumption is reasonable if access to water, sanitation, hygiene and anti-chlamydial antibiotics at community level have been constant or have changed only gradually. However, such an assumption is not always valid [14, 15].

Many national programmes [8] have successfully reduced TF prevalence in children aged 1–9 years below the elimination threshold of 5% [16] in some or many districts. To eliminate trachoma as a public health problem, district-level TT prevalence must also be reduced below 0.2% in adults aged ≥15 years [16]. Whether or not active trachoma and TT are public health problems are two separate questions. In Nigeria, for example, 94 local government areas in six states mapped through the Global Trachoma Mapping Project (GTMP) yielded district-level TF prevalence estimates below the elimination threshold and district-level TT prevalence estimates above the threshold [17,18,19,20,21,22,23]. This is attributable to historic transmission intensity being considerably higher than the contemporary one. It is important to better understand the factors associated with high TT burden, so as to develop more targeted control interventions. Understanding where TT cases are likely to occur could help to guide strategic placement of TT intervention services.

Thanks to the GTMP, there has been an increasing availability of high quality geolocated trachoma and water, sanitation and hygiene (WASH) data. The GTMP was launched in December 2012 with the aim of mapping the global prevalence of trachoma in all suspected-endemic districts, through completion of population based prevalence surveys. It systematically collected trachoma and WASH data across 1546 districts in 29 countries, nearly exclusively in areas where control activities, including antibiotic MDA, had not yet occurred [24]. These data can be used to further our understanding of TT distribution.

In this study, we attempted to identify risk factors that, in addition to TF, might associate with variation in community-level TT prevalence. To this end we fit binomial mixed models, with random effects at community level, to GTMP baseline data from ten countries. We then test for residual spatial correlation and, in countries where this is detected, use geostatistical methods in order to model the variation in TT prevalence between countries.



Ten GTMP collaborating countries provided data for this study: Benin, Cote d’ Ivoire, Democratic Republic of the Congo (DRC), Ethiopia, Guinea, Malawi, Mozambique, Nigeria, Sudan and Uganda. Data provided were from 15,051 clusters (or communities) within 624 trachoma elimination intervention-naïve evaluation units (EUs) (Table 1). Individual-level information on the presence or absence of TF and TT, as well as water and sanitation access of geolocated households, were provided.

Table 1 Summary of GTMP data included in the analysis

Community-level TT prevalence was calculated as the ratio between the number of adults aged ≥15 years with trichiasis in at least one eye and the number of adults aged ≥15 years examined. Community-level TF prevalence was calculated as the ratio between the number of children aged 1–9 years with TF in at least one eye and the number of children aged 1–9 years examined.

Physical and social environmental factors are hypothesized to play an important role in the natural history of trachoma. These factors could conceivably alter rate of progression to TT (Fig. 1).

Fig. 1
figure 1

Conceptual framework of environmental risk factors influencing progression to TT

Facial cleanliness is a well-established association of TF [25,26,27,28,29,30]. Access to water is necessary to facilitate personal hygiene practices. Previous studies have found an association between distance to water and risk of trachoma [25, 31,32,33]. There is mixed evidence on higher density populations of Musca sorbens, the fly vector for ocular C. trachomatis, being associated with a greater risk of trachoma [27, 34]. M. sorbens prefers to breed on human faeces left exposed on the soil [35, 36] and so it may be that latrine ownership has a protective association against active trachoma [37]. For this analysis, community-level WASH indicators were created from the GTMP household-level WASH dataset (Appendix 1). The categorization of these indicators was informed by the WHO/UNICEF Joint Monitoring Programme for Water Supply and Sanitation (JMP) [38]. We calculated the prevalence of access to each categorized WASH indicator.

Previous studies have shown that lower precipitation levels and higher temperatures can lead to an increase in the risk of TF [39]. Therefore, we selected climate-related factors, including annual total precipitation, mean temperature, aridity index and potential evapo-transpiration for this analysis. Gridded maps at 1 km2 resolution of annual total precipitation and mean temperature were extracted from the WorldClim database [40]. The aridity index and potential evapo-transpiration (PET) raster datasets of 1 km2 resolution, were obtained from the Consortium for Spatial Information (CGIAR-CSI) [41]. CGIAR-CSI modelled aridity index and PET using the data available from WorldClim as input parameters.

It has been suggested that frequent sandstorms that occur in some areas of Sudan cause eye trauma [42]. Irritation of the eyes leads to rubbing with fingers which could potentially accelerate the progression of TT. Hence, in our analysis we consider the proportion of sand in topsoil as a potential risk factor for TT. These data were obtained from the ISRIC-World Soil Information project included in the Harmonized Soil Map of the World [43].

We speculate that access to healthcare and other services are associated with developed infrastructure, and therefore sought an infrastructure indicator. Light density at night has been shown to be correlated with local economic activity and gross production rate at different scales [44, 45]. Night light emission captured by the Operational Linescan System instrument on board a satellite of the Defence Meteorological Satellite Program was used as a proxy measure of poverty across Africa [46, 47]. A gridded map of straight line distances to stable lights, namely night light emissivity > 0, was subsequently produced from the raw night light raster for 1997. This historic year was chosen because we were interested in a measure of infrastructure during the childhood of survey participants, rather than that at the time of the surveys themselves and the mean age of participants is 36 years.

All the aforementioned environmental datasets were derived from georeferenced raster files, and converted to a standardized resolution of 5 km × 5 km. The georeferenced data were linked in ArcGIS 10.1 (ESRI, Redlands, CA, USA). When shrinkage of spatial resolution was needed for the 1 km2 resolution covariates, we estimated the mean value in a 5 km × 5 km window using the aggregate tool in the Spatial Analyst toolbox of ArcGIS 10.1 (Appendix 2).

To identify collinearity among the selected variables, we used the variance inflation factor (VIF) [48], defined as

$$ {VIF}_j\kern0.5em =\kern0.5em \frac{1}{1-{R}_j^2} $$

where Rj2 is the fraction of explained variance in the j-th explanatory variables by the other explanatory variables.

Model formulation

Let pi denote the probability of having TT, β0 is the intercept and Ui is community-level unstructured random effects (let i denote the i-th community). We fit the following nested binomial mixed models, where γTFi is the regression coefficient for the effect of TF prevalence on the log-odds of TT:

$$ M1:\mathit{\log}\ \left(\frac{p_i}{1-{p}_i}\right)={\beta}_0+{U}_i; $$
$$ M2:\log \left(\frac{p_i}{1-{p}_i}\right)={\beta}_0+\upgamma {TF}_i+{U}_i; $$
$$ M3:\log \left(\frac{p_i}{1-{p}_i}\right)={\beta}_0+\upgamma {TF}_i+\sum \limits_{j=1}{\beta}_j{d}_{ij}+{U}_i, $$

where dij in M3 are the explanatory variables described in the previous section. We use the log-likelihood ratio test to select among the three models defined above.

In fitting M3, we also carried out variable selection using a backward stepwise approach, starting from the mixed effects model with all the variables included. The likelihood-ratio test was used to test for the significance of each variable, with terms removed one by one until all those remaining were significant at 5% level.

To assess the presence of residual spatial correlation, we first obtained a point estimate of the community-level unstructured random effects Ui from the best model identified in the previous step, and then computed the empirical semi-variogram. A semi-variogram provides insights into the rate of decay of spatial autocorrelation in the data. It does this by computing the mean squared difference between pairs of residuals as a function of the distance between their associated geographical locations. A flat semi-variogram is interpreted as evidence against the presence of spatial correlation. To test for spatial correlation more formally, we also generated 95% confidence intervals under the assumption of spatial independence. These intervals were obtained by computing semi-variograms on 1000 randomly permuted point estimates of Ui, while holding the geographical locations fixed.

In cases where we found evidence of spatial correlation, we fitted geostatistical binomial logistic models, in which Ui was modelled as a spatial Gaussian process with a stationary and isotropic correlation function. All the geostatistical models were fitted in the PrevMap [49] package.


The output for the cluster-level tests suggested that temperature, precipitation, aridity index, and PET interact with one another (Table 2). Since aridity was highly correlated with each of these indicators, we retained this variable and excluded the remainder.

Table 2 Multicollinearity test results for gridded covariates

The strength of association for variables in the full mixed effects model varied between countries (Table 3). There was very strong evidence of association (P < 0.05) between community-level TF prevalence and TT prevalence in all countries except Guinea and Uganda. In contrast, there was evidence of association with access to latrines in 4 of 10 countries (DRC, Ethiopia, Mozambique and Nigeria (p < 0.01)), with access to improved latrines in 2 countries (Nigeria and Uganda (p < 0.05)), and with water source variables in 3 countries (Ethiopia, Nigeria and Guinea (p < 0.05)). Observed relationships with environmental factors were equally heterogeneous, with associations observed with aridity index in DRC, Ethiopia, Nigeria and Sudan; and with sand/soil fraction only in Benin and Ethiopia. Night light was associated in DRC, Ethiopia, Mozambique and Nigeria.

Table 3 Relative increase in odds derived from a multivariate binomial logistic model where community-level prevalence of TT in adults aged ≥15-years is dependent on a 10% increase in community-level prevalence of TF in children aged 1–9 years

The nested mixed effects models (Table 4) show that when TF was added as a fixed effect, the proportional reduction in variance ranged from 0.06 (Nigeria) to 0.42 (Benin). When environmental risk factors were added, the proportional change in variance ranged from 0.25 (Ethiopia) to 0.79 (Cote d’Ivoire). In all countries, variance continued to decrease as TF and then environmental risk factors were added to the model.

Table 4 Comparison of variance explained by each mixed effects model

The best models, selected using the likelihood ratio test, are shown in Table 5. DRC, Ethiopia and Nigeria maintained the largest number of variables significant at the 5% level. These three countries also had the largest quantities of data available.

Table 5 Relative increase in odds derived from a multivariate binomial logistic model where community-level prevalence of TT in adults aged ≥15-years and older is dependent on a 10% increase in community-level prevalence of TF in children aged 1–9 years. Community-level household prevalence of improved sanitation and hygiene facilities as well as gridded covariates were included along with community-level TF

Semi-variograms generated with Pearson’s residuals of the best fitting non-spatial binomial models suggest presence of residual spatial correlation in Benin, DRC, Ethiopia, Mozambique and Sudan. The 95% confidence intervals generated under the assumption of spatial independence demonstrate spatial correlation in these countries (Fig. 2).

Fig. 2
figure 2

Semi-variograms were generated with Pearson’s residuals derived from the best fitting non-spatial mixed methods model. The 95% confidence intervals (red dashed lines) and semi-variogram (black dashed line) created through generating 1000 simulations are displayed here. All distances are in kilometres

The distance at which spatial correlation fell below 5% ranged from 3.0 km (in Ethiopia) (95% credible interval 1.6–6.0 km) to 14.2 km (in Mozambique) (95% credible interval 3.3–76.4 km), corresponding with a very rapid decline in spatial correlation with distance at larger scales, after accounting for covariates (Table 6).

Table 6 Scale of community-level TT prevalence spatial correlation in kilometres when accounting for covariates significant at the 5% level, by country with 95% confidence intervals


We investigated factors associated with community-level TT prevalence after considering the community-level TF prevalence and spatial dependency, to try and understand what causes this variation. We demonstrated considerable variation in the relationship between community-level TF and TT. When accounting for other covariates, the mixed effects models demonstrated a strong association between community-level TF and TT in eight of ten countries. These models estimate that a 10% increase in community-level TF prevalence is associated with an increase in the odds for TT of 20 to 86%, varying with setting. Benin, Cote d’Ivoire and Mozambique had exceptionally high increments in odds ratios with increasing TF, whereby a 10% increase in community-level TF prevalence was associated with an increase in TT odds of 78, 86 and 74% respectively. These high increments in odds ratios for TT lead us to speculate that reductions in TF prevalence in these environments will be quickly followed by a reduction in the incidence of TT. The relatively low increments in odds ratios in Ethiopia and Nigeria, where a 10% increase in community-level TF prevalence associated with an increase in TT odds of only 23 and 20% respectively, suggest a slight disconnect between historic and current transmission. This could be a signal of (1) a change in transmission dynamics over time, (2) population movement, (3) a pointer to the fact that we are using all-trichiasis as the dependent variable, rather than only trichiasis due to trachoma or (4) other factors are influencing TT aside from C. trachomatis. Further analysis is needed to explore the influence on the relationships that we observed here of including data on the presence or absence of trachomatous conjunctival scarring in eyes with trichiasis.

Importantly, in these models, the proportion of variance explained by TF ranges from 6% (in Nigeria) to 42% (in Benin). This range highlights the complex relationship of the distribution of TT and the distribution of TF. Environmental covariates, on average, explain an additional 9% (in Ethiopia) to 46% (in Nigeria) of variance. Our models suggest that while community-level TF prevalence is generally the strongest single predictor of TT, it does not fully explain the variation in community-level TT prevalence, and implying that occasionally, high-TT-prevalence populations will be found where TF is rare. It has been widely observed that dry conditions (parameterized in our analysis as a low aridity index) is a risk factor for TF in children [31, 50,51,52]. We found an extension of this association in three of our countries, in the form of an association of low aridity index with increased TT prevalence. However, in Sudan we observed the phenomenon of an unexpectedly-positive association between community-level TT prevalence and aridity index. This counter-intuitive relationship may be attributed to coinfections facilitated in humid climates. It has been shown that coinfection with other bacteria [53], such as Streptococcus pneumoniae and Staphylococcus aureus [54], could influence progression of TT [55].

High levels of self-reported latrine use by adults, aridity index and 1997 night light had strongly significant associations with TT prevalence in only three of ten countries. This suggests that hygiene practices, dry climate and historic infrastructure may link to increased community-level TT prevalence in some settings, but generally they do not. Previous studies have clearly shown the association between access to WASH and risk of TF [25, 32, 39, 56] and so it is not surprising that our models, which account for TF prevalence, generally do not demonstrate significant residual associations between TT and WASH variables. The variation in direction of association may be an artefact of WASH improvements over time, or—hypothetically—existence of latrines themselves could contribute to facilitating M. sorbens breeding if the latrines are not appropriately maintained, thereby deterring some potential users whilst protecting householders from legal or peer pressure to build an adequate facility. The development of TT requires many previous C. trachomatis infections [4] and so populations that historically had poor WASH access may now have high TT burdens, even if the WASH situation has since improved.

Many other studies have identified correlates of high TF prevalence, including potential socio-economic, demographic and environmental risk factors [25, 32, 39, 56,57,58], and have explored TF’s spatial distribution at different geographic scales [33, 52, 59,60,61]. However, few studies have specifically examined TT’s environmental risk factors and spatial distribution [42, 62, 63]. These previous studies were limited by the amount of data they incorporated, and their conclusions therefore had constrained generalizability. Our models, developed using large datasets from ten countries with outcome data considered to be gold standard [64], reached similar conclusions and so provide additional validation to this previous work.

The variation between countries in directions of association of environment-related indicators and the variation in spatial structure indicate that fitting a single model to the whole set of data is inadvisable. These variations are presumably attributable to country context.

There may be several explanations for the inconsistency of associations between large-scale indicators and community-level TT seen between countries. Studies have shown that post-operative recurrence of TT [55, 65, 66] and incidence of scaring [67, 68] may be important influencers of TT prevalence. It would be valuable for future models to further explore these elements. Our modelling approach did not capture recent or historical population movement. Migration could certainly play a role in the geographic distribution of TT. It is also important to note that different ethnic groups may have different progression rates to TT. For example, a study in the Gambia found a polymorphism in the TNF-α gene promoter was associated with scarring, and was found more frequently among Mandinkas than other ethnic groups [69].

We observed residual spatial correlation in only five countries (Benin, DRC, Ethiopia, Mozambique, and Sudan), suggesting that in the remaining countries there are no outstanding large-scale environmental factors influencing progression to TT.

In the geostatistical models, we identified a very rapid decline in spatial correlation with distance at larger scales, after accounting for covariates. This suggests that very closely adjacent communities have similar levels of TT.


The lack of consistent risk factors beyond community-level TF raises concerns that the models identified artefacts that are not generalizable, such as non-trachomatous trichiasis, or that the clinical history of trachoma varies substantially between settings. This underlines the importance of understanding local context when designing interventions for at-risk populations. Whilst our findings are not generalizable across countries, they can provide general direction for where to initiate case finding activities. As has been found in the Guinea Worm eradication program, active surveillance and case finding will be essential as trachoma elimination endpoints draw closer [70]; these activities become more expensive as prevalence drops [71]. This uniquely large and standardized analysis provides important insight into the variation in community-level TT distribution and identifies substantial variation in the relationship between community-level TF and TT prevalence. For some countries, important environmental risk factors were identified which can be used to inform case finding efforts, by providing insight into where TT cases are most likely to be found. Our findings suggest that in some countries it is possible to inform strategic location of TT management services, potentially improving efficiency of the end-game of trachoma elimination.