Background

Visceral leishmaniasis (VL) in the Indian sub-continent (ISC) is a disease caused by chronic infection with the protozoan parasite Leishmania donovani, transmitted by the Phlebotomus argentipes sandfly. Clinically manifest visceral leishmaniasis, also called kala-azar (KA), is progressive with a high mortality rate, and characterized by prolonged fever and an enlarged liver and/or spleen. Clinical and laboratory diagnostics are imprecise [14], partly because only a small proportion of infected individuals develop disease (so that the presence of infection alone is not diagnostic), and partly because the clinical features of VL overlap with those of other endemic diseases (e.g. hyperreactive malarial splenomegaly, typhoid fever and disseminated tuberculosis), so that clinical presentation alone is not diagnostic. The current diagnosis generally relies on clinical features, specifically a fever lasting at least 14 days and a palpable liver/spleen (hepatomegaly/splenomegaly), and elevated rK39 antibodies (based on immuno-chromatographic assay) rather than evidence of active current infection. This combination of the duration of infection and rK39 rapid testing has high sensitivity [5].

Epidemiologically, KA is spatially highly heterogeneous with focal ‘hotspots’ of infection that move over time, and periodic epidemics on a timescale of decades [6, 7]. The control campaign in the ISC, which has been running since 2005, has focused on elimination as a public health problem (less than 1 new case per 10,000 people per year), defined at local geographical scales (per subdistrict in both India and Bangladesh, known as an upazila in Bangladesh). Progress has been made towards the target by implementing novel case detection strategies, rapid diagnostic testing and vector control activities. In Bangladesh, KA incidence declined in all districts in the 8 years following the start of the control campaign (2006–2013) from the previous 8 years (1998–2005) [8]. The highest number of cases was reported in 2006, after which the annual number of cases decreased significantly. In the period from 2008 to 2013, only 16 upazilas had average incidence rates above the elimination target (ranging from 1.06-18.25/10,000 people/year) [8]. In Nepal, where KA was only endemic in south-eastern districts neighbouring the state of Bihar in India and incidence rates were much lower (1-10/10,000 people/year in 2007–2008 [9]), the elimination target has been reached for two consecutive years. However, Bihar, which accounts for 70-80 % of the KA cases in India [10], is still far from the target with an estimated incidence of 22–29.8/10,000 people/year in 2006–2007 [11], and more recent estimates of 1–5 cases/10,000 people/year [10, 12].

Although vector control activities (in particular indoor residual spraying (IRS)) are a pillar of the elimination programme [13, 14], they appear to have uncertain and variable effectiveness, likely due to sub-optimal implementation and, in some areas, insecticide resistance [1517]. However, a randomised control trial of IRS and insecticide-treated bed nets in Bangladesh from 2006 to 2007 showed a 70-80 % reduction in sandfly density up to 5 months post intervention [18], and recent modelling of IRS suggests that in low and medium endemicity settings (5–10 KA cases/10,000 people/year) effective IRS may be sufficient to reach the 1 case/10,000 people/year elimination target [19].

Progress in KA case reduction over the past decade has been largely attributed to improved timeliness of diagnoses and more effective treatment [20]. Given that the 'natural' epidemiology of the disease is typified by recurrent epidemics followed by long periods of low incidence, and noting that the current control is dependent on substantial external resources, effort and clinical awareness, there is considerable potential for future resurgence without a sustained elimination effort [20, 21]. Other major issues for the elimination programme include high levels of under-reporting (the ratio of actual to reported KA cases in the ISC is estimated to range between 2:1 and 8:1 [22, 23]), and the unknown contribution of asymptomatically infected individuals, who potentially form a large infectious reservoir, to transmission [24, 25].

Mathematical and statistical modelling of infectious diseases has a successful history of combining epidemiological data, biological understanding and clinical knowledge into quantitative frameworks that can be used to both interpret disease incidence (in terms of infection patterns) and predict the impact of proposed interventions. Visceral leishmaniasis is unusual in that there have been relatively few previous modelling attempts, mostly driven by the lack of quantitative data [26]. There are, to our knowledge, only two recent, high quality, longitudinal epidemiological studies: the KALANET bed net trial in India and Nepal between 2006 and 2009 [27, 28], and the studies of Bern et al. in Bangladesh from 2002 to 2010 [2931]. Consequently, there is still a large amount of uncertainty about the natural history of the disease, in particular about its incubation period and the proportion of asymptomatically infected individuals that develop KA. From here on we treat the incubation period as being synonymous with the duration of asymptomatic infection, since in our modelling we do not initially distinguish between asymptomatically infected individuals that develop KA and those that do not, but we note that the duration of asymptomatic infection may be different for the two groups and also test this hypothesis (see Additional file 1). Previous estimates for the incubation period have ranged from 2 to 6 months [3234], while estimates for the proportion of asymptomatic individuals that progress to KA have varied hugely, from 0.33 % [33] to 25 % [31]. Better quantification of these aspects of the disease is critical for developing effective models, guiding efforts to eliminate VL and maintaining elimination once it is reached.

Towards this end, we analyse the prospective, longitudinal data from a 3-year study in Bangladesh in the period 2002–2004 (details of the study and epidemiological analyses have been reported elsewhere [29, 31]). We use annual data on rK39 positivity and positivity of the leishmanin skin test (LST), together with KA diagnosis, to estimate the rates of progression between different disease states. The aim is to provide preliminary, quantitative estimates of waiting times (i.e. times spent in each state) and paths of progression that will feed into future transmission model development.

Methods

Data

The study took place in a single community in Fulbaria upazila, Mymensingh district, Bangladesh between January 2002 and June 2004. Fulbaria was chosen due to its high reported KA incidence for the three years prior to the study. In 2002 the community had a population of approximately 12,000 people and was divided into 9 'paras' (sections), of 100–500 houses each. Cross-sectional household surveys of the 3 paras with the highest reported KA incidence were conducted from January to April in 2002, 2003 and 2004. All individuals who lived at least 6 months in the study area in the 3 years before the first survey in 2002 were included. The protocol was approved by the International Centre for Diarrhoeal Disease Research, Bangladesh (ICDDR,B) Research and Ethical Review Committees and the Institutional Review Board of the Centers for Disease Control and Prevention (CDC).

The data recorded included demographic information (age and sex), present and past KA cases (back to 1999), and risk factors (such as sleeping location, bed net use, diet, and animal ownership). For participants ≥ 3 years of age, capillary blood samples were taken for serology testing and the leishmanin skin test (LST) was applied intradermally. The blood samples were tested by an enzyme-linked immunosorbent assay (ELISA) with recombinant K39 (rK39) antigen [35, 36] and a modified protocol that included a standard titration curve of a pool of known positive sera on each plate of blood samples [37]. Concentration units (CU) were assigned to the standard titration curve (with the highest concentration on the curve assigned a value of 1000CU), and the optical density of the serum specimens converted into CU. The positive ELISA cut-off was set at 20CU—the 99th percentile of the distribution of ELISA readings from 38 individuals living in a non-VL-endemic region of Bangladesh. A second cut-off of 61CU was introduced for diagnosis of active KA, with a sensitivity and specificity of 97 % and 98.9 % respectively for sera from the study population based on receiver-operator-characteristic analysis [37].

The antigen for the LST was a suspension of 5 × 106 promastigotes/mL of the WHO-approved MHOM/TN/80/IPT1 strain of L. infantum. The test was applied following the standard protocol: 0.1 mL of antigen was injected intradermally on the inside of the forearm and 48–72 h later the induration of the skin measured in two perpendicular directions [38, 39]. In accordance with international consensus, the LST result was deemed positive if the mean of the two measurements was ≥ 5 mm [31, 40]. There was evidence of loss of leishmanin potency in this study in the 2003 and 2004 survey rounds [30], and so at later time points the number of individuals with positive LST reactivity is likely to be an underestimate (testing showed that the L. infantum antigen had a sensitivity of 70 % compared with L. amazonensis antigen in 2004).

A past case of KA was defined as an illness with ≥ 2 weeks of fever and at least one of: weight loss, abdominal swelling or skin darkening, with clinical improvement after 20 days of intramuscular injections of sodium stibogluconate (SSG) (the treatment for KA prescribed by national guidelines at the time). A present case of KA was defined as one that fulfilled the same definition plus splenomegaly and/or hepatomegaly and a positive rK39 ELISA result or rK39 dipstick test [29].

In total, data was collected on 2,410 out of 2507 individuals living in 509 houses in the 3 paras during the study. Of these individuals, 47 % were male and 53 % were female, and 2,152 had at least one rK39 ELISA or LST reading between 2002 and 2004. There were 182 cases of KA from the start of 1999 to the end of the study in June 2004: 125 cases with onset before 2002 and 57 with onset from 2002 to the end of the study (see Table 1). There were only 5 relapses to active KA following treatment, and the incidence of post kala-azar dermal leishmaniasis (PKDL) was very low, with only 4 confirmed cases out of the 182 KA cases (all of which were in 2004). Consequently, we have not included development of PKDL in our modelling.

Table 1 Summary of the data

Statistical analysis

Following the identification of delays between onset of symptoms and diagnosis and treatment [20, 41], a descriptive analysis of the key time periods in the data was performed. To investigate the impact of serological status on progression to disease, we analysed the risk of progression to KA for those with a particular sero-status at baseline, and those who seroconverted during the study. Kaplan-Meier curves were plotted and Cox proportional hazards regression models fitted to test for associations between (i) initial rK39 seropositivity and KA and (ii) rK39 seroconversion and KA, following a previous analysis by Hasker et al. [28].

Hasker et al. analysed serology data from four different cohorts in two large studies—two from the KALANET trial in India and Nepal, and two from the Tropical Medicine Research Centre (TMRC) project run in Bihar, India since 2008 [42, 43]—to determine the association between rK39 and direction agglutination test (DAT) antibody titres and progression to KA. For the KALANET trial, rK39 ELISA results were available for 2006 only, so only the risk of KA as a function of baseline seropositivity could be assessed, but for the TMRC surveys blood samples were tested using rK39 ELISA at each survey, so seroconversion was also investigated. Hasker et al. found that there was a strong association between high rK39 titres at baseline and progression to KA, and an even stronger association between seroconversion to a high titre and subsequent progression.

In our analysis, we take the 2002 serology survey as the baseline survey and the 2003 survey as the follow-up survey. We use the rK39 ELISA cut-offs described above to define seronegativity (rK39 ELISA reading < 20CU), moderate seropositivity (20CU ≤ rK39 ELISA < 61CU) and strong seropositivity (rK39 ELISA ≥ 61CU). These differ slightly from the cut-offs used in the Hasker study, in which the cut-off for seropositivity was given by the mean optical density of known negative sera plus two standard deviations, and the cut-off for strong seropositivity was determined by the percentage point optical density with the highest combined sensitivity and specificity for identifying individuals diagnosed with KA in the last 2 years. Since the cut-off for strong seropositivity for our data was determined using samples from individuals with active KA as positive controls, it is likely that it corresponds to a higher rK39 titre and is more specific for KA. Nevertheless, the cut-offs for the second TMRC cohort in [28], which was from a higher endemicity region, correspond closely to those used in our analysis.

For the analysis of KA progression risk with seroconversion, we classified seroconvertors from the 2002 survey to the 2003 survey into different groups. Individuals that sero-deconverted from being either strongly seropositive or moderately seropositive to seronegative were grouped together (sero-deconvertors), as were those that remained either seronegative or seropositive (non-convertors), who were taken as the reference group. Individuals whose titre increased between surveys were grouped into seroconvertors (who went from being seronegative to seropositive) and strong seroconvertors (who went from being seronegative or seropositive to strongly seropositive). People that were strongly seropositive at both surveys were treated as a separate group.

Multi-state Markov model of natural history of VL

Multi-state Markov models provide an informative way of analysing the natural history of a disease, by describing how an individual moves through a series of disease states (e.g. healthy, infected, recovered, dead) in continuous time. The movement of individuals between states is governed by a set of transition intensities, q rs (r, s = 1, …, R), each of which represents the instantaneous risk of moving from state r to state s for r ≠ s (\( {q}_{rr}:=-{\displaystyle \sum_{s\ne r}}{q}_{rs} \)), where R is the number of states. The transition intensities may depend on time t and a set of (potentially individual-specific) explanatory variables z (i.e. q rs  = q rs (t, z)), and are summarised in an R × R matrix, Q, whose rows sum to zero. The aim of fitting the multi-state model to data on observations of individuals’ disease states is to estimate the transition intensity matrix Q.

Multi-state Markov models are particularly useful for modelling panel data on disease progression, such as that described above, where individuals are observed at approximately regular intervals, but the exact times between follow-up visits vary and limited information is available about the individuals between follow-up visits. This means that changes in individuals’ disease states generally occur at unknown times.

Following Stauch et al. [33] and with a view to developing a transmission model, we model the natural history of VL as shown in Fig. 1. Individuals are classified into 5 different disease states—susceptible, asymptomatically infected, symptomatically infected (active KA), recovered/dormant, and dead—according to their KA status and the results of the rK39 ELISA and LST as shown in Table 2 (see Table A1 in Additional file 1 for the full classification including censored states for missing tests). Susceptible individuals (state 1) have negative rK39 ELISA and LST readings, are not currently symptomatic, and their most recent rK39 ELISA test was negative. If individuals have a positive ELISA, but a negative LST and have not previously suffered KA they are classed as being asymptomatically infected (state 2). On development of symptoms an individual is recorded as having KA (state 3) and remains in this state from the date of fever onset to the end of treatment or, if the individual is untreated, one year after the onset of symptoms (98 % of patients that have KA during the study have a date of onset of symptoms and 91 % have a date of start of treatment; prior to and after the study period these data are often missing and so these dates are recorded as unknown or uncertain). However, asymptomatically infected individuals may also progress to a 'recovered/dormant' state (state 4) without developing symptoms. Recovered/dormant individuals are classified as those who are LST positive, have recovered from their KA symptoms or have sero-deconverted from positive to negative rK39 ELISA. Recovered/dormant individuals may relapse to KA or return to being susceptible. Deaths due to KA and other causes (state 5) are included in the dataset, so individuals may be absorbed from any of states 1 to 4 into state 5.

Fig. 1
figure 1

Flow diagram for multi-state Markov model of natural history of VL

Table 2 Classification of individuals into different disease states in multi-state model

The transition intensity matrix for this 5-state model of VL is

$$ Q=\left(\begin{array}{ccccc}\hfill {q}_{11}\hfill & \hfill {q}_{12}\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill {q}_{15}\hfill \\ {}\hfill 0\hfill & \hfill {q}_{22}\hfill & \hfill {q}_{23}\hfill & \hfill {q}_{24}\hfill & \hfill {q}_{25}\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill {q}_{33}\hfill & \hfill {q}_{34}\hfill & \hfill {q}_{35}\hfill \\ {}\hfill {q}_{41}\hfill & \hfill 0\hfill & \hfill {q}_{43}\hfill & \hfill {q}_{44}\hfill & \hfill {q}_{45}\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \end{array}\right). $$

We will assume that the q rs are independent of the number of individuals in each state, and thus are time-independent. For this model, the proportion of those asymptomatically infected who develop symptoms (excluding individuals that die) is the ratio of the transition rate from asymptomatic infection to KA to the total rate of progression to KA or recovery:

$$ \mathrm{Probability}\ \mathrm{of}\ \mathrm{developing}\ \mathrm{symptoms}=\frac{q_{23}}{q_{23}+{q}_{24}}. $$
(1)

Model fitting

The model was fitted using the multi-state modelling package msm in R [44]. This package allows for the fact that some observations are exact (such as dates of death) and others are censored (such as the date of seroconversion) (see Additional file 1 for further details). The package estimates the transition intensity matrix Q and its confidence intervals by maximising the likelihood of the model given the data (see Additional file 1 for full details). The model also estimates the durations of the asymptomatic and symptomatic stages (the waiting times in states 2 and 3). The BFGS (Broyden-Fletcher-Goldfarb-Shanno) optimisation method (a quasi-Newtonian hill-climbing method that uses analytic derivatives for the optimisation [45]) in the optim function was used for finding the maximum likelihood. The confidence interval for the proportion of asymptomatic individuals that develop KA was calculated by bootstrap resampling of the data and refitting of the model with 1000 bootstrap samples [44].

Results and discussion

Delays to treatment

Figure 2(a)-(c) show the distributions of onset-to-treatment, onset-to-diagnosis and diagnosis-to-treatment times for all KA patients for whom these times were recorded, along with the median, mean and standard deviation of each distribution. The median time of 120 days from onset to start of treatment is much longer than figures recently reported for Bangladesh (58 days) [20] and Nepal (55 days), but is comparable with those for Bihar (104 days) [41]. The discrepancy between our data and recently reported figures likely reflects the poorer state of the health care system and the greater cost of treatment in Bangladesh at the time, and the fact that there was no active detection programme for KA cases before 2005 [8, 46]. However, methodological differences between the studies may also account for some of the difference; for example, the dates of symptom onset and treatment from 1999 to 2002 were retrospectively ascertained in our study, so may be subject to recall bias.

Fig. 2
figure 2

Delays to treatment. Distributions of (a) onset-to-treatment time, (b) onset-to-diagnosis time, and (c) diagnosis-to-treatment time. Sample sizes (n), medians, means and standard deviations (SDs): (a) n = 147, median = 120 days, mean = 133 days, SD = 90 days; (b) n = 67, median = 90 days, mean = 111 days, SD = 94 days; (c) n = 64, median = 12 days, mean = 24 days, SD = 35 days

Risk of progression to KA

In the analysis of the association between rK39 sero-status and progression to KA, 1,515 individuals who had not had KA previously and who had a serological measurement at baseline were included, amongst whom there were a total of 43 KA cases. The Kaplan-Meier curve in Fig. 3(a) illustrates that being highly seropositive at baseline was much more predictive of progression to clinical symptoms than moderate seropositivity or negative serology—29 % of strongly seropositive individuals progressed to KA compared to 3 % and 2 % of seropositive and seronegative individuals. Testing these associations using Cox regression modelling showed that there was relatively little difference in the risk of progression for seronegative and moderately seropositive individuals, but a much higher hazard ratio (HR) for those with high seropositivity (HR 17.7, 95 % CI 8.05-38.8, Table 3). This matches the analysis of Hasker et al. [28], which found hazard ratios for progression to KA ranging from 1.6 to 4.9 for seropositive individuals and from 7.7 to 39.6 for strongly seropositive individuals compared to seronegative individuals in 4 cohorts studied in Bihar, India and Terai, Nepal. However, the proportion of strongly seropositive individuals progressing to KA in the Bangladesh study was much higher than in all of the cohorts in [28] (29 % compared to 1.1-7.7 %) except for the TMRC cohort in Muzaffapur, Bihar that was selected based on high reported KA incidence in the year prior to the study (where the proportion was 23.3 %).

Fig. 3
figure 3

Kaplan-Meier curves for risk of progression to KA. Progression risk (with censoring) by (a) serology status at baseline, and (b) seroconversion from baseline survey (2002) to second survey (2003). Dots show where individuals were lost to follow-up; dashed lines show 95 % confidence intervals

Table 3 Progression to KA depending on baseline rK39 sero-status. Hazard ratios and p-values estimated from fitted Cox proportional hazards regression models

As well as sero-status at baseline, seroconversion was an important marker for progression to KA. The seroconversion analysis was performed using 1,372 individuals that had rK39 ELISA readings from both the 2002 and 2003 surveys, 33 of whom developed KA. As expected from the figures shown in Table 3, a transition to strong seropositivity from either negative or moderately positive serology at baseline was associated with a high progression rate to KA compared with no seroconversion (HR 165, 95 % CI 74.6-365). Individuals that were strongly seropositive at both surveys also had a high risk of developing clinical symptoms (HR 61.5, 95 % CI 19.3-196). Seroconversion to moderate seropositivity was associated with an approximately 5-fold increase in risk of KA over no seroconversion, but the difference in progression for sero-deconvertors and non-convertors was not significant (Table 4). These results are similar to those for the highly endemic villages in Muzaffapur, Bihar in [28], which showed a hazard ratio for KA of 123.9 for individuals that became strongly seropositive compared with those that remained seronegative. However, a far greater proportion of high-titre seroconvertors progressed to KA in our data than in the previous study (13/18, 72 % as opposed to 9/37, 24.3 %). Also, unlike in this analysis, there was no significant association between moderate-titre seroconversion and progression to KA in the previous study. This is likely to be partly due to the differences, described in the Methods section, in the definitions of the cut-off values for seropositivity and strong seropositivity between the two studies.

Table 4 Progression to KA depending on change in serology status from first survey to second survey. Hazard ratios and p-values estimated from fitted Cox proportional hazard models. Sero-status: seronegative (−), seropositive (+), strongly seropositive (++)

The strong association between seroconversion to a high rK39 antibody titre and progression to KA in the study data suggests that rK39 ELISA could be used to predict KA prior to the onset of symptoms. However, since most high-titre seroconvertors who developed clinical symptoms did so before the second survey (Fig. 3(b)), more frequent testing would be required to use rK39 ELISA to pre-diagnose KA. Furthermore, even if practical constraints allowed for such testing, there is no suitable prophylactic treatment available at present, due to the toxicity of drugs currently used to treat KA.

Natural history of VL

We first fitted the Markov model in its simplest form with constant transition intensities, i.e. treating individuals in the same state as having the same risk of infection and disease. The estimated transition intensity matrix for this 5-state model is

$$ Q=\left(\begin{array}{ccccc}\hfill -0.22\hfill & \hfill 0.21\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0.005\hfill \\ {}\hfill 0\hfill & \hfill -2.49\hfill & \hfill 0.36\hfill & \hfill 2.11\hfill & \hfill 0.02\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill -2.61\hfill & \hfill 2.48\hfill & \hfill 0.13\hfill \\ {}\hfill 0.31\hfill & \hfill 0\hfill & \hfill 0.01\hfill & \hfill -0.33\hfill & \hfill 0.006\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \end{array}\right), $$
(2)

and the negative log-likelihood of the model is − log L = 1760.5 for the fitted intensities. The movement of individuals through the different disease states in the model with the estimated transition intensities is simulated in Additional file 2.

Proportion of asymptomatic individuals that develop KA

The estimated intensities show that for asymptomatically infected individuals the probability of developing clinical symptoms (from (1)) was

$$ \frac{q_{23}}{q_{23}+{q}_{24}}=\frac{0.36}{0.36+2.11}=0.147, $$

with a 95 % bootstrap confidence interval (CI) of 0.126-0.200, and that recovery from KA with correct treatment was nearly 20 times as likely as dying from KA (q 34/q 35 = 19.5). This estimate for the proportion of asymptomatic individuals that develop KA, 14.7 % (95 % CI 12.6-20.0 %), is much larger than the figure of 0.33 % (95 % CI 0.22-0.49 %) estimated by Stauch et al. [33] from their SIRS model of VL transmission fitted to the KALANET trial data, and the figure of 4 % used in the transmission model of Medley et al. [20] based on the average progression to KA from baseline seropositivity in the KALANET data [28]. It is, however, similar to the proportion of DAT seroconvertors that developed KA in high-endemicity villages in the KALANET trial (10.1 %) [47], and the KA progression rate of asymptomatics identified by rK39 and PCR positivity in a study in two highly endemic villages in Bihar in 2005–2006 (23 %) [24]. It also agrees reasonably well with the 4:1 ratio of cases of seroconversion to KA reported by Bern et al. [31] for the data from 2002 to 2004. The variation in these estimates may reflect differences in various factors between the different study locations and periods, including parasite virulence, the immune and nutritional status of the host population, and the part of the epidemic curve the study population was on (incidence was climbing steeply in Fulbaria in 2000–2004 [7]). However, Stauch et al.’s estimate may be a considerable underestimate of the actual proportion that progress to KA due to the rapid cycling of individuals through asymptomatic infection in their model caused by fitting to cross-sectional data. It is likely, therefore, that Stauch et al. underestimate the contribution of KA patients to transmission relative to that of asymptomatic individuals.

Average durations of asymptomatic infection and KA

The mean waiting times in the different disease states for the estimated transition intensities are shown in Table A2 in Additional file 1. The mean duration of asymptomatic infection was 147 days (95 % CI 130–166 days) and that of symptomatic infection was 140 days (95 % CI 123–160 days). Both these estimates are much longer than estimates from previous models. Stauch et al. [33] estimated that asymptomatic infection lasted on average 72 days (95 % CI 69–75 days) and that KA patients that received successful first-line treatment were cleared of parasites after 31 days and took 105 days on average from developing symptoms to become DAT negative and LST positive (i.e. to fully recover). Medley et al. [20] assumed that the duration of asymptomatic infection is 80 days in their model. Clearly, longer infection and disease durations can lead to increased transmission, as there are more opportunities for sandflies to become infected through feeding on humans. Based on our estimates, asymptomatic individuals are likely to contribute significantly to transmission even if their infectivity to sandflies is low relative to symptomatic individuals, due to the long asymptomatic infection period and the large ratio of asymptomatic to symptomatic individuals (although this proportion is smaller than that estimated by Stauch et al. [33]). If our estimates of the infection durations are representative of current high endemicity areas, more effort should be focused on reducing time to treatment through active case detection and early diagnosis, and greater surveillance of asymptomatic infection is required to identify individuals that are likely to develop KA and to estimate their contribution to transmission.

Immunity

The estimated mean waiting time in the dormant/recovered stage was 1110 days (95 % CI 988–1247 days), which is much longer than that in the model of Stauch et al., where individuals remained DAT positive after asymptomatic infection or KA treatment for 74 days (95 % CI 65–84 days) and LST-positive for 307 days (95 % CI 260–356 days) on average, corresponding to a total time in the dormant/recovered stage of 381 days. We note that our estimate for the total time spent immune or with dormant infection is also likely to be an underestimate, due to the decrease in the sensitivity of the LST over the course of the study. This suggests that cellular immunity to the parasite can last for multiple years after asymptomatic infection or successful treatment for KA as reported elsewhere [48], rather than being lost within a year as suggested by Stauch et al. Indeed the age distribution of LST positivity in 2002 (when the LST had highest sensitivity) (Figure A1 in Additional file 1) shows an increase in the proportion of LST positive individuals with age, and Bern et al. [31] calculated from the data that there was a 48 % (95 % CI 38-59 %) increase in the chance of being LST positive with each 10-year increase in age, which suggests that cellular immunity wanes slowly. Of 530 individuals that tested LST positive in 2002, only one developed KA over the next two years, compared with 43 of the 1000 individuals that tested negative (relative risk = 0.04, 95 % CI 0.006-0.32), indicating that LST positivity represents effective immunity against KA. Given the increasing prevalence of LST positivity with age, its potentially long duration and the strong protection it offers against KA, control efforts should strive for 100 % detection and treatment of KA cases, particularly among individuals below 30 years of age who are less likely to be immune. The LST should also form a routine part of epidemiological studies to enable effective monitoring of levels of immunity within the population and better prediction of the risk of KA outbreaks, though this will require the production of sufficient quality leishmanin antigen to avoid issues associated with low antigen sensitivity [30].

Model fit

To assess how well the multi-state model fits the data with the estimated transition intensity matrix in (2), we compared the observed number and prevalence of individuals in each state during the study period from 2002 to 2004 to the expected number and prevalence from the model. As shown in Table A7 and Figure A3 in Additional file 1, the overall fit of the model is good, with the observed and expected prevalences matching very closely for susceptible individuals, KA patients and recovered/dormant individuals, and fairly closely for asymptomatically infected individuals and dead individuals.

We also compared the model to a 6-state model in which asymptomatically infected individuals were split into two separate states—one for those who subsequently progressed to KA (‘pre-symptomatics’), and one for those who recovered without developing symptoms (‘asymptomatics’)—to determine whether there was a difference in the duration of asymptomatic infection for the two groups (see Additional file 1 for full details). Fitting the 6-state model to the data gave similar results to the 5-state model for the mean durations of the different disease stages (127 days for KA, 95 % CI 113–143 days, and 1108 days for the time spent immune or with dormant infection, 95 % CI 987–1244 days) and the proportion of infected individuals that develop symptoms (13.8 %, 95 % CI 9.7–19.4 %), and fairly similar asymptomatic infection durations of 135 days (95 % CI 109–167 days) and 159 days (95 % CI 138–183 days) for pre-symptomatics and asymptomatics.

Covariates

A number of factors may be associated with altered KA risk, including sex, age and consistent use of bed nets (Table A3 and Figure A2 in Additional file 1). Individuals aged between 0 and 14 were at highest risk of KA (9.9 %), with a significantly decreased KA incidence in adults aged over 45, with only 2.8 % developing KA over the course of the study. Males were found to have a slightly higher incidence of KA than females (9.2 % compared to 7.8 %), while the use of bed nets more than halved the risk of KA (6.7 % compared to 14.7 %).

To further investigate the effects of these variables on risk of infection and disease, we fitted the model with each variable included as a covariate on the transition intensities. The results are summarised in Table A4 in Additional file 1, which gives hazard ratios for each covariate with 95 % confidence intervals. This analysis allows us to investigate which parts of the disease progression are affected by each covariate. For example, the hazard ratios for q 12 and q 34 for bed net use are 0.72 (95 % CI 0.52–1.00) and 1.44 (95 % CI 1.06–1.96), suggesting that bed net use reduces the risk of leishmanial infection by 28 % and increases the chance of recovery from KA by 44 % over no bed net use. This is potentially due to bed nets preventing infected sandflies from biting humans and either infecting or reinoculating them, and suggests that, with proper and widespread use, bed nets could form an effective part of VL control.

Performing a similar analysis on sex indicates that females have a lower rate of progression from asymptomatic infection to recovery/dormant infection (HR for q 24 0.73, 95 % CI 0.57–0.94) and a higher rate of return from recovery/dormant infection to susceptibility (HR for q 41 1.36, 95 % CI 1.07–1.72). While Table A3 and Figure A2 in Additional file 1 suggest that the risk of KA generally decreases with age, the risk does not decrease linearly. Individuals aged 15–45 appear to be at increased risk of infection compared with those aged 0–14 (HR for q 12 1.31, 95 % CI 0.99–1.73), but more likely to recover from asymptomatic infection without developing KA (HR for q 24 1.35, 95 % CI 1.03-1.77) and less likely to recover from KA (HR q 34 0.75, 95 % CI 0.56-1.00). The risk of death from KA is higher for adults aged over 45 than children aged 0–14 (HR for q 35 5.19, 95 % CI 1.28–21.0). As expected, the risk of death due to other causes is much higher for adults aged over 45 than children aged 0–14 (HRs for q 15, q 25 and q 45 15.9, 95 % CI 4.5-55.4). Table A5 in Additional file 1 shows the probability of developing KA from asymptomatic infection for the different groups for each covariate. The probability is fairly similar across all groups and covariates, at approximately 0.15-0.16, apart from for 0–14-year-olds, who have a higher probability of developing symptoms of 0.17, and those aged over 45, who have a much lower probability of symptoms of 0.06. These differences have implications for design of surveillance systems; for example, they suggest that children are likely to be a more sensitive indicator of continued transmission, whereas most infection in adults is asymptomatic. Comparison of the model with each of the covariates to the model without any covariates using the likelihood ratio test and Akaike information criterion (Table A6 in Additional file 1) reveals that including each of the covariates significantly improves the fit of the model to the data. The largest improvement in the fit is given by including age-group as a covariate, which decreases the negative log-likelihood for the model to − log L = 1720.7 (p = 1.8 × 10−10 for the likelihood ratio test against the model with no covariates).

Conclusions

By reanalysing a detailed dataset on the development of clinical VL in Bangladesh in 2002–2004, we have been able to provide an independent estimate for the proportion of asymptomatically infected individuals who progress to KA of 14.7 % and an estimate for the asymptomatic infection period of 147 days. Both these estimates are similar to those reported in the literature by other means [32, 34], but much higher and longer respectively than those used in the main previous modelling studies [20, 33, 49].

Our analysis also shows that high rK39 levels, and in particular seroconversion to a high rK39 titre, are good predictors of progression to clinical VL, providing independent support for the results from a previous study [28]. This suggests that it may be possible to screen individuals to identify those who are likely to progress to clinical VL, improve their access to treatment and potentially reduce their infectious period and onward transmission through targeted IRS.

The role of seroconverting and symptomatic individuals in transmission depends not only on the proportion of individuals in each state and the lengths of time they are in each state, but also on their infectivity to sand flies. The relative infectivity of asymptomatics and symptomatics has rarely been studied, [50, 51], although one xenodiagnostic study in Ethiopian patients and vectors suggests that there may be marked changes in infectivity with parasitaemia [52], a measure which was not noted in this study.

We have also highlighted that the time from onset of symptoms to treatment in this area of Bangladesh in the early 2000s was considerably longer than in recent times [41]. During the 2002–2004 study period, Bangladesh experienced a major shortage of sodium stibogluconate, the only KA treatment drug in use at the time [53]. The shortage led to lack of supply in government health facilities and price gouging in the private marketplace [54]. At the time the study began, the median time from onset to treatment was 6 months and the only available drug was provided by the project; this shortened to 3–4 months over the course of the study. Improvements in drug availability after 2005, and especially after the implementation of active case detection and treatment with short course liposomal amphotericin B in Fulbaria [55], may have helped to drive reductions in incidence in Bangladesh by shortening the period symptomatic individuals spent in the community prior to treatment [20].

Despite our limited understanding of the natural history of VL [26], there are only two detailed epidemiological studies of the progression of leishmanial infection and disease. If we are to further refine control strategies to bring VL to local elimination, such studies will be invaluable, particularly if they can assist in identifying individuals who will develop KA or who may contribute most substantially to transmission.

Our intention is to use the results of our statistical modelling to develop transmission dynamics models of VL to evaluate the effect of potential interventions and the feasibility of achieving the 2020 elimination goals. At present, our estimate of the rate of infection, q 12, is independent of time and of the prevalence of infection, but should be a function of the number of infectious sandflies, which itself is dependent on the infectiousness of the human population. Within such a framework, we can also consider the spatial kernel of transmission and the impact of individual circumstances such as livestock ownership, nutritional status and sleeping location. To refine our estimates of the asymptomatic infection period and proportion of infected individuals that develop KA, and to assess the sensitivity and specificity of the rK39 ELISA and LST used in the study, we need to account for misclassification of individuals’ disease states in the multi-state model due to errors in the test results. This can be achieved using a hidden Markov model, in which individuals’ observed states can be misclassifications of their true, underlying disease states.

The results of this study and the potential for future development highlight the importance of detailed, longitudinal studies for improving understanding of VL and creating datasets that can be used for the design of interventions. As our understanding of the disease develops, the requirements for such data change, indicating that datasets such as these must be continually gathered.