Background

Leptospirosis is a zoonotic and an emerging infectious disease caused by the pathogenic Leptospira species and is identified in the recent years as a global public health problem because of its increased mortality and morbidity in different countries. Leptospirosis is frequently misdiagnosed as a result of its protean and non-specific presentation resembling many other febrile diseases, notably viral haemorrhagic fevers such as dengue [1]. There is, for certain, an underestimation of the leptospirosis problem due to lack of awareness and under-recognition through a lack of proper use of diagnostic tools.

The common mode of transmission of the infection in humans is either by direct or indirect contact with the urine of infected animals and may lead to potential lethal disease. A unique feature of this organism is to parasitize in a wide variety of wild and domestic animals [2]. Traditionally, two species have been identified, i.e. Leptospira interrogans and L. biflexa for pathogenic and non-pathogenic leptospires, respectively. The serovar is the basic identifier, characterized on the basis of serological criteria. To date nearly 300 serovars have been identified under the species L. interrogans alone that have been distributed among 25 different serogroups of antigenically similar serovars [3].

Previously a classification system based on DNA-DNA hybridization studies has been introduced, which now comprises 17 Leptospira species [47]. Among these, 7 species: L. interrogans, L. borgpetersenii, L. santarosai, L. noguchii, L. weilli, L. kirschneri and L. alexanderi are considered as the main agents of leptospirosis [5, 6]. The enormous inventory of serovars, based mainly on an ever-changing surface antigen repertoire, throws an artificial and unreliable scenario of strain diversity. It is therefore difficult to track strains whose molecular identity keeps changing according to the host and the environmental niches they inhabit and cross through.

Other than the serological methods, molecular tools that have been employed so far for sub-classification and cataloguing of leptospiral agents include restriction endonuclease assay (REA) [8, 9], pulsed field gel electrophoresis (PFGE) [10, 11], restriction fragment length polymorphism (RFLP) [12], arbitrarily primed PCR [13], Variable Number of Tandem Repeats (VNTR) analysis [14] and fluorescent amplified fragment length polymorphism (FAFLP) [15]. All these techniques however, suffer from certain disadvantages that include requirement of large quantity of pure and high quality DNA, low discriminatory power, low reproducibility, ambiguous interpretation of data and problems associated with transfer of data between different laboratories [14].

MLST is a simple PCR based technique, which makes use of automated DNA sequencers to assign and characterize the alleles present in different target genes. The method allows one to generate sequence data in a low to high-throughput scale, which is unambiguous and suitable for epidemiological and population studies. The selected loci are generally the housekeeping genes, which evolve very slowly over an evolutionary time-scale [16] and hence qualify as highly robust markers of ancient and modern ancestry. The sequencing of multiple loci provides a balance between technical feasibility and resolution. MLST has been applied to the study of many other bacterial species like Neisseria meningitides [17], Streptococcus pneumoniae [18], Yersinia species [19], Campylobacter jejuni [20] and Helicobacter pylori [21].

Our present study is the first attempt to use the MLST, which currently differentiates the species and examines the intra and interspecies relationships of Leptospira. This method in future could be developed as a highly sophisticated genotyping system based on integrated genome analysis approaches to correctly identify and track leptospiral strains and is expected to greatly facilitate epidemiology of leptospirosis apart from deciphering the origins and evolution of leptospires in a global sense.

Methods

Bacterial strains

Bacterial strains (Table 1) were cultured by the WHO reference laboratory at the KIT Biomedical Research Centre at The Royal Tropical Institute, Amsterdam, The Netherlands (all isolates and reference strains labelled RK3) and at the Veterinary Sciences Division (VSD), The Queen's University of Belfast, United Kingdom (reference strains labelled RB3) and the WHO reference centre at Port Blair India (labelled isol 15). A total of 120 strains consisting of 79 isolates and 41 reference strains from different sources and geographical regions were analyzed by MLST. The 41 reference strains included in the study belonged to six Leptospira species (L. interrogans; L. kirschneri; L. noguchii; L. borgpetersenii; L. santarosai and L. alexanderi).

Table 1 Details of leptospiral strains and isolates used for MLST based

Selection and validation of target genes for MLST

The candidate loci sequences were obtained from the strains L. interrogans Fiocruz L1-130 and L. interrogans Lai 56601 strains from the Leptolist server. Six genes, namely adk (Adenylate Kinase), icd A (Isocitrate dehydrogenase), LipL32 (outer membrane lipoprotein LipL32), rrs 2 (16S rRNA), sec Y (pre-protein translocase SecY protein), and LipL41 (outer membrane Lipoprotein LipL41) (Table 2) were selected for MLST analysis. Many sequences of the rrs2, LipL32 and LipL41 are available in the GenBank [2]. PCR primers were designed for these genes based on GenBank records in the conserved regions flanking the variable internal fragments of the target regions. PCR primers for adk, icd A and sec Y were based on gene sequences of strains Fiocruz L1-130 and Lai 56601 [22, 23] (Table 2). The Primer 3 software [24] was used to design the PCR primers for the amplification of the candidate loci. The PCR amplifications of the different MLST target genes were performed using 1.5 mM MgCl2, 200 μM of dNTP's (MBI Fermentas), 25–50 ng template DNA using Gene Amp 9700 (Applied Biosystems, Foster City, USA) PCR system.

Table 2 Details of gene loci and the corresponding primer sequences used for MLST analysis

Amplification parameters included an initial denaturation at 95°C for 5 min followed by 35 cycles of amplification comprising of denaturation (94°C for 30 sec), annealing (58°C for 30 sec) and primer extension (72°C for 1 min) steps and a final extension of 7 min at 72°C. All the amplified fragments were checked on 1.5% or 2% agarose gel with ethidium bromide staining and the amplicons were sequenced in both the directions using Big Dye Terminator cycle sequencing Kit (Applied Biosystems, Foster City, USA) on ABI 3100 DNA sequencers (Applied Biosystems, Foster City, USA).

MLST data analysis

The electropherograms were viewed by using Chromas Lite version 2.01 (Technelysium Pty Ltd, Australia) and the resulting DNA sequences corresponding to both the forward and reverse reads were aligned using the Seqscape software (Applied Biosystems, Foster City, USA). Low quality nucleotide sequences were trimmed from the ends while comparing with the reference sequence of the Fiocruz strain and all the processed sequences were subsequently aligned by Clustal X [25]. The Sequence Type Analysis and Recombinational Test (START) programme [26] was used to determine Guanine-Cytosine content, number of polymorphic sites and the ratio of non-synonymous to synonymous nucleotide substitutions (dN/dS). The phylogenetic analysis was performed using concatenated (2980bp) sequences in the order adk, icd A, LipL32, LipL41, rrs 2 and sec Y for each strain using MEGA 3.1 [27] and the consensus tree was drawn based on 1000 bootstrap replicates with Kimura 2 parameter.

Results

Diversity among the candidate loci analyzed

The 5' parts of rrs 2, LipL32, LipL41 and the 3' part of sec Y were considered for the analysis based on abundance of nucleotide substitution positions found in these regions. The sizes of the fragments analyzed for the selected housekeeping genes ranged between 430bp (adk) and 557bp (icd A). The positions of these MLST loci were scattered throughout the chromosome I of L. interrogans Fiocruz L1-130 (Table 2). Clustal X programme was used to align all the individual sequences separately and we observed that there were no large insertions and deletions in the selected region. According to our analysis the rrs 2 gene was found to be highly conserved among all the isolates with the percentage of variable sites being 4.42. Other genes namely LipL32, LipL41, icd A, adk and sec Y, however, were significantly diverse with the percentages of variable sites being 11.3, 21.04, 22.8, 27.2 and 28.7 respectively. The locus with highest diversity was icd A with 51 different alleles found among the set of 120 different isolates studied. The ratio of non-synonymous (dN) to synonymous substitution (dS) was much less than 1.0 indicating that these genes are not under positive selection pressure (the selection is against the amino acid change), whereas the rrs 2 gene showed dN/dS ratio as 1.369 suggesting a high flexibility for amino acid changes. The percentage of G + C content in these loci ranged from 39.16 (sec Y) to 51.92 (rrs 2) (Table 3). The synonymous substitution which, plays a role in the divergence of strains was more frequent in icd A and sec Y with 126 different synonymous sites. When compared to synonymous substitutions, non-synonymous substitutions were more frequent in all the genes tested, but highest numbers of 429 and 423 were observed in case of icd A and sec Y respectively (Table 3).

Table 3 Allelic diversity parameters observed for the six target genes used for MLST analysis of leptospires

Clustering analysis of Leptospires based on MLST

The neighbor-joining tree was constructed for representative isolates based on a 'super locus' of 2980bp comprising concatenated sequence of all the six loci. For this, the genes were fused in the order – ad k, icd A, LipL32, LipL41, rrs 2 and sec Y. The phylogenetic tree generated five different clusters where L. interrogans (56 samples), L. noguchii (4 samples), L. kirschneri (16 samples), L. santarosai (18 samples), L. alexanderi (1 sample), L. borgpetersenii (26 samples) separated according to their genome species (Figure 1).

Figure 1
figure 1

Genetic relatedness among Leptospira isolates based on the concatenated sequences of the six housekeeping and candidate gene loci analyzed (see table 1 for detailed information on isolates/strains). * Unpublished presumptive serological classification.

MLST analysis also clearly identified each of the field isolates up to the species level and in general, classification based on these observations corroborated with previous taxonomic status of these isolates determined either by serological criteria or by genomic methods such as FAFLP (data not shown). There are two isolates for which serological classification seemed to be in contrast to MLST identification, i.e. INT 46, L. interrogans serovar Lyme and SAN 18, L. santarosai serovar Copenhageni. It should be noted that in these cases serovar designation is based on preliminary serological analysis, which may be incorrect. L. alexanderi was found to be genomically highly similar to L. santarosai and clustered accordingly. This could therefore be a subspecies of L. santarosai.

L. interrogans isolate SAN 17 from Costa Rica, indicated as putative new serovar (Table 1) along with another L. interrogans member belonging to serovar Muelleri of the serogroup Grippotyphosa, formed an isolated branch under the L. interrogans cluster arguing for a separate taxonomic status, possibly another subspecies of L. interrogans.

Discussion

The present study was a first attempt in the development of MLST for Leptospira species; the main objective being the selection of the housekeeping and candidate genes that are species specific, stable and evolve slowly. The availability of the complete sequence of L. interrogans Lai 56601 and Fiocruz L1-130 helped us in selecting the candidate loci. Genetically diverse group of strains was used for the study to evaluate the sequence diversity among the tested housekeeping genes. The six genes selected and studied here appear to be distinctly resolving to reveal a wide variety of genotypes among the isolates analyzed. This indicates a significant heterogeneity and sequence variation at each locus (Table 3).

The six loci selected were found to be suitable for MLST typing as they can be amplified and sequenced in all the isolates irrespective of species as these loci are unlinked on the L. interrogans chromosome I and exhibit a modest degree of sequence diversity and resolution. A total of 585 polymorphic sites were observed in the 'super locus' of 2980bp. Non-synonymous sites were more abundant as compared to synonymous sites (Table 3) indicating that the amino acid sequence variability possibly represents acclimatization to the specific host and environmental restrictions [2].

Several molecular tools that have been so far described for the characterization of Leptospira are associated with several drawbacks. Methods like PFGE, RFLP, and REA need large quantity of purified DNA, present tedious methodology, have low discriminatory levels, are hard to interpret the data, suffer from lack of reproducibility, require specialized equipment such as counter clamped homogenous electric field electrophoresis systems and give poor data transfer. The VNTR or MLVA technique described by Majed et al [14] and Slack et al [28] are more specific to L. interrogans. MLST overcomes all these disadvantages as this technique is simple, and easy to standardize on an automated DNA sequencer that is more widely available in most of the laboratories and above all the sequence data generated are unambiguous, specific and explicit. The main advantage of MLST is the transfer of data that can be shared and compared between different laboratories easily through the Internet. To date, a large number of organisms have been typed by MLST, which proved to be a highly discriminatory technique [29]. MLST analysis on Leptospira strains showed that the similar serovars and the serogroups of different species are not clustered together (Figure 1). This method is more suitable in identifying the species of leptospires as indicated by the clustering patterns up to species level (Figure 1). The tree generated gives an idea on the phylogenetic organization of the Leptospira. The L. interrogans seems to be like a clonal branch as the isolates are more closely related and emerge from L. kirschneri indicating that they have evolved from this species. The L. interrogans and the L. kirschneri emerge from L. noguchii branch indicating it as a monophyletic group [2]. Due to the greater sequence diversity observed in all the six genes except rrs 2, the dendrogram generated could differentiate effectively the L. interrogans, L. kirschneri, L. noguchii, L. santarosai and L. borgpetersenii.

Conclusion

With this new technique of MLST, we believe the issues related to ever-increasing serotype diversity would be effectively addressed via high throughput genome profiling. This will help establish population genetic structure of this pathogen with diverse host range and under different ecological conditions and will provide a scope for genotype-phenotype correlation to be established. Analyses based on the allelic profiles generated by our method may be successfully used to gain insights into the evolution and phylogeographic affinities of leptospires as it has been done for many other organisms. Large-scale, global genotyping, therefore, largely constitutes the essential mandate of studying leptospirosis in different hosts at the population level. Such approaches always generate extremely valuable information that can be translated into a wealth of databases to search for strain specific markers for epidemiology or to construct evolutionary history of the strains for a particular epidemiological catchment area. This task becomes greatly simplified if the genotypic data are categorized, stacked, archived and made electronically portable to facilitate easy access, extensive comparisons, remote access and retrieval in sets.