Genotyping of Orientia tsutsugamushi circulating in and around Vellore (South India) using TSA 56 gene

The immunodominant TSA 56 gene of Orientia tsutsugamushi, (scrub typhus agent) has four variable regions (VD-I to VD-IV) making it useful for genotyping. As of date the genotyping data from India is based on partial 56kDa gene sequence analysis. The complete TSA 56 gene sequence is important for knowing the circulating strains and for designing region specific diagnostics and vaccines. This study was undertaken to determine Orientia tsutsugamushi genotypes circulating in and around Vellore using complete and partial TSA 56 gene. Of the 379 whole blood samples from suspected scrub typhus patients, 162 were positive by 47 kDa qPCR. Long protocol to amplify the complete TSA 56 gene (≈1605 bp) was performed on 21 samples. On the same 21 samples the partial gene sequence was also amplified using the Horinouchi (≈650bp) and the Furuya (≈480 bp) protocol. Using a combination of Sanger and Nanopore technology complete sequence was obtained for 9 and near complete (1551 to 1596 bp) for 4 respectively. As Furuya protocol gave multiple bands we obtained 480 bp sequences from the 13 complete gene sequences by in silico analysis. In contrast, 650bp sequences were obtained for 11 samples while for the remaining two we derived the 650 bp sequences from the complete gene sequences (Long protocol). Phylogenetic analysis of the complete gene (Long protocol) which includes VD-I to VD-IV region and partial gene (Horinouchi) which amplifies the VD-I to VD-III regions showed identical genotypes. Twelve belonged to TA763 genotype and one belongs to Karp genotype. The Furuya sequence (in silico) correctly identified the Karp genotype and 10 of the TA763 genotypes. Two TA763 genotypes (identified by complete and 650 bp partial gene analysis) were misidentified by Furuya sequence analysis as Karp genotype. The limited analysis showed the commonest Orientia tsutsugamushi genotypes circulating in and around Vellore is TA763 and that the 650 bp (Sanger) sequencing could be a cost effective method for identifying the scrub typhus genotypes. However, these results need to be validated by larger prospective multi-centric studies.


Introduction
Scrub typhus is a vector borne acute febrile illness caused by Orientia tsutsugamushi (formerly known as Rickettsia tsutsugamushi) and it is common in India (1). The etiological agent is transmitted to rodent or human by bite of the infected chiggers (larval) of trombiculid mite (2).
Till recently, scrub typhus was thought to be endemic only in the tsutsugamushi triangle (3). There is evidence now of scrub typhus like infections from the Middle East (Dubai), Africa (South Africa), Europe (France) and Chile in South America (4).
Orientia tsutsugamushi has many antigenic variants and >40 antigenic strains (5) and virulence has been attributed to regional strain differences, suggesting that virulence of Orientia tsutsugamushi is related to genotype (6). Genotypic classification of Orientia tsutsugamushi is based on the variations in the immuno-dominant outer membrane protein, the 56-kDa typespecific antigen (7). Though all individuals with scrub typhus have antibodies to the 56 kDa antigen, immunity is strain (genotype) specific. The immunogenicity of this antigen has made it a good diagnostic and vaccine candidate (8).
The common method used for genotyping in India is nested PCR amplification of partial gene followed by sequencing using protocol described by Furuya and Horinouchi (6,30). In Furuya protocol, the amplified segment has 418-453 nucleotides which covers two variable domain (VD II and VD III) whereas the Horinouchi protocol covers three variable domains (VD I to III) and has ≈652 nucleotides in the whole 1600 nucleotides thus simplifying the whole process (6,18).
The partial sequences of the TSA 56 gene will not provide full information about the entire ORF.
For accurate classification and analysis, sequencing of entire ORF (including VD I to VD IV) of the 56 kDa is needed (9,19). As of date, there are 90,716 partial gene sequences available and only 609 complete gene sequences available in the GenBank (https://www.ncbi.nlm.nih.gov/nuccore?term=tsa56+complete+gene&cmd). The Indian genotype data is based on partial gene sequence of the TSA 56 gene and there is no complete gene 56 kDa data till date (Prakash JA Personal communication). We undertook a study to amplify the complete TSA 56 gene to definitely determine the circulating Orientia tsutsugamushi genotypes in and around Vellore. Further, we compared the complete TSA 56 gene data with the partial gene sequences obtained using the Furuya and Horinouchi protocol.

Materials and methods:
Sample collection, processing and DNA extraction: Patients of either sex, above 1 year of age with fever more than 3 days less than 10 days with or without eschar/rash were recruited for this study after obtaining informed consent (EC pmol of the probe. The PCR conditions included an initial denaturation at 95ºC for 5 minutes followed by 40 cycles of 95⁰C for 30 seconds and 60⁰C for 1 minute. The Ct value ≤ 38 was considered as positive (13).

Conventional PCR
Each PCR amplification method was performed in a 25 µl reaction volume using HotStar Taq Plus Master Mix kit (Qiagen, Hilden, Germany). Details of primers used in this study are given in the Table 1. Further, we performed a nested PCR to amplify a 483 bp fragment, encompassing VD-II and VD-III segment of the same gene, described by Furuya which were extensively evaluated by Kim et al., (6). As the PCR amplified product obtained using Furuya protocol showed multiple bands, we performed in-silico analysis for determining the genotype.

Next Generation Sequencing:
Totally 18 amplicons (11 partial and 7 complete) were sequenced on GridION X5 (Oxford Nanopore Technologies, Oxford, UK) using Spot ON flow cell R9.4 (FLO-MINI06) in a 48 hrs sequencing protocol. The resulting raw data showed good sequencing coverage and high read depth. High quality processed data was aligned against reference sequence (Karp M33004) and the consensus sequence was generated.

In-silico analysis of partial gene
To obtain 483 bp fragment, the 9 complete and 4 near complete sequences were aligned with the inner primer P10 and P11 (of the furuya protocol) using CLUSTAL Omega sequence alignment tool (21). The inner primer covers the VD II and VD III region was selected and subjected to phylogenetic analysis. The two samples (CMCOT1 and CMCOT7) which showed low amplification by Horinouchi protocol were also subjected to in-silico generation of fragments by aligning the respective complete sequence with inner primer RTS 6 and RTS

7.
Phylogenetic tree: Both Sanger and NGS sequence was subjected to BLAST analysis.
Alignment was performed using Clustal Omega (21) with 37 reference sequences retrieved from GenBank-NCBI and the phylogenetic tree was established separately for complete and partial gene using IQTREE software: A fast and effective stochastic algorithm for estimating maximum likelihood phylogenies (22).

Analysis of Open Reading Frame (ORF) of complete TSA 56 gene:
The ORF of the 13 sequences was predicted using NCBI Open Reading Frame Finder:

Discussion
Identification of Orientia tsutsugamushi genotypes circulating in endemic areas is important for designing region specific diagnostic kits and vaccines. This is because immunity is genotype specific (7,8). The genotype variation is due to the variations in the four variable domains (VD I-IV) of the immuno dominant type specific TSA 56 gene. Of these, VD I-IV is the most variable and is the most important in determining the genotype (12). Knowledge of the circulating genotypes in a given area requires amplification of the complete gene followed by sequencing and phylogenetic analysis (6,10,30,31). In India, such data is not available, we present the first comprehensive genotyping data based on amplification and sequencing of the four variable domains (VD I-IV) of the immuno dominant type specific 56 kDa gene.
In this study, we amplified the complete 56 kDa gene (≈1600bp) from 21 samples of which only 13 could be successfully sequenced. Point mutations and recombination are quite common in the variable regions which adversely affect the primer annealing and sensitivity of PCR (24). We compared three protocols for 56 kDa gene amplification. The Long protocol covers four variable domains (VD-I to VD-IV). The Horinouchi protocol covers three variable domains (VD-I to VD-III) and the Furuya protocol covers VD-II and VD-III regions (6,9,18). The complete gene was successfully amplified in 13 samples and fidelity of amplification was confirmed as more than 96% sequence homology was obtained with Based on complete and 650 bp partial gene phylogeny CMCOT7 and CMCOT11 belongs to TA763 genotype but 480 bp partial gene phylogeny shows that the two sequences belong to Karp genotype. This discrepancy is due to the 480 bp partial gene covers only 2 variable domains (VDII and VDIII) whereas the 650bp sequence covers three variable domains (VDI to VDIII). The variable domain IV doesn't play much role as the variations in the sequence are mostly conserved or semi-conserved (7). Therefore, the 650bp partial gene (Horinouchi protocol) is enough to determine the genotype of Orientia tsutsugamushi as it provides same result as complete gene analysis. Sequencing the Horinouchi protocol derived amplicons can be done by Sanger sequencing, which has good fidelity for amplification up to 800 bp (27,28).
Our limited data suggests that, TA763 is the predominant genotype circulating in Vellore.
This genotype has been reported in South East Asia including Thailand and other countries like China, Australia and Taiwan (28,29). Only one sequence was found to be Karp.
Globally, Karp genotype is reported to account for about 39.5% and found throughout the endemic region (31). Our present study provides first report on Orientia tsutsugamushi genotypes using complete 56 kDa gene sequence. Multi-centric studies which include other genotypes are needed for validating this preliminary finding.