Background

Parasitic worm infections, including nematode infections, represent one of the most prevalent problems in human and veterinary medicine (Newton & Munn, 1999). Soil-transmitted helminths, commonly known as intestinal worms, are the most common infections worldwide, affecting the most deprived communities where infected people generally cannot afford treatment. More than 2 billion humans are infected with gastrointestinal or tissue nematodes and 3.5 billion are exposed to them, which results in tremendous health and economic problems (Chan, 1997; Hotez, 2008). Since infections have tendencies to be chronic, they are destructive to severely infected children, giving rise to anemia, growth retardation, impaired cognitive function, productivity of individuals, and lowered educational accomplishment (Cooper & Bundy, 1988; Guyatt, 2000; Nokes et al., 1992).

Every infection represents a competition between the parasite and the host (Playfair & Bancroft, 2008). The important difference between a free-living animal and a parasite of vertebrates is that the parasite must survive and reproduce against the host defense mechanisms (Wakelin, 1996; Wakelin & Walliker, 1996). Parasites have produced a variety of molecules that intervene with the host’s defense system aim to eliminate the undesirable lodger (Nagaraj, Gasser, & Ranganathan, 2008). This alteration is most likely caused by the liberation of soluble substances which bind, degenerate, or otherwise interact with host immune cells (Hewitson, Grainger, & Maizels, 2009; Lightowlers & Rickard, 1988). Molecules generated and released by nematodes that may interfere with the host immune responses include proteases, protease inhibitors, antioxidants, and homologies of host mediators and their receptors (Bungiro & Cappello, 2004).

The genus Strongyloides include about some 50 species of intestinal parasitic worms of vertebrates like mammals, birds, reptiles, and amphibians (Grove, 1989; Viney & Lok, 2007). S. stercoralis—the major human pathogen species—is an intestinal nematode that has the ability to tolerate host immune attack and survive within the human small intestine for decades and infects at least 100 million people (Concha, Harrington Jr, & Rogers, 2005; Liu & Weller, 1993). This prevalence is likely underestimated since the diagnostic tests are insensitive, and the development of accurate and sensitive methods is needed (Kramme et al., 2011; Montes, Sawhney, & Barros, 2010).

S. ratti is a parasite of rats, used commonly as a model nematode. The excretory/secretory proteins (ESP) from infective larvae (iL3), parasitic females (pF), and free-living stages (flS) of S. ratti were identified by LC-MS/MS (Soblik et al., 2011). Proteomic analysis of ESP revealed 79 proteins specific for parasitic females, including a prolyl oligopeptidase, small heat shock proteins, and a trypsin inhibitor-like protein (TIL). Sra-TIL peptides were detected in high concentration in the ESP of the parasitic female. Quantitative RT-PCR was done to confirm that Sra-til is parasitic female-specific gene (Soblik et al., 2011). Previously, we characterized some interesting S. ratti ESP including heat shock proteins (Sra-HSP10 and Sra-HSP17s), a homolog of the human macrophage migration inhibitory factor (Sra-MIF) which interacts with host immunity in varying ways (Tazir et al., 2009; Younis et al., 2011; Younis et al., 2012).

Sra-TIL is selected in this study, for identification, separation, and cloning of the full-length sequence of the gene and prediction of its structure, function, and phylogeny because it was present among the 25 highest scoring proteins found only in pF ESP (Soblik et al., 2011). It is one of the most interesting ESP candidates, which may be involved directly in the modulation of the host immune responses.

Methods

Parasite culture, infection, and harvesting of pF

The S. ratti life cycle has been held at the Bernhard Nocht Institute for Tropical Medicine (BNITM, Hamburg) since January 2006. The genetically trusted iL3 for the initial infection were kindly supplied by Prof. Dr. Gerd Pluschke from the Swiss Tropical Institute (STI), Department of Medical Parasitology and Infection Biology, Basel, Switzerland. The life cycle is running in agreement with the German animal protection law. The experimental protocols have been reviewed and confirmed by the responsible federal health authorities of the State of Hamburg (Hamburg, Behoerde fuer Gesundheit und Verbraucherschutz: permission number 89/09). Four to 6-week-old Wistar rats (Rattus norvegicus) from Charles River were used to maintain the cycle by serial passage as described (Keiser, Thiemann, Endriss, & Utzinger, 2008; Viney & Lok, 2007). For the collection of different stages, standard methods were used which were established in our laboratory (Soblik et al., 2011; Younis et al., 2011).

Computer-based cluster/EST analyses

Peptides related to Cluster SR02054 and its EST (expressed sequence tag) were identified and searched on www.nematode.net (Martin et al., 2009). The ESTs were screened for more sequence information to get the longest nucleotide sequence which representing partial gene structure. The ESTs constituting this cluster were aligned (using the ClustalW program) and were assembled (manually and using Cap3 program) to get the possible longest DNA sequence and then translated (http://web.expasy.org/translate/) to get the deduced amino acid (aa) sequence.

RNA isolation and reverse transcription

Total RNA was isolated from the freshly prepared pF after 7 days of infection by standard phenol/chloroform method as described before (Tazir et al., 2009). Subsequently, RNA was quantified spectrophotometrically. The quality and integrity of the total RNA were confirmed after staining of 5 μg RNA by loading buffer (Roti®-RNA, ROTH), following the manufacturer’s protocol, on an ethidium bromide-stained gel containing formaldehyde. RNA samples were then treated with RNase-free DNase I (Qiagen, Hilden, Germany) and purified with RNeasy MinElute spin column (Qiagen) following the manufacturer’s protocol.

A total of ~ 5 μg of purified parasite RNA was used to synthesize the first-strand cDNA using SuperScript® III RT and GeneRacer (Invitrogen). The manufacturer’s instructions were followed except for the antisense primer, oligo dT-T7I (Table 1), which was used at a final concentration of 10 μM.

Table 1 Primers list used for Sra-til gene amplification. All primers were ordered from Eurofins MWG/Operon https://www.eurofinsgenomics.eu/en/home.aspx

Polymerase chain reaction (PCR)

The partial longest EST obtained from the EST alignment and assembly was used to design specific forward and reverse primers (Table 1). The primers qualities were checked using Oligo Calc software (http://biotools.nubic.northwestern.edu/OligoCalc.html).

To get 3′-cDNA ends, specific forward primer (tilf1) was used for amplification of Sra-TIL, and the oligo dT-T7II was used as the reverse primer. The 5′-ends were obtained using the gene-specific reverse primer (tilr1) and using the 5′ SL-1 as forward primer, related to the nematode spliced leader sequence (Hunter et al., 1997).

The PCR was done as described before (Sambrook, Fritsch, & Maniatis, 1989). In brief, the PCR conditions were 1× thermo buffer (NEB), forward primer (10–20 pmol), reverse primer (10–20 pmol), dNTP mix (Invitrogen; 10 mM), Taq polymerase (NEB; 0.3 μl), DNA 0.1–0.5 μg, and HPLC-H2O to 50 μl. The denaturation step was set to 4 min at 95 °C, and the elongation was performed at 72 °C for 45 s. The annealing temperatures (52–58 °C) are related to the primers’ properties and their melting temperatures (Tm), which were calculated using the following formula: Tm = (A + T) × 2 + (G + C) × 4. The annealing time was set to 45 s, and 30 cycles were repeated.

Agarose gel electrophoresis

The results of the amplifications were tested on agarose gel. Agarose gel electrophoresis was performed as described (Sambrook et al., 1989) using TAE buffer and agarose concentration 1% (w/v). DNA was mixed with the DNA-loading buffer (Fermentas), and DNA ladder was used to monitor fragment size. Five microliters of ethidium bromide per 100 ml agarose solution was added before pouring the gel. The running buffer was 1X TAE and the voltage was set to 100 V. For RNA analysis, the method previously described (Farrell, 1993) was used, using 0.8% (w/v) agarose prepared in 1× MOPS buffer containing 0.7 M as a final concentration of formaldehyde. RNA was mixed with Roti®-RNA Loading buffer (Carl Roth, Karlsruhe, Germany); RNA ladder was used to monitor fragment size. Gels were run overnight in 1× MOPS as running buffer under 25 V. RNA or DNA bands were visualized using a UV transilluminator and immediately photographed.

Cloning and sequencing

Various PCR products were purified by QIAquick® PCR purification Kit (Qiagen, Hilden) and then cloned into pGEM-T Easy vector (Promega Corp., Madison, USA) according to the manufacturer’s protocols. The cloned fragments were subsequently transformed into competent Escherichia coli TOP10 cells (Invitrogen), and the recombinant plasmids were purified from the bacterial cultures by QIAprep® Miniprep Kit (Qiagen, Hilden), according to the manufacturer’s protocols. The resulted products were sequenced via LGC Genomics GmbH, Germany.

Alignment of the various cloned fragments resulted in the full length sequence. Subsequently, primers including the 3′- and the 5′-ends were designed (see Table 1), which included the codon for the initiating methionine and a stop codon. The full-length open reading frame (ORF) of the Sra-til cDNA was captured by PCR using specific forward primer tilf0, and reverse primer tilr0.

The full-length cDNAs were cloned, after purification, into the pGEM-T Easy vector and transformed into TOP10 cells. The results confirmed by restriction digestion by EcoRI (NEB) according to the supplier’s protocol then observed on ethidium bromide-stained gel. For further sequence confirmation, the fragments were sequenced using the M13 forward and gene-specific primers.

Extraction of gDNA and genomic organization

Genomic DNA was isolated from 100,000 iL3 using standard phenol/chloroform method. In brief, the parasite material digested for 3–4 h at 56 °C with proteinase K in ALT buffer (Dneasy kit, Qiagen) under a constant agitation. gDNA was treated with RNase A and was then precipitated with 5.2 M ammonium acetate and dissolved in 20–30 μl HPLC water. Concentrations were determined by spectrophotometry. PCRs were done for both cDNA and gDNA as mentioned above using the specific forward (tilf0) and reverse (tilr0) primers. The resulted PCR products were compared on gel, cloned, and sequenced as mentioned above to investigate the gene genomic organization.

Sequence and phylogenetic analyses

Homology search on the nucleotide and protein database was carried out using the BLAST program with default settings at NCBI (http://www.ncbi.nlm.nih.gov/). Subsequently, to draw an analogy, for pattern and profile deduction, for function and structure prediction and for further bioinformatic analysis, sets of software available at the Expert Protein Analysis System (ExPASy) proteomics server (http://expasy.org/tools/) of the Swiss Institute of Bioinformatics were used, including Simple Modular Architecture Research Tool (SMART). SignalP program (http://www.cbs.dtu.dk/services/SignalP/) which was used for signal peptide prediction. Thereafter, the full length of the Sra-til sequence was aligned with the most homologous sequences in the database using CLUSTALW multiple sequence alignment program. The phylogenetic trees were built using online tool software Phylogeny.fr program (Dereeper et al., 2008).

Results

Identification of Sra-til full-length cDNA and gene structure

High-quality RNA from pF (Fig. 1) was used in the cDNA synthesis. Initially, the cluster SR02054 was identified by database searches which were performed using the Applied Biosystems ProteinPilotTM search engine (Soblik et al., 2011) as a putative homolog of a scavenger receptor cysteine-rich protein from Culex pipiens quinquefasciatus with unknown function. Subsequent to searches on nematode. net, SR02054 was found in Contig 703 and included four EST sequences (kw08g03.y1- kw02a07.y1- ku89b08.y1- kt91a02.y1).

Fig. 1
figure 1

Quality and integrity analysis of the total RNA which has been extracted from S. ratti parasitic females. Five microliters of RNA was analyzed by ethidium bromide-stained gel containing formaldehyde. The RNA quality was confirmed by the detection of the distinct 18 S and 28 S ribosomal RNA bands

The partial sequence resulted from alignment and assembly of the ESTs of this cluster was used to obtain the full length of the Sra-til. After many PCRs with different primers (Table 1), the full length of Sra-til was amplified from both pF cDNA and gDNA and resulted in the same size bands ~ 1605 bp (Fig. 2), indicating the absence of introns in the Sra-til gene structure. For confirmation, all sequences were cloned, sequenced. and aligned. Interestingly, the sequences from both the pF cDNA and gDNA were 100% identical. For verification, the final Sra-til sequence has been submitted to the GenBank database under the accession no. KY441615.

Fig. 2
figure 2

The PCR products analysis by ethidium bromide-stained gel of the amplified products (slightly above 1500 bp) using S. ratti pF cDNA and gDNA

Sequence and phylogenetic analyses

BLAST search of the complete Sra-til nucleotide sequence showed 99% identity with a part of a sequence in GenBank under accession no. LN609529 (S. ratti genome assembly) at position 13120855–13122459 and shared 71% identity with parts of the S. stercoralis and S. papillosus genome assembly sequences (accession no. LL999048 and LM525578, respectively). In addition, Sra-til nucleotide sequence showed 69% similarity with parts of the genome assembly sequences of both S. venezuelensis (accession no. LM524970) and Parastrongyloides trichosuri (accession no. LM523174).

However, BLAST analysis of the translated Sra-TIL complete aa sequence showed low similarities (31–43%) to homologous protein sequences from other creatures including nematodes in the GenBank.

Using the full-length cDNA sequence of Sra-til, a predicted amino acid (aa) sequence of 534 residues was deduced with a theoretical molecular weight of ~ 57.8 kD and a calculated pI of 8.18. The Simple Modular Architecture Research Tool (SMART) analysis predicted six repeats of epidermal growth factor (EGF)-like domain, in the translated aa sequence of Sra-TIL (accession no. ARQ31643), at aa positions 43–77, 229–266, 289–330, 357–394, 422–459, and 482–523 (Fig. 3). A signal peptide, for secretion, (MRLFISLLLILTITLSVVTS) was detected by the SignalP v4.0 program at position 1–20. Only one low compositional complexity was detected by the SEG program at position 268–279.

Fig. 3
figure 3

SMART analysis of the Sra-TIL protein sequence predicted six EGF domains and a TIL domain. A signal peptide, for secretion, was detected by the SignalP v4.0 program at position 1–20. Only one low compositional complexity was detected by the SEG program at position 268–279

In addition, protein families database (Pfam) pattern showed a hit of TIL domain at position aa 154–208. TIL domain is found in proteinase inhibitors and typically comprise ten cysteine residues that form five disulfide bonds. Comparison of the Sra-TIL domain sequence (55 aa) with other related TIL domains from other parasitic nematodes and insects (Fig. 4), showed 12 conserved residues with complete (no gaps) reactive site regions (inhibitory loops) allaying between the fifth and sixth cysteines (Schechter & Berger, 1967) of the TIL domains analyzed in this figure.

Fig. 4
figure 4

CLUSTAL multiple sequence alignment by MUSCLE (3.8) of selected 16 aa sequences of TIL domains in some nematodes (including the Sra-TIL domain which is highlighted in gray) and arthropods. Samples are labeled according to the GenBank accession numbers (left) and name of species (right). Symbols: identical (asterisks; black highlighted) and no aa (dashes). The active inhibitory site is labeled with red boundary

Sra-TIL domains differed from the other nematode and arthropod TIL domains with low to average sequence identities (31–46%).

According to the Pfam entry of the TIL domain (PF01826: http://pfam.xfam.org/family/PF01826), the architectures of serine protease inhibitors, which were identified from some nematodes, were represented only by pure TIL domains (TIL × 1 – TIL × 8) or in combination with other domains like WAP domain (whey acidic protein) in Trichuris muris and Trichinella spiralis, chaperone DnaJ (Hsp40) in T. murrelli and DOMON (dopamine beta-monooxygenase N-terminal) in Caenorhabditis briggsae. Like one of T. muris TILs, the predicted Sra-TIL has a single TIL domain but was found in combination with six EGF domains with predicted architecture (Sra-TIL = TIL, EGF × 6). Interestingly, there was more than one TIL protein in the entry PF01826 of different architectures of most nematodes including S. ratti, mostly identified as uncharacterized proteins and with different sizes.

Based on DELTA-BLAST (Domain Enhanced Lookup Time Accelerated BLAST), phylogenetic analysis of the predicted full-length Sra-TIL aa sequence with the most related nematode serine protease inhibitors (Fig. 5), showed two main clusters. The present studied Sra-TIL is belonging to a sub-cluster with Trichinella sp. and C. remanei TILs. It is very closely related to a deduced TIL sequence from S. ratti which previously submitted to the GenBank under accession number CEF69530.

Fig. 5
figure 5

Phylogenetic tree of 21 TIL orthologs from nematodes resulted from DELTA-BLAST (Domain Enhanced Lookup Time Accelerated BLAST) of the deduced Sra-TIL protein sequence, using Phylogeny.fr program (http://www.phylogeny.fr/). Bootstrap support values are shown on the branches. The compared proteins are represented by GenBank accession number and the name of species

Discussion

Nematode-derived protease inhibitors have been suggested to play multiple roles in the survival of the parasite by suppression of exogenous host proteases (Morris & Sakanari, 1994; Peanasky et al., 1984a, b) and to rule endogenous proteases implicated in its development and reproduction (Ford et al., 2005; Lustigman, Brotman, Huima, Prince, & McKerrow, 1992). They are also believed to have multiple, specific abilities for the host immunomodulation, playing on various immune effector mechanisms (Hartmann & Lucius, 2003).

In our previous study (Soblik et al., 2011), Sra-TIL peptides, which is a serine protease inhibitor, has been found abundantly in the Excretory/Secretory products of the parasitic female only but not in the ESP of the infective larvae (iL3) or any other free-living stages. In the same study, qRT-PCR experiment also showed that Sra-til is a specific gene for pF.

This study agrees with the genomic study (Hunt et al., 2016) which reported that eight different gene families likely to be important in nematode parasitism in S. ratti, including that encoding trypsin inhibitor-like proteins.

In the present study, the full-length cDNA encoding trypsin inhibitor-like protein from the intestinal nematode S. ratti was identified, cloned, and analyzed.

The full-length of Sra-til was amplified from both pF cDNA and gDNA and resulted in the same size bands ~ 1605 bp. Unlike most filarial nematodes, TILs (Ford et al., 2005) and introns were absent in the Sra-til gene structure.

Different serine protease inhibitors have been identified from various parasitic nematodes, including Ascaris suum, A. lumbricoides (Gronenborn, Nilges, Peanasky, & Clore, 1990; Peanasky et al., 1984, b), Anisakis simplex (Morris & Sakanari, 1994) Ancylostoma caninum, A. ceylanicum, A. duodenale (Mieszczanek, Harrison, & Cappello, 2004; Rios-Steiner, Murakami, Tulinsky, & Arni, 2007; Jin et al., 2011), Trichuris suis (Rhoads, Fetterer, & Hill, 2000), and Onchocerca volvulus (Ford et al., 2005; Lustigman et al., 1992) where they were believed to have a variety of functions to protect the nematode from intestinal serine proteases or to be involved in molting and development of the nematodes.

The inhibitory function of these proteins is corresponding to the TIL domains (trypsin inhibitor-like) as identified by pFam. The predicted Sra-TIL has a single TIL domain but was found in combination with six EGF domains with predicted architecture (Sra-TIL = TIL, EGF × 6). The full deduced protein sequence of Sra-TIL showed a hit for TIL domain of 55 aa and typically comprise ten cysteine residues that form five disulfide bonds. Reactive site of TIL domain determines the targeted enzyme, which may be trypsin, elastase, chymotrypsin, or cathepsin G. Sra-TIL domain has a complete reactive site (inhibitory loop) found between the fifth and sixth cysteines (Schechter & Berger, 1967) like some TILs investigated in O. volvulus (Ford et al., 2005).

Nineteen different genes containing TIL domains are found in the Caenorhabditis elegans genome (C. elegans Sequencing Consortium, 1998), while 37 proteins of C. elegans matched the TIL domain were found in the interpro entry IPR002919, which were identified mostly in databases as uncharacterized or theoretical proteins with unknown function(s). Proteins with TIL domains have been detected in arthropods (Cierpicki, Bania, & Otlewski, 2000; Lung et al., 2002; Parkinson et al., 2004; Zhu & Li, 2002), where their functions have not yet been fully explained, although they were stated to be implicated in defining the lifespan or to be constituents of venom. TIL family proteins have also been identified from amphibians where they are suggested to have antimicrobial effects (Ali et al., 2002). In mammals, TILs have been characterized in a wide range of proteins like von Willebrand factor or zonadhesin, which have been reported to play a part in sperm gamete recognition, adhesion, or blood clotting (Bonthron et al., 1986; Gao & Garbers, 1998).

EGFs belong to a common group of proteins that share a repeat pattern, including a number of conserved Cys residues. Growth factors are implicated in cell recognition and proliferation. The EGF motif is found frequently in nature, particularly in extracellular proteins (Campbell et al., 1990). EGF and EGF receptor (EGFR) exerts a fundamental role in wound healing through stimulation, proliferation, and migration of cells (Bodnar, 2013; Wenczak, Lynch, & Nanney, 1992). EGFR also identified in some parasites, and it was believed that EGF signaling could be conserved in helminths and might be directly involved in host-parasite molecular interplay (Dissous, Khayath, Vicogne, & Capron, 2006; Vicogne et al., 2004).

The innovative structural advantages of Sra-TIL polypeptides secreted from parasitic nematodes is probably reflective of an abundance of different biological functions, including the protection against host proteases, host immunomodulation via proliferation induction of the damaged intestinal cells as well as wound healing properties.

Molecular expression and functional assays are under planning for independent study, in the future, to examine the biological functions of the recombinantly expressed Sra-TIL, including antigenic properties, cell binding, cytokine release, and enzymatic biochemical experiments.

Conclusions

The full-length Sra-til gene, which encodes an interesting ESP candidate, secreted from pF of S. ratti, was identified, isolated, cloned and then sequenced. Both nucleotide and amino acid sequences were subjected to analyses by various bioinformatic tools like alignment, BLAST search, structure prediction, and phylogenetic analysis. The predicted innovative structural advantages of Sra-TIL polypeptides secreted (signal peptide was predicted) from a parasitic intestinal stage of the S. ratti nematode, showing predicted host protease inhibition (TIL × 1) and cell proliferation/wound healing (EGF × 6) putative features, are probably reflective of the abundance of different biological functions, including the protection against host proteases, host immunomodulation via proliferation induction of the damaged intestinal cells as well as wound healing properties.