I. Introduction

A. Ixodes holocyclus

Ticks belong to the Class Arachnida, which includes other arthropods such as spiders and scorpions. Ticks are members of the Subclass Acari that consists of three families: Argasidae, Ixodidae and Nutalliellidae [1]. Ixodes holocyclus is a tick species categorized in the family Ixodidae comprised of hard-bodied ticks. In addition, they are referred to as one of the most ancient terrestrial arachnids and possibly the earliest organisms to have evolved with blood-feeding capabilities [2].

I. holocyclus is a native and unique paralysis tick species of Australia that is commonly found in areas of high humidity and moderate temperature along the east coast of Australia [3, 4]. The primary host of I. holocyclus are native marsupials such as bandicoots, whereas livestock, companion animals and humans are considered as secondary hosts. Figure 1 describes the lifecycles of the three-host tick, I. holocyclus, which is conformed by four lifecycle stages from egg, larva, nymph to adult. As for other Ixodidae (hard ticks), I. holocyclus can feed on hosts for a minimum few days up to several weeks [4, 5].

I. holocyclus is also a vector of various diseases, for example Rickettsial Spotted Fever, and is able to transmit pathogens to hosts during infestation [6, 7]. I. holocyclus is capable of inducing paralysis in hosts upon engorgement, which eventually leads to death if left untreated [8]. In addition, I. holocyclus is recognised as the most severe and harmful form of tick paralysis compared to its counterparts such as Dermacenor andersoni and Dermacentor variabilis in North America and Rhipicephalus evertsi evertsi in Africa [6]. In contrast, paralysis symptoms caused by I. holocyclus infestations are often irreversible and deaths occur regardless of the immediate treatment given [8]. Between 1900 and 1945, there were approximately 30 fatal human cases reported caused by I. holocyclus, victims were mostly under 4 years of age [5, 9]. Domestic animals are the major tick paralysis victims with an estimated number of 100,000 animals affected by the Australian tick paralysis each year [4, 10]. Five to ten percent of affected canines presenting to veterinary clinics reported fatal regardless of tick removal and antivenom administration [11, 12].

The development of paralysis syndrome is dependent on the duration of infestation and the composition of the salivary gland (SG). Generally, it takes 3-5 days for clinical symptoms to appear in victims of I. holocyclus. The clinical presentation includes loss of appetite, voice and coordination. Other severe indications are ascending flaccid paralysis, excessive salivation, asymmetric pupillary dilation and respiratory distress [13].

Figure 1 –
figure 1

The lifecycle of the three-host tick Ixodes holocyclus [5].

B. Tick saliva and sialome studies

The saliva of blood-feeding ticks play an essential role for effective and successful feeding. A large array of saliva components are known to facilitate tick-host interactions and counteract the hosts’ defences [14, 15]. To provide a comprehensive understanding of tick saliva, next-generation sequencing, which generates high-throughput sequencing data and deep coverage of transcriptomes [16], has been used to generate the salivary gland transcriptome (sialome) of several genera within ticks [17-20]. Sialome studies revealed that different tick species have a set of novel salivary proteins or protein families, which are unique and beneficial to their own hematophagy (blood-feeding behaviour) [21, 22]. These novel salivary proteins are for example anti-haemostatic, anti-inflammatory and immune-modulatory components, which facilitate the tick hematophagy [15]. Previous studies on Ixodes scapularis and Rhipicephalus appendiculatus suggested that the variation among tick sialomes is influenced by tick feeding conditions such as the vertebrate host(s) and/or developmental stages [23, 24]. However only 3 sialomes, out of the 243 tick species in the genus of Ixodes, have been described to date including I. scapularis [20, 24], Ixodes pacificus [25] and Ixodes ricinus [17, 26].

Two cDNA libraries were constructed in this study with large numbers of transcripts yielded from feeding female adult I. holocyclus ticks engorged on different hosts. This research paper aimed to provide a preliminary description of the transcriptomes of the Australian paralysis tick, I. holocyclus by annotating the contigs obtained from the two cDNA libraries and to understand the differential expression of I. holocyclus protein families when engorged on different hosts.

II. Materials and methods

A. Two cDNA samples

Two pooled RNA samples were prepared from feeding female adult ticks for cDNA library construction and sequencing. Fully engorged female adult I. holocyclus were collected from 2 groups of hosts – (i) paralysed cats and dogs, and (ii) bandicoots. The first sample (SG_FA_CD) and second sample (SG_FA_BA) were SG RNA obtained from female adult ticks collected from cats and dogs with paralysis symptoms and bandicoots, respectively.

B. Next generation sequencing

The cDNA libraries were sent to the Australian Genome Research Facility Ltd (AGRF) for next generation sequencing. The SG_FA_CD cDNA sample was sequenced using Illumina HiSeq. The Illumina sequenced reads were then assessed with FastQC [27] to avoid inherent biases associated with sequencing and the starting libraries. In order to standardize the sequencing reads and to eliminate sequencing errors prior to de novo assembly, the processed Illumina reads were subjected to Content Dependent Read Trimmer (ConDeTri) [28], which removes redundant and low scoring reads. Assemblies of Illumina sequences were performed using Velvet short read assembler [29] for each odd kmer ranged between 59-79, then merged using Velvet’s Oases [30]. The second SG_FA_BA sample was sequenced using 454 FLX Roche technologies. The 454 pyrosequences were assembled with GS Assembler (Newbler) v2.5.3 [31], and resulting final contig assemblies were examined in this study.

C. Contig annotation and contig representation in each library

The contigs of the assembled ESTs (Expressed Sequence Tag) in the four individual libraries were blast using different BLAST searches (blastx, balstn, or rpblast searches) against various databases, including non-redundant protein database (NR) of the National Centre for Biotechnology Information (NCBI) [32], the Swiss-Prot database [33], gene ontology (GO) fasta subset [34], the NCBI conserved domains database [35], KOG [36], KEGG [37], PFAM [38] and SMART [39] motifs for contig annotation. All BLAST searches were undertaken using Yabi [40] with the default parameters.

Transcripts, which have no significant hit with any records in different databases, were submitted to the SignalP server [41] to identify possible secreted proteins and the TMHMM server [42] to detect possible trans-membrane helices. Finally, all transcripts in each individual library were functionally classified by the key words provided from the BLAST match, considering their significant hits from different search engines. All annotated contigs for each individual library are compiled in a hyperlinked Excel spreadsheet. The number of contigs per functional group of each library was calculated for comparison of contig representation between the two I. holocyclus libraries.

III. Results

A. Next generation sequencing and sequencing assembly

A total of 130,071,262 reads were generated for SG_FA_CD using Illumina technology. To improve the read quality, Illumina read quality was filtered using FastQC and ConDenTri which filtered out 19,968,632 poor quality reads and 58,020,916 redundant reads, resulting in 26,040,857 paired end reads (52,081,714 reads) for the SG_FA_CD sample. The assembly of Illumina paired-end reads yielded 41,797 contigs with an average consensus size of 830bp for the SG_FA_CD library. Alternatively, 454 pyrosequencing generated 286,047 reads with an average read length of 310bp for SG_FA_BA. SG_FA_CD 454 pyrosequencing reads were assembled into 3,772 with an average consensus size of 663bp.

B. Functional annotation

All contigs from the two libraries were annotated and can be viewed in Supplementary files 1 and 2. Column A contains a specific identification code for each contig. The descriptions returned from different BLAST searches against various databases are summarized in column D. Columns E and F describe hits returned from NR and the NR score. Columns G-J contain hits returned with Uniprot and the Uniprot alignment score, Expected-value E, and sequence similarity. While columns K-P describe Enzyme commission number, KOG function, KOG general categories, KEGG pathway ID, KEGG pathways, KEGG general category, followed by hits returned from SMART, PFAM ISU, SSU, SignalP and TMHMM.

Based on the best matches, each protein-encoded contig, for each I. holocyclus library, was functionally annotated in the Excel spreadsheet using the classification of Schwarz et al [33]. All contigs were functionally classified into 15 main protein families (column C) describing their predicted functions – housekeeping, hypothetical, pathogen-related, transposable elements, novel, putative secreted, saliva, mucin, glycine rich, enzyme, lipocalin, protease inhibitor, toxin like, immunity related and antigen. The nineteen protein families were conceptually grouped into 6 categories (column D), which are housekeeping products, hypothetical proteins, pathogen-related sequences, transposable elements, novel products and secreted proteins.

Housekeeping contigs were proteins involved in organismal systems, environmental/ cellular processes and signaling, information storage and processing and metabolism. Hypothetical proteins are proteins labeled hypothetical and putative in NCBI databases, which are predicted proteins that have not been well characterized. All ticks used for this transcriptome study were collected from the field (including from hosts) and may carry pathogens and/or symbionts. Pathogen-related are associated with several contigs originating from pathogens found in the two I. holocyclus libraries. A minor number of contigs were reported to be transposable elements, including both class I (retrotransposons) and class II (transposons) transposable elements, reverse transcriptases and several transposases. Contigs, which did not have significant similarities with protein or nucleotide sequences on current databases were classified under novel products and are predicted to be unique proteins in I. holocyclus. The remaining 10 protein families are classified under the category of “Secreted”. Descriptions of secreted protein families, as described in previous studies, are summarized in Table 5. Figures 5 and 6 depict the classification of the categories and protein families in the two I. holocyclus libraries.

B. 1. Functional annotations for Salivary Gland collected from Female Adults collected from paralysed Cats and Dogs (SG_FA_CD)

For the SG_FA_CD library, a total number of 21,982 contigs (52.85%) were predicted to encode for novel proteins (Figure 2). Hypothetical proteins were 34.84% (n=14,657) while pathogen-related proteins and transposable elements represent 0.23% (n=98) and 0.24% (n=106) respectively from all the SG_FA_CD contigs. Housekeeping products comprised of 8.69% (n=3,746) of the contigs. There were 1208 contigs (2.87%) encoding for secreted proteins. Most of the secreted proteins (26.82%) were enzymes.

Figure 2 –
figure 2

Functional classification of contig-encoded Ixodes holocyclus proteins in SG_FA_CD library. The overall contig proportion of different protein families/categories is shown, and all contigs categorized under “Secreted” were further divided into different subfamilies.

Table 5 - Brief descriptions and examples for secreted protein family
Figure 3 –
figure 3

Functional classification of contig-encoded Ixodes holocyclus proteins in SG_FA_BA library. The overall contig proportion of different protein families/categories is shown, and all contigs categorized under “Secreted” were further divided into different subfamilies.

B. 2. Functional annotations for Salivary Gland collected from Female Adults collected from BAndicoots (SG_FA_BA).

Unlike contigs assembled from Illumina reads generated from female adult ticks engorged on companion animals with paralysis symptoms, the largest group of contigs in SG_FA_BA encoded for hypothetical proteins, which is comprised of 36.66% (n=1,383) of the all the contigs for the SG_FA_BA library (Figure 3). In the SG_FA_BA library, novel transcripts are the second most abundant group of contigs and is made up of 26.43% (n=997) of the total contigs, followed by housekeeping proteins (20.44%, n=771), secreted proteins (16.41%, n=619), and pathogen-related proteins (0.05%, n=2). Putative secreted proteins, which are predicted secreted proteins previously identified in other organisms, are the major groups of transcripts under secreted proteins (23.26%, n=144) in the SG_FA_BA library.

B.3 Contig representation in the two Ixodes holocyclus next-generation libraries

Comparison was made to study the expression of different protein families from the salivary gland of I. holocyclus collected from different hosts: salivary glands collected from companion animals with paralysis symptoms (SG_FA_CD) and bandicoots (SG_FA_BA). In order to visualize the contig representation in each library and the comparisons, percentages of the six protein family categories and secreted protein families were calculated to generate comparative graphs (Figure 4).

Figure 4 –
figure 4

Comparing the abundance of six categories of protein families between salivary gland samples collected from adult female ticks engorged on paralysed companion animals (SG_FA_CD) and bandicoots (SG_FA_BA). Red bars represent the SG_FA_CD library while green bars represent the SG_FA_BA library.

Figure 5 –
figure 5

Comparing the abundance of secreted protein families between salivary gland samples collected from adult female ticks engorged on paralysed companion animals (SG_FA_CD) and bandicoots (SG_FA_BA). Red bars represent the SG_FA_CD library while green bars represent the SG_FA_BA library.

B.4 Comparison of Salivary Gland collected from Female Adults collected from paralysed Cats and Dogs (SG_FA_CD) and Salivary Gland collected from Female Adults collected from BAndicoots (SG_FA_BA) libraries

The SG_FA_CD and SG_FA_BA libraries show significant differences in the secreted and novel categories (Figure 4). For the contigs representing secreted proteins, SG_FA_CD yielded only 2.87%, which is approximately eight times less that SG_CD_BA, which yielded 16.41% of secreted proteins. In contrast, SG_FA_CD yielded 52.85% of contigs coding for novel products, which is two times more than SG_FA_BA, which yielded 26.43% novel proteins. Figure 5 further demonstrated the differences in the yields of different secreted protein families, significantly for salivary, enzyme, lipocalin, protease inhibitor and immunity-related secreted proteins.

IV. Discussion

Classical Sanger sequencing was utilized for most previous tick sialome studies [17, 24, 25, 43-46] and only a few have used next-generation sequencing [26, 47]. A comparison between I. ricinus transcriptomes generated using Sanger and next-generation sequencing technologies discovered a higher and more comprehensive transcriptome coverage with next-generation sequencing methodologies, which is beneficial in providing in-depth data towards understanding SG transcript dynamics [17, 26]. The high coverage of next-generation transcriptome sequencing demonstrated a more accurate distribution of sequence reads, which reflects the patterns of gene expression during tick feeding. In addition, even in the absence of a reference genome, next generation sequencing has been proven to produce sufficient numbers of sequence reads and thus a reliable and robust transcriptome assembly for genome studies [48]. In this study, the sialome from I. holocyclus was studied for the first time in the absence of a reference genome. We employed both 454 and Illumina next-generation sequencing methodologies in this study. This is also the first paralysis tick sialome study, which provides a preliminary resource and pioneering step towards a comprehensive understanding of paralysis tick saliva composition.

In order to counteract the host defense system and achieve successful feeding, ticks have developed unique saliva compositions, which has been revealed in previous salivary gland transcriptome (sialome) studies [49, 50]. The transcriptome analysis conducted in this study corroborated previous reports through the identification of ten secreted protein families (Table 5) reported in the two I. holocyclus cDNA libraries. These protein families were classified as:

Antigen

Two types of antigen families were identified in I. holocyclus – antigen 5 and antigen P23. Members of antigen 5 protein families are found in the venom of Vespid wasps and snakes and were identified previously in many hematophagous tick transcriptomes [51, 52]. However, most of the antigen 5 proteins have no known function. Antigen P23 is protein identified in I. scapularis and has demonstrated anti-coagulation activity [53].

Enzymes

Some of the enzymes identified in the saliva of hematophagous arthropods were shown to assist with blood feeding. For example 5’ nucleotidase, which is also called apyrase, hydrolyses ATP and ADP released by damaged cells to prevent activation of platelets and neutrophils [15]. Metalloproteases are another large group of hematophagy-assisting enzymes, which inhibits the production of fibrin and fibrinogen [54]. Endonuclease has been found in the saliva of the mosquito Culex pipiens quinquefasciatus. Endonucleases are predicted to have the ability to lower the viscosity of the skin matrix and allow diffusion of pharmacological components through the host dermis [22]. Chitinases may be involved in anti-fungal activity or housekeeping functions associated with the cuticular structure [49]. Serine proteases may interrupt the fibrinolysis pathways in the host or exert a prophenoloxidase anti-pathogen activity [49]. Other minor enzymes identified in the I. holocyclus adult female SGs include: inositol polyphosphate, phosphatase, carboxypeptidase, sulfotransferase, dipeptidylpeptidase, leukotriene hydrolase and phospholipase.

Glycine rich

Glycine rich proteins have been identified in sialome studies of ticks and some were predicted as salivary cement proteins that are utilized during tick-host attachment [49]. Another group of glycine rich proteins are collagen-like proteins, which function as part of the extracellular matrix of the salivary glands and as part of the tracheolar system [49]. GGY peptides, which are rich in glycine and tyrosine, also belong to glycine rich protein family. GGY peptides are predicted to have an antimicrobial function [55].

Immunity related

Immunity-related proteins are often present in the saliva of hematophagous insects and ticks [49]. There are two major groups of immunity-related peptides - antimicrobial peptides and pattern recognition proteins. Antimicrobial peptides prevent contamination in the feeding cavity and microbial growth in the ingested blood. Examples of antimicrobial peptides are lysozyme, defensin, microplusin and histidine-rich peptides [49]. Ficolin, ixoderin, peptidoglycan recognition proteins and ML domain containing proteins are examples of pattern recognition proteins. Ficolin and ixoderins activate the invertebrate complement system [56] while peptidoglycan recognition proteins are important for the activation of anti-microbial enzymes [49]. ML domains enable the recognition of lipids important in innate immunity and lipid metabolism [49].

Lipocalin

Lipocalins have a highly conserved barrel structure, which enables them to bind and carry hydrophobic ligands. Lipocalins are ubiquitously distributed and abundantly expressed in the salivary glands of ticks [24, 25, 57]. In ticks, lipocalins are described to be involved in histamine and serotonin binding [58], anti-haemostasis[59], anti-complement activity [60], immunoglobulin binding [61] and toxicity [62]. An extensive family of lipocalins is predicted to have more than 20 members and has a tendency of increasing in number with diverse functions being discovered in other ticks [62].

Mucin

Mucins are putative glycoproteins with O-galactosylation sites and a chitin-binding domain [63]. This group of proteins is predicted to serve as mechanical protection against pathogen invasion of ticks, by coating the chitinous feeding mouthparts or the feeding lesion [49].

Protease Inhibitor

There are a few protease inhibitor protein families that can be characterized by their protease inhibitor domains such as the Kunitz domain, serpin domain and TIL (Trypsin Inhibitor Like) domain. Kunitz domain containing protease inhibitors are one of the largest protein families in tick saliva and they target proteases of the S1 family and enzymes complexes such as Xase and prothrombinase [49, 64]. Several members of Kunitz protease inhibitors are involved in anti-clotting [49], anti-thrombin [65] and anti-platelet [66] activities. Serpins control the clotting systems of vertebrates by regulating serine protease cascades. Serpins of I. ricinus, which are called IRIS, are found to have immunosuppressive properties in cellular assays [67] and weak anti-hemostatic function [68]. Another serpin, RMS-3, found in R. microplus is predicted to counteract immune responses during the host-parasite interaction [69]. The only characterized TIL domain containing protease inhibitor is ixodidin, which has antimicrobial, anti-trypsin and anti-elastase properties [70]. Cystatins (cysteine proteinase inhibitors) are also a protease inhibitors, which mainly inhibit activities of peptidases of families C1 (papain family) and C13 (legumain family) [71]. Thyropins are another example of protease inhibitors. These members are found to contain Thyroglobulin type I repeats that may inactivate cysteine proteases. However, the function of the thyropin family of inhibitors is still unknown [49].

Putative secreted protein

A significant number of contigs found in the I. holocyclus cDNA libraries matched previously described tick proteins, mainly also deducted from tick sialome studies. These putative secreted proteins were found to have an open reading frame in the previous studies but were not functionally classified. The consistent presence of these groups of proteins in tick saliva suggests that they have tick salivary gland specific functions yet to be determined.

Salivary proteins

Salp15 is an example of salivary protein, which was previously identified in I. scapularis. Salp15 was shown to inhibit CD4 T cells and to be essential for Lyme disease pathogen transmission [24]. Other salivary proteins are described as 8.9-kDa proteins, 14-kDa proteins, 18.7-kDa proteins and 22.5kDa proteins, which were previously identified in other Ixodes ticks. These proteins have an open reading frame and a signal peptide, however their significance and functions remain unknown.

Toxin like

Some proteins, with two to six cysteine residues in the mature peptides, are reported to be toxin-like. Some of these cysteine-rich proteins in I. holocyclus were previously identified in the I. scapularis sialome. Proteins, which produced significant matches with HT-1 [72], are also labeled as toxin-like.

The abundance of the above ten secreted protein families were inconsistent in both of the I. holocyclus transcriptome libraries, reinforcing the significant role of tick salivary dynamics in tick blood-feeding behavior as previously reported in other tick sialome studies [14]. Most of transcripts (52.85% from SG_FA_CD and 26.43% from SG_FA_BA), are novel proteins, which had no significant match with any domain or nucleotide and/or protein sequences. This result is consistent with previous sialome studies from several hard and soft ticks, including I. ricinus [17, 26], I. scapularis [24], I. pacificus [25], Rhipicephalus sanguineus [73], Amblyomma maculatum [47], Amblyomma variegatum [19] and Antricola delacruzi [44], identifying a high abundance of proteins and protein families of unknown functions. These sialome studies suggest an intensive evolution in the salivary proteins as a consequence of host immune pressure. Hence, a large quantity of unique protein families and salivary protein were identified in most of tick sialomes that have been analyzed [2, 21].

V. Conclusions

This report describes for the first time the transcriptome data of the Australian paralysis tick, I. holocyclus infesting different hosts, which were resistant and susceptible respectively. The protein families identified are an important contribution in understanding the I. holocyclus interactions within its hosts. The differential contig representations of the two transcriptome libraries indicate that I. holocyclus feeding on different host regulate gene expression differently.

The contig representation of the contig counts utilized in this study would be an underestimation of the real read coverage. Further analyses such as qPCR are needed to validate the abundances reported here. This transcriptomic study could be further improved by repeating the analysis of the samples with both 454 and Illumina technologies to provide replicate and more robust contig data. Currently, two samples were either processed using 454 or Illumina technologies. The differences observed may be attributed to the inherent differences between the two technologies [74] and therefore reduce the credibility of the results in this study.