Introduction

The therapeutic and prophylactic research efforts over the past 50 years, to address parasitic infection in human and veterinary medicine, have had profoundly different results. The therapeutic-based anthelmintics have been remarkably successful against a broad range of parasites and, although blighted by more recent reports of anthelmintic resistance (Brockwell et al. 2013; Ortiz et al. 2013; Elliott et al. 2015), stand in contrast to the failure of prophylactic research efforts to develop a viable product to effectively prevent trematode infection in both animals and humans.

Trematode parasites are eukaryotic organisms with a large protein repertoire (n > 10,000 predicted for various species) (Young et al. 2012, 2014; Cwiklinski et al. 2015; Haçarız et al. 2015). Diseases caused by these parasites are responsible for substantial production losses in the livestock industry (e.g. fasciolosis) and/or for health problems in humans as over a million cases per year are reported for trematode infections worldwide (such as schistosomiasis, clonorchiasis and fasciolosis) (Toet et al. 2014; Cwiklinski et al. 2015; Walz et al. 2015; Qian et al. 2015). The previous approaches to develop vaccines against trematode parasites, largely adopted from virus and prokaryotic models, have failed to deliver an effective vaccine to prevent infection. Additionally, whilst recombinant antigens with various adjuvant combinations have been tested in experimental and field trials, none have showed full protection or been commercialised (Golden et al. 2010; Hotez et al. 2010; Toet et al. 2014; McNeilly and Nisbet 2014; van der Ree and Mutapi 2015; You and McManus 2015). Currently, genuine progress into novel anti-trematode drug and vaccine development appears static, with no literary publications citing end stage production of commercial products, in human or veterinary medicine.

The application of omic technologies is widely regarded as a possible candidate tool to overcome this stasis (Toet et al. 2014; van der Ree and Mutapi 2015; Molina-Hernández et al. 2015). Adopting a whole proteome analysis approach to trematode parasitism is most useful to help pinpoint the essential parasite proteins relating to virulence, to investigate important parasite protein interactions (proteins that are correlated in expression or additive in their effect) and to identify the parasite’s unique immuno-modulatory mechanisms to prolong infection (Robinson et al. 2009, 2013; Hotez et al. 2010; Haçarız et al. 2012, 2014; van der Ree and Mutapi 2015; You and McManus 2015; Khan et al. 2015).

The integration of omic approaches into molecular studies of trematode parasites has been ongoing since the beginning of this century. Proteins with different molecular functions for various trematode parasites have been identified, and candidate molecules have been proposed as drug and vaccine targets (Wilson et al. 2011; Dalton et al. 2013; van der Ree and Mutapi 2015).

This review focuses on the progress in the omic approach to address trematode parasitism within the past 5 years,and aims to complement and build upon previously published review studies prior to 2010 (Toledo et al. 2011). Targeted at researchers from various fields (e.g. biology, medicine, veterinary medicine, parasitology), the text addresses the tools, applied approaches, advantages and disadvantages of omic studies in trematode research, and importantly how to progress beyond candidate protein nomination to complete investigative viability studies.

Frequently used omic technologies in trematode research

Two high-throughput omic approaches, liquid chromatography-mass spectrometry (LC-MS) to identify proteins at peptide level, and next-generation sequencing (NGS) to identify genes at nucleotide level, are key tools used to characterise the signature profile of each trematode parasite. Sample preparation and correct experimental design are central to a meaningful outcome from omic studies, which is valid for both proteomics and genomics/transcriptomics research. A workflow integrating LC-MS with NGS is illustrated in Fig. 1.

Fig. 1
figure 1

Workflow diagram describing LC-MS and NGS (RNA-Seq and WGS), as adopted (Haçarız et al. 2012, 2014, 2015; Haçarız and Baykal 2014). A protein database derived from NGS provides parasite specific database, leading to a significant increase in protein identification. ESI electrospray ionisation, qTOF quadrupole time-of-flight, MS mass spectrometry, DDA data-dependent acquisition, DIA data-independent acquisition, GFP gas phase fractionation, PLGS ProteinLynx Global Server, WGS whole genome sequencing, FASP filter-aided sample preparation, GO gene ontology, KEGG Kyoto Encyclopaedia of Genes and Genomes

LC-MS

Earlier proteomic studies commenced by analysing the purified proteins of interest (which are mainly excised from polyacrylamide gels after two-dimensional electrophoresis) through peptide mass fingerprint using matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry (MALDI-TOF), with an appropriate database search to reveal its identify (Thiede et al. 2005). This method, which is efficient for the identification of targeted proteins (such as antigenic proteins), has been used in trematode research (Acosta et al. 2008; Tian et al. 2013), but the isolation of hydrophobic proteins from the gels (after the electrophoresis application) has been reported to be challenging because of the protein aggregation (Fischer et al. 2006).

In the recent past (no more than 10 years), researchers in the trematodology field aimed to elucidate the whole proteome of trematode organisms in order to enhance our biological understanding of the parasites, and develop effective strategies for the disease control. This whole proteome approach was favoured, as the one-by-one molecule-based strategy (such as the use of an antigen) proved difficult in developing new efficient prevention strategy (such as vaccination) in many experimental trials. As the gel electrophoresis-based separation of proteins can be difficult to perform, difficult to reproduce, time consuming and economically inefficient, particularly for a large number of identification targets, the use of separation techniques (such as liquid chromatography) with mass spectrometry took hold amongst the research community.

The LC-MS approach, which is increasingly used in many fields, has been particularly applied to analyse samples containing thousands of proteins and, therefore, can be used to identify large proteomes of trematode parasites. The principle of LC-MS is the chromatographic separation (such as reverse phase separation-based on molecule hydrophobicity) of digested peptides and deduction of amino acid sequences from mass-to-charge ratios (m/z) determined by mass spectrometry (de Hoffmann 1996; Chen and Pramanik 2009). The commonly used version of LC-MS, called LC-MS/MS, previously described in detail (de Hoffmann 1996; Chen and Pramanik 2009; Grebe and Singh 2011; Rauh 2012), typically includes technical stages such as peptide ionisation (by electrospray ionisation), peptide fragmentation (by collision energy), and mass detection of the resulting products (by time-of-flight-based detection), yielding spectral data of peptide and amino acid sequences. Finally, raw data obtained by this approach is searched against a protein database using in silico tools (such as Mascot (Matrix Science; http://www.matrixscience.com/), ProteinLynx Global Server (Waters; http://www.waters.com/)) to identify proteins in the biological samples.

In addition to protein identification, LC-MS/MS is also useful for protein quantitation using labelled (e.g. isotopes) or label-free methods. In the label-free methods, which is now regarded as the more practical and effective choice (Sandin et al. 2015), the relative abundance of proteins is possible using internal standard with known concentration (such as the use of enolase peptides of Saccharomyces cerevisiae) (Haçarız et al. 2012; Haçarız and Baykal 2014) or using a more advanced approach such as peptide ion abundance (Haçarız et al. 2014). Other than protein detection, LC-MS (and also other technologies such as proton nuclear magnetic resonance spectroscopy (NMR)) can also be helpful to detect metabolites which are small molecules derived from enzymatic reactions in biological systems (Becker et al. 2012).

Overall, LC-MS technology is mainly useful for (a) the identification and quantitation of proteins (including isoforms and splice variants) at peptide level, including protein-protein interactions and ancillary comparative and correlative analysis to PCR-based RNA expression levels (Haçarız et al. 2012, 2014; Schaefer et al. 2012; Larance and Lamond 2015), (b) assessment of post-translation modifications, as previously described (Gupta et al. 2007), (c) understanding the micro RNA (miRNA) biology, influencing protein expression levels (Huang et al. 2013a) and (d) detection of metabolites which indicate metabolic pathways (Becker et al. 2012).

Important issues for LC-MS

Protein identification

For proteomic studies involving trematodes, careful isolation, recovery and maintenance of the parasite is important for the deduction of meaningful proteomic and genomic data. Procedures to achieve this for trematodes are outlined by Haçarız et al. (2011, 2012, 2014, 2015); however, variations may need to be applied depending on the experimental purpose.

The sample could be prepared using homogenisation (such as beads and homogeniser) and/or detergents (such as nonidet P40, for enriching proteins of surface layers in particular) (Wilson et al. 2011; Haçarız et al. 2011, 2014). Excretory/secretory (ES) proteins could be obtained in a cell culture environment and concentrated with methanol/chloroform precipitation (Robinson et al. 2009). However, enrichment protocols do not guarantee pure isolation of the experimental fractions. A small concentration of surface proteins can be found in ES fractions due to shedding of the tegument in vitro (Morphew et al. 2007) and a quantity of ES proteins can potentially remain in surface fractions (Wilson et al. 2011; Morphew et al. 2013). In addition, the proteome fractions can be influenced by the duration of incubation, as the in vitro conditions can cause damage to the parasites (Wilson et al. 2011). In the case of trematodes, the addition of protease inhibitor cocktails to prevent protein sample degradation is important, as parasitic trematodes release various proteases (Haçarız et al. 2014). Following this, an efficient protein extraction technique, such as filter-aided sample preparation (FASP), is required (Wiśniewski et al. 2009).

In LC-MS-based proteomic studies, proteomic grade trypsin is generally used to digest proteins into tryptic peptides, but using other endoproteases (e.g. Glu C, and Asp N) cleaving proteins at different amino acid points could be useful to identify other proteins that have not been identified with the trypsin protocol (Wiśniewski et al. 2009; Swaney et al. 2010). One of the major obstacles of LC-MS/MS technology is that peptides with high relative concentrations may mask the identification of other peptides expressed at lower concentrations (Gingras et al. 2005). This ‘masking effect’ is due to the measurement of protein concentration in starting material, which mostly relates to abundant proteins, whilst the concentrations of some proteins that are relatively low become undetectable for LC-MS (Liu et al. 2006). This problem could be solved by fractionation of proteins/peptides based on their chemical properties such as isoelectric point and molecular weight, prior to the LC-MS/MS analysis. The integration of different data acquisition approaches (e.g. data-dependent acquisition, data-independent acquisition, gas phase fractionation) could be important to identify more proteins (Haçarız et al. 2014). In addition, the quality and amount of data depends on the performance of the device (e.g. the resolution capacity of mass spectrometry) used.

Protein identification is highly dependent on the size of the database provided, as LC-MS/MS requires a database to identify proteins. To date, LC-MS/MS has been applied to a number of trematode parasites, and a number of proteins have been identified by comparing the raw proteomic data with protein data of non-specific organisms and other datasets derived from EST resources. However, in many cases, findings have been limited because of the absence of well-established protein databases due to the lack of transcriptome/genome data for trematode parasites. Since 2010, NGS-based studies have provided a number of genomes/transcriptomes of various trematodes and this trend is likely to continue for others where genomes/transcriptomes are not yet deciphered. The integration of current advanced LC-MS/MS systems with protein databases derived from NGS data (for both genomic and transcriptomic components) would allow researchers to visualise the whole proteome of the trematode parasite. Technical parameters such as sequence coverage, the number of peptides identified, parent and fragment ion error tolerances, and false positive rate (<4 % is accepted in general) are important in the proteomic analysis (Gingras et al. 2005; Wiśniewski et al. 2009; Haçarız et al. 2014). The integration of two main protein identification approaches, called de novo peptide sequencing and database searching, is suggested to increase the identification coverage and accuracy (Wang and Wilson 2013). As LC-MS/MS has been widely used for whole profiling omic studies in trematode research, other useful mass spectrometry techniques (e.g. MALDI-TOF) will not be reviewed here.

Metabolite identification

In metabolome-based studies, sample preparation is an important issue and an extraction procedure for metabolome profiling of helminths has been proposed (Saric et al. 2012). After spectral data is obtained by the sample run, the general experimental procedure of the metabolome analysis involves spectral processing (including baseline correction, noise filtering, peak detection and alignment, normalisation and de-convulation), data analysis [including various methods such as uni-variant, multi-variant (supervised/unsupervised) and multi-way] and metabolite identification (see for technical details; Alonso et al. 2015). An organism-specific metabolome database is an important tool to increase the number of identifications (Jewison et al. 2012). Currently, fully annotated metabolome databases are limited for many infectious organisms (Preidis and Hotez 2015), but a comprehensive metabolome database is available for other comprehensively studied organisms such as human (Wishart et al. 2013).

Metabolome analyses have been mainly focused on the study of prokaryotic organisms, yielding useful results. Metabolome analysis of Staphylococcus aureus led to discovery of the target (pyruvate dehydrogenase) of an antibacterial product (triphenylbismuth dichloride) (Birkenstock et al. 2012) in multiresistant bacterial pathogens, and fatty acid metabolism has been found to be altered in rifampicin-resistance Mycobacterium tuberculosis (du Preez and Loots 2012).

NGS

The other important tool in high throughput sequencing is NGS, which is based on the fragmentation of genetic material with various techniques such as heat enzymatic cleavage or sonication, followed by attachment and amplification of the labelled gDNA or complementary DNA (cDNA) fragments on an appropriate surface (e.g. chip, flow cell), and deduction of sequences of the attached fragments (∼100–400 bases) using optical signals (such as in 454 sequencing, Roche; HiSeq 2000/2500, Illumina) or pH change (Ion Torrent, Life Technologies) (Cantacessi et al. 2012a; Mardis 2013). The base information of each sequence is called ‘read or sequence read’. These technologies enable extensive parallel sequencing, which results in large volumes of data (at gigabase levels for trematodes).

In general, NGS of whole cDNA (reverse-transcribed from mRNA) and gDNA are referred to as RNA-seq (or whole transcriptome sequencing) and whole genome sequencing (WGS), respectively (Ng and Kirkness 2010; Smith et al. 2013). Millions of reads can be assembled to get contiguous sequences (called ‘contigs’) using various software applications (called de novo assembly) (such as Velvet, Oases) or can be mapped to a reference genome sequence, if available in silico (such as Bowtie, TopHat) (Zerbino and Birney 2008; Wang et al. 2009; Trapnell et al. 2009, 2012; Ng and Kirkness 2010; Langmead and Salzberg 2012; Schulz et al. 2012; Smith et al. 2013; Kim et al. 2013). With WGS, intronic and exonic DNA regions of trematodes can be sequenced, as previously described (Smith et al. 2013; Ng and Kirkness 2010; Young et al. 2014). RNA-seq data can be aligned to reference trematode genomes in order to identify splice junctions between exons (using the appropriate software such as TopHat, Bowtie), and gene expression levels can be analysed based on read abundance using the reference trematode genome (e.g. TopHat, Cufflinks) (Langmead and Salzberg 2012; Trapnell et al. 2012; Kim et al. 2013; Cwiklinski et al. 2015), or even in the absence of the reference genome (Li and Dewey 2011). Some genome assemblies of trematodes (to use as reference genome for NGS applications) are publicly available (e.g. WormBase, http://www.wormbase.org/, WormBase Parasite, http://parasite.wormbase.org/). In NGS-based studies involving de novo assembly, correct frames of transcript sequences can be detected by ‘BLAST’ comparisons (Altschul et al. 1990) using previously known protein data, and these sequences can be conceptually translated into protein sequences to provide a parasite specific protein database for LC-MS/MS applications (Robinson et al. 2009; Wilson et al. 2011; Haçarız et al. 2014, 2015). To date, genomes and transcriptomes of a number of trematodes have been successfully sequenced with current NGS technologies (Berriman et al. 2009; The Schistosoma japonicum Genome Sequencing and Functional Analysis Consortium 2009; Logan-Klumpler et al. 2012; Zerlotini et al. 2013; Young et al. 2012, 2014; Cwiklinski et al. 2015; Haçarız et al. 2015).

Important issues for NGS

As for peptide analysis, careful isolation, recovery and maintenance of the parasite are important. Extraction of DNA and RNA is most commonly completed using commercially available kits, typically membrane and spin column-based, which facilitate a reliable and practical isolation procedure for genetic material. For the wet laboratory stage of NGS, quality and quantity of genetic material are critical parameters (Haçarız and Sayers 2013). Purity of material is checked with the absorbance of the nucleic acid at 260, 230 and 280 nm, using a spectrophotometer. For purity, the ratio of A280/A260 is suggested to be approximately 1.8 for DNA and approximately 2.0 for total RNA, and the ratio of A260/A230 is proposed to be 2.0–2.2 for both DNA and total RNA (T009-Technical bulletin, NanoDrop, ThermoScientific, USA). Integrity of DNA or total RNA could be checked with gel-based methods, but current approaches using microfluidics capillary-based electrophoresis would give more accurate assessment of the intactness, particularly for total RNA (Haçarız and Sayers 2013). Fluorescence-based quantification and quantitative PCR (qPCR) applications are recommended for starting material and DNA library quantification, respectively (Haçarız and Sayers 2013; Haçarız et al. 2015).

The dry lab stage of NGS currently necessitates the use of Linux operating systems (RedHat in particular) in a computer with large RAM memory (∼128 gigabytes for RNA-seq/1 terabyte is usually enough for general purposes) for the assembly of sequence reads. Read quality analysis and other necessary processes (e.g. removal of undesired sequences) need to be completed using bioinformatic softwares (e.g. FASTQC; http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ and FASTX-Toolkit; http://hannonlab.cshl.edu/fastx_toolkit/index.html) before sequence assembly is carried out (Young et al. 2010a,b; Haçarız et al. 2015). The accuracy at both base and sequence level which can be determined by Phred quality score (Q score) (a value of ≥ 33 is usually acceptable), an adequate sequencing coverage (≥30 × is preferred in general), and sequence duplication with a low ratio are important components of NGS (Ewing and Green 1998; Sims et al. 2014; Haçarız et al. 2015).

Knowledge about the trematode of interest is essential for effective evaluation of data. For example, a good understanding of the general biology of the trematode will maximise results from in silico work. In some cases, coordinators of studies may not be familiar with in silico data and related applications, thereby potentially missing important findings and/or making incorrect interpretation of the data. An efficient correspondence between the coordinator and the expertise carrying out the omic scale experiment is necessary to prevent this outcome.

Whole transcriptome profiling with RNA-Seq is an important approach to prepare protein databases for LC-MS/MS. In turn, the RNA-Seq procedure is highly dependent on the quality of extracted total RNA. Poor quality of starting RNA material would decrease the read number, leading to less coverage of transcriptome and hence a limited protein database for LC-MS/MS. Assessment of the integrity of extracted total RNA is routinely examined by microfluidics capillary-based electrophoresis, e.g. Bioanalyzer (Agilent, USA) and, as part of this analysis, a heat-denaturation step is recommended to prevent loop formation of nucleic acid strands. However, the 28S ribosomal RNA of trematode parasites (e.g. F. hepatica, S. japonicum) is ruptured at the middle point of the RNA molecule by this heat-denaturation step [similar to other organisms such as insects (Ishikawa and Newburgh 1972; Winnebeck et al. 2010)] and prevents accurate assessment of quality using the RIN indicator (RNA integration number) (Haçarız and Sayers 2013; Gobert et al. 2009). Therefore, total RNA of trematode parasites should not be exposed to heat-denaturation before RNA quality assessment analysis (Haçarız and Sayers 2013). This approach has been successfully applied within a transcriptome study (Haçarız et al. 2015).

For more detailed and specific guidelines to these instruments (both LC-MS and NGS), readers are referred to relevant books (McMaster 2005; Alzate 2010 for LC-MS and Valencia et al. 2013; Wong 2013 for NGS) and technical protocols (from commercial providers such as Waters, www.waters.com, for LC-MS and Illumina, www.illumina.com, for NGS) in the area.

Current status of omic approaches in trematode research

Trematodes belong to the subclass digenea of the class trematode which branches off from the phylum platyhelminthes (flatworms). Taxonomical status of the flatworms, as shown in Fig. 2, could be a useful guide to researchers for comparative purposes in omic-based trematode research. The diversity amongst the parasites stems from the taxonomic stages of order, superfamily or family. In general, the life cycle of trematode parasites includes an intermediate host (gastropods (snails)), an additional intermediate host in some cases (e.g. ants for Dicrocoelium sp., crustaceans for Paragonimus sp., and fish for Clonorchis sp. and Opisthorchis sp.), and the main (definitive) host (mammals) (Toledo et al. 2011). The family difference in the same suborder (e.g. Clonorchis sp. (liver fluke) versus Metagonimus sp. (intestinal fluke); Fasciola sp. (liver fluke) versus Hypoderaeum sp. (intestinal fluke)) and the genus difference in the same family (e.g. Fasciola sp. (liver fluke) vs. Fasciolopsis sp. (intestinal fluke)) could be associated with disparity of the infected organ in the definitive host.

Fig. 2
figure 2

The taxonomic classification of known trematode parasites (http://www.ncbi.nlm.nih.gov/taxonomy/). Based on current knowledge, a total of 20 different genera (stem from different families (n = 12) of orders (n = 4) from the subclass digenea, the class trematode) appeared to be responsible for various trematode infections. The relationship between family and genus is indicated with the slash symbol (/)

The current status in protein/nucleotide sequences in trematode parasites and specialised data resources are demonstrated in Tables 1 and 2, respectively. In this section, omic-based studies (particularly profiling studies after approximately 2010) for common parasitic trematodes are outlined based on organ tropism under four categories (blood, liver, intestinal and other flukes).

Table 1 Current status of the availability of trematode protein/nucleotide sequences at two global data resources (UniProt and NCBI)
Table 2 Current and publicly available important data resources for trematode research at omic level

Blood flukes

One of the most studied trematode genera at the omic scale is Schistosoma spp., including S. mansoni, S. japonicum and S. haematobium species which cause pathologies in various organs (e.g. liver, bladder, lungs) in mammals by entering the host circulatory system. There have been numerous proteomic studies particularly for S. mansoni since 2000, to which readers are referred to for further information (DeMarco and Verjovski-Almeida 2009; Toledo et al. 2011; Gobert et al. 2014; Mickum et al. 2014; van der Ree and Mutapi 2015). Previous studies have identified antigenic proteins, some of which are now proposed as candidates for both vaccines and diagnostics (Toledo et al. 2011; van der Ree and Mutapi 2015; Ludolf et al. 2014). In large-scale omic studies, the proteome profiles of teguments of S. mansoni and S. japonicum (Braschi et al. 2006a, b; Mulvenna et al. 2010a) and ES proteins of S. japonicum (Liu et al. 2009) have been investigated, an integrated immunoproteomic and bioinformatic approach has been applied to identify S. japonicum tegument proteins (Chen et al. 2014) and a differential proteomic analysis of different stages (schistosomula and adult) of S. japonicum has been described (Hong et al. 2013). In addition, tegumental and ES proteins of another Schistosoma species with veterinary importance, S. bovis, have been identified by LC-MS-based studies (Pérez-Sánchez et al. 2006, 2008; Ramajo-Hernández et al. 2007; Higón et al. 2011; de la Torre Escudero et al. 2011, 2013) and a recombinant version of S. bovis 22.6 antigen has been suggested to be used in diagnostics (de la Torre-Escudero et al. 2012).

Significant advances have been made recently using omic technologies to enhance the biological understanding of Schistosoma spp. (Toledo et al. 2011; van der Ree and Mutapi 2015). The NGS technology made it possible to sequence genomes of three Schistosoma species (S. mansoni, S. japonicum and S. haematobium) (Young et al. 2012; Berriman et al. 2009; The Schistosoma japonicum Genome Sequencing and Functional Analysis Consortium 2009). In addition, a well-established annotation database, GeneDB (Logan-Klumpler et al. 2012) and an updated genome resource, SchistoDB (Zerlotini et al. 2013), as well as data from the Chinese National Human Genome Center at Shanghai, the Schistosoma japonicum Genome Project (The Schistosoma japonicum Genome Sequencing and Functional Analysis Consortium 2009), are now publicly available. A genome data for S. bovis is available at the single read archive of NCBI (SRA) but the annotated genome has not yet been published. Although a number of proteins have been identified at the peptide level by previous efforts, the integration of all current data with further proteomic studies (LC-MS/MS-based) is still thought to be useful to confirm previously non-reported proteins of all Schistosoma species at peptide level, thereby further improving the effectiveness of current strategies (van der Ree and Mutapi 2015).

Despite the genome of Schistosoma sp. being well studied, there is as yet no report about the comprehensive metabolome profiling of any species of this genus, or any other trematode genera, according to our search. However, there have been a number of studies investigating metabolite changes in hosts with schistosomiasis, which have been reviewed previously (Wang et al. 2010; O’Sullivan et al. 2013). In brief, alterations in metabolites related to energy metabolism (such as down regulation of tricarboxylic acid cycle) and gut microbiota are observed, which have been proposed to be important in the diagnosis of infection (Wang et al. 2010; Li et al. 2011; O’Sullivan et al. 2013).

Liver flukes

Opisthorchiidae family includes two important genera, Clonorchis and Opisthorchis. C. sinensis, O. viverrini and O. felineus have two intermediate hosts (snail and fish) and infect the liver of humans. These parasites are known to be significant risk factors of cholangiocarcinoma in infected patients. Many studies have focused on this aspect of the parasite and several proteins have been suggested for serological diagnosis of the parasite.

Publications on omic-scale studies for C. sinensis and O. viverrini have been steadily increasing since 2010. Studies investigating ES proteins of C. sinensis have been published and antigenic proteins have been detected (such as fructose-1,6-bisphosphatase, methionine aminopeptidase 2 and acid phosphatase) (Zheng et al. 2011, 2013). However, in these studies, the raw proteomic data is searched against Caenorhabditis elegans or Schistosoma databases, which may limit the protein identification data obtained. An organised expressed sequence tag (EST) database (ClonorESTdb; n = 13,305, consists of 6497 contigs and 6808 singletons) of C. sinensis is publicly available (http://pathod.cdc.go.kr/clonorestdb/) (Kim et al. 2014). Furthermore, the draft genome and transcriptome of the parasite have been published (Young et al. 2010a; Wang et al. 2011; Huang et al. 2013a). The NGS derived data could also be useful for further LC-MS/MS-based studies to identify surface or ES proteins that have not been detected by previous efforts.

The secreted and surface proteins of the adult stage of O. viverrini have been previously investigated using an EST database.z Immunomodulators such as annexins and orthologs of the schistosomiasis vaccine antigens Sm29 and tetraspanin-2 have been suggested for the tegument-host interface in this parasite (Mulvenna et al. 2010b). More recently, a draft genome of the parasite has been demonstrated by Young et al. (2014). This study also provides protein sequences (n = 16,379; predicted from nucleotide sequences) of the parasite. Therefore, it would be helpful to conduct further LC-MS/MS integrated studies to identify other proteins (at peptide level) which could be important for vaccine development. Another important Opisthorchis species is O. felineus; however, both proteomic and genomic data for this species is not publicly available.

The other family, fasciolidae, includes Fasciola hepatica and F. gigantica, which cause liver infections in various mammalians (e.g. ruminants and humans). So far, a number of omic-based studies have been done for F. hepatica. Initial attempts focused on the identification of ES proteins of the parasites (Jefferies et al. 2001; Gourbal et al. 2008). Robinson et al. (2009) studied the secretome of the parasite with LC-MS/MS proteomic technique and Hernández-González et al. (2010) studied the newly excysted juveniles of F. hepatica using the same technique. However, both searched the raw proteomic data against a preliminary EST database (published by the Sanger Institute); therefore, the findings were limited. This was also the case for other LC-MS/MS-based studies for more specific purposes, such as comparison of the effects of different conditions (in vivo/in vitro) on F. hepatica (Morphew et al. 2007) and analysing the response of adult F. hepatica to triclabendazole (Chemale et al. 2010).

Large omic-scale studies have been started with initial nucleotide sequencing of the parasite with a NGS platform (454, Roche) (Young et al. 2010b). In 2011, the tegument of F. hepatica has been studied with an LC-MS/MS approach where the database from the 454 sequencer (Young et al. 2010b) and an in-house EST database were used (Wilson et al. 2011). In another study, surface and internal fractions of the parasite have been studied with LC-MS/MS, and compared against a number of databases (Haçarız et al. 2012). A more detailed analysis of these fractions have been studied using LC-MS/MS integrated with different acquisition methods and in silico database search against various protein databases and a new assembly of publicly available EST data (Haçarız et al. 2014), yielding a total of 776 proteins. Currently, the total number of detected proteins of the parasite at peptide level is approximately 1000 parasitic proteins. Recently, the draft genome of the parasite (Cwiklinski et al. 2015) and a detailed transcriptome study (Haçarız et al. 2015) have been published. Using combined approaches including the available nucleotide database and fractionation, coupled with the use of different acquisition methods, would be useful in revealing the whole proteome of this parasite.

According to our knowledge, F. hepatica is the only trematode to which a profiling metabolome analysis has been applied. A total of 142 metabolites of the parasite have been identified (Saric et al. 2012); however, this identification number appears to represent a small percentage of the parasite’s entire metabolome, considering the current Escherichia coli and Saccharomyces cerevisiae metabolome databases (each containing more than 2000 metabolite entries) and the human metabolome database (containing more than 40000 metabolite entries) (Jewison et al. 2012; Guo et al. 2013; Wishart et al. 2013).

In addition to the biological analyses of the parasite itself, variations in serum protein profiles during a F. hepatica infection in sheep have been demonstrated, reflecting the ability of a trematode to influence the host serum profile (Rioux et al. 2008).

The other important Fasciola species is F. gigantica, however, omic-based studies for this species are rare in comparsion to F. hepatica. Young et al. (2011) studied the transcriptome of this parasite using NGS (Genome analyzer II, Illumina). Proteomic studies for this parasite have been completed for specific purposes, such as investigations for cathepsin l protease family and glutathione transferase superfamily (Morphew et al. 2011, 2012), but not whole profiling studies. The Fasciolidae family also contains other species, such as Fascioloides magna (generally detected in wild ruminants, cervids and bovids). The transcriptome and ES proteome of adult F. magna have been studied (Cantacessi et al. 2012b).

The other trematode species that infects the liver of the definitive host is Dicrocoelium dendriticum (belong to the Dicrocoeliidae family). An EST database of this parasite has been previously generated and surface/ES proteins of the parasite have been analysed by the LC-MS/MS approach (Martínez-Ibeas et al. 2013a, b; Bernal et al. 2014). Whole proteome profiling of this parasite remains to be studied, as a protein database reflecting whole transcriptome of the parasite has not been used in previous studies.

Intestinal flukes

Other trematode species, which are intestinal inhabitants of various mammalian and non-mammalian hosts, are members of the family Diplostomidae (including Alaria sp., Fibricola sp.), of the family Gymnophallidae (including Gymnophalloides sp., Parvatrema sp., Gymnophallus sp., and Bartolius sp.), of the family Heterophyidae (such as Metagonimus sp.) and of the family Echinostomatidae (including Echinostoma sp., Hypoderaeum sp.) (Toledo et al. 2011).

The transcriptome of E. caproni has been described by Garg et al. (2013). Predictably, some of the proteins may not have been identified by the proteomic studies available before 2013 (Guillou et al. 2007; Sotillo et al. 2010) due to lack of NGS data. Other studies, which have focused on metabolite changes in hosts infected with E. caproni, have been reviewed by O’Sullivan et al. (2013). Various metabolite alterations in plasma and urine have been detected at 1 and 8 days at post-infection in mice, respectively (Saric et al. 2008). Another species of the Echinostomatidae family is Hypoderaeum conoideum, which is found in intestines of birds, has led to significant economic losses to the poultry industry, particularly in some Asian countries (Yang et al. 2015a). Although the mitochondrial genome of this parasite has been studied (Yang et al. 2015a), no omic studies have yet been completed.

Two other genera, Fasciolopsis sp. and Watsonius sp., can cause intestinal infection. Although Fasciolopsis sp. is taxonomically classified under Fasciolidae family (mostly including liver flukes), a species of this genus, Fasciolopsis buski, is an intestinal fluke of humans and only one omic-based study has been published (Biswal et al. 2013). Watsonius watsoni (belong to the Paramphistomatidae family) is a trematode parasite, attached to intestine wall of primates and humans, causing diarrhoea (Mas-Coma et al. 2005), but this species has not been well studied.

Other flukes

Paramphistomum species such as P. cervi infects the rumen of ruminants (such as cattle and sheep), causing diarrhoea, anaemia, and lethargy (Toledo et al. 2011). Therefore, this parasite is important for animal health with associated economical costs. According to our literary review, there has not been a detailed omic scale study for this genus. However, transcriptome data of P. cervi (isolated from various main hosts) from NGS studies have been deposited at SRA. Fischoederius elongatus is observed in the rumen of ruminants, causing significant economic losses to the livestock industry (Yang et al. 2015b), however there is no omic scale study for this parasite to date.

A comprehensive omic-based analysis has been done for Paragonimus kellicotti, a neglected trematode, causing inflammation in lungs of humans (McNulty et al. 2014). In this study, LC-MS/MS has been integrated with the parasite’s transcriptome to identify immuno-dominant antigens. Similar studies could be carried out for other Paragonimus species (P. westermani, P. miyazakii), of which, a large nucleotide sequences can be utilised from SRA.

From ‘many to few’—identifying the proteins of interest

Identifying vaccine/drug targets for trematode parasites by determining those proteins that play specific roles in establishment and maintenance of infection, so-called virulence-related proteins, is a difficult but essential step. Most of the parasite proteins identified through omic studies will relate to physiological pathways of the parasite and are not directly associated with virulence. To eliminate these ‘house-keeping’ proteins, in silico comparative approaches are useful. Recently, a large number of nucleotide sequences of free-living trematodes (Dugesiidae family including Dugesia sp. and Schmidtea sp.) have been made publicly available at DNA Data Bank of Japan and a genome database of S. mediterranea has been deposited at SmedGD (Robb et al. 2008, 2015). Nucleotide and protein data of another well studied organism, C. elegans, is freely accessible from WormBase (Harris et al. 2014). Both trematode and nematode organisms have similar proteins; therefore, C. elegans database may serve as an important resource to observe the degree of similarity between these species (Haçarız et al. 2015).

Comparative analysis of the parasite proteins to databases containing sequence data of these non-pathogenic organisms may be a first step approach to simplify the search in identifying potential proteins of virulence, prior to the use of pathogen related databases such as Vaccine Investigation and Online Information Network (He et al. 2014) and Helminth Secretome Database (Garg and Ranganathan 2011, 2012). The other data resource, Helminth.net including Trematode.net, providing collective information for trematodes, could be useful for comparative studies (Martin et al. 2015). In addition, evolutionary changes of trematode proteins under positive selection pressure can be investigated. For this purpose, orthologous sequences (parasite vs. non-parasite) can be subjected to an evolutionary analysis, nonsynonymous/synonymous (Ka/Ks) substitution rate analysis, where Ka/Ks greater than a value of 1.0 is considered as parasitic related diversity (Hurst 2002).

Protein sequences of interest could be subjected to a series of bioinformatical analyses such as motif search (e.g. family, domain, conserved site) using InterProScan as an example (Jones et al. 2014). Biological pathways could be predicted by KEGG (Kyoto Encyclopedia of Genes and Genomes) (Ogata et al. 1999) and gene ontology (GO) information (Ashburner et al. 2000; Gene Ontology Consortium 2015) could also be investigated. However, some protein motifs of trematode parasites (e.g. domain of unknown function DUF1968 of F. hepatica, Haçarız et al. 2015) do not have GO information in the current GO database, so protein motifs can manually be inspected online (http://www.ebi.ac.uk/interpro/). Although this would be a time-consuming approach, it could provide significant results that could be missed by automated database searches.

The subcellular localisation of virulence-related proteins, which interact directly with the host, could be predicted using bioinformatical software such as WoLF PSORT (Horton et al. 2007). In addition, protein interactions, commonly referred to as the protein ‘interactome’ (Kandpal et al. 2009), reflecting interactions both within a parasite and between parasite and its host, could be investigated using bioinformatic approaches and molecular techniques. For example, the application of co-immunoprecipitation, fluorescence resonance energy transfer (FRET) and molecular imaging can be used to define those parasite protein of interest (Kandpal et al. 2009; Lahiri et al. 2014; Jia et al. 2014).

Analysis of differential gene expression levels in comparing different trematode organisms (e.g. infectious versus non-infectious, infectious versus infectious), different conditions (in vivo/in vitro) and/or different stages (newly excysted juvenile versus adult) of same trematode parasite, could be important to identify virulence-related genes. These could be investigated by NGS (e.g. using Cufflinks in RNA-Seq) (Trapnell et al. 2012) and then confirmed by real-time PCR (for chosen genes) (Haçarız et al. 2009a). However, the exact confirmation of differentially expressed proteins (in the different aspects) with quantitative LC-MS/MS is recommended (Haçarız et al. 2014).

To enhance the detection of antigenic virulence-related proteins of trematodes, two-dimensional gel-based techniques, including the use of positive host serum taken at appropriate time point(s) of infection (Haçarız et al. 2011) can be integrated with LC-MS/MS (McNulty et al. 2014). Possible B or T cell epitopes of predicted virulence-related proteins could be investigated using epitope databases (Immune Epitope Database and Analysis Resource; http://www.iedb.org/) (Vita et al. 2015), which would be important for vaccine studies, in particular.

Proteins identified by LC-MS/MS can be searched against drug target databases such as Potential Drug Target Database (http://www.dddc.ac.cn/pdtd/) (Li et al. 2006; Gao et al. 2008), Therapeutic Target Database (http://bidd.nus.edu.sg/group/ttd/ttd.asp) (Zhu et al. 2012) and DrugBank 4.0 (Law et al. 2014) for drug development studies. Information from these data resources could help find the related drug category and drug structure, which may aid researchers in choosing drugs as new therapeutics or assist with novel drug design.

Apart from the investigation of novel molecules that are potentially important in combatting fluke infections, omic approaches are also useful to enhance molecular understanding of anthelminthic drugs (e.g. triclabendazole), their mechanism of action and the rise in trematode resistance to these drugs (Chemale et al. 2010). As current omic technologies allow ‘global’ data screening of genes/proteins of organisms, it is possible to detect significant differences of the genes/proteins (of importance) that may be missed by less advanced technologies. With current omic approaches, differences in genes/proteins in comparative experimental models (such as the comparison of parasites which are susceptible or resistant to anthelmintics) could be detected, even to single nucleotide/amino acid change level, highlighting potential issues for drug metabolism/resistance studies (Chemale et al. 2010; Haçarız et al. 2012, 2014, 2015). This information could be used to re-synthesise the existing drug with the necessary chemical alterations to accommodate these molecular changes, at least for a period of time - as drug resistance could be developed for any drug design. In addition, omic approaches could also be utilised to detect important variations (e.g. drug resistance) in genes/proteins amongst parasite populations from different geographical locations (Hodgkinson et al. 2013).

The identification of novel trematode-based immuno-modulatory molecules is another potentially important outcome of omic studies. It is well known that parasitic helminths modulate the host immune system (Flynn et al. 2007; Robinson et al. 2013; Khan et al. 2015). As an example, the use of porcine whipworm (a nematode helminth) eggs (Trichuris suis ova) has been suggested in the treatment of ulcerative colitis (Summers et al. 2005). Recent studies have indicated that proteins secreted by trematode parasites (such as cathepsin L1, peroxiredoxin and helminth defence molecule of F. hepatica) are potentially important in dealing with immune-related diseases (Robinson et al. 2009, 2013). A recent NGS-based study has suggested that some of the predicted proteins of F. hepatica imitate host cytokine and cytokine receptors suggesting a possible therapeutic approach to dealing with immune-related diseases (Haçarız et al. 2015). It is thought that trematodes release specific molecules (such as cytokine-like) which only interact with unique targets of the host, hence inducing a series of unique pathways (Robinson et al. 2013; Wammes et al. 2014; Haçarız et al. 2015; Helmby 2015). Overall, the integration of LC-MS/MS strategies with growing nucleotide data will help to identify trematodes’ ‘molecules of interest’, at peptide level, which could be utilised to treat immune-related diseases (e.g. auto-immune diseases).

Production of important parasite proteins for further experimental studies

In general terms, LC-MS/MS and RNA-Seq do not provide full sequences of identified proteins or encoding genes, respectively, as molecule identification is based on the detected sequences even if incomplete but of sufficient length. If the gene sequence of the protein of interest is partially known, full length of that sequence (derived from RNA-seq studies) can be obtained using RACE methods or deduced from genome data of the parasite if it is available (Japa et al. 2015). Gene sequences can be amplified by PCR and, if genetic material cannot be obtained, the gene sequence can be synthesised artificially (Hughes et al. 2011). In general, the ‘his-taq’ sequence encoding at least six histidine residues is incorporated with the gene sequence of interest and the his-tagged protein (either at N or C terminal) is purified using affinity chromatography (Pedersen et al. 1999; Cass et al. 2005). If necessary, the his-tag peptide could be removed after the protein is purified (Pedersen et al. 1999; Andberg et al. 2007).

In order to investigate a protein as a therapeutic or prophylactic target, a recombinant of the identified parasite proteins (partial or full) will need to be produced, commonly using other organisms’ cells. In general, expression vectors (such as virus or plasmid) are used to deliver the gene encoding the desired protein into protein expressing cells (Pedersen et al. 1999; Cass et al. 2005). In doing so, it is important that chemical properties (e.g. hydrophobicity) of the protein of interest are examined, and determine whether these properties are appropriate for the planned protein production protocol. A number of bioinformatic tools in on-line resources (e.g. ExPasy; http://www.expasy.org/) are helpful to predict such important chemical properties (Artimo et al. 2012).

As trematode parasites are eukaryotic organisms, accurate post-translation modifications of the parasite proteins will, most likely, be required. Production of trematode proteins in prokaryotic cells (such as E. coli) may lead to differences in three dimensional structure (e.g. protein folding) and post-translation modifications (e.g. phosphorylation) relative to the original. Although prokaryotic cell systems could initially be tested (Khow and Suntrarachun 2012), adopting yeast (such as S.cerevisiae), insect cell systems (baculovirus + insect cell) or mammalian cells systems may be required to have better results in terms of similarity to the original protein (Endo and Sawasaki 2006). However, protein expression in heterologous systems (even eukaryotic) may not always guarantee the production of an identical parasite protein in some components, carbohydrate motifs, for example.

Primary structure (peptide sequence) and purity of produced protein can be investigated with LC-MS/MS or MALDI-TOF systems (Chen and Pramanik 2009; Webster and Oxley 2012). Endotoxin levels can be analysed with Limulus Amebocyte Lysate Assay (LAL assay) and possible endotoxin residues could be removed using a phase separation approach (Haçarız et al. 2011). Defining primary structure (peptide sequence) of homogenous-produced protein may be satisfactory in some cases. However, further structural analysis may be of benefit for more promising protein targets. Secondary structure (alpha and beta sheets) of the produced protein can be detected with circular dichroism (Kelly and Price 2000). Tertiary or quaternary structures of the produced protein can be demonstrated by NMR or X-ray crystallography (Littlechild et al. 1987; Spencer et al. 2015).

After proteins are expressed in vitro, comparative vaccine trials will assess the protective nature of the protein. Monitoring the effect of adjuvants is also important for trematodes, as some adjuvants themselves (such as Quil A) may influence parasitic burden/egg number, as described for F. hepatica (Haçarız et al. 2009b), masking the specific effect of the antigen.

For the development of anthelmintics, in silico approaches could be utilised to identify inhibitors of trematode molecules, as it has been shown for cathepsin L3 of F. hepatica (Hernández Alvarez et al. 2015). High-throughput in vitro models can be designed if the number of testing molecules is high before using in vivo models (Tsai et al. 2013).

Apart from developing strategies to deal with infection, recombinant trematode molecules could be suitable for diagnostic tests (Gonzales Santana et al. 2013; Teimoori et al. 2015) and other in vitro/in vivo disease models in terms of therapeutic applications of immune-related diseases (Dalton et al. 2013; Robinson et al. 2013; Hasby et al. 2015; Heylen et al. 2015).

Finally, for commercialisation of novel biological molecules, GLP (Good Laboratory Practice) and GMP (Good Manufacturing Practice) procedures (Leblanc et al. 2014) need to be adopted.

Translation of omic research into reality—commercialisation of novel trematode-derived molecules

Whilst the omic approach to researching parasitic mechanisms must now be favoured as the best current approach to unlocking control mechanisms, any newly designed control measure for trematode parasitism in human or veterinary medicine, prophylactic or therapeutic, will be benchmarked against the practical benefits of anthelmintics. These commonly used products have been hugely successful in the veterinary field for specific reasons - the cheap price relative to the cost of production losses, the wide spectrum of activity including a number of seasonal co-infecting trematodes (and nematodes), and the ease of integration into farm management practices and easily stored. Vaccine-based concepts carry additional criteria of requiring the induction of memory, effective for at least one season. Stricter food residue safety concerns, animal welfare and environmental regulations will place additional requirements on bringing any new product to market in the field of veterinary medicine, in particular.

The omic approach has been successfully applied to vaccine development and commercialisation. The sequence-based ‘reverse vaccinology’ approach has been utilised to develop an innovative commercialised vaccine against Neisseria meningitidis serogroup B (Heinson et al. 2015), with a similar omic approach being applied to Streptococcus agalactiae, S. pyogenes, S. pneumoniae and pathogenic E. coli (Seib et al. 2012).

The status of the omic-based approach for many pathogens, including the causative agents of tick-borne diseases, malaria and hookworm disease, has been previously reviewed and concluded that the approach contributes to the better understanding of the parasites’ infection mechanisms and identification of the parasites’ important molecules for control strategies, but further efforts (wet lab-based research in particular) are needed to utilise the omic-based findings for the development of the practical tools (such as vaccine) (Loukas et al. 2011; Marcelino et al. 2012; Nóbrega de Sousa et al. 2013).

For trematode parasites, the application of omic technologies into developing commercialised vaccines and drug targets is at an early stage; however, examples are emerging. A 28 kDa glutathione S-transferase of S. haematobium has been detected (Higón et al. 2011; van der Ree and Mutapi 2015), and a recombinant version of this molecule (rSh28GST), along with an aluminium hydroxide (alum) adjuvant, has been tested in a phase I trial for its safety, tolerability and immunogenicity for the purpose of reducing egg production and subsequent pathology (Riveau et al. 2012). The beneficial effect against schistosomiasis is currently thought to be improved by considering other important antigens detected by proteomic studies (Ludolf et al. 2014; van der Ree and Mutapi 2015). Protein kinase C and extracellular signal-regulated kinase, proteins that were identified through omic analysis, have been recently suggested as potential targets for chemotherapeutic treatments against human schistosomiasis (Ressurreição et al. 2014; Walker et al. 2014). The identification of an insulin receptor of S. japonicum (The Schistosoma japonicum Genome Sequencing and Functional Analysis Consortium 2009) has led to an experimental study where the use of the ligand domain of S. japonicum insulin receptor 2 (SjLD2) (produced in E. coli) provided a high level decrease (75 %) in the number of mature intestinal eggs in a mice challenge model (You et al. 2012; You and McManus 2015).

Following efforts on omic-scale work for another trematode parasite, F. hepatica (Young et al. 2010b; Wilson et al. 2011), further research work has indicated that there are different FhCD59-like sequences which play distinct roles in the development of the parasite (Shi et al. 2014) and FhCD59-like proteins are now considered vaccine candidates (Toet et al. 2014). Additionally, the identification of potential S. mansoni vaccine candidates with computational vaccinology has been reviewed (Pinheiro et al. 2011).

The omic research approach may be useful to identify essential virulence genes that can be targeted by RNA interference (RNAi), thereby silencing the phenotypic effect of these genes and negatively effecting the survival of the parasite (McGonigle et al. 2008; McVeigh et al. 2014; Hagen et al. 2015). Pereira et al. (2008) have shown that injection with small interfering RNAs (siRNAs) (designed and produced against HGPRTase gene of the parasite) through tail vein of S. mansoni infected mice leads to a reduction in worm burden (approximately 27 %, compared to control) and proposed that the treatment could be improved by molecule delivery and siRNA dose. Recently, Cao et al. (2014) showed that injection of siRNAs against lethal giant larvae (Lgl) gene of S. japonicum through tail vein results in a reduction of up to 85 % in hatching rates of eggs for the parasite-infected mice. Apart from their gene silencing effect, siRNAs may also be used to stimulate the host immune system to enhance resistance to infectious organisms (Marques and Williams 2005; Gantier 2013). Overall, the RNAi approach is a promising tool to deal with infection for trematode parasites. However, more studies are necessary to develop commercial products which are effective and safe in trematode infections.

In addition to vaccine, drug or RNAi development, omic approaches show promise in the development/improvement of diagnostic kits via the identification of species-specific antigens to address cross-reactivity (van der Ree and Mutapi 2015). Fast and correct diagnosis of infection is important for rapid targeted treatment, reduced pathology and decreased prevalence of the parasite through reduced infectious potential.

As the expansion of omic techniques and technologies provide larger volumes of information, tunnelling this information from omic studies into a commercialised product requires several steps from a number of disciplines, including research, development and commercialisation experts. Whilst individual researchers, or ad hoc research groups, may not have the resources or time to enter into the commercialisation of products, it is the responsibility of first stage researcher to ensure that research into strategies for parasitic control are carried out mindful of the practical applications required of such a strategy. Such focused goals are needed to address the mounting challenges in animal and human health (Hood et al. 2012) and the application of technical advances, such as omic technologies, should be done appropriately (McShane et al. 2013). Integration of industrial, commercial and discovery research partners, through intellectual property rights must be regarded as the most likely approach to drug and vaccine discovery, which should be reflected in project plans from the outset (Guerrero et al. 2012).

Challenges and perspectives for omic-scale studies in trematode research

One of the major difficulties in omic-based trematode research is the identification capacity of LC-MS/MS. Recent LC-MS/MS systems without sample pre-fractionation cannot reach a protein identification number greater than 10,000 (Thakur et al. 2011; Nagaraj et al. 2012; Branca et al. 2014). Advances in LC-MS/MS technology, for the identification of genome-size proteomes of higher eukaryotes including trematodes, without a pre-fractionation step will be an important future development in simplifying the wet lab procedure.

To date, in general terms, omic-based strategies have been applied to a single trematode organism isolated from a chosen location. However, genetic/proteomic heterogeneity has been shown to influence drug resistance (such as the case for F. hepatica) (Chemale et al. 2010). In the future, omic-based strategies should focus on deciphering genetic differences among the variants within a trematode species, as the heterogeneity could define the effectiveness of treatment/prevention strategies. These strategies will be significant in the development of multi-valent vaccines for the variants within a species.

The other difficulty is the annotation for some of newly found trematode sequences from omic-based studies. The comparative BLAST analysis of the sequences using public databases does not always provide significant hits. Therefore, these sequences need be defined further. Additionally, a significant number of trematode sequences are called ‘unknown protein’ in the databases, which remain to be studied further for functional annotation. Developments in bioinformatic approaches may help to solve this problem.

The existing difficulties in the development of preventive vaccines against the trematode parasites could be summarised as follows: lack of full protective effect, undesired inflammation of the antigen/adjuvant combinations, and individual diversity in both parasite and host in terms of magnitude of immune responses leading to difficulty in assessing the effectiveness of the vaccine (Haçarız et al. 2009a, b; McNeilly and Nisbet 2014). Omic-based studies will eventually be helpful to enhance understanding of the biological characteristics of both parasite and host, at individual level and host-pathogen interaction, which will guide the development of effective preventive vaccines (Molina-Hernández et al. 2015). This will be addressed in over a longer time period. Nevertheless, the omic-based results would contribute to the development of other control approaches such as the use of therapeutics within a shorter timescale.

Each promising experimental vaccine is generally proposed for use as a preventative strategy for a single disease in the field of parasitology. As the feasibility of this ‘one vaccine - one disease’ strategy comes under increasing financial scrutiny, the use of omic-based studies, which can identify important proteins for the development of combination vaccines, could incentivise the commercialisation of such broad spectrum vaccines.

Conclusions

Five years ago, a large number of genes/proteins from omic-based studies were only reported for Schistosoma spp., particularly S. mansoni, in the field trematodology. The omic-based strategy has led to the discovery of molecules (e.g. 28 kDa glutathione S-transferase, protein kinase C, extracellular signal-regulated kinase, SjLDs fusion proteins) which are now proposed to be promising for the immunotherapy of schistosomiasis and/or transmission of the disease (Knudsen et al. 2005; Riveau et al. 2012; You et al. 2012; You and McManus 2015; Ressurreição et al. 2014; Walker et al. 2014; van der Ree and Mutapi 2015). We now have a large volume of new data for many other trematode species. However, results from the omic-based studies for these genera are relevantly new, compared to S. mansoni and much newer in comparison with most of infectious microorganisms. With well-designed experiments, LC-MS/MS will identify essential parasite proteins at peptide level and open avenues to better understand parasite biology, including virulence mechanisms, parasite protein interactions (proteins that are correlated in expression or additive in their effect), interactions between the parasite and its host, and identifying better protective strategies such as vaccine and therapeutic approaches such as drug or RNAi.

Integrated approaches, with bioinformatic tools in particular, will be helpful to strategically identify particular proteins of interest from a whole trematode proteome which can then be used as protective commercialised product, as in the case of the meningococcal vaccine (Heinson et al. 2015).

To date, metabolome profiling of bacteria has led to enhanced understanding antibiotic-resistance and discovery of novel drug targets (Birkenstock et al. 2012; du Preez and Loots 2012; Milshteyn et al. 2014). Compared to genomic and proteomic profiling of trematodes, omic-scale metabolome profiling of these organisms is still at its infancy. Understanding the anthelmintic resistance to commonly used drugs and finding alternative drug targets would be the first target in the trematode research in applying metabolome analyses. We foresee that integration of metabolome information with genome and proteome data will be possible in the future and enlighten our understanding of trematode biology.

Overall, it is reasonable to assume that omic approaches are key to providing insights into the biological characteristics of the parasite, and will contribute to the development of innovative and effective strategies against trematodes, with the possibility of helminth-based therapeutic strategies for autoimmune-related diseases. The development of rapid, effective and accurate procedures to allow researchers to pinpoint proteins of interest from the trematode proteome pool will be key to maximising the benefits of this omic era. Based on previous success of omic-based studies in different fields, we expect that results from the relevant studies will turn into useful products to deal with trematode infections in the forseeable future.