Advertisement

A Multi-Omics Database for Parasitic Nematodes and Trematodes

  • John Martin
  • Rahul Tyagi
  • Bruce A. Rosa
  • Makedonka MitrevaEmail author
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1757)

Abstract

Helminth.net (www.helminth.net) is a web-based resource that was launched in 2000 as simply “Nematode.net” to host and investigate gene sequences from nematode genomes. Over the years it has evolved to become the moniker for a collection of databases: Nematode.net and Trematode.net. These databases host information for 73 nematode (roundworms) and 17 trematode (flatworms) species and serve as backbone for a number of tools that allow users to query slices of the data for multifactorial combinations of species-omics properties. Recent focus has been on inclusion of gene and protein expression data, population genomics and cross-kingdom interactions (metagenomics datasets). This chapter describes the website, the available tools and some of the new features.

Key words

Nematodes Trematodes Database Search Genome browser BLAST Functional annotation Transcriptome Proteome Metagenome 

Notes

Acknowledgments

We sincerely thank all the past and present members of Mitreva lab for their contribution to the database over the past 17 years (nematode.net/staff.html), and we thank the numerous collaborators in the helminth community (nematode.net/collaborators.html and trematode.net/collaborators.html), for providing invaluable worm material and being involved in data generation/analysis activities, and the dedicated members of the production group at The McDonnell Genome Institute (http://genome.wustl.edu/) for the library construction and sequencing. Helminth.net is funded by National Institutes of Health [AI081803 and GM097435] and NIFA [2013-01109].

References

  1. 1.
    Wylie T, Martin JC, Dante M, Mitreva MD, Clifton SW, Chinwalla A, Waterston RH, Wilson RK, McCarter JP (2004) Nematode.net: a tool for navigating sequences from parasitic and free-living nematodes. Nucleic Acids Res 32(Database issue):D423–D426.  https://doi.org/10.1093/nar/gkh010 CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Martin J, Abubucker S, Wylie T, Yin Y, Wang Z, Mitreva M (2009) Nematode.net update 2008: improvements enabling more efficient data mining and comparative nematode genomics. Nucleic Acids Res 37(Database issue):D571–D578.  https://doi.org/10.1093/nar/gkn744 CrossRefPubMedGoogle Scholar
  3. 3.
    Martin J, Abubucker S, Heizer E, Taylor CM, Mitreva M (2012) Nematode.net update 2011: addition of data sets and tools featuring next-generation sequencing data. Nucleic Acids Res 40(Database issue):D720–D728.  https://doi.org/10.1093/nar/gkr1194 CrossRefPubMedGoogle Scholar
  4. 4.
    Martin J, Rosa BA, Ozersky P, Hallsworth-Pepin K, Zhang X, Bhonagiri-Palsikar V, Tyagi R, Wang Q, Choi YJ, Gao X, McNulty SN, Brindley PJ, Mitreva M (2015) Helminth.net: expansions to Nematode.net and an introduction to Trematode.net. Nucleic Acids Res 43(Database issue):D698–D706.  https://doi.org/10.1093/nar/gku1128 CrossRefPubMedGoogle Scholar
  5. 5.
    Wylie T, Martin J, Abubucker S, Yin Y, Messina D, Wang Z, McCarter JP, Mitreva M (2008) NemaPath: online exploration of KEGG-based metabolic pathways for nematodes. BMC Genomics 9:525.  https://doi.org/10.1186/1471-2164-9-525 CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Abubucker S, Martin J, Taylor CM, Mitreva M (2011) HelmCoP: an online resource for helminth functional genomics and drug and vaccine targets prioritization. PLoS One 6(7):e21832.  https://doi.org/10.1371/journal.pone.0021832. PONE-D-11-02640 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Tyagi R, Rosa BA, Lewis WG, Mitreva M (2015) Pan-phylum comparison of nematode metabolic potential. PLoS Negl Trop Dis 9(5):e0003788.  https://doi.org/10.1371/journal.pntd.0003788 CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Torbati ME, Mitreva M, Gopalakrishnan V (2016) Application of taxonomic modeling to microbiota data mining for detection of helminth infection in global populations. Data (Basel) 1(3):19.  https://doi.org/10.3390/data1030019 CrossRefGoogle Scholar
  9. 9.
    Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30(9):1236–1240.  https://doi.org/10.1093/bioinformatics/btu031 CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, Bernard T, Binns D, Bork P, Burge S, de Castro E, Coggill P, Corbett M, Das U, Daugherty L, Duquenne L, Finn RD, Fraser M, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, McMenamin C, Mi H, Mutowo-Muellenet P, Mulder N, Natale D, Orengo C, Pesseat S, Punta M, Quinn AF, Rivoire C, Sangrador-Vegas A, Selengut JD, Sigrist CJ, Scheremetjew M, Tate J, Thimmajanarthanan M, Thomas PD, Wu CH, Yeats C, Yong SY (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 40(Database issue):D306–D312.  https://doi.org/10.1093/nar/gkr948 CrossRefPubMedGoogle Scholar
  11. 11.
    Gish W (1996–2003) http://blast.wustl.edu
  12. 12.
    Blake JA, Dolan M, Drabkin H, Hill DP, Li N, Sitnikov D, Bridges S, Burgess S, Buza T, McCarthy F, Peddinti D, Pillai L, Carbon S, Dietze H, Ireland A, Lewis SE, Mungall CJ, Gaudet P, Chrisholm RL, Fey P, Kibbe WA, Basu S, Siegele DA, McIntosh BK, Renfro DP, Zweifel AE, Hu JC, Brown NH, Tweedie S, Alam-Faruque Y, Apweiler R, Auchinchloss A, Axelsen K, Bely B, Blatter M, Bonilla C, Bouguerleret L, Boutet E, Breuza L, Bridge A, Chan WM, Chavali G, Coudert E, Dimmer E, Estreicher A, Famiglietti L, Feuermann M, Gos A, Gruaz-Gumowski N, Hieta R, Hinz C, Hulo C, Huntley R, James J, Jungo F, Keller G, Laiho K, Legge D, Lemercier P, Lieberherr D, Magrane M, Martin MJ, Masson P, Mutowo-Muellenet P, O’Donovan C, Pedruzzi I, Pichler K, Poggioli D, Porras Millán P, Poux S, Rivoire C, Roechert B, Sawford T, Schneider M, Stutz A, Sundaram S, Tognolli M, Xenarios I, Foulgar R, Lomax J, Roncaglia P, Khodiyar VK, Lovering RC, Talmud PJ, Chibucos M, Giglio MG, Chang H, Hunter S, McAnulla C, Mitchell A, Sangrador A, Stephan R, Harris MA, Oliver SG, Rutherford K, Wood V, Bahler J, Lock A, Kersey PJ, McDowall DM, Staines DM, Dwinell M, Shimoyama M, Laulederkind S, Hayman T, Wang S, Petri V, Lowry T, D’Eustachio P, Matthews L, Balakrishnan R, Binkley G, Cherry JM, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hitz BC, Hong EL, Karra K, Miyasato SR, Nash RS, Park J, Skrzypek MS, Weng S, Wong ED, Berardini TZ, Huala E, Mi H, Thomas PD, Chan J, Kishore R, Sternberg P, Van Auken K, Howe D, Westerfield M, Consortium GO (2013) Gene Ontology annotations and resources. Nucleic Acids Res 41(Database issue):D530–D535.  https://doi.org/10.1093/nar/gks1050 CrossRefPubMedGoogle Scholar
  13. 13.
    Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M (2014) Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res 42(Database issue):D199–D205.  https://doi.org/10.1093/nar/gkt1076 CrossRefPubMedGoogle Scholar
  14. 14.
    Fischer S, Brunk BP, Chen F, Gao X, Harb OS, Iodice JB, Shanmugam D, Roos DS, Stoeckert CJ, Jr. (2011) Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups. Curr Protoc Bioinformatics Chapter 6:Unit 6 12 11–19. doi: https://doi.org/10.1002/0471250953.bi0612s35
  15. 15.
    Sonnhammer EL, Ostlund G (2015) InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic. Nucleic Acids Res 43(Database issue):D234–D239.  https://doi.org/10.1093/nar/gku1203 CrossRefPubMedGoogle Scholar
  16. 16.
    Wootton JC, Federhen S (1993) Statistics of local complexity in amino acid sequences and sequence databases. Comput Chem 17(2):149–163CrossRefGoogle Scholar
  17. 17.
    Bedell JA, Korf I, Gish W (2000) MaskerAid: a performance enhancement to RepeatMasker. Bioinformatics 16(11):1040–1041CrossRefPubMedGoogle Scholar
  18. 18.
    Stein LD (2013) Using GBrowse 2.0 to visualize and share next-generation sequence data. Brief Bioinform 14(2):162–171.  https://doi.org/10.1093/bib/bbt001 CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Sanchez Alvarado A, Yandell M (2008) MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 18(1):188–196.  https://doi.org/10.1101/gr.6743907 CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW (2007) RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35(9):3100–3108.  https://doi.org/10.1093/nar/gkm160 CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Lowe TM, Eddy SR (1997) tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25(5):955–964CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20(9):1297–1303.  https://doi.org/10.1101/gr.107524.110 CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6(2):80–92.  https://doi.org/10.4161/fly.19695 CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Choi YJ, Tyagi R, McNulty SN, Rosa BA, Ozersky P, Martin J, Hallsworth-Pepin K, Unnasch TR, Norice CT, Nutman TB, Weil GJ, Fischer PU, Mitreva M (2016) Genomic diversity in Onchocerca volvulus and its Wolbachia endosymbiont. Nat Microbiol 2:16207.  https://doi.org/10.1038/nmicrobiol.2016.207 CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    McNulty SN, Strube C, Rosa BA, Martin JC, Tyagi R, Choi YJ, Wang Q, Hallsworth Pepin K, Zhang X, Ozersky P, Wilson RK, Sternberg PW, Gasser RB, Mitreva M (2016) Dictyocaulus viviparus genome, variome and transcriptome elucidate lungworm biology and support future intervention. Sci Rep 6:20316.  https://doi.org/10.1038/srep20316 CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    McNulty SN, Tort JF, Rinaldi G, Fischer K, Rosa BA, Smircich P, Fontenla S, Choi YJ, Tyagi R, Hallsworth-Pepin K, Mann VH, Kammili L, Latham PS, Dell’Oca N, Dominguez F, Carmona C, Fischer PU, Brindley PJ, Mitreva M (2017) Genomes of Fasciola hepatica from the Americas reveal colonization with Neorickettsia Endobacteria related to the agents of potomac horse and human sennetsu fevers. PLoS Genet 13(1):e1006537.  https://doi.org/10.1371/journal.pgen.1006537 CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Carbon S, Ireland A, Mungall CJ, Shu S, Marshall B, Lewis S, the AmiGO Hub, the Web Presence Working Group (2009) AmiGO: online access to ontology and annotation data. Bioinformatics 25(2):288–289.  https://doi.org/10.1093/bioinformatics/btn615 CrossRefPubMedGoogle Scholar
  28. 28.
    Leinonen R, Sugawara H, Shumway M, International Nucleotide Sequence Database Collaboration (2011) The sequence read archive. Nucleic Acids Res 39(Database issue):D19–D21.  https://doi.org/10.1093/nar/gkq1019 CrossRefPubMedGoogle Scholar
  29. 29.
    Wasmuth JD, Blaxter ML (2004) prot4EST: translating expressed sequence tags from neglected genomes. BMC Bioinformatics 5:187.  https://doi.org/10.1186/1471-2105-5-187 CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437(7057):376–380.  https://doi.org/10.1038/nature03959 CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8(3):175–185CrossRefPubMedGoogle Scholar
  32. 32.
    Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8(3):186–194CrossRefPubMedGoogle Scholar
  33. 33.
    Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Res 8(3):195–202CrossRefPubMedGoogle Scholar
  34. 34.
    Kozomara A, Griffiths-Jones S (2011) miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res 39(Database issue):D152–D157.  https://doi.org/10.1093/nar/gkq1027 CrossRefPubMedGoogle Scholar
  35. 35.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28(1):235–242CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Kall L, Krogh A, Sonnhammer EL (2004) A combined transmembrane topology and signal peptide prediction method. J Mol Biol 338(5):1027–1036.  https://doi.org/10.1016/j.jmb.2004.03.016 CrossRefPubMedGoogle Scholar
  37. 37.
    Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V, Djoumbou Y, Eisner R, Guo AC, Wishart DS (2011) DrugBank 3.0: a comprehensive resource for ‘omics’ research on drugs. Nucleic Acids Res 39(Database issue):D1035–D1041.  https://doi.org/10.1093/nar/gkq1126 CrossRefPubMedGoogle Scholar
  38. 38.
    Hopkins AL, Groom CR (2002) The druggable genome. Nat Rev Drug Discov 1(9):727–730.  https://doi.org/10.1038/nrd892 CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • John Martin
    • 1
  • Rahul Tyagi
    • 1
  • Bruce A. Rosa
    • 1
  • Makedonka Mitreva
    • 1
    • 2
    Email author
  1. 1.McDonnell Genome InstituteWashington University in St. LouisSt. LouisUSA
  2. 2.Division of Infectious Diseases, Department of MedicineWashington University School of MedicineSt. LouisUSA

Personalised recommendations