Advertisement

Using WormBase ParaSite: An Integrated Platform for Exploring Helminth Genomic Data

  • Bruce J. Bolt
  • Faye H. Rodgers
  • Myriam Shafie
  • Paul J. Kersey
  • Matthew Berriman
  • Kevin L. Howe
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1757)

Abstract

WormBase ParaSite (parasite.wormbase.org) is a comprehensive resource for the genomes of parasitic nematodes and flatworms (helminths). It currently includes genomic data for over 100 helminth species, adding value by way of consistent functional annotation, gene comparative analysis and gene expression analysis. We provide several ways of exploring the data including a choice of genome browsers, genome and gene summary pages, text and sequence searching, a query wizard, bulk downloads, and programmatic interfaces. WormBase ParaSite is released three to six times per year, and is developed in collaboration with WormBase (www.wormbase.org) and Ensembl Genomes (www.ensemblgenomes.org).

Key words

Genome browser Comparative genomics Functional genomics Helminths WormBase Ensembl Parasitology 

Notes

Acknowledgments

WormBase ParaSite is funded by the UK Biotechnology and Biological Sciences Research Council [BB/K020080]. We are thankful to Alessandra Traini, Eleanor Stanley, and Jane Lomax for their previous work on WormBase ParaSite. We would also like to thank members of the WormBase and WTSI Parasite genomics groups for useful advice and feedback.

References

  1. 1.
    Hotez PJ, Brindley PJ, Bethony JM, King CH, Pearce EJ, Jacobson J (2008) Helminth infections: the great neglected tropical diseases. J Clin Invest 118(4):1311–1321.  https://doi.org/10.1172/JCI34261 CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Howe KL, Bolt BJ, Cain S, Chan J, Chen WJ, Davis P, Done J, Down T, Gao S, Grove C, Harris TW, Kishore R, Lee R, Lomax J, Li Y, Muller HM, Nakamura C, Nuin P, Paulini M, Raciti D, Schindelman G, Stanley E, Tuli MA, Van Auken K, Wang D, Wang X, Williams G, Wright A, Yook K, Berriman M, Kersey P, Schedl T, Stein L, Sternberg PW (2016) WormBase 2016: expanding to enable helminth genomic research. Nucleic Acids Res 44(D1):D774–D780.  https://doi.org/10.1093/nar/gkv1217 CrossRefPubMedGoogle Scholar
  3. 3.
    Kersey PJ, Allen JE, Armean I, Boddu S, Bolt BJ, Carvalho-Silva D, Christensen M, Davis P, Falin LJ, Grabmueller C, Humphrey J, Kerhornou A, Khobova J, Aranganathan NK, Langridge N, Lowy E, McDowall MD, Maheswari U, Nuhn M, Ong CK, Overduin B, Paulini M, Pedro H, Perry E, Spudich G, Tapanari E, Walts B, Williams G, Tello-Ruiz M, Stein J, Wei S, Ware D, Bolser DM, Howe KL, Kulesha E, Lawson D, Maslen G, Staines DM (2016) Ensembl Genomes 2016: more genomes, more complexity. Nucleic Acids Res 44(D1):D574–D580.  https://doi.org/10.1093/nar/gkv1209 CrossRefPubMedGoogle Scholar
  4. 4.
    Cochrane G, Karsch-Mizrachi I, Takagi T, International Nucleotide Sequence Database C (2016) The International Nucleotide Sequence Database Collaboration. Nucleic Acids Res 44(D1):D48–D50.  https://doi.org/10.1093/nar/gkv1323 CrossRefGoogle Scholar
  5. 5.
    Schwarz EM, Korhonen PK, Campbell BE, Young ND, Jex AR, Jabbar A, Hall RS, Mondal A, Howe AC, Pell J, Hofmann A, Boag PR, Zhu XQ, Gregory T, Loukas A, Williams BA, Antoshechkin I, Brown C, Sternberg PW, Gasser RB (2013) The genome and developmental transcriptome of the strongylid nematode Haemonchus contortus. Genome Biol 14(8):R89.  https://doi.org/10.1186/gb-2013-14-8-r89 CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Laing R, Kikuchi T, Martinelli A, Tsai IJ, Beech RN, Redman E, Holroyd N, Bartley DJ, Beasley H, Britton C, Curran D, Devaney E, Gilabert A, Hunt M, Jackson F, Johnston SL, Kryukov I, Li K, Morrison AA, Reid AJ, Sargison N, Saunders GI, Wasmuth JD, Wolstenholme A, Berriman M, Gilleard JS, Cotton JA (2013) The genome and transcriptome of Haemonchus contortus, a key model parasite for drug and vaccine discovery. Genome Biol 14(8):R88.  https://doi.org/10.1186/gb-2013-14-8-r88 CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Barrett T, Clark K, Gevorgyan R, Gorelenkov V, Gribov E, Karsch-Mizrachi I, Kimelman M, Pruitt KD, Resenchuk S, Tatusova T, Yaschenko E, Ostell J (2012) BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata. Nucleic Acids Res 40(Database issue):D57–D63.  https://doi.org/10.1093/nar/gkr1163 CrossRefPubMedGoogle Scholar
  8. 8.
    The Gene Ontology Consortium (2017) Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res 45(D1):D331–D338.  https://doi.org/10.1093/nar/gkw1108 CrossRefGoogle Scholar
  9. 9.
    Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30(9):1236–1240.  https://doi.org/10.1093/bioinformatics/btu031 CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E (2009) EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19(2):327–335.  https://doi.org/10.1101/gr.073585.107 CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Howe KL, Bolt BJ, Shafie M, Kersey P, Berriman M (2016) WormBase ParaSite—a comprehensive resource for helminth genomics. Mol Biochem Parasitol.  https://doi.org/10.1016/j.molbiopara.2016.11.005
  12. 12.
    Earl D, Bradnam K, St John J, Darling A, Lin D, Fass J, Yu HO, Buffalo V, Zerbino DR, Diekhans M, Nguyen N, Ariyaratne PN, Sung WK, Ning Z, Haimel M, Simpson JT, Fonseca NA, Birol I, Docking TR, Ho IY, Rokhsar DS, Chikhi R, Lavenier D, Chapuis G, Naquin D, Maillet N, Schatz MC, Kelley DR, Phillippy AM, Koren S, Yang SP, Wu W, Chou WC, Srivastava A, Shaw TI, Ruby JG, Skewes-Cox P, Betegon M, Dimon MT, Solovyev V, Seledtsov I, Kosarev P, Vorobyev D, Ramirez-Gonzalez R, Leggett R, MacLean D, Xia F, Luo R, Li Z, Xie Y, Liu B, Gnerre S, MacCallum I, Przybylski D, Ribeiro FJ, Yin S, Sharpe T, Hall G, Kersey PJ, Durbin R, Jackman SD, Chapman JA, Huang X, DeRisi JL, Caccamo M, Li Y, Jaffe DB, Green RE, Haussler D, Korf I, Paten B (2011) Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res 21(12):2224–2241.  https://doi.org/10.1101/gr.126599.111 CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Parra G, Bradnam K, Korf I (2007) CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23(9):1061–1067.  https://doi.org/10.1093/bioinformatics/btm071 CrossRefPubMedGoogle Scholar
  14. 14.
    Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31(19):3210–3212.  https://doi.org/10.1093/bioinformatics/btv351 CrossRefPubMedGoogle Scholar
  15. 15.
    Consortium EP (2015) Europe PMC: a full-text literature database for the life sciences and platform for innovation. Nucleic Acids Res 43(Database issue):D1042–D1048.  https://doi.org/10.1093/nar/gku1061 CrossRefGoogle Scholar
  16. 16.
    Veidenberg A, Medlar A, Loytynoja A (2016) Wasabi: an integrated platform for evolutionary sequence analysis and data visualization. Mol Biol Evol 33(4):1126–1130.  https://doi.org/10.1093/molbev/msv333 CrossRefPubMedGoogle Scholar
  17. 17.
    Cook CE, Bergman MT, Finn RD, Cochrane G, Birney E, Apweiler R (2016) The European Bioinformatics Institute in 2016: data growth and integration. Nucleic Acids Res 44(D1):D20–D26.  https://doi.org/10.1093/nar/gkv1352 CrossRefPubMedGoogle Scholar
  18. 18.
    Petryszak R, Keays M, Tang YA, Fonseca NA, Barrera E, Burdett T, Fullgrabe A, Fuentes AM, Jupp S, Koskinen S, Mannion O, Huerta L, Megy K, Snow C, Williams E, Barzine M, Hastings E, Weisser H, Wright J, Jaiswal P, Huber W, Choudhary J, Parkinson HE, Brazma A (2016) Expression Atlas update—an integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Res 44(D1):D746–D752.  https://doi.org/10.1093/nar/gkv1045 CrossRefPubMedGoogle Scholar
  19. 19.
    Finn RD, Attwood TK, Babbitt PC, Bateman A, Bork P, Bridge AJ, Chang HY, Dosztanyi Z, El-Gebali S, Fraser M, Gough J, Haft D, Holliday GL, Huang H, Huang X, Letunic I, Lopez R, Lu S, Marchler-Bauer A, Mi H, Mistry J, Natale DA, Necci M, Nuka G, Orengo CA, Park Y, Pesseat S, Piovesan D, Potter SC, Rawlings ND, Redaschi N, Richardson L, Rivoire C, Sangrador-Vegas A, Sigrist C, Sillitoe I, Smithers B, Squizzato S, Sutton G, Thanki N, Thomas PD, Tosatto SC, Wu CH, Xenarios I, Yeh LS, Young SY, Mitchell AL (2017) InterPro in 2017—beyond protein family and domain annotations. Nucleic Acids Res 45(D1):D190–D199.  https://doi.org/10.1093/nar/gkw1107 CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Buels R, Yao E, Diesh CM, Hayes RD, Munoz-Torres M, Helt G, Goodstein DM, Elsik CG, Lewis SE, Stein L, Holmes IH (2016) JBrowse: a dynamic web platform for genome visualization and analysis. Genome Biol 17:66.  https://doi.org/10.1186/s13059-016-0924-1 CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Aken BL, Achuthan P, Akanni W, Amode MR, Bernsdorff F, Bhai J, Billis K, Carvalho-Silva D, Cummins C, Clapham P, Gil L, Giron CG, Gordon L, Hourlier T, Hunt SE, Janacek SH, Juettemann T, Keenan S, Laird MR, Lavidas I, Maurel T, McLaren W, Moore B, Murphy DN, Nag R, Newman V, Nuhn M, Ong CK, Parker A, Patricio M, Riat HS, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Wilder SP, Zadissa A, Kostadima M, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Cunningham F, Yates A, Zerbino DR, Flicek P (2017) Ensembl 2017. Nucleic Acids Res 45(D1):D635–D642.  https://doi.org/10.1093/nar/gkw1104 CrossRefPubMedGoogle Scholar
  22. 22.
    Kent WJ, Zweig AS, Barber G, Hinrichs AS, Karolchik D (2010) BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26(17):2204–2207.  https://doi.org/10.1093/bioinformatics/btq351 CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Raney BJ, Dreszer TR, Barber GP, Clawson H, Fujita PA, Wang T, Nguyen N, Paten B, Zweig AS, Karolchik D, Kent WJ (2014) Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser. Bioinformatics 30(7):1003–1005.  https://doi.org/10.1093/bioinformatics/btt637 CrossRefPubMedGoogle Scholar
  24. 24.
    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing S (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16):2078–2079.  https://doi.org/10.1093/bioinformatics/btp352 CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R, Genomes Project Analysis G (2011) The variant call format and VCFtools. Bioinformatics 27(15):2156–2158.  https://doi.org/10.1093/bioinformatics/btr330 CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, Kasprzyk A (2009) BioMart—biological queries made easy. BMC Genomics 10:22.  https://doi.org/10.1186/1471-2164-10-22 CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Zhang J, Haider S, Baran J, Cros A, Guberman JM, Hsu J, Liang Y, Yao L, Kasprzyk A (2011) BioMart: a data federation framework for large collaborative projects. Database (Oxford) 2011:bar038.  https://doi.org/10.1093/database/bar038 CrossRefGoogle Scholar
  28. 28.
    Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421.  https://doi.org/10.1186/1471-2105-10-421 CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P, Cunningham F (2016) The Ensembl Variant Effect Predictor. Genome Biol 17(1):122.  https://doi.org/10.1186/s13059-016-0974-4 CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Yates A, Beal K, Keenan S, McLaren W, Pignatelli M, Ritchie GR, Ruffier M, Taylor K, Vullo A, Flicek P (2015) The Ensembl REST API: Ensembl data for any language. Bioinformatics 31(1):143–145.  https://doi.org/10.1093/bioinformatics/btu613 CrossRefPubMedGoogle Scholar
  31. 31.
    Durinck S, Moreau Y, Kasprzyk A, Davis S, De Moor B, Brazma A, Huber W (2005) BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21(16):3439–3440.  https://doi.org/10.1093/bioinformatics/bti525 CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Bruce J. Bolt
    • 1
  • Faye H. Rodgers
    • 2
  • Myriam Shafie
    • 2
  • Paul J. Kersey
    • 1
  • Matthew Berriman
    • 2
  • Kevin L. Howe
    • 1
  1. 1.European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL-EBI)Hinxton, CambridgeUK
  2. 2.Wellcome Trust Sanger InstituteHinxton, CambridgeUK

Personalised recommendations