Using WormBase: A Genome Biology Resource for Caenorhabditis elegans and Related Nematodes

  • Christian Grove
  • Scott Cain
  • Wen J. Chen
  • Paul Davis
  • Todd Harris
  • Kevin L. Howe
  • Ranjana Kishore
  • Raymond Lee
  • Michael Paulini
  • Daniela Raciti
  • Mary Ann Tuli
  • Kimberly Van Auken
  • Gary Williams
  • The WormBase Consortium
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1757)

Abstract

WormBase (www.wormbase.org) provides the nematode research community with a centralized database for information pertaining to nematode genes and genomes. As more nematode genome sequences are becoming available and as richer data sets are published, WormBase strives to maintain updated information, displays, and services to facilitate efficient access to and understanding of the knowledge generated by the published nematode genetics literature. This chapter aims to provide an explanation of how to use basic features of WormBase, new features, and some commonly used tools and data queries. Explanations of the curated data and step-by-step instructions of how to access the data via the WormBase website and available data mining tools are provided.

Key words

Data mining Nematodes Genomics Genetics Caenorhabditis elegans Model organism database Ontologies User guide 

Notes

Acknowledgments

WormBase is supported by grant #U41 HG002223 from the National Human Genome Research Institute at the US National Institutes of Health, the UK Medical Research Council and the UK Biotechnology and Biological Sciences Research Council. At the time of writing, the WormBase Consortium included Paul W. Sternberg, Paul Kersey, Matthew Berriman, Lincoln Stein, Tim Schedl, Todd Harris, Scott Cain, Sibyl Gao, Paulo Nuin, Adam Wright, Kevin Howe, Bruce Bolt, Paul Davis, Michael Paulini, Faye Rodgers, Matthew Russell, Myriam Shafie, Gary Williams, Juancarlos Chan, Wen J. Chen, Christian Grove, Ranjana Kishore, Raymond Lee, Hans-Michael Müller, Cecilia Nakamura, Daniela Raciti, Gary Schindelman, Mary Ann Tuli, Kimberly Van Auken, Daniel Wang, and Karen Yook.

References

  1. 1.
    Harris TW, Baran J, Bieri T et al (2014) WormBase 2014: new views of curated biology. Nucleic Acids Res 42:D789–D793.  https://doi.org/10.1093/nar/gkt1063 CrossRefPubMedGoogle Scholar
  2. 2.
    Howe KL, Bolt BJ, Cain S et al (2016) WormBase 2016: expanding to enable helminth genomic research. Nucleic Acids Res 44:D774–D780.  https://doi.org/10.1093/nar/gkv1217 CrossRefGoogle Scholar
  3. 3.
    C. elegans Sequencing Consortium (1998) Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282:2012–2018.Google Scholar
  4. 4.
    Nakamura Y, Cochrane G, Karsch-Mizrachi I, International Nucleotide Sequence Database Collaboration (2013) The International Nucleotide Sequence Database Collaboration. Nucleic Acids Res 41:D21–D24.  https://doi.org/10.1093/nar/gks1084 CrossRefPubMedGoogle Scholar
  5. 5.
    Stein LD, Mungall C, Shu S et al (2002) The generic genome browser: a building block for a model organism system database. Genome Res 12:1599–1610.  https://doi.org/10.1101/gr.403602 CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Skinner ME, Uzilov AV, Stein LD et al (2009) JBrowse: a next-generation genome browser. Genome Res 19:1630–1638.  https://doi.org/10.1101/gr.094607.109 CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Gerstein MB, ZJ L, Van Nostrand EL et al (2010) Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330:1775–1787.  https://doi.org/10.1126/science.1196914 CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Altschul SF, Madden TL, Schäffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402CrossRefGoogle Scholar
  9. 9.
    Camacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421.  https://doi.org/10.1186/1471-2105-10-421 CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797.  https://doi.org/10.1093/nar/gkh340 CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Mitchell A, Chang H-Y, Daugherty L et al (2015) The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res 43:D213–D221.  https://doi.org/10.1093/nar/gku1243 CrossRefPubMedGoogle Scholar
  12. 12.
    Gene Ontology Consortium (2015) Gene Ontology Consortium: going forward. Nucleic Acids Res 43:D1049–D1056.  https://doi.org/10.1093/nar/gku1179 CrossRefGoogle Scholar
  13. 13.
    Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230.  https://doi.org/10.1093/nar/gkt1223 CrossRefPubMedGoogle Scholar
  14. 14.
    Powell S, Forslund K, Szklarczyk D et al (2014) eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res 42:D231–D239.  https://doi.org/10.1093/nar/gkt1253 CrossRefPubMedGoogle Scholar
  15. 15.
    Li H, Coghlan A, Ruan J et al (2006) TreeFam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res 34:D572–D580.  https://doi.org/10.1093/nar/gkj118 CrossRefPubMedGoogle Scholar
  16. 16.
    Vilella AJ, Severin J, Ureta-Vidal A et al (2009) EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19:327–335.  https://doi.org/10.1101/gr.073585.107 CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    The Gene Ontology Consortium (2017) Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res 45:D331–D338.  https://doi.org/10.1093/nar/gkw1108 CrossRefGoogle Scholar
  18. 18.
    Lee RYN, Sternberg PW (2003) Building a cell and anatomy ontology of Caenorhabditis elegans. Comp Funct Genomics 4:121–126.  https://doi.org/10.1002/cfg.248 CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Schriml LM, Arze C, Nadendla S et al (2012) Disease Ontology: a backbone for disease semantic integration. Nucleic Acids Res 40:D940–D946.  https://doi.org/10.1093/nar/gkr972 CrossRefGoogle Scholar
  20. 20.
    Schindelman G, Fernandes JS, Bastiani CA et al (2011) Worm Phenotype Ontology: integrating phenotype data within and beyond the C. elegans community. BMC Bioinformatics 12:32.  https://doi.org/10.1186/1471-2105-12-32 CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Huntley RP, Harris MA, Alam-Faruque Y et al (2014) A method for increasing expressivity of Gene Ontology annotations using a compositional approach. BMC Bioinformatics 15:155.  https://doi.org/10.1186/1471-2105-15-155 CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Gaudet P, Livstone MS, Lewis SE, Thomas PD (2011) Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. Brief Bioinform 12:449–462.  https://doi.org/10.1093/bib/bbr042 CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Huntley RP, Sawford T, Mutowo-Meullenet P et al (2015) The GOA database: gene Ontology annotation updates for 2015. Nucleic Acids Res 43:D1057–D1063.  https://doi.org/10.1093/nar/gku1113 CrossRefGoogle Scholar
  24. 24.
    Burge S, Kelly E, Lonsdale D et al (2012) Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation. Database (Oxford) 2012:bar068.  https://doi.org/10.1093/database/bar068 CrossRefPubMedCentralGoogle Scholar
  25. 25.
    Trapnell C, Williams BA, Pertea G et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515.  https://doi.org/10.1038/nbt.1621 CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Trapnell C, Roberts A, Goff L et al (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7:562–578.  https://doi.org/10.1038/nprot.2012.016 CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Zhong W, Sternberg PW (2006) Genome-wide prediction of C. elegans genetic interactions. Science 311:1481–1484.  https://doi.org/10.1126/science.1123287 CrossRefPubMedGoogle Scholar
  28. 28.
    Lee I, Lehner B, Crombie C et al (2008) A single gene network accurately predicts phenotypic effects of gene perturbation in Caenorhabditis elegans. Nat Genet 40:181–188.  https://doi.org/10.1038/ng.2007.70 CrossRefPubMedGoogle Scholar
  29. 29.
    Lee I, Lehner B, Vavouri T et al (2010) Predicting genetic modifier loci using functional gene networks. Genome Res 20:1143–1153.  https://doi.org/10.1101/gr.102749.109 CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Rual J-F, Ceron J, Koreth J et al (2004) Toward improving Caenorhabditis elegans phenome mapping with an ORFeome-based RNAi library. Genome Res 14:2162–2168.  https://doi.org/10.1101/gr.2505604 CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Kamath RS, Fraser AG, Dong Y et al (2003) Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature 421:231–237.  https://doi.org/10.1038/nature01278 CrossRefPubMedGoogle Scholar
  32. 32.
    Culetto E, Sattelle DB (2000) A role for Caenorhabditis elegans in understanding the function and interactions of human disease genes. Hum Mol Genet 9:869–877CrossRefGoogle Scholar
  33. 33.
    Artal-Sanz M, de Jong L, Tavernarakis N (2006) Caenorhabditis elegans: a versatile platform for drug discovery. Biotechnol J 1:1405–1418.  https://doi.org/10.1002/biot.200600176 CrossRefPubMedGoogle Scholar
  34. 34.
    Giacomotto J, Ségalat L (2010) High-throughput screening and small animal models, where are we? Br J Pharmacol 160:204–216.  https://doi.org/10.1111/j.1476-5381.2010.00725.x CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    O’Reilly LP, Luke CJ, Perlmutter DH et al (2014) C. elegans in high-throughput drug discovery. Adv Drug Deliv Rev 69–70:247–253.  https://doi.org/10.1016/j.addr.2013.12.001 CrossRefPubMedGoogle Scholar
  36. 36.
    Li J, Le W (2013) Modeling neurodegenerative diseases in Caenorhabditis elegans. Exp Neurol 250:94–103.  https://doi.org/10.1016/j.expneurol.2013.09.024 CrossRefPubMedGoogle Scholar
  37. 37.
    Alexander AG, Marfil V, Li C (2014) Use of Caenorhabditis elegans as a model to study Alzheimer’s disease and other neurodegenerative diseases. Front Genet 5:279.  https://doi.org/10.3389/fgene.2014.00279 CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    O’Hagan R, Wang J, Barr MM (2014) Mating behavior, male sensory cilia, and polycystins in Caenorhabditis elegans. Semin Cell Dev Biol 33:25–33.  https://doi.org/10.1016/j.semcdb.2014.06.001 CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Blacque OE, Sanders AAWM (2014) Compartments within a compartment: what C. elegans can tell us about ciliary subdomain composition, biogenesis, function, and disease. Organogenesis 10:126–137.  https://doi.org/10.4161/org.28830 CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Lee S-J, Gartner A, Hyun M et al (2010) The Caenorhabditis elegans Werner syndrome protein functions upstream of ATR and ATM in response to DNA replication inhibition and double-strand DNA breaks. PLoS Genet 6:e1000801.  https://doi.org/10.1371/journal.pgen.1000801 CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Zheng J, Greenway FL (2012) Caenorhabditis elegans as a model for obesity research. Int J Obes (Lond) 36:186–194.  https://doi.org/10.1038/ijo.2011.93 CrossRefGoogle Scholar
  42. 42.
    Park K-W, Li L (2011) Prion protein in Caenorhabditis elegans: distinct models of anti-BAX and neuropathology. Prion 5:28–38CrossRefGoogle Scholar
  43. 43.
    Kibbe WA, Arze C, Felix V et al (2015) Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data. Nucleic Acids Res 43:D1071–D1078.  https://doi.org/10.1093/nar/gku1011 CrossRefPubMedGoogle Scholar
  44. 44.
    Amberger JS, Bocchini CA, Schiettecatte F et al (2015) OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic Acids Res 43:D789–D798.  https://doi.org/10.1093/nar/gku1205 CrossRefPubMedGoogle Scholar
  45. 45.
    Bretscher AJ, Kodama-Namba E, Busch KE et al (2011) Temperature, oxygen, and salt-sensing neurons in C. elegans are carbon dioxide sensors that control avoidance behavior. Neuron 69:1099–1113.  https://doi.org/10.1016/j.neuron.2011.02.023 CrossRefPubMedPubMedCentralGoogle Scholar
  46. 46.
    Smith RN, Aleksic J, Butano D et al (2012) InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data. Bioinformatics 28:3163–3165.  https://doi.org/10.1093/bioinformatics/bts577 CrossRefPubMedPubMedCentralGoogle Scholar
  47. 47.
    Kalderimis A, Lyne R, Butano D et al (2014) InterMine: extensive web services for modern biology. Nucleic Acids Res 42:W468–W472.  https://doi.org/10.1093/nar/gku301 CrossRefPubMedPubMedCentralGoogle Scholar
  48. 48.
    Lyne R, Smith R, Rutherford K et al (2007) FlyMine: an integrated database for Drosophila and Anopheles genomics. Genome Biol 8:R129.  https://doi.org/10.1186/gb-2007-8-7-r129 CrossRefPubMedPubMedCentralGoogle Scholar
  49. 49.
    Motenko H, Neuhauser SB, O’Keefe M, Richardson JE (2015) MouseMine: a new data warehouse for MGI. Mamm Genome 26:325–330.  https://doi.org/10.1007/s00335-015-9573-z CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Balakrishnan R, Park J, Karra K et al (2012) YeastMine--an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit. Database (Oxford) 2012:bar062.  https://doi.org/10.1093/database/bar062 CrossRefGoogle Scholar
  51. 51.
    Contrino S, Smith RN, Butano D et al (2012) modMine: flexible access to modENCODE data. Nucleic Acids Res 40:D1082–D1088.  https://doi.org/10.1093/nar/gkr921 CrossRefPubMedGoogle Scholar
  52. 52.
    Rhee DB, Croken MM, Shieh KR et al (2015) toxoMine: an integrated omics data warehouse for Toxoplasma gondii systems biology research. Database (Oxford) 2015:bav066.  https://doi.org/10.1093/database/bav066 CrossRefGoogle Scholar
  53. 53.
    Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410.  https://doi.org/10.1016/S0022-2836(05)80360-2 CrossRefGoogle Scholar
  54. 54.
    Kent WJ (2002) BLAT--the BLAST-like alignment tool. Genome Res 12:656–664.  https://doi.org/10.1101/gr.229202 CrossRefPubMedPubMedCentralGoogle Scholar
  55. 55.
    Angeles-Albores D, N Lee RY, Chan J, Sternberg PW (2016) Tissue enrichment analysis for C. elegans genomics. BMC Bioinformatics 17:366.  https://doi.org/10.1186/s12859-016-1229-9 CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    WormAtlas, Altun ZF, Herndon LA, Wolkow CA, Crocker C, Lints R, Hall DH (eds) (2002–2017). http://www.wormatlas.org. Accessed 10 Apr 2017
  57. 57.
    Greenwald I (2016) WormBook: WormBiology for the 21st Century. Genetics 202:883–884.  https://doi.org/10.1534/genetics.116.187575 CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Christian Grove
    • 1
  • Scott Cain
    • 2
  • Wen J. Chen
    • 1
  • Paul Davis
    • 3
  • Todd Harris
    • 2
  • Kevin L. Howe
    • 3
  • Ranjana Kishore
    • 1
  • Raymond Lee
    • 1
  • Michael Paulini
    • 3
  • Daniela Raciti
    • 1
  • Mary Ann Tuli
    • 1
  • Kimberly Van Auken
    • 1
  • Gary Williams
    • 3
  • The WormBase Consortium
  1. 1.Division of Biology and Biological EngineeringCalifornia Institute of TechnologyPasadenaUSA
  2. 2.Informatics and Bio-computing PlatformOntario Institute for Cancer ResearchTorontoCanada
  3. 3.European Molecular Biology LaboratoryEuropean Bioinformatics InstituteCambridgeUK

Personalised recommendations