SoyBase: A Comprehensive Database for Soybean Genetic and Genomic Data

  • David Grant
  • Rex T. Nelson
Part of the Compendium of Plant Genomes book series (CPG)


SoyBase, the USDA-ARS soybean genetics and genomics database, provides a comprehensive collection of data, analysis tools, and links to external resources of interest to soybean researchers. The SoyBase home page ( contains the SoyBase Toolbox which provides quick access to a search of the SoyBase database or the SoyCyc metabolic pathways database, access to the data download page, a genome sequence BLAST tool, and direct links to the genetic and sequence maps. An extensive navigation menu and site description provides facile access to all sections of SoyBase. Comprehensive information for a number of data types is available at SoyBase including the current genetic maps, the soybean reference genome sequence with tracks covering genetic markers, genome organization, gene annotation and expression, and gene knockout mutants. SoyBase includes an extensive RNA-Seq gene atlas and innovative tools for identifying fast neutron-induced mutants. SoyBase is an actively curated database, with new data regularly being incorporated, including additions to the controlled vocabularies (ontologies) for soybean growth, development and phenotypic traits, soybean genes, and quantitative trait loci. New “omics” tools enable sophisticated queries on lists of genes. These features are all accessed using intuitive interfaces and are linked together wherever possible.



The SoyBase Development Group includes David Grant, Rex T. Nelson, Kevin Feeley, Nathan Weeks, Jacqueline D. Campbell, Victoria Carollo Blake, Wei Huang, and Steven B. Cannon.

Data in SoyBase were generously provided by many cooperators or were collected from the literature by SoyBase staff.

Funding: This work was supported by the US Department of Agriculture, Agricultural Research Service (USDA-ARS) CRIS Project 3625-21000-062-00D.

Conflict of interest: No conflicts of interest declared.


This digital data set publication was prepared by an agency of the US Government. Neither the US Government nor any agency thereof, nor any of their employees, makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or misuse of the data, or for damage, transmission of viruses, or computer contamination through the distribution of these data sets or for the usefulness of any information, apparatus, product, or process disclosed in this report, or represents that its use would not infringe privately owned rights. Reference therein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the US Government or any agency thereof. Any views and opinions of authors expressed herein do not necessarily state or reflect those.

The US Department of Agriculture (USDA) prohibits discrimination in all its programs and activities on the basis of race, color, national origin, age, disability, and where applicable, sex, marital status, familial status, parental status, religion, sexual orientation, genetic information, political beliefs, reprisal, or because all or a part of an individual’s income is derived from any public assistance program. (Not all prohibited bases apply to all programs.) Persons with disabilities who require alternative means for communication of program information (Braille, large print, audiotape, etc.) should contact USDA’s TARGET Center at (202) 720-2600 (voice and TDD). To file a complaint of discrimination, write to USDA, Director, Office of Civil Rights, 1400 Independence Avenue, SW., Washington, DC 20250–9410, or call (800) 795-3272 (voice) or (202) 720-6382 (TDD). USDA is an equal opportunity provider and employer.


  1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402CrossRefPubMedPubMedCentralGoogle Scholar
  2. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, David AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene Ontology: tool for the unification of biology. Nat Genet 25(1):25–29CrossRefPubMedPubMedCentralGoogle Scholar
  3. Avraham S, Tung CW, Ilic K, Jaiswal P, Kellog EA, McCouch S, Pujar A, Reiser L, Rhee SY, Sachs MM, Schaeffer M, Stein L, Stevens P, Vincent L, Zapata F, Ware D (2008) The plant ontology database: a community resource for plant structure and developmental stages controlled vocabulary and annotations. Nucleic Acids Res 36(suppl 1):D449–D454CrossRefPubMedPubMedCentralGoogle Scholar
  4. Bolon Y-T, Haun WJ, Xu WW, Grant D, Stacey MG, Nelson RT, Gerhardt DJ, Jeddeloh JA, Stacey G, Muellbauer GJ, Orf JH, Naeve SL, Stupar RM, Vance CP (2011) Phenotypic and genomic analyses of a fast neutron mutant population resource in soybean. Plant Physiol 156:240–253CrossRefPubMedPubMedCentralGoogle Scholar
  5. Bolon Y-T, Stec AO, Michno JM, Roessler J, Bhaskar PB, Ries L, Dobbels AA, Campbell BW, Young NP, Anderson JE, Grant DM, Orf JH, Naeve SL, Muehlbauer GJ, Vance CP, Stupar Robert M (2014) Genome resilience and prevalence of segmental duplications following fast neutron irradiation of soybean. Genetics 198:967–981CrossRefPubMedPubMedCentralGoogle Scholar
  6. Du J, Grant D, Tian Z, Nelson RT, Zhu L, Shoemaker RC, Ma J (2010) SoyTEdb: a comprehensive database of transposable elements in the soybean genome. BMC Genom 11:113CrossRefGoogle Scholar
  7. Grant D, Imsande M, Shoemaker R (1996) SoyBase, a soybean genome database. Soybean Genet Newslett 23:51–53Google Scholar
  8. Grant D, Nelson RT, Cannon SB, Shoemaker RC (2010) SoyBase, the USDA-ARS soybean genetics and genomics database. Nucleic Acids Res 38:D843–D846CrossRefPubMedGoogle Scholar
  9. Hyten DL, Choi I-Y, Song Q, Specht JE, Carter TE Jr, Shoemaker RC, Hwang E-Y, Matukumalli LK, Cregan PB (2010) Map of soybean and the development of a 1536 universal soy linkage panel for quantitative trait locus mapping. Crop Sci 50:960–968CrossRefGoogle Scholar
  10. Jaiswal P, Ware D, Ni J, Chang K, Zhao W, Schmidt S, Pan X, Clark K, Teytelman L, Cartinhour S, Stein L, McCouch S (2002) Gramene: development and integration of trait and gene ontologies for rice. Comp Funct Genom 3:132–136CrossRefGoogle Scholar
  11. Karp PD, Paley SM, Krummenacker M, Latendresse M, Dale JM, Lee TJ, Kaipa P, Gilham F, Spaulding A, Popescu L, Altman T, Paulsen I, Keseler IM, Caspi R (2010) Pathway Tools version 13.0: Integrated software for pathway/genome informatics and systems biology. Brief Bioinformat 11:40–70CrossRefGoogle Scholar
  12. Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12(4):656–664CrossRefPubMedPubMedCentralGoogle Scholar
  13. Munger P, Bleiholder H, Hack H, Hess M, Stauss R, van den Boom T, Weber E (1997) Phenological growth stages of the soybean plant (Glycine max L. MERR.): codification and description according to the BBCH scale. J Agron Crop Sci 179(4):209–217CrossRefGoogle Scholar
  14. Pfeil BE, Schlueter JA, Shoemaker RC, Doyle JJ (2005) Placing paleopolyploidy in relation to taxon divergence: a phylogenetic analysis in legumes using 39 gene families. Syst Biol 54(3):441–454CrossRefPubMedGoogle Scholar
  15. Ribault (2015) IBP: Integrated Breeding Platform.
  16. Schlueter JA, Dixon P, Granger C, Grant D, Clark L, Doyle JJ, Shoemaker RC (2004) Mining EST databases to resolve evolutionary events in major crop species. Genome 47(5):868–876CrossRefPubMedGoogle Scholar
  17. Schmutz J, Cannon S, Schlueter J, Jianxin M, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquis E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang X-C, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA (2010) Genome sequence of the palaeopolyploid soybean. Nature 463:178–183CrossRefPubMedGoogle Scholar
  18. Severin AJ, Woody JL, Bolon YE, Joseph B, Diers BW, Farmer AD, Muehlbauer GJ, Nelson R, Grant DM, Specht JE, Graham MA, Cannon SB, May GD, Vance CP, Shoemaker RC (2010) RNA-Seq Atlas of Glycine max: a guide to the soybean transcriptome. BMC Plant Biol 10:610CrossRefGoogle Scholar
  19. Shoemaker RC, Grant D, Olson T, Warren WC, Wing R, Yu Y, Kim H, Cregan P, Joseph B, Futrell-Griggs M, Nelson W, Davito J, Walker J, Wallis J, Kremitski C, Scheer D, Clifton SW, Graves T, Nguyen H, Wu X, Luo M, Dvorak J, Nelson R, Cannon S, Tomkins J, Schmutz J, Stacey G, Jackson S (2008) Microsatellite discovery from BAC end sequences and genetic mapping to anchor the soybean physical and genetic maps. Genome 51:294–302CrossRefPubMedGoogle Scholar
  20. Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S (2002) The generic genome browser: a building block for a model organism system database. Genome Res 12(10):1599–1610CrossRefPubMedPubMedCentralGoogle Scholar
  21. Zhang P, Dreher K, Karthikeyan A, Chi A, Pujar A, Caspi R, Karp P, Kirkup V, Latendresse M, Lee C, Mueller LA, Muller R, Rhee SY (2010) Creation of a genome-wide metabolic pathway database for Populus trichocarpa using a new approach for reconstruction and curation of metabolic pathways for plants. Plant Physiol 153(4):1479–1491CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.USDA-ARS, Corn Insects and Crop Genetics Research UnitIowa State UniversityAmesUSA
  2. 2.Department of AgronomyIowa State UniversityAmesUSA

Personalised recommendations