Abstract
Comparative genomics is the leveraging of genomic data between species to understand the evolution of genomes and species. With the increasing availability of genomics resources (genomes, transcriptomes, epigenomes, proteomes, etc.), opportunities exist to explore species relationships using comparative genomics. Comparative genomics is most commonly used to determine structural and functional variation between genomes. Traditional approaches that study genomes in isolation are limiting in both the kind of questions that can be answered, as well as the transferability of knowledge between species. Herein, we will address the recent advances in comparative genomics research, specifically in legumes, and how this wealth of knowledge can further expand our understanding of biological diversity. Comparative genomics can be performed at the genic or at genomic level, for which there are numerous workflows to exploit, including gene prediction and annotation, orthologous gene relationships, building gene and species phylogenetic trees, synteny, finding lineage specific genes, and pan-genomic analyses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arendsee ZW, Li L, Wurtele ES (2014) Coming of age: orphan genes in plants. Trends Plant Sci 19:698–708
Bennetzen JL (2002) Mechanisms and rates of genome expansion and contraction in flowering plants. Genetica 115:29–36
Bennetzen JL, Wang H (2014) The contributions of transposable elements to the structure, function, and evolution of plant genomes. Annu Rev Plant Biol 65:505–530
Biémont C (2010) A brief history of the status of transposable elements: from junk DNA to major players in evolution. Genetics 186:1085–1093
Birchler JA, Veitia RA (2007) The gene balance hypothesis: from classical genetics to modern genomics. Plant Cell 19:395–402
Birchler JA, Veitia RA (2010) The gene balance hypothesis: implications for gene regulation, quantitative traits and evolution. New Phytol 186:54–62
Birchler JA, Veitia RA (2014) The gene balance hypothesis: dosage effects in plants. Methods Mol Biol 1112:25–32
Birchler JA, Albert PS, Gao Z (2008) Stability of repeated sequence clusters in hybrids of maize as revealed by FISH. Trop Plant Biol 1:34
Boratyn GM, Camacho C, Cooper PS, Coulouris G, Fong A, Ma N, Madden TL, Matten WT, McGinnis SD, Merezhuk Y, Raytselis Y, Sayers EW, Tao T, Ye J, Zaretskaya I (2013) BLAST: a more efficient report with usability improvements. Nucl Acids Res 41:W29–W33
Borodovsky M, Lomsadze A (2011) Eukaryotic gene prediction using GeneMark.hmm-E and GeneMark-ES. Curr Protoc Bioinformatics Chapter 4(Unit 4.6):1–10
Borodovsky M, Peresetsky A (1994) Deriving non-homogeneous DNA Markov chain models by cluster analysis algorithm minimizing multiple alignment entropy. Comput Chem 18:259–267
Cannon SB, Ilut D, Farmer AD, Maki SL, May GD, Singer SR, Doyle JJ (2010) Polyploidy did not predate the evolution of nodulation in all legumes. PLoS ONE 5:e11630
Choi H-K, Mun J-H, Kim D-J, Zhu H, Baek J-M, Mudge J, Roe B, Ellis N, Doyle J, Kiss GB (2004) Estimating genome conservation between crop and model legume species. Proc Natl Acad Sci USA 101:15289–15294
Dash S, Campbell JD, Cannon EK, Cleary AM, Huang W, Kalberer SR, Karingula V, Rice AG, Singh J, Umale PE (2016) Legume information system (LegumeInfo. org): a key component of a set of federated data resources for the legume family. Nucl Acids Res 44:D1181–D1188
Doyle JJ, Luckow MA (2003) The rest of the iceberg. Legume diversity and evolution in a phylogenetic context. Plant Physiol 131:900–910
Doyle JJ, Doyle JL, Brown AHD, Pfeil BE (2000) Confirmation of shared and divergent genomes in the Glycine tabacina polyploid complex (Leguminosae) using histone H3-D sequences. Syst Bot 25:437–448
Fischer S, Brunk BP, Chen F, Gao X, Harb OS, Iodice JB, Shanmugam D, Roos DS, Stoeckert CJ Jr (2011) Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups. Curr Protoc Bioinformatics Chapter 6(Unit 6.12):11–19
Freeling M, Thomas BC (2006) Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity. Genome Res 16:805–814
Gondo T, Sato S, Okumura K, Tabata S, Akashi R, Isobe S (2007) Quantitative trait locus analysis of multiple agronomic traits in the model legume Lotus japonicus. Genome 50:627–637
Gonzales MD, Archuleta E, Farmer A, Gajendran K, Grant D, Shoemaker R, Beavis WD, Waugh ME (2005) The Legume Information System (LIS): an integrated information resource for comparative legume biology. Nucl Acids Res 33:D660–D665
Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, Rokhsar DS (2012) Phytozome: a comparative platform for green plant genomics. Nucl Acids Res 40:D1178–D1186
Hawkins JS, Grover CE, Wendel JF (2008) Repeated big bangs and the expanding universe: directionality in plant genome size evolution. Plant Sci 174:557–562
Holt C, Yandell M (2011) MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491
Isemura T, Kaga A, Konishi S, Ando T, Tomooka N, Han OK, Vaughan DA (2007) Genome dissection of traits related to domestication in azuki bean (Vigna angularis) and comparison with other warm-season legumes. Ann Bot 100:1053–1071
Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240
Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780
Khazaei H, O’Sullivan DM, Sillanpää MJ, Stoddard FL (2014) Use of synteny to identify candidate genes underlying QTL controlling stomatal traits in faba bean (Vicia faba L.). Theor Appl Genet 127:2371–2385
Kim MY, Lee S, Van K, Kim T-H, Jeong S-C, Choi I-Y, Kim D-S, Lee Y-S, Park D, Ma J (2010) Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome. Proc Natl Acad Sci USA 107:22032–22037
Korf I (2004) Gene finding in novel genomes. BMC Bioinformatics 5:59
Kuzniar A, van Ham RC, Pongor S, Leunissen JA (2008) The quest for orthologs: finding the corresponding gene across genomes. Trends Genet 24:539–551
Lee C, Yu D, Choi H-K, Kim RW (2017) Reconstruction of a composite comparative map composed of ten legume genomes. Genes Genomics 39:111–119
Leitch IJ, Leitch AR (2013) Genome size diversity and evolution in land plants. Plant genome diversity, vol 2. Springer, Berlin, pp 307–322
Li L, Wurtele ES (2015) The QQS orphan gene of Arabidopsis modulates carbon and nitrogen allocation in soybean. Plant Biotechnol J 13:177–187
Li L, Stoeckert CJ Jr, Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13:2178–2189
Li Y-h, Zhou G, Ma J, Jiang W, Jin L-g, Zhang Z, Guo Y, Zhang J, Sui Y, Zheng L (2014) De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat Biotechnol 32:1045–1052
Li L, Zheng W, Zhu Y, Ye H, Tang B, Arendsee ZW, Jones D, Li R, Ortiz D, Zhao X, Du C, Nettleton D, Scott MP, Salas-Fernandez MG, Yin Y, Wurtele ES (2015) QQS orphan gene regulates carbon and nitrogen partitioning across species via NF-YC interactions. Proc Natl Acad Sci USA 112:14734–14739
Li Y, Tong Y, Xing F (2016) DNA barcoding evaluation and its taxonomic implications in the recently evolved genus Oberonia Lindl. (Orchidaceae) in China. Front Plant Sci 7:1791
Loytynoja A (2014) Phylogeny-aware alignment with PRANK. Methods Mol Biol 1079:155–170
Maughan P, Maroof MS, Buss G (1996) Molecular-marker analysis of seed-weight: genomic locations, gene action, and evidence for orthologous evolution among three legume species. Theor Appl Genet 93:574–579
Ncbi RC (2013) Database resources of the National Center for Biotechnology Information. Nucl Acids Res 41:D8
Penn O, Privman E, Landan G, Graur D, Pupko T (2010) An alignment confidence score capturing robustness to guide tree uncertainty. Mol Biol Evol 27:1759–1767
Price MN, Dehal PS, Arkin AP (2010) FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5:e9490
Rosado A, Raikhel NV (2010) Application of the gene dosage balance hypothesis to auxin-related ribosomal mutants in Arabidopsis. Plant Signaling Behav 5:450–452
Ross-Ibarra J (2007) Genome size and recombination in angiosperms: a second look. J Evol Biol 20:800–806
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J (2010) Genome sequence of the palaeopolyploid soybean. Nature 463:178–183
Severin AJ, Cannon SB, Graham MM, Grant D, Shoemaker RC (2011) Changes in twelve homoeologous genomic regions in soybean following three rounds of polyploidy. Plant cell 23:3129–3136
Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212
Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313
Stanke M, Keller O, Gunduz I, Hayes A, Waack S, Morgenstern B (2006) AUGUSTUS: ab initio prediction of alternative transcripts. Nucl Acids Res 34:W435–W439
Stupar RM (2010) Into the wild: the soybean genome meets its undomesticated relative. Proc Natl Acad Sci USA 107:21947–21948
Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH (2007) UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23:1282–1288
Tello-Ruiz MK, Stein J, Wei S, Preece J, Olson A, Naithani S, Amarasinghe V, Dharmawardhana P, Jiao Y, Mulvaney J, Kumari S, Chougule K, Elser J, Wang B, Thomason J, Bolser DM, Kerhornou A, Walts B, Fonseca NA, Huerta L, Keays M, Tang YA, Parkinson H, Fabregat A, McKay S, Weiser J, D’Eustachio P, Stein L, Petryszak R, Kersey PJ, Jaiswal P, Ware D (2016) Gramene 2016: comparative plant genomics and pathway resources. Nucl Acids Res 44:D1133–D1140
Tiley GP, Burleigh JG (2015) The relationship of recombination rate, genome structure, and patterns of molecular evolution across angiosperms. BMC Evol Biol 15:194
Varshney RK, Chen W, Li Y, Bharti AK, Saxena RK, Schlueter JA, Donoghue MT, Azam S, Fan G, Whaley AM (2012) Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers. Nat Biotechnol 30:83–89
Vernikos G, Medini D, Riley DR, Tettelin H (2015) Ten years of pan-genome analyses. Curr Opin Microbiol 23:148–154
Vitte C, Bennetzen JL (2006) Analysis of retrotransposon structural diversity uncovers properties and propensities in angiosperm genome evolution. Proc Natl Acad Sci USA 103:17638–17643
Weigel D, Mott R (2009) The 1001 genomes project for Arabidopsis thaliana. Genome Biol 10:107
Wendel JF, Jackson SA, Meyers BC, Wing RA (2016) Evolution of plant genome architecture. Genome Biol 17:37
Waugh M, Anderson W, Bell C, Inman J, Schilkey F, Sullivan J, May G (2001) Legume information system. NAR molecular biology database collection 80
Young ND, Debellé F, Oldroyd GE, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KF, Gouzy J, Schoof H (2011) The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 480:520–524
Zenil-Ferguson R, Ponciano JM, Burleigh JG (2016) Evaluating the role of genome downsizing and size thresholds from genome size distributions in angiosperms. Amer J Bot 103:1175–1186
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Masonbrink, R.E., Severin, A.J., Seetharam, A.S. (2017). Comparative Genomics of Soybean and Other Legumes. In: Nguyen, H., Bhattacharyya, M. (eds) The Soybean Genome. Compendium of Plant Genomes. Springer, Cham. https://doi.org/10.1007/978-3-319-64198-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-64198-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64196-6
Online ISBN: 978-3-319-64198-0
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)