De novo assembly of a Chinese soybean genome

Abstract

Soybean was domesticated in China and has become one of the most important oilseed crops. Due to bottlenecks in their introduction and dissemination, soybeans from different geographic areas exhibit extensive genetic diversity. Asia is the largest soybean market; therefore, a high–quality soybean reference genome from this area is critical for soybean research and breeding. Here, we report the de novo assembly and sequence analysis of a Chinese soybean genome for “Zhonghuang 13” by a combination of SMRT, Hi–C and optical mapping data. The assembled genome size is 1.025 Gb with a contig N50 of 3.46 Mb and a scaffold N50 of 51.87 Mb. Comparisons between this genome and the previously reported reference genome (cv. Williams 82) uncovered more than 250,000 structure variations. A total of 52,051 protein coding genes and 36,429 transposable elements were annotated for this genome, and a gene co–expression network including 39,967 genes was also established. This high quality Chinese soybean genome and its sequence analysis will provide valuable information for soybean improvement in the future.

This is a preview of subscription content, log in to check access.

References

  1. Akdemir, K.C., and Chin, L. (2015). HiCPlotter integrates genomic data with interaction matrices. Genome Biol 16, 198.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  2. Badouin, H., Gouzy, J., Grassa, C.J., Murat, F., Staton, S.E., Cottret, L., Lelandais–Brière, C., Owens, G.L., Carrère, S., Mayjonade, B., et al. (2017). The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution. Nature 546, 148–152.

    PubMed  Article  CAS  Google Scholar 

  3. Besemer, J., and Borodovsky, M. (2005). GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res 33, W451–W454.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  4. Bickhart, D.M., Rosen, B.D., Koren, S., Sayre, B.L., Hastie, A.R., Chan, S., Lee, J., Lam, E.T., Liachko, I., Sullivan, S.T., et al. (2017). Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat Genet 49, 643–650.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  5. Bolger, A.M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  6. Burton, J.N., Adey, A., Patwardhan, R.P., Qiu, R., Kitzman, J.O., and Shendure, J. (2013). Chromosome–scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31, 1119–1125.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  7. Byrum, J. R., Kinney, A. J., Shoemaker, R. C., and Diers, B. W. (1995). Mapping of the microsomal and plastid omega–3 fatty acid desaturases in soybean [Glycine max (L.) Merr.]. Soybean Genet Newslett 22, 181–184.

    Google Scholar 

  8. Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., and Madden, T.L. (2009). BLAST+: architecture and applications. BMC BioInf 10, 421.

    Article  CAS  Google Scholar 

  9. Carter, T.E., Nelson, R., Sneller, C.H., and Cui, Z. (2004). Soybeans: improvement, production and uses, Third edition (agronomy) (Madison, Wisconsin, USA).

    Google Scholar 

  10. Chaisson, M.J., and Tesler, G. (2012). Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC BioInf 13, 238.

    Article  CAS  Google Scholar 

  11. Chan, C., Qi, X., Li, M.W., Wong, F.L., and Lam, H.M. (2012). Recent developments of genomic research in soybean. J Genets Genomics 39, 317–324.

    Article  CAS  Google Scholar 

  12. Chen, G., Shi, T., and Shi, L. (2017). Characterizing and annotating the genome using RNA–seq data. Sci China Life Sci 60, 116–125.

    PubMed  Article  CAS  Google Scholar 

  13. Childs, K.L., Davidson, R.M., and Buell, C.R. (2011). Gene coexpression network analysis as a source of functional annotation for rice genes. PLoS ONE 6, e22196.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  14. Clavijo, B.J., Venturini, L., Schudoma, C., Accinelli, G.G., Kaithakottil, G., Wright, J., Borrill, P., Kettleborough, G., Heavens, D., Chapman, H., et al. (2017). An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations. Genome Res 27, 885–896.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  15. Contreras–Soto, R.I., Mora, F., Lazzari, F., de Oliveira, M.A.R., Scapim, C. A., and Schuster, I. (2017). Genome–wide association mapping for flowering and maturity in tropical soybean: implications for breeding strategies. Breed Sci 67, 435–449.

    PubMed  PubMed Central  Article  Google Scholar 

  16. Du, H., Yu, Y., Ma, Y., Gao, Q., Cao, Y., Chen, Z., Ma, B., Qi, M., Li, Y., Zhao, X., et al. (2017). Sequencing and de novo assembly of a near complete indica rice genome. Nat Commun 8, 15324.

    PubMed  PubMed Central  Article  Google Scholar 

  17. Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA–seq aligner. Bioinformatics 29, 15–21.

    PubMed  Article  CAS  Google Scholar 

  18. Dooner, H.K., and He, L. (2008). Maize genome structure variation: interplay between retrotransposon polymorphisms and genic recombination. Plant Cell 20, 249–258.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  19. Du, J., Grant, D., Tian, Z., Nelson, R.T., Zhu, L., Shoemaker, R.C., and Ma, J. (2010). SoyTEdb: a comprehensive database of transposable elements in the soybean genome. BMC Genomics 11, 113.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  20. Fang, C., Ma, Y., Wu, S., Liu, Z., Wang, Z., Yang, R., Hu, G., Zhou, Z., Yu, H., Zhang, M., et al. (2017). Genome–wide association studies dissect the genetic networks underlying agronomical traits in soybean. Genome Biol 18, 161.

    PubMed  PubMed Central  Article  Google Scholar 

  21. Foley, J.A., Ramankutty, N., Brauman, K.A., Cassidy, E.S., Gerber, J.S., Johnston, M., Mueller, N.D., O’Connell, C., Ray, D.K., West, P.C., et al. (2011). Solutions for a cultivated planet. Nature 478, 337–342.

    PubMed  Article  CAS  Google Scholar 

  22. Funatsuki, H., Kawaguchi, K., Matsuba, S., Sato, Y., and Ishimoto, M. (2005). Mapping of QTL associated with chilling tolerance during reproductive growth in soybean. Theor Appl Genet 111, 851–861.

    PubMed  Article  CAS  Google Scholar 

  23. Gai, J., Wang, Y., Wu, X., and Chen, S. (2007). A comparative study on segregation analysis and QTL mapping of quantitative traits in plants— with a case in soybean. Front Agric China 1, 1–7.

    Article  Google Scholar 

  24. Githiri, S.M., Yang, D., Khan, N.A., Xu, D., Komatsuda, T., and Takahashi, R. (2007). QTL analysis of low temperature induced browning in soybean seed coats. J Heredity 98, 360–366.

    Article  CAS  Google Scholar 

  25. Gizlice, Z., Carter, T.E., and Burton, J.W. (1994). Genetic base for North American public soybean cultivars released between 1947 and 1988. Crop Sci 34, 1143–1151.

    Article  Google Scholar 

  26. Guo, H., Liu, J., Luo, L., Wei, X., Zhang, J., Qi, Y., Zhang, B., Liu, H., and Xiao, P. (2017). Complete chloroplast genome sequences of Schisandra chinensis: genome structure, comparative analysis, and phylogenetic relationship of basal angiosperms. Sci China Life Sci 60, 1–5.

    Google Scholar 

  27. Haas, B.J. (2003). Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res 31, 5654–5666.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  28. Haas, B.J., Salzberg, S.L., Zhu, W., Pertea, M., Allen, J.E., Orvis, J., White, O., Buell, C.R., and Wortman, J.R. (2008). Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 9, R7.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  29. Hirsch, C.N., Hirsch, C.D., Brohammer, A.B., Bowman, M.J., Soifer, I., Barad, O., Shem–Tov, D., Baruch, K., Lu, F., Hernandez, A.G., et al. (2016). Draft assembly of elite inbred line PH207 provides insights into genomic and transcriptome diversity in maize. Plant Cell 28, 2700–2714.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  30. Holligan, D., Zhang, X., Jiang, N., Pritham, E.J., and Wessler, S.R. (2006). The transposable element landscape of the model legume Lotus japonicus. Genetics 174, 2215–2228.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  31. Hoshino, A., Jayakumar, V., Nitasaka, E., Toyoda, A., Noguchi, H., Itoh, T., Shin–I, T., Minakuchi, Y., Koda, Y., Nagano, A.J., et al. (2016). Genome sequence and analysis of the Japanese morning glory Ipomoea nil. Nat Commun 7, 13295.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  32. Hyten, D.L., Song, Q., Zhu, Y., Choi, I.Y., Nelson, R.L., Costa, J.M., Specht, J.E., Shoemaker, R.C., and Cregan, P.B. (2006). Impacts of genetic bottlenecks on soybean genome diversity. Proc Natl Acad Sci USA 103, 16666–16671.

    PubMed  Article  CAS  Google Scholar 

  33. Jarvis, D.E., Ho, Y.S., Lightfoot, D.J., Schmöckel, S.M., Li, B., Borm, T.J. A., Ohyanagi, H., Mineta, K., Michell, C.T., Saber, N., et al. (2017). The genome of Chenopodium quinoa. Nature 542, 307–312.

    PubMed  Article  CAS  Google Scholar 

  34. Jiao, Y., Peluso, P., Shi, J., Liang, T., Stitzer, M.C., Wang, B., Campbell, M. S., Stein, J.C., Wei, X., and Chin, C.S. (2017). Improved maize reference genome with single–molecule technologies. Nature 546, 524–527.

    PubMed  CAS  Google Scholar 

  35. Jun, T.H., Freewalt, K., Michel, A.P., and Mian, R. (2014). Identification of novel QTL for leaf traits in soybean. Plant Breed 133, 61–66.

    Article  CAS  Google Scholar 

  36. Kawakatsu, T., Huang, S.S.C., Jupe, F., Sasaki, E., Schmitz, R.J., Urich, M. A., Castanon, R., Nery, J.R., Barragan, C., He, Y., et al. (2016). Epigenomic diversity in a global collection of Arabidopsis thaliana accessions. Cell 166, 492–505.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  37. Keilwagen, J., Hartung, F., Paulini, M., Twardziok, S.O., and Grau, J. (2018). Combining RNA–seq data and homology–based gene prediction for plants, animals and fungi. BMC BioInf 19, 189.

    Article  Google Scholar 

  38. Keim, P., Diers, B.W., Olson, T.C., and Shoemaker, R.C. (1990). RFLP mapping in soybean: association between marker loci and variation in quantitative traits. Genetics 126, 735–742.

    PubMed  PubMed Central  CAS  Google Scholar 

  39. Khan, N.A., Githiri, S.M., Benitez, E.R., Abe, J., Kawasaki, S., Hayashi, T., and Takahashi, R. (2008). QTL analysis of cleistogamy in soybean. Theor Appl Genet 117, 479–487.

    PubMed  Article  CAS  Google Scholar 

  40. Kim, H.K., Kim, Y.C., Kim, S.T., Son, B.G., Choi, Y.W., Kang, J.S., Park, Y.H., Cho, Y.S., and Choi, I.S. (2010). Analysis of quantitative trait loci (QTLs) for seed size and fatty acid composition using recombinant inbred lines in soybean. J Life Sci 20, 1186–1192.

    Article  CAS  Google Scholar 

  41. Komatsu, K., Okuda, S., Takahashi, M., Matsunaga, R., and Nakazawa, Y. (2007). Quantitative trait loci mapping of pubescence density and flowering time of insect–resistant soybean (Glycine max L. Merr.). Genet Mol Biol 30, 635–639.

    Article  Google Scholar 

  42. Kong, F., Liu, B., Xia, Z., Sato, S., Kim, B.M., Watanabe, S., Yamada, T., Tabata, S., Kanazawa, A., Harada, K., et al. (2010). Two coordinately regulated homologs of FLOWERING LOCUS T are involved in the control of photoperiodic flowering in soybean. Plant Physiol 154, 1220–1231.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  43. Kong, F., Nan, H., Cao, D., Li, Y., Wu, F., Wang, J., Lu, S., Yuan, X., Cober, E.R., Abe, J., et al. (2014). A new dominant gene conditions early flowering and maturity in soybean. Crop Sci 54, 2529–2535.

    Article  CAS  Google Scholar 

  44. Koo, S.C., Bracko, O., Park, M.S., Schwab, R., Chun, H.J., Park, K.M., Seo, J.S., Grbic, V., Balasubramanian, S., Schmid, M., et al. (2010). Control of lateral organ development and flowering time by the Arabidopsis thaliana MADS–box Gene AGAMOUS–LIKE6. Plant J 62, 807–816.

    PubMed  Article  CAS  Google Scholar 

  45. Koren, S., Walenz, B.P., Berlin, K., Miller, J.R., Bergman, N.H., and Phillippy, A.M. (2017). Canu: scalable and accurate long–read assembly via adaptivek–mer weighting and repeat separation. Genome Res 27, 722–736.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  46. Korf, I. (2004). Gene finding in novel genomes. BMC BioInf 5, 59.

    Article  Google Scholar 

  47. Krouk, G., Mirowski, P., LeCun, Y., Shasha, D.E., and Coruzzi, G.M. (2010). Predictive network modeling of the high–resolution dynamic plant transcriptome in response to nitrate. Genome Biol 11, R123.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  48. Kuroda, Y., Kaga, A., Tomooka, N., Yano, H., Takada, Y., Kato, S., and Vaughan, D. (2013). QTL affecting fitness of hybrids between wild and cultivated soybeans in experimental fields. Ecol Evol 3, 2150–2168.

    PubMed  PubMed Central  Article  Google Scholar 

  49. Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., and Salzberg, S.L. (2004). Versatile and open software for comparing large genomes.. Genome Biol 5, R12.

    PubMed  PubMed Central  Article  Google Scholar 

  50. Lam, H.M., Xu, X., Liu, X., Chen, W., Yang, G., Wong, F.L., Li, M.W., He, W., Qin, N., Wang, B., et al. (2010). Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection. Nat Genet 42, 1053–1059.

    PubMed  Article  CAS  Google Scholar 

  51. Le, B.H., Cheng, C., Bui, A.Q., Wagmaister, J.A., Henry, K.F., Pelletier, J., Kwong, L., Belmonte, M., Kirkbride, R., Horvath, S., et al. (2010). Global analysis of gene activity during Arabidopsis seed development and identification of seed–specific transcription factors. Proc Natl Acad Sci USA 107, 8063–8070.

    PubMed  Article  Google Scholar 

  52. Li, B., and Dewey, C.N. (2011). RSEM: accurate transcript quantification from RNA–Seq data with or without a reference genome. BMC BioInf 12, 323.

    Article  CAS  Google Scholar 

  53. Li, Y.H., Li, W., Zhang, C., Yang, L., Chang, R.Z., Gaut, B.S., and Qiu, L.J. (2010). Genetic diversity in domesticated soybean (Glycine max) and its wild progenitor (Glycine soja) for simple sequence repeat and singlenucleotide polymorphism loci. New Phytologist 188, 242–253.

    PubMed  Article  CAS  Google Scholar 

  54. Li, Y., Zhao, S., Ma, J., Li, D., Yan, L., Li, J., Qi, X., Guo, X., Zhang, L., He, W., et al. (2013). Molecular footprints of domestication and improvement in soybean revealed by whole genome re–sequencing. BMC Genomics 14, 579.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  55. Li, Y., Zhou, G., Ma, J., Jiang, W., Jin, L., Zhang, Z., Guo, Y., Zhang, J., Sui, Y., Zheng, L., et al. (2014). De novo assembly of soybean wild relatives for pan–genome analysis of diversity and agronomic traits. Nat Biotechnol 32, 1045–1052.

    PubMed  Article  CAS  Google Scholar 

  56. Lieberman–Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M. O., et al. (2009). Comprehensive mapping of long–range interactions reveals folding principles of the human genome. Science 326, 289–293.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  57. Liu, C., Shi, L., Zhu, Y., Chen, H., Zhang, J., Lin, X., and Guan, X. (2012). CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genomics 13, 715.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  58. Liu, Z.X., Li, H.H., Wen, Z.X., Fan, X.H., Li, Y.H., Guan, R.X., Guo, Y., Wang, S.M., Wang, D.C., and Qiu, L.J. (2017). Comparison of genetic diversity between Chinese and American soybean (Glycine max (L.)) accessions revealed by high–density SNPs. Front Plant Sci 8, 2014.

    PubMed  PubMed Central  Article  Google Scholar 

  59. Lupski, J.R., de Oca–Luna, R.M., Slaugenhaupt, S., Pentao, L., Guzzetta, V., Trask, B.J., Saucedo–Cardenas, O., Barker, D.F., Killian, J.M., Garcia, C.A., et al. (1991). DNA duplication associated with Charcot– Marie–Tooth disease type 1A. Cell 66, 219–232.

    PubMed  Article  CAS  Google Scholar 

  60. Lu, S., Zhao, X., Hu, Y., Liu, S., Nan, H., Li, X., Fang, C., Cao, D., Shi, X., Kong, L., et al. (2017). Natural variation at the soybean J locus improves adaptation to the tropics and enhances yield. Nat Genet 49, 773–779.

    PubMed  Article  CAS  Google Scholar 

  61. Lv, S., Wu, W., Wang, M., Meyer, R.S., Ndjiondjop, M.N., Tan, L., Zhou, H., Zhang, J., Fu, Y., Cai, H., et al. (2018). Genetic control of seed shattering during African rice domestication. Nat Plants 4, 331–337.

    PubMed  Article  CAS  Google Scholar 

  62. Ma, S.S., Bohnert, H.J., and Dinesh–Kumar, S.P. (2015). AtGGM2014, an Arabidopsis gene co–expression network for functional studies. Sci China Life Sci 58, 276–286.

    PubMed  Article  CAS  Google Scholar 

  63. Ma, S., Ding, Z., and Li, P. (2017). Maize network analysis revealed gene modules involved in development, nutrients utilization, metabolism, and stress response. BMC Plant Biol 17, 131.

    PubMed  PubMed Central  Article  Google Scholar 

  64. Ma, S., Gong, Q., and Bohnert, H.J. (2007). An Arabidopsis gene network based on the graphical Gaussian model. Genome Res 17, 1614–1625.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  65. Mansur, L., Lark, K., Kross, H., and Oliveira, A. (1993). Interval mapping of quantitative trait loci for reproductive, morphological, and seed traits of soybean (Glycine max L.). Theor Appl Genet 86, 907–913.

    PubMed  CAS  Google Scholar 

  66. Mansur, L.M., Orf, J.H., Chase, K., Jarvik, T., Cregan, P.B., and Lark, K.G. (1996). Genetic mapping of agronomic traits using recombinant inbred lines of soybean. Crop Sci 36, 1327–1336.

    Article  CAS  Google Scholar 

  67. Mao, T., Li, J., Wen, Z., Wu, T., Wu, C., Sun, S., Jiang, B., Hou, W., Li, W., Song, Q., et al. (2017). Association mapping of loci controlling genetic and environmental interaction of soybean flowering time under various photo–thermal conditions. BMC Genomics 18, 415.

    PubMed  PubMed Central  Article  Google Scholar 

  68. McCarthy, E.M., and McDonald, J.F. (2003). LTR_STRUC: a novel search and identification program for LTR retrotransposons. Bioinformatics 19, 362–367.

    PubMed  Article  CAS  Google Scholar 

  69. Oldham, M.C., Horvath, S., and Geschwind, D.H. (2006). Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proc Natl Acad Sci USA 103, 17973–17978.

    PubMed  Article  CAS  Google Scholar 

  70. Orf, J., Chase, K., Jarvik, T., Mansur, L., Cregan, P., Adler, F., and Lark, K. (1999). Genetics of soybean agronomic traits: I. Comparison of three related recombinant inbred populations. Crop Sci 39, 1642–1651.

    Google Scholar 

  71. Oyoo, M.E., Githiri, S.M., Benitez, E.R., and Takahashi, R. (2010). QTL analysis of net–like cracking in soybean seed coats. Breed Sci 60, 28–33.

    Article  CAS  Google Scholar 

  72. Palomeque, L., Li–Jun, L., Li, W., Hedges, B., Cober, E.R., and Rajcan, I. (2009). QTL in mega–environments: II. Agronomic trait QTL co–localized with seed yield QTL detected in a population derived from a cross of high–yielding adapted × high–yielding exotic soybean lines. Theor Appl Genet 119, 429–436.

    Google Scholar 

  73. Pooprompan, P., Wasee, S., Toojinda, T., Abe, J., Chanprame, S., and Srinives, P. (2006). Molecular marker analysis of days to flowering in vegetable soybean (Glycine max (L.) Merrill). Kasetsart Journal 40, 573–581.

    Google Scholar 

  74. Ray, D.K., Mueller, N.D., West, P.C., and Foley, J.A. (2013). Yield trends are insufficient to double global crop production by 2050. PLoS ONE 8, e66428.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  75. Raymond, O., Gouzy, J., Just, J., Badouin, H., Verdenaud, M., Lemainque, A., Vergne, P., Moja, S., Choisne, N., Pont, C., et al. (2018). The Rosa genome provides new insights into the domestication of modern roses. Nat Genet 50, 772–777.

    PubMed  Article  CAS  Google Scholar 

  76. Reinprecht, Y., Poysa, V.W., Yu, K., Rajcan, I., Ablett, G.R., and Pauls, K.P. (2006). Seed and agronomic QTL in low linolenic acid, lipoxygenasefree soybean (Glycine max (L.) Merrill) germplasm. Genome 49, 1510–1527.

    PubMed  Article  CAS  Google Scholar 

  77. Rhee, S.Y., and Mutwil, M. (2014). Towards revealing the functions of all genes in plants. Trends Plant Sci 19, 212–221.

    PubMed  Article  CAS  Google Scholar 

  78. Samanfar, B., Molnar, S.J., Charette, M., Schoenrock, A., Dehne, F., Golshani, A., Belzile, F., and Cober, E.R. (2017). Mapping and identification of a potential candidate gene for a novel maturity locus, E10, in soybean. Theor Appl Genet 130, 377–390.

    PubMed  Article  CAS  Google Scholar 

  79. Saski, C., Lee, S.B., Daniell, H., Wood, T.C., Tomkins, J., Kim, H.G., and Jansen, R.K. (2005). Complete chloroplast genome sequence of Glycine max and comparative analyses with other legume genomes. Plant Mol Biol 59, 309–322.

    PubMed  Article  CAS  Google Scholar 

  80. Schäfer, J., and Strimmer, K. (2005). A shrinkage approach to large–scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol 4, Article32.

  81. Schmidt, M.H.W., Vogel, A., Denton, A.K., Istace, B., Wormit, A., van de Geest, H., Bolger, M.E., Alseekh, S., Maß, J., Pfaff, C., et al. (2017). De novo assembly of a newSolanum pennellii accession using nanopore sequencing. Plant Cell 29, 2336–2348.

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  82. Schmutz, J., Cannon, S.B., Schlueter, J., Ma, J., Mitros, T., Nelson, W., Hyten, D.L., Song, Q., Thelen, J.J., Cheng, J., et al. (2010). Genome sequence of the palaeopolyploid soybean. Nature 463, 178–183.

    PubMed  Article  CAS  Google Scholar 

  83. Seo, J.S., Rhie, A., Kim, J., Lee, S., Sohn, M.H., Kim, C.U., Hastie, A., Cao, H., Yun, J.Y., Kim, J., et al. (2016). De novo assembly and phasing of a Korean human genome. Nature 538, 243–247.

    PubMed  Article  CAS  Google Scholar 

  84. Serin, E.A.R., Nijveen, H., Hilhorst, H.W.M., and Ligterink, W. (2016). Learning from co–expression networks: possibilities and challenges. Front Plant Sci 7, 444.

    PubMed  PubMed Central  Article  Google Scholar 

  85. Servant, N., Varoquaux, N., Lajoie, B.R., Viara, E., Chen, C.J., Vert, J.P., Heard, E., Dekker, J., and Barillot, E. (2015). HiC–Pro: an optimized and flexible pipeline for Hi–C data processing. Genome Biol 16, 259.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  86. Shi, L., Guo, Y., Dong, C., Huddleston, J., Yang, H., Han, X., Fu, A., Li, Q., Li, N., Gong, S., et al. (2016). Long–read sequencing and de novo assembly of a Chinese genome. Nat Commun 7, 12065.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  87. Shimomura, M., Kanamori, H., Komatsu, S., Namiki, N., Mukai, Y., Kurita, K., Kamatsuki, K., Ikawa, H., Yano, R., and Ishimoto, M. (2015). The Glycine max cv. Enrei genome for improvement of Japanese soybean cultivars. Int J Genomics 2015, 358127.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  88. Simão, F.A., Waterhouse, R.M., Ioannidis, P., Kriventseva, E.V., and Zdobnov, E.M. (2015). BUSCO: assessing genome assembly and annotation completeness with single–copy orthologs. Bioinformatics 31, 3210–3212.

    PubMed  Article  CAS  Google Scholar 

  89. Stanke, M., and Morgenstern, B. (2005). AUGUSTUS: a web server for gene prediction in eukaryotes that allows user–defined constraints. Nucleic Acids Res 33, W465–W467.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  90. Studer, A., Zhao, Q., Ross–Ibarra, J., and Doebley, J. (2011). Identification of a functional transposon insertion in the maize domestication gene tbl. Nat Genet 43, 1160–1163.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  91. Tasma, I.M., Lorenzen, L.L., Green, D.E., and Shoemaker, R.C. (2001). Mapping genetic loci for flowering time, maturity, and photoperiod insensitivity in soybean. Mol Breeding 8, 25–35.

    Article  CAS  Google Scholar 

  92. VanBuren, R., Bryant, D., Edger, P.P., Tang, H., Burgess, D., Challabathula, D., Spittle, K., Hall, R., Gu, J., Lyons, E., et al. (2015). Single–molecule sequencing of the desiccation–tolerant grass Oropetium thomaeum. Nature 527, 508–511.

    PubMed  Article  CAS  Google Scholar 

  93. Walker, B.J., Abeel, T., Shea, T., Priest, M., Abouelliel, A., Sakthikumar, S., Cuomo, C.A., Zeng, Q., Wortman, J., Young, S.K., et al. (2014). Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  94. Wang, K., Huang, G., and Zhu, Y. (2016). Transposable elements play an important role during cotton genome evolution and fiber cell development. Sci China Life Sci 59, 112–121.

    PubMed  Article  CAS  Google Scholar 

  95. Wang, Z., and Tian, Z.X. (2015). Genomics progress will facilitate molecular breeding in soybean. Sci China Life Sci 58, 813–815.

    PubMed  Article  Google Scholar 

  96. Watanabe, S., Xia, Z., Hideshima, R., Tsubokura, Y., Sato, S., Yamanaka, N., Takahashi, R., Anai, T., Tabata, S., Kitamura, K., et al. (2011). A map–based cloning strategy employing a residual heterozygous line reveals that theGIGANTEA gene is involved in soybean maturity and flowering. Genetics 188, 395–407.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  97. Wei, H., Yordanov, Y.S., Georgieva, T., Li, X., and Busov, V. (2013). Nitrogen deprivation promotesPopulus root growth through global transcriptome reprogramming and activation of hierarchical genetic networks. New Phytol 200, 483–497.

    PubMed  Article  CAS  Google Scholar 

  98. Wei, L., and Cao, X. (2016). The effect of transposable elements on phenotypic variation: insights from plants to humans. Sci China Life Sci 59, 24–37.

    PubMed  Article  CAS  Google Scholar 

  99. Wilson, R.F. (2008). Soybean: Market Driven Research Needs in Genetics and Genomics of Soybean, G. Stacey, ed. (New York: Springer), pp. 3–16.

  100. Windram, O., Madhou, P., McHattie, S., Hill, C., Hickman, R., Cooke, E., Jenkins, D.J., Penfold, C.A., Baxter, L., Breeze, E., et al. (2012). Arabidopsis defense against Botrytis cinerea: chronology and regulation deciphered by high–resolution temporal transcriptomic analysis. Plant Cell 24, 3530–3557.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  101. Wolfe, C.J., Kohane, I.S., and Butte, A.J. (2005). Systematic survey reveals general applicability of “guilt–by–association” within gene coexpression networks. BMC BioInf 6, 227.

    Article  CAS  Google Scholar 

  102. Xia, Z., Watanabe, S., Yamada, T., Tsubokura, Y., Nakashima, H., Zhai, H., Anai, T., Sato, S., Yamazaki, T., Lü, S., et al. (2012). Positional cloning and characterization reveal the molecular basis for soybean maturity locus E1 that regulates photoperiodic flowering. Proc Natl Acad Sci USA 109, E2155–E2164.

    PubMed  Article  Google Scholar 

  103. Yamanaka, N., Nagamura, Y., Tsubokura, Y., Yamamoto, K., Takahashi, R., Kouchi, H., Yano, M., Sasaki, T., and Harada, K. (2000). Quantitative trait locus analysis of flowering time in soybean using a RFLP linkage map.. Breed Sci 50, 109–115.

    Article  CAS  Google Scholar 

  104. Yamanaka, N. (2001). An informative linkage map of soybean reveals QTLs for flowering time, leaflet morphology and regions of segregation distortion. DNA Res 8, 61–72.

    PubMed  Article  CAS  Google Scholar 

  105. Yue, Y., Liu, N., Jiang, B., Li, M., Wang, H., Jiang, Z., Pan, H., Xia, Q., Ma, Q., Han, T., et al. (2017). A single nucleotide deletion in J encoding gmelf3 confers long juvenility and is associated with adaption of tropic soybean. Mol Plant 10, 656–658.

    PubMed  Article  CAS  Google Scholar 

  106. Zabala, G., and Vodkin, L.O. (2007). A rearrangement resulting in small tandem repeats in the F3′5′H gene of white flower genotypes is associated with the soybean locus. Crop Sci 47, S–113.

    Article  Google Scholar 

  107. Zhang, J., Chen, L.L., Xing, F., Kudrna, D.A., Yao, W., Copetti, D., Mu, T., Li, W., Song, J.M., Xie, W., et al. (2016). Extensive sequence divergence between the reference genomes of two eliteindica rice varieties Zhenshan 97 and Minghui 63. Proc Natl Acad Sci USA 113, E5163–E5171.

    PubMed  Article  CAS  Google Scholar 

  108. Zhang, S.R., Wang, H., Wang, Z., Ren, Y., Niu, L., Liu, J., and Liu, B. (2017). Photoperiodism dynamics during the domestication and improvement of soybean. Sci China Life Sci 60, 1416–1427.

    PubMed  Article  Google Scholar 

  109. Zhang, W.K., Wang, Y.J., Luo, G.Z., Zhang, J.S., He, C.Y., Wu, X.L., Gai, J.Y., and Chen, S.Y. (2004). QTL mapping of ten agronomic traits on the soybean (Glycine max L. Merr.) genetic map and their association with EST markers. Theor Appl Genet 108, 1131–1139.

    PubMed  Article  CAS  Google Scholar 

  110. Zhao, C., Takeshima, R., Zhu, J., Xu, M., Sato, M., Watanabe, S., Kanazawa, A., Liu, B., Kong, F., Yamada, T., et al. (2016). A recessive allele for delayed flowering at the soybean maturity locus E9 is a leaky allele of FT2a, a FLOWERING LOCUS T ortholog. BMC Plant Biol 16, 20.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  111. Zhou, Z., Jiang, Y., Wang, Z., Gou, Z., Lyu, J., Li, W., Yu, Y., Shu, L., Zhao, Y., Ma, Y., et al. (2015). Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat Biotechnol 33, 408–414.

    PubMed  Article  CAS  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (91531304, 31525018, 31370266, and 31788103), the “Strategic Priority Research Program” of the Chinese Academy of Sciences (XDA08000000), and the State Key Laboratory of Plant Cell and Chromosome Engineering (PCCE–KF–2017–03).

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Jianchang Du or Shisong Ma or Zhixi Tian.

Electronic supplementary material

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Shen, Y., Liu, J., Geng, H. et al. De novo assembly of a Chinese soybean genome. Sci. China Life Sci. 61, 871–884 (2018). https://doi.org/10.1007/s11427-018-9360-0

Download citation

Keywords

  • de novo soybean genome
  • Zhonghuang 13
  • Gmax_ZH13
  • structure variation
  • gene co–expression network