Large-scale sequencing of normalized full-length cDNA library of soybean seed at different developmental stages and analysis of the gene expression profiles based on ESTs

Sha, Ai-Hua; Li, Chen; Yan, Xiao-Hong; Shan, Zhi-Hui; Zhou, Xin-An; Jiang, Mu-Lan; Mao, Han; Chen, Bo; Wan, Xia; Wei, Wen-Hui

doi:10.1007/s11033-011-1046-1

Large-scale sequencing of normalized full-length cDNA library of soybean seed at different developmental stages and analysis of the gene expression profiles based on ESTs

Published: 12 June 2011

Volume 39, pages 2867–2874, (2012)
Cite this article

Molecular Biology Reports Aims and scope Submit manuscript

Ai-Hua Sha¹,
Chen Li¹,
Xiao-Hong Yan¹,
Zhi-Hui Shan¹,
Xin-An Zhou¹,
Mu-Lan Jiang¹,
Han Mao¹,
Bo Chen¹,
Xia Wan¹ &
…
Wen-Hui Wei¹

442 Accesses
10 Citations
Explore all metrics

Abstract

Although GenBank has now covered over 1,400,000 expressed sequence tags (ESTs) from soybean, most ESTs available to the public have been derived from tissues or environmental conditions rather than developing seeds. It is absolutely necessary for annotating the molecular mechanisms of soybean seed development to analyze completely the gene expression profiles of its immature seed at various stages. Here we have constructed a full-length-enriched cDNA library comprised of a total of 45,408 cDNA clones which cover various stages of soybean seed development. Furthermore, we have sequenced from 5′ ends of these clones, 36,656 ESTs were obtained in the present study. These EST sequences could be categorized into 27,982 unigenes, including 22,867 contigs and 5,115 singletons, among which 27,931 could be mapped onto soybean 20 chromosome sequences. Comparative genomic analysis with other plants has revealed that these unigenes include lots of candidate genes specific to dicot, legume and soybean. Approximately 1,789 of these unigenes currently show no homology to known soybean sequences, suggesting that many represent mRNAs specifically expressed in seeds. Novel abundant genes involved in the oil synthesis have been found in this study, may serve as a valuable resource for soybean seed improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Transcriptome-wide identification and characterization of the regulatory landscape of NAC genes in Drimia indica

Article 30 November 2023

Plant genome information facilitates plant functional genomics

Article Open access 09 April 2024

Genome-wide identification, characterization, and expression analysis of the MAPK gene family in Nardostachys jatamansi (D. Don) DC

Article 22 April 2024

References

Hill J, Nelson E, Tilman D, Polasky S, Tiffany D (2006) Environmental, economic, and energetic costs and benefits of biodiesel and ethanol biofuels. Proc Natl Acad Sci USA 103:11206–11210. doi:10.1073/pnas.0604600103
Article PubMed CAS Google Scholar
Cregan PB, Jarvik T, Bush AL, Shoemaker RC, Lark KG, Kahler AL, Kaya N, van Toai TT, Lohnes DG, Chung J, Specht JE (1999) An integrated genetic linkage map of the soybean genome. Crop Sci 39:1464–1490
Article CAS Google Scholar
Song QJ, Marek LF, Shoemaker RC, Lark KG, Concibido VC, Delannay X, Specht JE, Cregan PB (2004) A new integrated genetic linkage map of the soybean. Theor Appl Genet 109:122–128. doi:10.1007/s00122-004-1602-3
Article PubMed CAS Google Scholar
Watanabe S, Tajuddin T, Yamanaka N, Hayashi M, Harada K (2004) Analysis of QTLs for reproductive development and seed quality traits in soybean using recombinant inbred lines. Breeding Sci 54:399–407
Article CAS Google Scholar
Hinchee MAW, Connor-Ward DV, Newell CA, McDonnell RE, Sato SJ, Gasser CS, Fischhoff DA, Re DB, Fraley RT, Horsch RB (1988) Production of transgenic soybean plants using Agrobacterium-mediate DNA transfer. Biotechnology 6:915–922
Article CAS Google Scholar
Keyser HH, Li F (1992) Potential for increasing biological nitrogen fixation in soybean. Plant Soil 141:119–135
Article CAS Google Scholar
Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815. doi:10.1038/35048692
Article Google Scholar
Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun WL, Chen L, Cooper B, Park S, Wood TC, Mao L, Quail P, Wing R, Dean R, Yu Y, Zharkikh A, Shen R, Sahasrabudhe S, Thomas A, Cannings R, Gutin A, Pruss D, Reid J, Tavtigian S, Mitchell J, Eldredge G, Scholl T, Miller RM, Bhatnagar S, Adey N, Rubano T, Tusneem N, Robinson R, Feldhaus J, Macalma T, Oliphant A, Briggs S (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100. doi:10.1126/science.1068275
Article PubMed CAS Google Scholar
Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, Cao M, Liu J, Sun J, Tang J, Chen Y, Huang X, Lin W, Ye C, Tong W, Cong L, Geng J, Han Y, Li L, Li W, Hu G, Huang X, Li W, Li J, Liu Z, Li L, Liu J, Qi Q, Liu J, Li L, Li T, Wang X, Lu H, Wu T, Zhu M, Ni P, Han H, Dong W, Ren X, Feng X, Cui P, Li X, Wang H, Xu X, Zhai W, Xu Z, Zhang J, He S, Zhang J, Xu J, Zhang K, Zheng X, Dong J, Zeng W, Tao L, Ye J, Tan J, Ren X, Chen X, He J, Liu D, Tian W, Tian C, Xia H, Bao Q, Li G, Gao H, Cao T, Wang J, Zhao W, Li P, Chen W, Wang X, Zhang Y, Hu J, Wang J, Liu S, Yang J, Zhang G, Xiong Y, Li Z, Mao L, Zhou C, Zhu Z, Chen R, Hao B, Zheng W, Chen S, Guo W, Li G, Liu S, Tao M, Wang J, Zhu L, Yuan L, Yang H (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296:79–92. doi:10.1126/science.1068037
Article PubMed CAS Google Scholar
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA (2010) Genome sequence of the palaeopolyploid soybean. Nature 463:178–183. doi:10.1038/nature08670
Article PubMed CAS Google Scholar
Arumuganathan K, Earle ED (1991) Nuclear DNA content of some important plant species. Plant Mol Biol Rep 9:208–218
Article CAS Google Scholar
Putney SC, Herlihy WC, Schimmel P (1983) A new tropin T and cDNA clones for 13 different muscle proteins, found by shotgun sequencing. Nature 302:718–721
Article PubMed CAS Google Scholar
Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252:1651–1656. doi:10.1126/science.2047873
Article PubMed CAS Google Scholar
Delseny M, Cooke R, Raynal M, Grellet F (1997) The Arabidopsis thaliana cDNA sequencing projects. FEBS Lett 403:221–224. doi:10.1016/S0014-5793(97)00075-6
Article PubMed CAS Google Scholar
Ewing RM, Kahla AB, Poirot O, Lopez F, Audic S, Claverie JM (1999) Large-scale statistical analyses of rice ESTs reveal correlated patterns of gene expression. Genome Res 9:950–959
Article PubMed CAS Google Scholar
Fernandes J, Brendel V, Gai X, Lal S, Chandler VL, Elumalai RP, Galbraith DW, Pierson EA, Walbot V (2002) Comparison of RNA expression profiles based on maize expressed sequence tag frequency analysis and micro-array hybridization. Plant Physiol 128:896–910. doi:10.1104/pp.010681
Article PubMed Google Scholar
Shoemaker R, Keim P, Vodkin L, Retzel E, Clifton SW, Waterston R, Smoller D, Coryell V, Khanna A, Erpelding J, Gai X, Brendel V, Raph-Schmidt C, Shoop EG, Vielweber CJ, Schmatz M, Pape D, Bowers Y, Theising B, Martin J, Dante M, Wylie T, Granger C (2002) A compilation of soybean ESTs: generation and analysis. Genome 45:329–338. doi:10.1139/G01-150
Article PubMed Google Scholar
Zhang H, Sreenivasulu N, Weschke W, Stein N, Rudd S, Radchuk V, Potokina E, Scholz U, Schweizer P, Zierold U, Langridge P, Varshney RK, Wobus U, Graner A (2004) Large-scale analysis of the barley transcriptome based on expressed sequence tags. Plant J 40:276–290. doi:10.1111/j.1365-313X.2004.02209.x
Article PubMed Google Scholar
Jones SI, Gonzalez DO, Vodkin LO (2010) Flux of transcript patterns during soybean seed development. BMC Genomics 11:136. doi:10.1186/1471-2164-11-136
Article PubMed Google Scholar
Girke T, Todd J, Ruuska S, White J, Benning C, Ohlrogge J (2000) Microarray analysis of developing Arabidopsis seeds. Plant Physiol 124:1570–1581
Article PubMed CAS Google Scholar
Schenk PM, Kazan K, Wilson I, Anderson JP, Richmond T, Somerville SC, Manners JM (2000) Coordinated plant defense responses in Arabidopsis revealed by microarray analysis. Proc Natl Acad Sci USA 97:11655–11660
Article PubMed CAS Google Scholar
Hatey F, Tosser-Klopp G, Clouscard-Martinato C, Mulsant P, Gasser F (1998) Expressed sequence tags for genes: a review. Genet Sel Evol 30:521–541
Article CAS Google Scholar
Hoeven RV, Ronning C, Giovannoni J, Martin G, Tanksley S (2002) Deductions about the number, organization, and evolution of genes in the tomato genome based on analysis of a large expressed sequence tag collection and selective genomic sequencing. Plant Cell 14:1441–1456. doi:10.1105/tpc.010478
Article PubMed Google Scholar
White JA, Todd J, Newman T, Focks N, Girke T, de Ilárduya OM, Jaworski JG, Ohlrogge JB, Benning C (2000) A new set of Arabidopsis expressed sequence tags from developing seeds. The metabolic pathway from carbohydrates to seed oil. Plant Physiol 124:1582–1594
Article PubMed Google Scholar
Weschke W, Panitz R, Sauer N, Wang Q, Neubohn B, Weber H, Wobus U (2000) Sucrose transport into barley seeds: molecular characterization of two transporters and implications for seed development and starch accumulation. Plant J 21:455–467. doi:10.1046/j.1365-313x.2000.00695.x
Article PubMed CAS Google Scholar
Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8:186–194. doi:10.1101/gr.8.3.186
PubMed CAS Google Scholar
Ewing B, Hillier L, Wendl MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. accuracy assessment. Genome Res 8:175–185. doi:10.1101/gr.8.3.175
PubMed CAS Google Scholar
Burke J, Wang H, Hide W, Davison DB (1998) Alternative gene form discovery and candidate gene selection from gene indexing projects. Genome Res 8:276–290. doi:10.1101/gr.8.3.276
PubMed CAS Google Scholar
Miller RT, Christoffels AG, Gopalakrishnan C, Burke J, Ptitsyn AA, Broveak TR, Hide WA (1999) A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base. Genome Res 9:1143–1155
Article PubMed CAS Google Scholar
Christoffels A, van Gelder A, Greyling G, Miller R, Hide T, Hide W (2001) STACK: sequence tag alignment and consensus knowledgebase. Nucleic Acids Res 29:234–238
Article PubMed CAS Google Scholar
Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631–637. doi:10.1126/science.278.5338.631
Article PubMed CAS Google Scholar
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4:41. doi:10.1186/1471-2105-4-41
Article PubMed Google Scholar
Wellenreuther R, Schupp I, Poustka A, Wiemann S, The German cDNA Consortium (2004) SMART amplification combined with cDNA size fractionation in order to obtain larger full-length clones. BMC Genomics 5:36. doi:10.1186/1471-2164-5-36
Article PubMed Google Scholar
Ohlrogge J, Benning C (2000) Unraveling plant metabolism by EST analysis. Curr Opin Plant Biol 3:224–228
PubMed CAS Google Scholar
Umezawa T, Sakurai T, Totoki Y, Toyoda A, Seki M, Ishiwata A, Akiyama K, Kurotani A, Yoshida T, Mochida K, Kasuga M, Todaka D, Maruyama K, Nakashima K, Enju A, Mizukado S, Ahmed S, Yoshiwara K, Harada K, Tsubokura Y, Hayashi M, Sato S, Anai T, Ishimoto M, Funatsuki H, Teraishi M, Osaki M, Shinano T, Akashi R, Sakaki Y, Yamaguchi-Shinozaki K, Shinozaki K (2008) Sequencing and analysis of approximately 40000 soybean cDNA clones from a full-length-enriched cDNA library. DNA Res 15:333–346. doi:10.1093/dnares/dsn024
Article PubMed CAS Google Scholar
Mortazavi A, Williams BA, Williams BA, Mccue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Method 5:621–628. doi:10.1038/nmeth.1226
Article CAS Google Scholar
Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y (2008) RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18:1509–1517. doi:10.1101/gr.079558.108
Article PubMed CAS Google Scholar
Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320:1344–1349. doi:10.1126/science.1158441
Article PubMed CAS Google Scholar
Severin AJ, Woody JL, Bolon YT, Joseph B, Diers BW, Farmer AD, Muehlbauer GJ, Nelson RT, Grant D, Specht JE, Graham MA, Cannon SB, May GD, Vance CP, Shoemaker RC (2010) RNA-Seq atlas of Glycine max: a guide to the soybean transcriptome. BMC Plant Biol 10:160. doi:10.1186/1471-2229-10-160
Article PubMed Google Scholar
Fukuoka Y, Inaoka H, Kohane IS (2004) Inter-species differences of co-expression of neighboring genes in eukaryotic genomes. BMC Genomics 5:4. doi:10.1186/1471-2164-5-4
Article PubMed Google Scholar
Sasaki T, Matsumoto T, Yamamoto K, Sakata K, Baba T, Katayose Y, Wu J, Niimura Y, Cheng Z, Nagamura Y, Antonio BA, Kanamori H, Hosokawa S, Masukawa M, Arikawa K, Chiden Y, Hayashi M, Okamoto M, Ando T, Aoki H, Arita K, Hamada M, Harada C, Hijishita S, Honda M, Ichikawa Y, Idonuma A, Iijima M, Ikeda M, Ikeno M, Ito S, Ito T, Ito Y, Ito Y, Iwabuchi A, Kamiya K, Karasawa W, Katagiri S, Kikuta A, Kobayashi N, Kono I, Machita K, Maehara T, Mizuno H, Mizubayashi T, Mukai Y, Nagasaki H, Nakashima M, Nakama Y, Nakamichi Y, Nakamura M, Namiki N, Negishi M, Ohta I, Ono N, Saji S, Sakai K, Shibata M, Shimokawa T, Shomura A, Song J, Takazaki Y, Terasawa K, Tsuji K, Waki K, Yamagata H, Yamane H, Yoshiki S, Yoshihara R, Yukawa K, Zhong H, Iwama H, Endo T, Ito H, Hahn JH, Kim HI, Eun MY, Yano M, Jiang J, Gojobori T (2002) The genome sequence and structure of rice chromosome 1. Nature 420:312–316. doi:10.1038/nature01184
Article PubMed CAS Google Scholar
Paterson AH, Bowers JE, Chapman BA, Peterson DG, Rong J, Wicker TM (2004) Comparative genome analysis of monocots and dicots, toward characterization of angiosperm diversity. Curr Opin Biotechnol 15:120–125. doi:10.1016/j.copbio.2004.03.001
Article PubMed CAS Google Scholar
Thelen JJ, Ohlrogge JB (2002) Metabolic engineering of fatty acid biosynthesis in plants. Metab Eng 4:12–21. doi:10.1006/mben.2001.0204
Article PubMed CAS Google Scholar
Wei WH, Chen B, Yan XH, Wang LJ, Zhang HF, Cheng JP, Zhou XA, Sha AH, Shen H (2008) Identification of differentially expressed genes in soybean seeds differing in oil content. Plant Sci 175:663–673. doi:10.1016/j.plantsci.2008.06.018
Article CAS Google Scholar
Huang JY, Jie ZJ, Wang LJ, Yan XH, Wei WH (2011) Analysis of the differential expression of the genes related to Brassica napus seed development. Mol Biol Rep 38:1055–1061. doi:10.1007/s11033-010-0202-3
Article PubMed CAS Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant 30671312), the Major S&T Projects on the Cultivation of New Varieties of Genetically Modified Organisms (Grants 2008ZX08004-005, 2009ZX08009-120B and 2011ZX08004-005), and the National Nonprofit Institute Research Grant of CATAS-ITBB (Grant 20075049).

Author information

Authors and Affiliations

Institute of Oil Crops, Key Laboratory of Oil Crop Biology of the Ministry of Agriculture, Chinese Academy of Agricultural Sciences, Wuhan, 430062, China
Ai-Hua Sha, Chen Li, Xiao-Hong Yan, Zhi-Hui Shan, Xin-An Zhou, Mu-Lan Jiang, Han Mao, Bo Chen, Xia Wan & Wen-Hui Wei

Authors

Ai-Hua Sha
View author publications
You can also search for this author in PubMed Google Scholar
Chen Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Hong Yan
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Hui Shan
View author publications
You can also search for this author in PubMed Google Scholar
Xin-An Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Mu-Lan Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Han Mao
View author publications
You can also search for this author in PubMed Google Scholar
Bo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xia Wan
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Hui Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xin-An Zhou, Mu-Lan Jiang or Wen-Hui Wei.

Additional information

Ai-Hua Sha, Chen Li and Xiao-Hong Yan contributed equally to this work.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 51 kb)

Supplementary material 2 (DOC 85 kb)

Supplementary material 3 (DOC 29 kb)

Supplementary material 4 (DOC 33173 kb)

Supplementary material 5 (DOC 34 kb)

Supplementary material 6 (XLS 4532 kb)

Supplementary material 7 (XLS 4671 kb)

Supplementary material 8 (XLS 3728 kb)

Supplementary material 9 (XLS 2733 kb)

Supplementary material 10 (XLS 2674 kb)

Supplementary material 11 (XLS 2029 kb)

Supplementary material 12 (XLS 1393 kb)

Supplementary material 13 (XLS 971 kb)

Supplementary material 14 (XLS 989 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sha, AH., Li, C., Yan, XH. et al. Large-scale sequencing of normalized full-length cDNA library of soybean seed at different developmental stages and analysis of the gene expression profiles based on ESTs. Mol Biol Rep 39, 2867–2874 (2012). https://doi.org/10.1007/s11033-011-1046-1

Download citation

Received: 29 November 2010
Accepted: 04 June 2011
Published: 12 June 2011
Issue Date: March 2012
DOI: https://doi.org/10.1007/s11033-011-1046-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large-scale sequencing of normalized full-length cDNA library of soybean seed at different developmental stages and analysis of the gene expression profiles based on ESTs

Abstract

Access this article

Similar content being viewed by others

Transcriptome-wide identification and characterization of the regulatory landscape of NAC genes in Drimia indica

Plant genome information facilitates plant functional genomics

Genome-wide identification, characterization, and expression analysis of the MAPK gene family in Nardostachys jatamansi (D. Don) DC

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Electronic supplementary material

Supplementary material 1 (DOC 51 kb)

Supplementary material 2 (DOC 85 kb)

Supplementary material 3 (DOC 29 kb)

Supplementary material 4 (DOC 33173 kb)

Supplementary material 5 (DOC 34 kb)

Supplementary material 6 (XLS 4532 kb)

Supplementary material 7 (XLS 4671 kb)

Supplementary material 8 (XLS 3728 kb)

Supplementary material 9 (XLS 2733 kb)

Supplementary material 10 (XLS 2674 kb)

Supplementary material 11 (XLS 2029 kb)

Supplementary material 12 (XLS 1393 kb)

Supplementary material 13 (XLS 971 kb)

Supplementary material 14 (XLS 989 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Large-scale sequencing of normalized full-length cDNA library of soybean seed at different developmental stages and analysis of the gene expression profiles based on ESTs

Abstract

Access this article

Similar content being viewed by others

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation