Functional & Integrative Genomics

, Volume 8, Issue 1, pp 69–78 | Cite as

Genome-wide analysis of intronless genes in rice and Arabidopsis

  • Mukesh Jain
  • Paramjit Khurana
  • Akhilesh K. Tyagi
  • Jitendra P. KhuranaEmail author
Original Paper


Intronless genes, a characteristic feature of prokaryotes, constitute a significant portion of the eukaryotic genomes. Our analysis revealed the presence of 11,109 (19.9%) and 5,846 (21.7%) intronless genes in rice and Arabidopsis genomes, respectively, belonging to different cellular role and gene ontology categories. The distribution and conservation of rice and Arabidopsis intronless genes among different taxonomic groups have been analyzed. A total of 301 and 296 intronless genes from rice and Arabidopsis, respectively, are conserved among organisms representing the three major domains of life, i.e., archaea, bacteria, and eukaryotes. These evolutionarily conserved proteins are predicted to be involved in housekeeping cellular functions. Interestingly, among the 68% of rice and 77% of Arabidopsis intronless genes present only in eukaryotic genomes, approximately 51% and 57% genes have orthologs only in plants, and thus may represent the plant-specific genes. Furthermore, 831 and 144 intronless genes of rice and Arabidopsis, respectively, referred to as ORFans, do not exhibit homology to any of the genes in the database and may perform species-specific functions. These data can serve as a resource for further comparative, evolutionary, and functional analysis of intronless genes in plants and other organisms.


Rice Arabidopsis Intronless genes Evolution 



We are thankful to Rashmi Jain for technical assistance. This work was supported financially by the Department of Biotechnology, Government of India, and the University Grants Commission, New Delhi. MJ acknowledges the Council of Scientific and Industrial Research, New Delhi, for the award of Senior Research Fellowship.

Supplementary material

10142_2007_52_MOESM1_ESM.pdf (258 kb)
Supplemental data file 1 Predicted intronless genes in rice. (PDF 263 kb)
10142_2007_52_MOESM2_ESM.pdf (180 kb)
Supplemental data file 2 Predicted intronless genes in Arabidopsis. (PDF 184 kb)
10142_2007_52_MOESM3_ESM.pdf (271 kb)
Supplemental data file 3 Cellular role and GO category of rice intronless genes predicted by ProtFun. (PDF 277 kb)
10142_2007_52_MOESM4_ESM.pdf (171 kb)
Supplemental data file 4 Cellular role and GO category of Arabidopsis intronless genes predicted by ProtFun. (PDF 174 kb)
10142_2007_52_MOESM5_ESM.pdf (166 kb)
Supplemental data file 5 Locus IDs of rice intronless genes present in archaea, bacteria, and/or eukaryotes (PDF 169 kb)
10142_2007_52_MOESM6_ESM.pdf (144 kb)
Supplemental data file 6 Locus IDs of Arabidopsis intronless genes present in archaea, bacteria, and/or eukaryotes. (PDF 147 kb)
10142_2007_52_MOESM7_ESM.xls (410 kb)
Supplemental data file 7 Locus IDs of rice intronless genes present specifically in different taxonomic groups. (XLS 419 kb)
10142_2007_52_MOESM8_ESM.xls (300 kb)
Supplemental data file 8 Locus IDs of Arabidopsis intronless genes present specifically in different taxonomic groups. (XLS 306 kb)
10142_2007_52_MOESM9_ESM.pdf (58 kb)
Supplemental data file 9 Locus IDs of rice and Arabidopsis intronless ORFans. (PDF 59 kb)


  1. Agarwal SM, Gupta J (2005) Comparative analysis of human intronless proteins. Biochem Biophys Res Commun 331:512–519PubMedCrossRefGoogle Scholar
  2. Ahn S, Tanksley SD (1993) Comparative linkage maps of the rice and maize genomes. Proc Natl Acad Sci USA 90:7980–7984PubMedCrossRefGoogle Scholar
  3. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402PubMedCrossRefGoogle Scholar
  4. Andersson JO (2005) Lateral gene transfer in eukaryotes. Cell Mol Life Sci 62:1182–1197PubMedCrossRefGoogle Scholar
  5. Aubourg S, Kreis M, Lecharny A (1999) The DEAD box RNA helicase family in Arabidopsis thaliana. Nucleic Acids Res 27:628–636PubMedCrossRefGoogle Scholar
  6. Babenko VN, Rogozin IB, Mekhedov SL, Koonin EV (2004) Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic Acids Res 32:3724–3733PubMedCrossRefGoogle Scholar
  7. Bancroft I (2002) Insights into cereal genomes from two draft genome sequences of rice. Genome Biol 3: Reviews 1015.1–1015.3CrossRefGoogle Scholar
  8. Boucher Y, Douady CJ, Papke RT, Walsh DA, Boudreau ME, Nesbo CL, Case RJ, Doolittle WF (2003) Lateral gene transfer and the origins of prokaryotic groups. Annu Rev Genet 37:283–328PubMedCrossRefGoogle Scholar
  9. Boudet N, Aubourg S, Toffano-Nioche C, Kreis M, Lecharny A (2001) Evolution of intron/exon structure of DEAD helicase family genes in Arabidopsis, Caenorhabditis, and Drosophila. Genome Res 11:2101–2114PubMedCrossRefGoogle Scholar
  10. Bowers JE, Chapman BA, Rong J, Paterson AH (2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422:433–438PubMedCrossRefGoogle Scholar
  11. Chapman BA, Bowers JE, Feltus FA, Paterson AH (2006) Buffering of crucial functions by paleologous duplicated genes may contribute cyclicality to angiosperm genome duplication. Proc Natl Acad Sci U S A 103:2730–2735PubMedCrossRefGoogle Scholar
  12. Copley SD, Dhillon JK (2002) Lateral gene transfer and parallel evolution in the history of glutathione biosynthesis genes. Genome Biol 3:1–25CrossRefGoogle Scholar
  13. Delseny M (2003) Towards an accurate sequence of the rice genome. Curr Opin Plant Biol 6:101–105PubMedCrossRefGoogle Scholar
  14. Domazet-Loso T, Tautz D (2003) An evolutionary analysis of orphan genes in Drosophila. Genome Res 13:2213–2219PubMedCrossRefGoogle Scholar
  15. Fischer D, Eisenberg D (1999) Finding families for genomic ORFans. Bioinformatics 15:759–762PubMedCrossRefGoogle Scholar
  16. Gagne JM, Downes BP, Shiu SH, Durski AM, Vierstra RD (2002) The F-box subunit of the SCF E3 complex is encoded by a diverse superfamily of genes in Arabidopsis. Proc Natl Acad Sci U S A 99:11519–11524PubMedCrossRefGoogle Scholar
  17. Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, Carlton JM, Pain A, Nelson KE, Bowman S, Paulsen IT, James K, Eisen JA, Rutherford K, Salzberg SL, Craig A, Kyes S, Chan MS, Nene V, Shallom SJ, Suh B, Peterson J, Angiuoli S, Pertea M, Allen J, Selengut J, Haft D, Mather MW, Vaidya AB, Martin DM, Fairlamb AH, Fraunholz MJ, Roos DS, Ralph SA, McFadden GI, Cummings LM, Subramanian GM, Mungall C, Venter JC, Carucci DJ, Hoffman SL, Newbold C, Davis RW, Fraser CM, Barrell B (2002) Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419:498–511PubMedCrossRefGoogle Scholar
  18. Gentles AJ, Karlin S (1999) Why are human G-protein-coupled receptors predominantly intronless? Trends Genet 15:47–49PubMedCrossRefGoogle Scholar
  19. Glusman G, Sosinsky A, Ben-Asher E, Avidan N, Sonkin D, Bahar A, Rosenthal A, Clifton S, Roe B, Ferraz C, Demaille J, Lancet D (2000) Sequence, structure, and evolution of a complete human olfactory receptor gene cluster. Genomics 63:227–245PubMedCrossRefGoogle Scholar
  20. Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun WL, Chen L, Cooper B, Park S, Wood TC, Mao L, Quail P, Wing R, Dean R, Yu Y, Zharkikh A, Shen R, Sahasrabudhe S, Thomas A, Cannings R, Gutin A, Pruss D, Reid J, Tavtigian S, Mitchell J, Eldredge G, Scholl T, Miller RM, Bhatnagar S, Adey N, Rubano T, Tusneem N, Robinson R, Feldhaus J, Macalma T, Oliphant A, Briggs S (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100PubMedCrossRefGoogle Scholar
  21. Gotoh O (1998) Divergent structures of Caenorhabditis elegans cytochrome P450 genes suggest the frequent loss and gain of introns during the evolution of nematodes. Mol Biol Evol 15:1447–1459PubMedGoogle Scholar
  22. International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800CrossRefGoogle Scholar
  23. Jain M, Kaur N, Garg R, Thakur JK, Tyagi AK, Khurana JP (2006a) Structure and expression analysis of early auxin-responsive Aux/IAA gene family in rice (Oryza sativa). Funct Integr Genomics 6:47–59PubMedCrossRefGoogle Scholar
  24. Jain M, Tyagi AK, Khurana JP (2006b) Genome-wide analysis, evolutionary expansion, and expression of early auxin-responsive SAUR gene family in rice (Oryza sativa). Genomics 88:360–371PubMedCrossRefGoogle Scholar
  25. Jensen LJ, Gupta R, Blom N, Devos D, Tamames J, Kesmir C, Nielsen H, Staerfeldt HH, Rapacki K, Workman C, Andersen CA, Knudsen S, Krogh A, Valencia A, Brunak S (2002) Prediction of human protein function from post-translational modifications and localization features. J Mol Biol 319:1257–1265PubMedCrossRefGoogle Scholar
  26. Jensen LJ, Ussery DW, Brunak S (2003) Functionality of system components: conservation of protein function in protein feature space. Genome Res 13:2444–2449PubMedCrossRefGoogle Scholar
  27. Jordan IK, Rogozin IB, Wolf YI, Koonin EV (2002) Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res 12:962–968PubMedCrossRefGoogle Scholar
  28. Kellis M, Birren BW, Lander ES (2004) Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428:617–624PubMedCrossRefGoogle Scholar
  29. Lecharny A, Boudet N, Gy I, Aubourg S, Kreis M (2003) Introns in, introns out in plant gene families: a genomic approach of the dynamics of gene structure. J Struct Funct Genomics 3:111–116PubMedCrossRefGoogle Scholar
  30. Long M (2001) Evolution of novel genes. Curr Opin Genet Dev 11:673–680PubMedCrossRefGoogle Scholar
  31. Lurin C, Andres C, Aubourg S, Bellaoui M, Bitton F, Bruyere C, Caboche M, Debast C, Gualberto J, Hoffmann B, Lecharny A, Le Ret M, Martin-Magniette ML, Mireau H, Peeters N, Renou JP, Szurek B, Taconnat L, Small I (2004) Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis. Plant Cell 16:2089–2103PubMedCrossRefGoogle Scholar
  32. Paterson AH, Bowers JE, Chapman BA (2004a) Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc Natl Acad Sci U S A 101:9903–9908PubMedCrossRefGoogle Scholar
  33. Paterson AH, Bowers JE, Chapman BA, Peterson DG, Rong J, Wicker TM (2004b) Comparative genome analysis of monocots and dicots, toward characterization of angiosperm diversity. Curr Opin Biotechnol 15:120–125PubMedCrossRefGoogle Scholar
  34. Rujan T, Martin W (2001) How many genes in Arabidopsis come from cyanobacteria? An estimate from 386 protein phylogenies. Trends Genet 17:113–120PubMedCrossRefGoogle Scholar
  35. Sakharkar MK, Kangueane P (2004) Genome SEGE: a database for ‘intronless’ genes in eukaryotic genomes. BMC Bioinformatics 5:67PubMedCrossRefGoogle Scholar
  36. Sakharkar KR, Sakharkar MK, Culiat CT, Chow VT, Pervaiz S (2006) Functional and evolutionary analyses on expressed intronless genes in the mouse genome. FEBS Lett 580:1472–1478PubMedCrossRefGoogle Scholar
  37. Schmid KJ, Aquadro CF (2001) The evolutionary analysis of “orphans” from the Drosophila genome identifies rapidly diverging and incorrectly annotated genes. Genetics 159:589–598PubMedGoogle Scholar
  38. Siew N, Fischer D (2003a) Analysis of singleton ORFans in fully sequenced microbial genomes. Proteins 53:241–251PubMedCrossRefGoogle Scholar
  39. Siew N, Fischer D (2003b) Twenty thousand ORFan microbial protein families for the biologist? Structure 11:7–9PubMedCrossRefGoogle Scholar
  40. Takeda S, Kadowaki S, Haga T, Takaesu H, Mitaku S (2002) Identification of G protein-coupled receptor genes from the human genome sequence. FEBS Lett 520:97–101PubMedCrossRefGoogle Scholar
  41. The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815CrossRefGoogle Scholar
  42. Veitia RA (2005) Paralogs in polyploids: one for all and all for one? Plant Cell 17:4–11PubMedCrossRefGoogle Scholar
  43. Vij S, Gupta V, Kumar D, Vydianathan R, Raghuvanshi S, Khurana P, Khurana JP, Tyagi AK (2006) Decoding the rice genome. Bioessays 28:421–432PubMedCrossRefGoogle Scholar
  44. Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Church DM, DiCuccio M, Edgar R, Federhen S, Helmberg W, Kenton DL, Khovayko O, Lipman DJ, Madden TL, Maglott DR, Ostell J, Pontius JU, Pruitt KD, Schuler GD, Schriml LM, Sequeira E, Sherry ST, Sirotkin K, Starchenko G, Suzek TO, Tatusov R, Tatusova TA, Wagner L, Yaschenko E (2005) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 33:D39–D45PubMedCrossRefGoogle Scholar
  45. Wilson AC, Carlson SS, White TJ (1977) Biochemical evolution. Annu Rev Biochem 46:573–639PubMedCrossRefGoogle Scholar
  46. Yu J, Wang J, Lin W, Li S, Li H, Zhou J, Ni P, Dong W, Hu S, Zeng C, Zhang J, Zhang Y, Li R, Xu Z, Li X, Zheng H, Cong L, Lin L, Yin J, Geng J, Li G, Shi J, Liu J, Lv H, Li J, Deng Y, Ran L, Shi X, Wang X, Wu Q, Li C, Ren X, Li D, Liu D, Zhang X, Ji Z, Zhao W, Sun Y, Zhang Z, Bao J, Han Y, Dong L, Ji J, Chen P, Wu S, Xiao Y, Bu D, Tan J, Yang L, Ye C, Xu J, Zhou Y, Yu Y, Zhang B, Zhuang S, Wei H, Liu B, Lei M, Yu H, Li Y, Xu H, Wei S, He X, Fang L, Huang X, Su Z, Tong W, Tong Z, Ye J, Wang L, Lei T, Chen C, Chen H, Huang H, Zhang F, Li N, Zhao C, Huang Y, Li L, Xi Y, Qi Q, Li W, Hu W, Tian X, Jiao Y, Liang X, Jin J, Gao L, Zheng W, Hao B, Liu S, Wang W, Yuan L, Cao M, McDermott J, Samudrala R, Wong GK, Yang H (2005) The Genomes of Oryza sativa: a history of duplications. PLoS Biol 3:e38PubMedCrossRefGoogle Scholar
  47. Yuan Q, Ouyang S, Wang A, Zhu W, Maiti R, Lin H, Hamilton J, Haas B, Sultana R, Cheung F, Wortman J, Buell CR (2005) The Institute for Genomic Research Osa1 rice genome annotation database. Plant Physiol 138:18–26PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  • Mukesh Jain
    • 1
  • Paramjit Khurana
    • 1
  • Akhilesh K. Tyagi
    • 1
  • Jitendra P. Khurana
    • 1
    Email author
  1. 1.Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular BiologyUniversity of Delhi South CampusNew DelhiIndia

Personalised recommendations