Molecular Genetics and Genomics

, Volume 276, Issue 1, pp 1–12

Analysis of papaya BAC end sequences reveals first insights into the organization of a fruit tree genome

  • Chun Wan J. Lai
  • Qingyi Yu
  • Shaobin Hou
  • Rachel L. Skelton
  • Meghan R. Jones
  • Kanako L. T. Lewis
  • Jan Murray
  • Moriah Eustice
  • Peizhu Guan
  • Ricelle Agbayani
  • Paul H. Moore
  • Ray Ming
  • Gernot G. Presting
Original Paper
  • 205 Downloads

Abstract

Papaya (Carica papaya L.) is a major tree fruit crop of tropical and subtropical regions with an estimated genome size of 372 Mbp. We present the analysis of 4.7% of the papaya genome based on BAC end sequences (BESs) representing 17 million high-quality bases. Microsatellites discovered in 5,452 BESs and flanking primer sequences are available to papaya breeding programs at http://www.genomics.hawaii.edu/papaya/BES. Sixteen percent of BESs contain plant repeat elements, the vast majority (83.3%) of which are class I retrotransposons. Several novel papaya-specific repeats were identified. Approximately 19.1% of the BESs have homology to Arabidopsis cDNA. Increasing numbers of completely sequenced plant genomes and BES projects enable novel approaches to comparative plant genomics. Paired BESs of Carica, Arabidopsis, Populus, Brassica and Lycopersicon were mapped onto the completed genomes of Arabidopsis and Populus. In general the level of microsynteny was highest between closely related organisms. However, papaya revealed a higher degree of apparent synteny with the more distantly related poplar than with the more closely related Arabidopsis. This, as well as significant colinearity observed between peach and poplar genome sequences, support recent observations of frequent genome rearrangements in the Arabidopsis lineage and suggest that the poplar genome sequence may be more useful for elucidating the papaya and other rosid genomes. These insights will play a critical role in selecting species and sequencing strategies that will optimally represent crop genomes in sequence databases.

Keywords

Bacterial artificial chromosome Carica papaya Comparative genomics Microsatellite Genome mapping 

Abbreviations

BAC

Bacterial artificial chromosome

BES

BAC end sequence

kb

Kilobase

Mbp

Megabase pairs

MYA

Million years ago

nt

Nucleotide

SSR

Simple sequence repeat

Supplementary material

438_2006_122_MOESM1_ESM.pdf (205 kb)
Supplementary material

References

  1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410PubMedGoogle Scholar
  2. Arumuganathan K, Earle ED (1991) Nuclear DNA content of some important plant species. Plant Mol Biol Rep 9(3):211–215Google Scholar
  3. Bowers JE, Chapman BA, Rong J, Paterson AH (2003) Unraveling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422:433–438PubMedCrossRefGoogle Scholar
  4. Chen M, Presting G, Barbazuk WB, Goicoechea JL, Blackmon B, Fang G, Kim H, Frisch D, Yu Y, Sun S, Higingbottom S, Phimphilai J, Phimphilai D, Thurmond S, Gaudette B, Li P, Liu J, Hatfield J, Main D, Farrar K, Henderson C, Barnett L, Costa R, Williams B, Walser S, Atkins M, Hall C, Budiman MA, Tomkins JP, Luo M, Bancroft I, Salse J, Regad F, Mohapatra T, Singh NK, Tyagi AK, Soderlund C, Dean RA, Wing RA (2002) An integrated physical and genetic map of the rice genome. Plant Cell 14:1–10CrossRefGoogle Scholar
  5. Cheng Z, Presting G, Buell CR, Wing RA, Jiang J (2001) High-resolution pachytene chromosome mapping of bacterial artificial chromosomes anchored by genetic markers reveals the centromere location and the distribution of genetic recombination along chromosome 10 of rice. Genetics 157:1749–1757PubMedGoogle Scholar
  6. Choi S, Creelman RA, Mullet JE, Wing RA (1995) Construction and characterization of a bacterial artificial chromosome library of Arabidopsis thaliana. Plant Mol Biol Rep 13:124–129CrossRefGoogle Scholar
  7. Ewing B, Green P (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8:186–194PubMedGoogle Scholar
  8. Ewing B, Hillier L, Wend MC, Green P (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8:175–185PubMedGoogle Scholar
  9. Georgi LL, Wang Y, Reighard GL, Mao L, Wing RA, Abbott AG (2003) Comparison of peach and Arabidopsis genomic sequences: fragmentary conservation of gene neighborhoods. Genome 46:268–276PubMedCrossRefGoogle Scholar
  10. Goff SA, Ricke D, Lan T, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange B, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun W, Chen L, Cooper B, Park S, Wood T, Mao L, Quail P, Wing R, Dean R, Yu Y, Zharkikh A, Shen R, Sahasrabudhe S, Thomas A, Cannings R, Gutin A, Pruss D, Reid J, Tavtigian S, Mitchell J, Eldredge G, Scholl T, Miller R, Bhatnagar S, Adey N, Rubano T, Tusneem N, Robinson R, Feldhaus J, Macalma T, Oliphant A, Briggs S (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100PubMedCrossRefGoogle Scholar
  11. Hong CP, Lee SJ, Park JY, Plaha P, Park YS, Lee YK, Choi JE, Kim KY, Lee JH, Lee J, Jin H, Choi SR, Lim YP (2004) Construction of a BAC library of Korean ginseng and initial analysis of BAC-end sequences. Mol Genet Genomics 271:709–716PubMedCrossRefGoogle Scholar
  12. Huang S, van der Vossen EAG, Kuang H, Vleeshouwers VGAA, Zhang N, Borm TJA, van Eck HJ, Baker B, Jacobsen E, Visser RGF (2005) Comparative genomics enabled the isolation of the R3a late blight resistance gene in potato. Plant J 42:251–261PubMedCrossRefGoogle Scholar
  13. Ilic K, SanMiguel PJ, Bennetzen JL (2003) A complex history of rearrangements in an orthologous region of the maize, sorghum, and rice genomes. Proc Natl Acad Sci USA 100:12265–12270PubMedCrossRefGoogle Scholar
  14. International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800CrossRefGoogle Scholar
  15. Judd WS, Campbell CS, Kellogg EA, Stevens PF, Donoghue MJ (2002) Plant systematics: a phylogenetic approach, 2nd edn. Sinauer Associates, Inc. SunderlandGoogle Scholar
  16. Jung S, Abbott A, Jesudurai C, Tomkins J, Main D (2005) Frequency, type, distribution and annotation of simple sequence repeats in Rosaceae ESTs. Funct Integr Genomics 5:136–143PubMedCrossRefGoogle Scholar
  17. Katti M, Ranjekar PK, Gupta VS (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol 18:1161–1167PubMedGoogle Scholar
  18. Kim MS, Moore PH, Zee F, Fitch MM, Steiger DL, Manshardt RM, Paull RE, Drew RA, Sekioka T, Ming R (2002) Genetic diversity of Carica papaya as revealed by AFLP markers. Genome 45:503–512PubMedCrossRefGoogle Scholar
  19. Lange BM, Presting G (2004) Genomic survey of metabolic pathways in rice. In: Romeo JT (ed) Recent advances in phytochemistry. Elsevier, Amsterdam, pp 111–137Google Scholar
  20. Liu Z, Moore PH, Ma H, Ackerman CM, Ragiba M, Yu Q, Pearl HM, Kim MS, Chartton JW, Stiles JI, Zee FT, Paterson AH, Ming R (2004) A primitive Y chromosome in papaya marks incipient sex chromosome evolution. Nature 427:348–352PubMedCrossRefGoogle Scholar
  21. Ma H, Moore PH, Liu Z, Kim MS, Yu Q, Fitch MM, Sekioka T, Paterson AH, Ming R (2004) High-density linkage mapping revealed suppression of recombination at the sex determination locus in papaya. Genetics 166:419–436PubMedCrossRefGoogle Scholar
  22. Mao L, Wood T, Yu Y, Budiman MA, Tomkins J, Woo S, Sasinowski M, Presting G, Frisch D, Goff S, Dean RA, Wing RA (2000) Rice transposable elements: a survey of 73,000 sequence-tagged-connectors. Genome Res 10:982–990PubMedCrossRefGoogle Scholar
  23. Messing J, Bharti AK, Karlowski WM, Gundlach H, Kim HR, Yu Y, Wei F, Fuks G, Soderlund CA, Mayer KF, Wing RA (2004) Sequence composition and genome organization of maize. Proc Natl Acad Sci 101:14349–14354PubMedCrossRefGoogle Scholar
  24. Ming R, Moore PH, Zee F, Abbey CA, Ma H, Paterson AH (2001) Construction and characterization of a papaya BAC library as a foundation for molecular dissection of a tree-fruit genome. Theor Appl Genet 102:892–899CrossRefGoogle Scholar
  25. Mozo T, Fischer S, Shizuya H, Altmann T (1998) Construction and characterization of the IGF Arabidopsis BAC library. Mol Gen Genet 258:562–570PubMedCrossRefGoogle Scholar
  26. O’Neill CM, Bancroft I (2000) Comparative physical mapping of segments of the genome of Brassica olearacea var. alboglabra that are homeologous to sequenced regions of chromosomes 4 and 5 of Arabidopsis thaliana. Plant J 23:233–43PubMedCrossRefGoogle Scholar
  27. Rice Chromosome 10 Sequencing Consortium (2003) In-depth view of structure, activity and evolution of rice chromosome 10. Science 300:1566–1569CrossRefGoogle Scholar
  28. Rong J, Bowers JE, Schulze SR, Waghmare VN, Rogers CJ, Pierce GJ, Zhang H, Estill JC, Paterson AH (2005) Comparative genomics of Gossypium and Arabidopsis: unraveling the consequences of both ancient and recent polyploidy. Genome Res 15:1198–1210PubMedCrossRefGoogle Scholar
  29. Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S, Misener S (eds) Bioinformatics methods and protocols (methods in molecular biology). Humana Press, Totowa, pp 365–386Google Scholar
  30. Schlueter JA, Dixon P, Granger C, Grant D, Clark L, Doyle JJ, Shoemaker RC (2004) Mining EST databases to resolve evolutionary events in major crop species. Genome 47:868–877PubMedCrossRefGoogle Scholar
  31. Shizuya H, Birren B, Kim UJ, Mancino V, Slepak T, Tachiiri Y, Simon M (1992) Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coli using an F-factor-based vector. Proc Natl Acad Sci USA 89:8794–8797PubMedCrossRefGoogle Scholar
  32. Temnykh S, DeClerck G, Lukashova A, Lipoviich L, Cartinhour, McCouch S (2001) Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res 11:1441–1452PubMedCrossRefGoogle Scholar
  33. The Arabidopsis Genome Initiative (2000) Analysis of the genome structure of the flowering plant Arabidopsis thaliana. Nature 408:796–815CrossRefGoogle Scholar
  34. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 24:4876–4882CrossRefGoogle Scholar
  35. Tomkins J, Fregene M, Main D, Kim H, Wing R, Tohme J (2004) Bacterial artificial chromosome (BAC) library resource for positional cloning of pest and disease resistance genes in cassava (Manihot esculenta Crantz). Plant Mol Biol 56:555–561PubMedCrossRefGoogle Scholar
  36. Van Droogenbroeck B, Breyne P, Goetghebeur P, Romeijn-Peeters E, Kyndt T, Gheysen G (2002) AFLP analysis of genetic relationships among papaya and its wild relatives (Caricaceae) from Ecuador. Theor Appl Genet 105:289–297PubMedCrossRefGoogle Scholar
  37. Wikström N, Savolainen V, Chase M (2001) Evolution of the angiosperms: calibrating the family tree. Proc R Soc Lond B 268:2211–2220CrossRefGoogle Scholar
  38. Yan L, Loukoianov A, Tranquilli G, Helguera M, Fahima T, Dubcovsky J (2003) Positional cloning of the wheat vernalization gene VRN1. Proc Natl Acad Sci USA 100:6263–6268PubMedCrossRefGoogle Scholar
  39. Yang Y-W, Lai K-N, Tai P-Y, Li W-H (1999) Rates of nucleotide substitution in angiosperm mitochondrial DNA sequences and dates of divergence between Brassica and other angiosperm lineages. J Mol Evol 48:597–604PubMedCrossRefGoogle Scholar
  40. Zhao S, Shatsman S, Ayodeji B, Geer K, Tsegaye G, Krol M, Gebregeorgis E, Shvartsbeyn A, Russell D, Overton L, Jiang L, Dimitrov G, Tran K, Shetty J, Malek JA, Feldblyum T, Nierman WC, Fraser CM (2001) Mouse BAC ends quality assessment and sequence analyses. Genome Res 11:1736–1745PubMedCrossRefGoogle Scholar
  41. Zhu H, Kim D-J, Baek J-M, Choi H-K, Ellis LC, Küester H, McCombie WR, Peng H-M, Cook DR (2003) Syntenic relationships between Medicago trunculata and Arabidopsis reveal extensive divergence of genome organization. Plant Physiol 131:1028–1026CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  • Chun Wan J. Lai
    • 1
  • Qingyi Yu
    • 2
  • Shaobin Hou
    • 3
  • Rachel L. Skelton
    • 2
  • Meghan R. Jones
    • 2
  • Kanako L. T. Lewis
    • 3
  • Jan Murray
    • 2
  • Moriah Eustice
    • 1
    • 2
  • Peizhu Guan
    • 1
    • 2
  • Ricelle Agbayani
    • 1
    • 2
  • Paul H. Moore
    • 4
  • Ray Ming
    • 2
    • 5
  • Gernot G. Presting
    • 1
  1. 1.Department of Molecular Biosciences and BioengineeringUniversity of Hawai‘iHonoluluUSA
  2. 2.Hawaii Agriculture Research CenterAieaUSA
  3. 3.Center for Genomics, Proteomics and Bioinformatics Research InitiativeUniversity of Hawai‘iHonoluluUSA
  4. 4.USDA-ARS, Pacific Basin Agricultural Research CenterHiloUSA
  5. 5.Department of Plant BiologyUniversity of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations