Tree Genetics & Genomes

, Volume 7, Issue 5, pp 933–940 | Cite as

Genome-wide BAC-end sequencing of Musa acuminata DH Pahang reveals further insights into the genome organization of banana

  • Rafael E. Arango
  • Roberto C. Togawa
  • Sebastien C. Carpentier
  • Nicolas Roux
  • Bas L. Hekkert
  • Gert H. J. Kema
  • Manoel T. SouzaJr.
Original Paper


Banana and plantain (Musa spp.) are grown in more than 120 countries in tropical and subtropical regions and constitute an important staple food for millions of people. A Musa acuminata ssp. malaccencis DH Pahang bacterial artificial chromosome (BAC) library (MAMB) was submitted for BAC-end sequencing. MAMB consists of 23,040 clones, with a 140-kbp average insert size, accounting for a five times coverage of the banana genome. A total of 46,080 reads were generated, and 42,750 (92.8%) high-quality sequences were obtained after trimming for vector and quality. Analysis of these data shows a GC content of 41.39%, whereas interspersed repeats comprise 32.3%. The most common repeated sequences found show homology to ribosomal RNA genes, particularly 18S rRNA, while the Ty3/gypsy type monkey retrotransposon is the most common retro element. The sequence data were used to generate a banana-specific repeat library containing 54 new repetitive elements which accounted for 11.86% of the total nucleotides. Simple sequence repeats represent 0.7% of the sequence data and allowed the identification of 2,455 potentially useful marker sites. Functional annotation identified 2,705 sequences that could code for proteins of known function. Microsynteny analysis shows a higher number of co-linear matches to Oryza sativa, in contrast to Arabidopsis thaliana. This database of BAC-end sequences is useful for the assembly of the complete banana genome sequence and is important for identification in functional genomics experiments.


BAC library Musa acuminata ssp. malaccencis GMGC Genomics 



This project was funded through a grant of the Stichting Het Groene Woudt and was partially supported by the Dioraphte Foundation. We thank Dr. Jane Grimwood (HudsonAlpha Institute for Bioinformatics, Huntsville, AL, USA) for the BAC-end sequencing and Stefaan Vandamme and Dr. Kris Laukens (CEPROMA, Antwerp, Belgium) for technical assistance with the protein searching.

Supplementary material

11295_2011_385_MOESM1_ESM.pdf (1.4 mb)
ESM 1 (PDF 1467 kb)
11295_2011_385_MOESM2_ESM.xls (38 kb)
Table 1a (XLS 37 kb)
11295_2011_385_MOESM3_ESM.xls (22 kb)
Table 1b (XLS 22 kb)


  1. Abrusan G, Grundmann N, DeMester L, Makalowski W (2009) TEclass—a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics 25:1329–1330PubMedCrossRefGoogle Scholar
  2. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389PubMedCrossRefGoogle Scholar
  3. Ashburner M, Ball C, Blake J et al (2000) Gene Ontology: tool for the unification of biology. Nat Genet 25:25–29PubMedCrossRefGoogle Scholar
  4. Balint-Kurti P, Clendennen S, Dolezelova M, Valarik M, Dolezel J, Beetham P, May G (2000) Identification and chromosomal localization of the monkey retrotransposon in Musa sp. Mol Gen Genet 263:908–915PubMedCrossRefGoogle Scholar
  5. Bartos J, Alkhimova O, Dolezelova M, De Langhe E, Dolezel J (2005) Nuclear genome size and genomic distribution of ribosomal DNA in Musa and Ensete (Musaceae): taxonomic implications. Cytogenet Genome Res 109:50–57. doi: 10.1159/000082381 PubMedCrossRefGoogle Scholar
  6. Bennett M, Smith J (1991) Nuclear-DNA amounts in angiosperms. Philos Trans The R Soc Lond Ser 334:309–345CrossRefGoogle Scholar
  7. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL (2002) GenBank. Nucleic Acids Res 30:17–20PubMedCrossRefGoogle Scholar
  8. Carpentier SC, Witters E, Laukens K, Onckelen HV, Swennen R, Panis B (2007) Banana (Musa spp.) as a model to study the meristem proteome: acclimation to osmotic stress. Proteomics 7:92–105. doi: 10.1002/pmic.200600533 PubMedCrossRefGoogle Scholar
  9. Carpentier SC, Coemans B, Podevin N, Laukens K, Witters E, Matsumura H, Terauchi R, Swennen R, Panis B (2008a) Functional genomics in a non-model crop: transcriptomics or proteomics? Physiol Plant 133:117–130. doi: 10.1111/j.1399-3054.2008.01069.x PubMedCrossRefGoogle Scholar
  10. Carpentier SC, Panis B, Vertommen A, Swennen R, Sergeant K, Renaut J, Laukens K, Witters E, Samyn B, Devreese B (2008b) Proteome analysis of non-model plants: a challenging but powerful approach. Mass Spectrom Rev 27:354–377. doi: 10.1002/mas.20170 PubMedCrossRefGoogle Scholar
  11. Cheung F, Town CD (2007) A BAC end view of the Musa acuminata genome. BMC Plant Biol 7:29PubMedCrossRefGoogle Scholar
  12. Dsouza M, Larsen N, Overbeek, R (1997) Searching for patterns in genomic data. Trends Genet 13:497–498PubMedCrossRefGoogle Scholar
  13. Gasteiger E, Jung E, Bairoch A (2001) SWISS-PROT: connecting biomolecular knowledge via a protein database. Curr Issues Mol Biol 3:47–56PubMedGoogle Scholar
  14. Goff SA, Ricke D, Lan T-H, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100PubMedCrossRefGoogle Scholar
  15. Hong CP, Plaha P, Koo DH, Yang TJ, Choi SR, Lee YK, Uhm T, Bang JW, Edwards D, Bancroft I, Park BS, Lee J, Lim YP (2006) A survey of the Brassica rapa genome by BAC-end sequence analysis and comparison with Arabidopsis thaliana. Mol Cells 22:300–307PubMedGoogle Scholar
  16. Hribova E, Dolezelova M, Town CD, Macas J, Dolezel J (2007) Isolation and characterization of the highly repeated fraction of the banana genome. Cytogenet Genome Res 119:268–274PubMedCrossRefGoogle Scholar
  17. Hribova E, Neumann P, Matsumoto T, Roux N, Macas J, Dolezel J (2010) Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing. BMC Plant Biol 10(1):204PubMedCrossRefGoogle Scholar
  18. Huo N, Lazo G, Vogel J et al (2008) The nuclear genome of Brachypodium distachyon: analysis of BAC end sequences. Funct Integr Genomics 8:135–147. doi: 10.1007/s10142-007-0062-7 PubMedCrossRefGoogle Scholar
  19. Lai C, Yu Q, Hou S et al (2006) Analysis of papaya BAC end sequences reveals first insights into the organization of a fruit tree genome. Mol Genet Genomics 276:1–12. doi: 10.1007/s00438-006-0122-z PubMedCrossRefGoogle Scholar
  20. Lescot M, Piffanelli P, Ciampi AY et al (2008) Insights into the Musa genome: syntenic relationships to rice and between Musa species. BMC Genomics 9:58PubMedCrossRefGoogle Scholar
  21. Lysak M, Dolezelova M, Horry J, Swennen R, Dolezel J (1999) Flow cytometric analysis of nuclear DNA content in Musa. Theor Appl Genet 98:1344–1350CrossRefGoogle Scholar
  22. Marin DH, Romero RA, Guzman M, Sutton TB (2003) Black Sigatoka: an increasing threat to banana cultivation. Plant Dis 87:208–222CrossRefGoogle Scholar
  23. Masoudi-Nejad A, Tonomura K, Kawashima S, Moriya Y, Suzuki M, Itoh M, Kanehisa M, Endo T, Goto S (2006) EGassembler: online bioinformatics service for large-scale processing, clustering and assembling ESTs and genomic DNA fragments. Nucleic Acids Res 34:459–462CrossRefGoogle Scholar
  24. Osuji J, Harrison G, Crouch J, Heslop-Harrison J (1997) Identification of the genomic constitution of Musa L. lines (bananas, plantains and hybrids) using molecular cytogenetics. Ann Bot 80:787–793CrossRefGoogle Scholar
  25. Paux E, Roger D, Badaeva E, Gay G, Bernard M, Sourdille P, Feuillet C (2006) Characterizing the composition and evolution of homoeologous genomes in hexaploid wheat through BAC-end sequencing on chromosome 3B. Plant J 48:463–474PubMedCrossRefGoogle Scholar
  26. Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. Meth Mol Biol 132:365–386Google Scholar
  27. SanMiguel P, Gaut B, Tikhonov A, Nakajima Y, Bennetzen J (1998) The paleontology of intergene retrotransposons of maize. Nat Genet 20:43–45PubMedCrossRefGoogle Scholar
  28. Schoof H, Ernst R, Nazarov V, Pfeifer L, Mewes H, Mayer K (2004) MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics. Nucleic Acid Res 32:D373–D376. doi: 10.1093/nar/gkh068 PubMedCrossRefGoogle Scholar
  29. Shultz J, Kazi S, Bashir R, Afzal J, Lightfoot D (2007) The development of BAC-end sequence-based microsatellite markers and placement in the physical and genetic maps of soybean. Theor Appl Genet 114:1081–1090PubMedCrossRefGoogle Scholar
  30. Tatusov RL, Fedorova ND, Jackson JD et al (2003) The COG database: an updated version includes eukaryotes. BMC Bioinform 4:41. doi: 10.1186/1471-2105-4-41 CrossRefGoogle Scholar
  31. Valarik M, Simkova H, Hribova E, Safar J, Dolezelova M, Dolezel J (2002) Isolation, characterization and chromosome localization of repetitive DNA sequences in bananas (Musa spp.). Chromosome Res 10:89–100PubMedCrossRefGoogle Scholar
  32. Venter JC, Smith HO, Hood L (1996) A new strategy for genome sequencing. Nature 381:364–366PubMedCrossRefGoogle Scholar
  33. Vij S, Gupta V, Kumar D, Vydianathan R, Raghuvanshi S, Khurana P, Khurana J, Tyagi A (2006) Decoding the rice genome. Bioessays 28:421–432. doi: 10.1002/bies.20399 PubMedCrossRefGoogle Scholar
  34. Yu J, Hu S, Wang J et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296:79–92PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Rafael E. Arango
    • 1
    • 2
  • Roberto C. Togawa
    • 3
  • Sebastien C. Carpentier
    • 4
  • Nicolas Roux
    • 5
  • Bas L. Hekkert
    • 6
  • Gert H. J. Kema
    • 6
  • Manoel T. SouzaJr.
    • 7
  1. 1.Unidad de Biotecnología Vegetal UNALMED-CIB, Corporación para Investigaciones Biológicas (CIB)MedellínColombia
  2. 2.Escuela de Biociencias, Facultad de CienciasUniversidad NacionalMedellínColombia
  3. 3.Embrapa Genetic Resources & BiotechnologyBrasíliaBrazil
  4. 4.Division of Crop BiosystemsK.U.LeuvenLeuvenBelgium
  5. 5.Global Musa Genomics ConsortiumBioversity InternationalMontpellierFrance
  6. 6.Plant Research InternationalWageningenThe Netherlands
  7. 7.Embrapa LABEX EuropeWageningenThe Netherlands

Personalised recommendations