Using BUSCO to Assess Insect Genomic Resources

  • Robert M. WaterhouseEmail author
  • Mathieu Seppey
  • Felipe A. Simão
  • Evgeny M. Zdobnov
Part of the Methods in Molecular Biology book series (MIMB, volume 1858)


The increasing affordability of sequencing technologies offers many new and exciting opportunities to address a diverse array of biological questions. This is evidenced in entomological research by numerous genomics and transcriptomics studies that attempt to decipher the often complex relationships among different species or orders and to build “omics” resources to drive advancement of the molecular understanding of insect biology. Being able to gauge the quality of the sequencing data is of critical importance to understanding the potential limitations on the types of questions that these data can be reliably used to address. This chapter details the use of the Benchmarking Universal Single-Copy Orthologue (BUSCO) assessment tool to estimate the completeness of transcriptomes, genome assemblies, and annotated gene sets in terms of their expected gene content.

Key words

Genomics Transcriptomics Genome annotation Completeness assessment Single-copy orthologues 



R.M.W. was supported by Swiss National Science Foundation award PP00P3_170664.


  1. 1.
    Richards S, Murali SC (2015) Best practices in insect genome sequencing: what works and what doesn’t. Curr Opin Insect Sci 7:1–7CrossRefGoogle Scholar
  2. 2.
    Gurevich A, Saveliev V, Vyahhi N et al (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072–1075CrossRefGoogle Scholar
  3. 3.
    Hunt M, Kikuchi T, Sanders M et al (2013) REAPR: a universal tool for genome assembly evaluation. Genome Biol 14:R47CrossRefGoogle Scholar
  4. 4.
    Allen SL, Delaney EK, Kopp A et al (2017) Single-Molecule Sequencing of the Drosophila serrata Genome. G3: Genes, Genomes, Genetics 7:781–788CrossRefGoogle Scholar
  5. 5.
    Davey JW, Chouteau M, Barker SL et al (2016) Major improvements to the Heliconius melpomene genome assembly used to confirm 10 chromosome fusion events in 6 million years of butterfly evolution. G3: Genes, Genomes, Genetics 6:695–708CrossRefGoogle Scholar
  6. 6.
    Kanost MR, Arrese EL, Cao X et al (2016) Multifaceted biological insights from a draft genome sequence of the tobacco hornworm moth, Manduca sexta. Insect Biochem Mol Biol 76:118–147CrossRefGoogle Scholar
  7. 7.
    Nowell RW, Elsworth B, Oostra V et al (2017) A high-coverage draft genome of the mycalesine butterfly Bicyclus anynana. GigaScience 6:1–7CrossRefGoogle Scholar
  8. 8.
    Papanicolaou A, Schetelig MF, Arensburger P et al (2016) The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedemann), reveals insights into the biology and adaptive evolution of a highly invasive pest species. Genome Biol 17:192CrossRefGoogle Scholar
  9. 9.
    Benoit JB, Adelman ZN, Reinhardt K et al (2016) Unique features of a global human ectoparasite identified through sequencing of the bed bug genome. Nat Commun 7:10165CrossRefGoogle Scholar
  10. 10.
    McKenna DD, Scully ED, Pauchet Y et al (2016) Genome of the Asian longhorned beetle (Anoplophora glabripennis), a globally significant invasive species, reveals key functional and evolutionary innovations at the beetle-plant interface. Genome Biol 17:227CrossRefGoogle Scholar
  11. 11.
    Ioannidis P, Simão FA, Waterhouse RM et al (2017) Genomic features of the damselfly Calopteryx splendens representing a sister clade to most insect orders. Genome Biol Evol 9:415–430PubMedPubMedCentralGoogle Scholar
  12. 12.
    Simão FA, Waterhouse RM, Ioannidis P et al (2015) BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212CrossRefGoogle Scholar
  13. 13.
    Waterhouse RM, Seppey M, Simão FA et al (2017) BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol Biol Evol 35:543–548Google Scholar
  14. 14.
    Zdobnov EM, Tegenfeldt F, Kuznetsov D et al (2017) OrthoDB v9.1: Cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs. Nucleic Acids Res 45:D744–D749CrossRefGoogle Scholar
  15. 15.
    Waterhouse RM, Zdobnov EM, Kriventseva EV (2011) Correlating traits of gene retention, sequence divergence, duplicability and essentiality in vertebrates, arthropods, and fungi. Genome Biol Evol 3:75–86CrossRefGoogle Scholar
  16. 16.
    Waterhouse RM (2015) A maturing understanding of the composition of the insect gene repertoire. Curr Opin Insect Sci 7:15–23CrossRefGoogle Scholar
  17. 17.
    Eddy SR (2011) Accelerated Profile HMM Searches. PLoS Comput Biol 7:e1002195CrossRefGoogle Scholar
  18. 18.
    Keller O, Kollmar M, Stanke M et al (2011) A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics 27:757–763CrossRefGoogle Scholar
  19. 19.
    Camacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421CrossRefGoogle Scholar
  20. 20.
    Holt RA, Subramanian GM, Halpern A et al (2002) The genome sequence of the malaria mosquito Anopheles gambiae. Science 298:129–149CrossRefGoogle Scholar
  21. 21.
    Jiang X, Peery A, Hall AB et al (2014) Genome analysis of a major urban malaria vector mosquito, Anopheles stephensi. Genome Biol 15:459CrossRefGoogle Scholar
  22. 22.
    Neafsey DE, Waterhouse RM, Abai MR et al (2015) Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes. Science 347:1258522–1258522CrossRefGoogle Scholar
  23. 23.
    Giraldo-Calderón GI, Emrich SJ, MacCallum RM et al (2015) VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases. Nucleic Acids Res 43:D707–D713CrossRefGoogle Scholar
  24. 24.
    Peters RS, Krogmann L, Mayer C et al (2017) Evolutionary history of the hymenoptera. Curr Biol 27:1013–1018CrossRefGoogle Scholar
  25. 25.
    Petersen M, Meusemann K, Donath A et al (2017) Orthograph: a versatile tool for mapping coding nucleotide sequences to clusters of orthologous genes. BMC Bioinformatics 18:111CrossRefGoogle Scholar
  26. 26.
    Li W, Godzik A (2006) CD-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658–1659CrossRefGoogle Scholar
  27. 27.
    Korf I (2004) Gene finding in novel genomes. BMC Bioinformatics 5:59CrossRefGoogle Scholar
  28. 28.
    Waterhouse RM, Chen X, Bonizzoni M et al (2017) The third International Workshop on Aedes albopictus: building scientific alliances in the fight against the globally invasive Asian tiger mosquito. Pathog Global Health 111:161–165CrossRefGoogle Scholar
  29. 29.
    Campbell MS, Holt C, Moore B et al (2014) Genome annotation and curation using MAKER and MAKER-P. Curr Protoc Bioinformatics 48:4.11.1-39PubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Robert M. Waterhouse
    • 1
    Email author
  • Mathieu Seppey
    • 2
  • Felipe A. Simão
    • 2
  • Evgeny M. Zdobnov
    • 2
  1. 1.Department of Ecology and EvolutionUniversity of Lausanne, and Swiss Institute of BioinformaticsLausanneSwitzerland
  2. 2.University of Geneva and Swiss Institute of BioinformaticsGenevaSwitzerland

Personalised recommendations