BUSCO: Assessing Genome Assembly and Annotation Completeness
Genomics drives the current progress in molecular biology, generating unprecedented volumes of data. The scientific value of these sequences depends on the ability to evaluate their completeness using a biologically meaningful approach. Here, we describe the use of the BUSCO tool suite to assess the completeness of genomes, gene sets, and transcriptomes, using their gene content as a complementary method to common technical metrics. The chapter introduces the concept of universal single-copy genes, which underlies the BUSCO methodology, covers the basic requirements to set up the tool, and provides guidelines to properly design the analyses, run the assessments, and interpret and utilize the results.
Key wordsBUSCO Orthologs Genome completeness Quality assessment Gene content Phylogenomics
We would like to thank all members of the Zdobnov group, in particular Felipe Simão and Christopher Rands for their useful comments. This work was partly supported by the Swiss Institute of Bioinformatics SER funding and the Swiss National Science Foundation funding 31003A_166483 to E.Z.
- 8.Kriventseva EV, Kuznetsov D, Tegenfeldt F et al (2019) OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res 47:D807–D811. https://doi.org/10.1093/nar/gky1053 CrossRefPubMedGoogle Scholar
- 12.Araujo NS, Santos PKF, Arias MC (2018) RNA-Seq reveals that mitochondrial genes and long non-coding RNAs may play important roles in the bivoltine generations of the non-social Neotropical bee Tetrapedia diversipes. Apidologie 49:3–12. https://doi.org/10.1007/s13592-017-0542-2 CrossRefGoogle Scholar
- 17.Plomion C, Aury J-M, Amselem J et al (2018) Oak genome reveals facets of long lifespan. Nat Plants. https://doi.org/10.1038/s41477-018-0172-3
- 21.Core Team R (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
- 24.Blanco E, Parra G, Guigó R (2007) Using geneid to identify genes. In: Baxevanis AD, Davison DB, Page RDM et al (eds) Current protocols in bioinformatics. John Wiley & Sons, Inc., Hoboken, NJGoogle Scholar