Skip to main content
Log in

Machine learning improves genome quality prediction across the microbial tree of life

  • Research Briefing
  • Published:

From Nature Methods

View current issue Submit your manuscript

CheckM2 is a tool that applies machine learning to evaluate the quality of genomes from metagenomic data. CheckM2 is faster and more accurate than existing methods, and it outperforms them when applied to novel lineages and lineages with reduced genome sizes, such as Patescibacteria and the DPANN superphylum.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1: Genome quality predictions of CheckM2 compared to CheckM1.

References

  1. Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662 (2019). This article provides an example of the scale of MAG recovery from metagenomic data.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). This article presents the groundbreaking application of machine learning to address the protein folding problem.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Tang, B. et al. Recent advances of deep learning in bioinformatics and computational biology. Front. Genet. 10, 214 (2019). This review explains the nature of machine learning and how it is relevant to diverse biological problems.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Parks, D. H. et al. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015). This paper presents CheckM1 — the basis for designing CheckM2 and one of the most popular tools used to assess genome quality.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Simão, F. A. et al. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015). This article describes BUSCO, a highly popular alternative tool used to assess genome quality.

    Article  PubMed  Google Scholar 

Download references

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is a summary of: Chklovski, A., Parks, D. H., Woodcroft, B. J. & Tyson, G. W. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat. Methods https://doi.org/10.1038/s41592-023-01940-w (2023).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Machine learning improves genome quality prediction across the microbial tree of life. Nat Methods 20, 1137–1138 (2023). https://doi.org/10.1038/s41592-023-01941-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-023-01941-9

  • Springer Nature America, Inc.

Navigation