Skip to main content
Log in

Metagenome-assembled genomes: concepts, analogies, and challenges

  • Letter to the Editor
  • Published:
Biophysical Reviews Aims and scope Submit manuscript

Abstract

Metagenome-assembled genomes (MAGs) are microbial genomes reconstructed from metagenome data. In the last few years, many thousands of MAGs have been reported in the literature, for a variety of environments and host-associated microbiota, including humans. MAGs have helped us better understand microbial populations and their interactions with the environment where they live; moreover most MAGs belong to novel species, therefore helping to decrease the so-called microbial dark matter. However, questions about the biological reality of MAGs have not, in general, been properly addressed. In this review, I define the notions of hypothetical MAGs and conserved hypothetical MAGs. These notions should help with the understanding of the biological reality of MAGs, their worldwide occurrence, and the efforts to improve MAG recovery processes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

References

Download references

Funding

The author was funded in part by a CNPq Senior Researcher Fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João C. Setubal.

Ethics declarations

Conflict of interest

The author declares no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

Genome completeness

The completeness of MAGs and draft isolate genomes can be estimated by determining the fraction of certain marker genes present in the genome for the particular prokaryotic clade to which the MAG or the isolate belongs. These marker genes are assumed to be required in all members of the clade.

Genome contamination

For a given isolate genome or MAG sequence, the percentage of the sequence that is estimated to belong to a different species.

Genome

The set of all DNA molecules in a cell.

Genome alignment

This is a particular case of DNA sequence alignment. A pairwise alignment algorithm seeks to establish a correspondence between positions in one sequence with positions in the other sequence, in order to maximize the matches between positions. When two sequences have 95% identity, this means that matches were found between 95% of the positions participating in the alignment. Because prokaryotic genomes have usually more than a million base pairs, and in some cases surpass ten million base pairs, their alignments require special programs, different from those employed to align shorter sequences. One popular program to align genomes is MUMmer (Kurtz et al. 2004).

Homology and orthology

Two DNA sequences (in particular, two gene sequences) are homologous if they share a common ancestor. Homology is therefore a biological concept. In practice, one has to resort to sequence similarity in order to infer homology. This has led to widespread misleading statements in the literature, where it is easy to find expressions such as “sequence X and Y have 55% homology”; what the authors of such statements mean is that sequence X and Y, when aligned, display 55% of sequence identity. When a homology relationship can be inferred between two DNA sequences in the absence of the complicating factor of duplications, the term orthology can be used. The expression “ortholog MAGs” is not standard and has been used in the spirit of the analogy between annotation of protein-coding genes and MAG similarity relationships proposed in the text.

Reads

The output of a DNA sequencing machine. The length of a read can vary from 50 bp to thousands of kbp, depending on the sequencing technology.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Setubal, J.C. Metagenome-assembled genomes: concepts, analogies, and challenges. Biophys Rev 13, 905–909 (2021). https://doi.org/10.1007/s12551-021-00865-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12551-021-00865-y

Navigation