Clustering-Based HMP Sequence Comparison
Comparison of sequences from human microbiome projects using sequence clustering methods
Sequence clustering is a computational method that groups similar sequences into families. Clustering sequences from multiple samples from human microbiome projects or other metagenomic projects can effectively compare these samples.
Numerous human microbiome projects and other metagenomic projects have sequenced many microbiome samples using high-throughput sequencing platforms. One of the key goals of these projects is to compare samples or groups of samples according to their composition and abundance profiles by taxon, gene, function, and pathway. These profiles are often calculated by comparing the sequences against various reference databases. However, reference-based methods cannot analyze the large number of novel sequences that are frequently found in metagenomics samples.
Clustering analysis is a data mining and classification method that assigns similar...
- Li W. Analysis and comparison of very large metagenomes with fast clustering and functional annotation. BMC Bioinforma. 2009;10:359.Google Scholar
- Li W, Fu L, Niu B, Wu S, Wooley J. Ultrafast clustering algorithms for metagenomic sequence analysis. Brief Bioinform. 2012. doi:10.1093/bib/bbs035.Google Scholar
- Niu B, Fu L, Sun S, et al. Artificial and natural duplicates in pyrosequencing reads of metagenomic data. BMC Bioinforma. 2010;11:187.Google Scholar
- Quince C, Lanzen A, Davenport RJ, et al. Removing noise from pyrosequenced amplicons. BMC Bioinforma. 2011;12:38.Google Scholar