Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Identification of the clustering structure in microbiome data by density clustering on the Manhattan distance

  • 171 Accesses

  • 2 Citations


Clustering technology is a method for grouping data points into clusters containing a group of similar data points. In a real dataset such as microbiome data, the data points are presented as profiles or a probability distribution. These data points form the periphery of a cluster, making it difficult to identify the real clustering structure. In this study, we used density clustering on several distance measures to overcome this difficulty. Experiments using a real dataset indicated that the Manhattan distance is an appropriate distance measure for clustering analysis of microbiome data.

This is a preview of subscription content, log in to check access.


  1. 1

    Cani P D. Gut microbiota and obesity: lessons from the microbiome. Brief Funct Genom, 2013, 12: 381–387

  2. 2

    DeWeerdt S. Microbiome: a complicated relationship status. Nature, 2014, 508: S61–S63

  3. 3

    Bornigen D, Morgan X C, Franzosa E A, et al. Functional profiling of the gut microbiome in disease-associated inflammation? Genom Med, 2013, 5: 65

  4. 4

    Caporaso J G, Kuczynski J, Stombaugh J, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Meth, 2010, 7: 335–336

  5. 5

    Gevers D, Pop M, Schloss P D, et al. Bioinformatics for the human microbiome project. PLoS Comput Biol, 2012, 8: e1002779

  6. 6

    Goodrich J K, Di Rienzi S C, Poole A C, et al. Conducting a microbiome study. Cell, 2014, 158: 250–262

  7. 7

    La Rosa P S, Shands B, Deych E, et al. Statistical object data analysis of taxonomic trees from human microbiome data. PLoS ONE, 2012, 7: e48996

  8. 8

    Arumugam M, Raes J, Pelletier E, et al. Enterotypes of the human gut microbiome. Nature, 2011, 473: 174–180

  9. 9

    Wang J, Linnenbrink M, Kunzel S, et al. Dietary history contributes to enterotype-like clustering and functional metagenomic content in the intestinal microbiome of wild mice. Proc Nat Acad Sci USA, 2014, 111: E2703–E2710

  10. 10

    Viaene L, Thijs L, Jin Y, et al. Heritability and clinical determinants of serum indoxyl sulfate and p-cresyl sulfate, candidate biomarkers of the human microbiome enterotype. PLoS ONE, 2014, 9: e79682

  11. 11

    Knights D, Ward T L, McKinlay C E, et al. Rethinking “enterotypes”. Cell Host Microbe, 2014, 16: 433–437

  12. 12

    Chen X, Hu X H, Lim T Y, et al. Exploiting the functional and taxonomic structure of genomic data by probabilistic topic modeling. IEEE/ACM Trans Comput Biol Bioinform, 2012, 9: 980–991

  13. 13

    Gevers D, Knight R, Petrosino J F, et al. The Human Microbiome Project: a community resource for the healthy human microbiome, PLoS Biol, 2012, 10: e1001377

  14. 14

    Peterson J, Garges S, Giovanni M, et al. The NIH human microbiome project. Genome Res, 2009, 19: 2317–2323

  15. 15

    Aggarwal C C, Reddy C K. Data Clustering: Algorithms and Applications. Boca Raton: CRC Press, 2013

  16. 16

    Rodriguez A, Laio A. Clustering by fast search and find of density peaks. Science, 2014, 344: 1492–1496

  17. 17

    Kurzyński P, Kaszlikowski D. Information-theoretic metric as a tool to investigate nonclassical correlations. Phys Rev A, 2014, 89: 012103

  18. 18

    Lellouch L, Pavoine S, Jiguet F, et al. Monitoring temporal change of bird communities with dissimilarity acoustic indices. Meth Ecol Evol, 2014, 5: 495–505

  19. 19

    Simpson G. CRAN task view: analysis of ecological and environmental data. 2014. https://cran.r-project.org/web /views/Environmetrics.html

  20. 20

    Bourguet D, Chaufaux J, Seguin M, et al. Frequency of alleles conferring resistance to Bt maize in French and US corn belt populations of the European corn borer, Ostrinia nubilalis. Theor Appl Genet, 2003, 106: 1225–1233

  21. 21

    Allen V M, Tinker D B, Hinton M H, et al. Dispersal of micro-organisms in commercial defeathering systems. Brit Poult Sci, 2003, 44: 53–59

Download references

Author information

Correspondence to Tingting He.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jiang, X., Hu, X. & He, T. Identification of the clustering structure in microbiome data by density clustering on the Manhattan distance. Sci. China Inf. Sci. 59, 070104 (2016). https://doi.org/10.1007/s11432-016-5587-8

Download citation


  • microbiome
  • information distance
  • data visualization
  • density clustering
  • microbial community