Advertisement

Clustering

  • Ron WehrensEmail author
Chapter
  • 5.7k Downloads
Part of the Use R book series (USE R)

Abstract

As we saw earlier in the visualizations provided by methods like PCA and SOM, it is often interesting to look for structure, or groupings, in the data. However, these methods do not explicitly define clusters; that is left to the pattern recognition capabilities of the scientist studying the plot. In many cases, however, it is useful to rely on somewhat more formal methods, and this is where clustering methods come in. They are usually based on objectwise similarities or distances, and since the late nineties have become hugely popular in the area of high-throughput measurement techniques in biology, such as DNA microarrays. There, the activities of tens of thousands of genes are measured, often as a function of a specic treatment, or as a time series. Of course, the question is which genes show the same activity pattern: if an unknown gene has much the same behaviour as another gene of which it is known that it is involved in a process like cell di_erentiation, one can hypothesise that the unknown gene is somehow related to this process as well.

Keywords

Hierarchical Cluster Bayesian Information Criterion Cluster Center Partitional Method Adjusted Rand Index 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.Research and Innovation CentreFondazione Edmund MachSan Michele all’AdigeItaly

Personalised recommendations