- 5.7k Downloads
As we saw earlier in the visualizations provided by methods like PCA and SOM, it is often interesting to look for structure, or groupings, in the data. However, these methods do not explicitly define clusters; that is left to the pattern recognition capabilities of the scientist studying the plot. In many cases, however, it is useful to rely on somewhat more formal methods, and this is where clustering methods come in. They are usually based on objectwise similarities or distances, and since the late nineties have become hugely popular in the area of high-throughput measurement techniques in biology, such as DNA microarrays. There, the activities of tens of thousands of genes are measured, often as a function of a specic treatment, or as a time series. Of course, the question is which genes show the same activity pattern: if an unknown gene has much the same behaviour as another gene of which it is known that it is involved in a process like cell di_erentiation, one can hypothesise that the unknown gene is somehow related to this process as well.
KeywordsHierarchical Cluster Bayesian Information Criterion Cluster Center Partitional Method Adjusted Rand Index
Unable to display preview. Download preview PDF.