Stability-Based Model Order Selection in Clustering with Applications to Gene Expression Data
- First Online:
The concept of cluster stability is introduced to assess the validity of data partitionings found by clustering algorithms. It allows us to explicitly quantify the quality of a clustering solution, without being dependent on external information. The principle of maximizing the cluster stability can be interpreted as choosing the most self-consistent data partitioning. We present an empirical estimator for the theoretically derived stability index, based on resampling. Experiments are conducted on well known gene expression data sets, re-analyzing the work by Alon et al.  and by Spellman et al. .
Unable to display preview. Download preview PDF.