A Betweenness Centrality Guided Clustering Algorithm and Its Applications to Cancer Diagnosis
Clustering has become one of the important data analysis techniques for the discovery of cancer disease. Numerous clustering approaches have been proposed in the recent years. However, handling of high-dimensional cancer gene expression datasets remains an open challenge for clustering algorithms. In this paper, we present an improved graph based clustering algorithm by applying edge betweenness criterion on spanning subgraph. We carry out empirical analysis on artificial datasets and five cancer gene expression datasets. Results of the study show that the proposed algorithm can effectively discover the cancerous tissues and it performs better than two recent graph based clustering algorithms in terms of cluster quality as well as modularity index.
KeywordsClustering Cancer diagnosis Betweenness Spanning subgraph Minimum spanning tree
- 5.Huttenhower, C., Flamholz, A.I., Landis, J.N., Sahi, S., Myers, C.L., Olszewski, K.L., Hibbs, M.A., Siemers, N.O., Troyanskaya, O.G., Coller, H.A.: Nearest Neighbor Networks: clustering expression data based on gene neighborhoods. BMC Bioinform. 8(250), 1–13 (2007)Google Scholar