Abstract
In this article, we propose an index, called Gaussian Fuzzy-index (GFI), based on the notion of fuzzy set theory, for validating the clusters obtained by a clustering algorithm. This index is then used to identify some genes that have altered quite significantly from normal stage to diseased stage with respect to their expression patterns. Thus we can predict some possible disease mediating genes from microarray gene expression data. The methodology has been demonstrated on the gene expression data set dealing with human lung cancer. The performance of GFI is compared with 8 existing cluster validity indices. The results are appropriately validated using biochemical pathways. We have also implemented different cluster validity indices to demonstrate superior capability of GFI over the others.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bezdek, J.C.: On clustering validation techniques. J. Cybernet. 17, 58–73 (1974)
Deborah, L.J., Baskaran, R., Kannan, A.: A survey on internal validity measure for cluster validation. IJCSES 1, 85–102 (2010)
Dunn, J.C.: Well separated clusters and optimal fuzzy partitions. J. Cybern. 4, 95–104 (1974)
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Machine Intell. 1, 224–227 (1979)
Rousseeuw, P.J.: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20, 53–65 (1987)
Hubert, L., Schultz, J.: Quadratic assignment as a general data-analysis strategy. British Journal of Mathematical and Statistical Psychologie 29, 190–241 (1976)
Goodman, L., Kruskal, W.: Measures of associations for cross-validations. J. Am. Stat. Assoc. 49, 732–764 (1954)
Pauwels, E.J., Frederix, G.: Finding salient regions in images: nonparametric clustering for image segmentation and grouping. Computer Vision and Image Understanding 75, 73–85 (1999)
Trauwaert, E.: On the meaning of dunn’s partition coefficient for fuzzy clusters. Fuzzy Sets Systems 25, 217–242 (1988)
Yun, X.U., Brereton, G.R.: A comparative study of cluster validation indices applied to genotyping data. Chemometrics and Intelligent Laboratory Systems 78, 30–40 (2005)
Bensaid, A.M., Hall, L.O., Bezdek, J., Clarke, L.P., Silbiger, M.L., Arrington, J.A., Murtagh, R.F.: Validity-guided (re) clustering with applications to imige segmentation. IEEE Transactions on Fuzzy Systems 4, 112–123 (1996)
Wu, K., Yang, M.: A cluster validity index for fuzzy clustering. Pattern Recognition Lett. 26, 1275–1291 (2005)
Zadeh, L.A.: Fuzzy sets. Information and Control 8, 338–353 (1965)
Zadeh, L.A.: A fuzzy-set-theoretic interpretation of linguistic hedges. Journal of Cybernetics 2, 4–34 (1972)
Bandler, W., Kohout, L.J.: Fuzzy power sets and fuzzy implication operators. Fuzzy Sets and Systems 4, 13–30 (1980)
Xie, X.L., Beni, G.A.: Validity measure for fuzzy clustering. IEEE Trans. PAMI 3, 841–846 (1991)
Fukuyama, Y., Sugeno, M.: A new method of choosing the number of clusters for the fuzzy c-means method. In: Proceeding of fifth Fuzzy Syst. Symp., pp. 247–250 (1989)
Gath, I., Geva, A.B.: Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Anal. Machine Intell. 11, 773–781 (1989)
Dave, R.N.: Validating fuzzy partition obtained through c-shells clustering. Pattern Recognition Lett. 17, 613–623 (1996)
Akaike, H.: A bayesian extension of the minimum aic procedure of autoregressive model fitting. Biometrika 66, 237–242 (1979)
Pakhira, M., Bandyopadhyay, S., Maulik, U.: A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification. Fuzzy Sets and Systems 155, 191–214 (2005)
Beer, G.D., et al.: Gene-expression profilespredict survival of patients with lung adenocarcinoma. Nature Medicine 8, 816–823 (2002)
Dubes, R.C., Jain, A.K.: Algorithms for clustering data. Prentice Hall (1988)
Bezdek, J.: Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York (1981)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7, 179–188 (1936)
Gibbons, F.D., Roth, F.P.: Judging the quality of gene expression-based clustering methods using gene annotation. Genome Research 12, 1574–1581 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ghosh, A., De, R.K. (2013). Gaussian Fuzzy Index (GFI) for Cluster Validation: Identification of High Quality Biologically Enriched Clusters of Genes and Selection of Some Possible Genes Mediating Lung Cancer. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2013. Lecture Notes in Computer Science, vol 8251. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45062-4_96
Download citation
DOI: https://doi.org/10.1007/978-3-642-45062-4_96
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45061-7
Online ISBN: 978-3-642-45062-4
eBook Packages: Computer ScienceComputer Science (R0)