Multiobjective Clustering Using Support Vector Machine: Application to Microarray Cancer Data

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 243)

Abstract

Microarray technology facilitates the analysis and interpretation of the microarray expression profile of a huge amount of genes across different experimental conditions or tissue samples simultaneously. In this paper, a clustering technique is implemented on microarray cancer data using multiobjective genetic algorithm with non-dominated sorting GA (MOGA-NSGA-II). The two objective functions for this multiobjective clustering are optimization of cluster compactness as well as separation. The multiobjective technique is first used to produce a set of non-dominated solutions. We find high-confidence points for these non-dominated set using a fuzzy voting technique. SVM classifier is further trained by the selected training points which have high-confidence value. Finally, the remaining points are classified by trained SVM classifier. The performance of the proposed multiobjective clustering method has been compared to other microarray clustering algorithms for two publicly available cancer data sets, viz. ovarian and colon cancer data to establish its superiority.

Keywords

Fuzzy clustering Multiobjective genetic algorithm  Pareto optimality Support vector machine 

References

  1. 1.
    Heller, M.J.: DNA microarray technology: devices, systems, and applications. Annu. Rev. Biomed. Eng. 4(1), 129–153 (2002)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gassenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(15), 531–537 (1999)CrossRefGoogle Scholar
  3. 3.
    Jainand, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)Google Scholar
  4. 4.
    Wanga, W., Zhanga, Y.: On fuzzy cluster validity indices. Fuzzy Sets Syst. 158(19), 2095–2117 (2007)CrossRefGoogle Scholar
  5. 5.
    Jollie, I.: Principal component analysis. Wiley Online Library (2005)Google Scholar
  6. 6.
    Tsekouras, G.E., Papageorgiou, D., Kotsiantis, S., Kalloniatis, C., Pintelas, P.: Fuzzy clustering of categorical attributes and its use in analyzing cultural data. Int. J. Comput. Intell. 1(2), 147–151 (2004)Google Scholar
  7. 7.
    Huang, Z., Ng, M.K.: A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans. Fuzzy Syst. 7(4), 446–452 (1999)CrossRefGoogle Scholar
  8. 8.
    Deb, K., Pratap, A., Agrawal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)CrossRefGoogle Scholar
  9. 9.
    Vapnik, V.: Statistical learning theory. Wiley, New York (1998)MATHGoogle Scholar
  10. 10.
    Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. App. Math 20, 53–65 (1987)CrossRefMATHGoogle Scholar
  11. 11.
    Bandyopadhyay, S., Saba, S., Maulik, U., Deb, K.: A simulated annealing-based multiobjective optimization algorithm: AMOSA. IEEE Trans. Evol. Comput, 12(3) 269–283 (2008)Google Scholar

Copyright information

© Springer India 2014

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringNITRourkelaIndia

Personalised recommendations