Unsupervised Machine Learning Approach for Gene Expression Microarray Data Using Soft Computing Technique

  • Madhurima Rana
  • Prachi Vijayeeta
  • Utsav Kar
  • Madhabananda Das
  • B. S. P. Mishra
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 43)


Machine learning is a burgeoning technology used for extractions of knowledge from an ocean of data. It has robust binding with optimization and artificial intelligence that delivers theory, methodologies and application domain to the field of statistics and computer science. Machine learning tasks are broadly classified into two groups namely supervised learning and unsupervised learning. The analysis of the unsupervised data requires thorough computational activities using different clustering algorithms. Microarray gene expression data are taken into consideration for cluster regulating genes from non-regulating genes. In our work optimization technique (Cat Swarm Optimization) is used to minimize the number of cluster by evaluating the Euclidean distance among the centroids. A comparative study is being carried out by clustering the regulating genes before optimization and after optimization. In our work Principal component analysis (PCA) is incorporated for dimensionality reduction of vast dataset to ensure qualitative cluster analysis.


Gene expression Microarray data Principal component analysis (PCA) Hierarchical clustering (HC) Cat swarm optimization (CSO) 


  1. 1.
    Ma, P.C.H., Chan, K.C.C., Xin, Y., Chiu, D.K.Y.: An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Trans. Evol. Comput. 10(3), 296–314 (2006)Google Scholar
  2. 2.
    Witten, I.H., Frank, E., Hall, M.A.: Data Mining—Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2005)Google Scholar
  3. 3.
    Thamaraiselvi, G., Kaliammal, A.: A data mining: concepts and techniques. SRELS J. Inform. Manage. 41(4), 339–348 (2004)Google Scholar
  4. 4.
    Roy, S., Chakraborty, U.: Introduction to soft computing: NeuroFuzzy and Genetic Algorithms. Pearson PublicationGoogle Scholar
  5. 5.
    Dudoit, S., Gentleman, R.: Cluster analysis in DNA microarray experiments. Bioconductor Short Course Winter (2002)Google Scholar
  6. 6.
    Gibbons, F.D., Roth, F.P.: Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res. 12(10), 1574–1581 (2002)CrossRefGoogle Scholar
  7. 7.
    Deng, Y., Kayarat, D., Elasri, M.O., Brown, S.J.: Microarray data clustering using particle swarm optimization K-means algorithm. In: Proceedings 8th JCIS, pp. 1730–1734 (2005)Google Scholar
  8. 8.
    Lee, K.M., Chung, T.S., Kim, J.H.: Global optimization of clusters in gene expression data of DNA microarrays by deterministic annealing. Genom. Inform. 1(1), 20–24 (2003)Google Scholar
  9. 9.
    Dudoit, S., Gentleman, R.: Cluster analysis in DNA microarray experiments. Bioconductor Short Course Winter (2002)Google Scholar
  10. 10.
    Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)Google Scholar
  11. 11.
    Jiang, D., Chun, T., Aidong, Z.: Cluster analysis for gene expression data: A survey. IEEE Trans. Knowled. Data Eng. 16(11), 1370–1386 (2004)CrossRefGoogle Scholar
  12. 12.
    Dey, L., Mukhopadhyay, A.: Microarray gene expression data clustering using PSO based K-means algorithm. UACEE Int. J. Comput. Sci. Appl. 1(1), 232–236 (2009)Google Scholar
  13. 13.
    Andreopoulos, B., An, A., Wang, X., Schroeder, M.: A roadmap of clustering algorithms: finding a match for a biomedical application. Briefings Bioinform. 10(3), 297–314 (2009)CrossRefGoogle Scholar
  14. 14.
    Santosa, B., Ningrum, M.K.: Cat swarm optimization for clustering and pattern recognition. In: International Conference of Soft Computing SOCPAR’09, pp. 54–59. 20 (2009)Google Scholar
  15. 15.
    Yin, L., Huang, C.H., Ni, J.: Clustering of gene expression data: performance and similarity analysis. BMC Bioinform. (2006)Google Scholar
  16. 16.
    Priscilla, R., Swamynathan, S.: Efficient two dimensional clustering of microarray gene expression data by means of hybrid similarity measure. In Proceedings of the International Conference on Advances in Computing, Communications and Informatics, pp. 1047–1053. ACM (2012)Google Scholar
  17. 17.
    Santosa, B., Ningrum, M.K.: Cat swarm optimization for clustering. In: International Conference of in Soft Computing and Pattern Recognition, pp. 54–59 (2009)Google Scholar
  18. 18.
    Iassargir, M., Ahhmad, A.: A hybrid multi-objective PSO method discover biclusters in microarray data. Mohsen. Int. J. Comput. (2009)Google Scholar
  19. 19.
    Karaboga, D., Ozturk, C.: A novel clustering approach: artificial bee colony (ABC) algorithm. Appl. Soft Comput. 652–657 (2011)Google Scholar
  20. 20.
    Castellanos-Garzón, J.A., Diaz, F.: An evolutionary and visual framework for clustering of DNA microarray data. J. Integr. Bioinform. 10, 232–232 (2012)Google Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  • Madhurima Rana
    • 1
  • Prachi Vijayeeta
    • 1
  • Utsav Kar
    • 1
  • Madhabananda Das
    • 1
  • B. S. P. Mishra
    • 1
  1. 1.KIIT UniversityBhubaneswarIndia

Personalised recommendations