Skip to main content
Log in

Bi-clustering of microarray data using a symmetry-based multi-objective optimization framework

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

High-throughput technologies, like DNA microarray, help in simultaneous monitoring of the expression levels of thousands of genes during important biological processes and over the collection of experimental conditions. Automatically uncovering functionally related genes is a basic building block to solve various problems related to functional genomics. But sometimes a subset of genes may not be similar with respect to all the conditions present in the dataset; thus, bi-clustering concept becomes popular where different subsets of genes and the corresponding subsets of conditions with respect to which genes are most similar are automatically identified. In the current study, we have posed this problem in the multi-objective optimization (MOO) framework where different bi-cluster quality measures are optimized simultaneously. The search potentiality of a simulated annealing-based MOO technique, AMOSA, is used for the simultaneous optimization of these measures. A case study on the suitability of different distance measures in solving the bi-clustering problem is also conducted. The competency of the proposed multi-objective-based bi-clustering approach is shown for three benchmark datasets. The obtained results are further validated using statistical and biological significance tests.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. (http://www.cplusplus.com/reference/cstdlib/rand/)

  2. http://promodel.com/onlinehelp/promodel/80/C-14%20-%20Rand().htm

References

  • Acharya S, Saha S (2016) Importance of proximity measures in clustering of cancer and mirna datasets: proposal of an automated framework. Mol BioSyst 12(11):3478–3501

    Article  Google Scholar 

  • Acharya S, Saha S, Thadisina Y (2016) Multiobjective simulated annealing-based clustering of tissue samples for cancer diagnosis. IEEE J Biomed Health Inf 20(2):691–698

    Article  Google Scholar 

  • Angiulli F, Pizzuti C (2005) Gene expression biclustering using random walk strategies. In: International conference on data warehousing and knowledge discovery. Springer, pp 509–519

  • Attneave F (1955) Symmetry, information, and memory for patterns. Am J Psychol 68(2):209–222

    Article  Google Scholar 

  • Bandyopadhyay S, Saha S (2007) Gaps: A clustering method using a new point symmetry-based distance measure. Pattern Recogn 40(12):3430–3451

    Article  MATH  Google Scholar 

  • Bandyopadhyay S, Saha S (2012) Unsupervised classification: similarity measures, classical and metaheuristic approaches, and applications. Springer, Berlin

    MATH  Google Scholar 

  • Bandyopadhyay S, Saha S, Maulik U, Deb K (2008) A simulated annealing-based multiobjective optimization algorithm: Amosa. IEEE Trans Evol Comput 12(3):269–283

    Article  Google Scholar 

  • Ben-Dor A, Chor B, Karp R, Yakhini Z (2003) Discovering local structure in gene expression data: the order-preserving submatrix problem. J Comput Biol 10(3–4):373–384

    Article  Google Scholar 

  • Bousselmi M, Bechikh S, Hung C-C, Said LB (2017) Bi-mock: a multi-objective evolutionary algorithm for bi-clustering with automatic determination of the number of bi-clusters. In: International conference on neural information processing. Springer, pp 366–376

  • Bryan K, Cunningham P, Bolshakova N (2005) Biclustering of expression data using simulated annealing. In: 18th IEEE symposium on computer-based medical systems, 2005. Proceedings. IEEE, pp 383–388

  • Chakraborty A, Maka H (2005) Biclustering of gene expression data using genetic algorithm. In: Proceedings of the 2005 IEEE symposium on computational intelligence in bioinformatics and computational biology, 2005. CIBCB’05. IEEE, pp 1–8

  • Cheng K-O, Law N-F, Siu W-C, Lau T (2007) Bivisu: software tool for bicluster detection and visualization. Bioinformatics 23(17):2342–2344

    Article  Google Scholar 

  • Cheng Y, Church GM (2000) Biclustering of expression data. Ismb 8(2000):93–103

    Google Scholar 

  • Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197

    Article  Google Scholar 

  • Deb K, Sindhya K, Hakanen J (2016) Multi-objective optimization. In: Decision sciences: theory and practice. CRC Press, Boca Raton, FL

  • Divina F, Aguilar-Ruiz JS (2007) A multi-objective approach to discover biclusters in microarray data. In: Proceedings of the 9th annual conference on Genetic and evolutionary computation. ACM, pp 385–392

  • Dudoit S, Fridlyand J (2003) Classification in microarray experiments. Stat Anal Gene Expr Microarray Data 1:93–158

    Google Scholar 

  • Getz G, Levine E, Domany E (2000) Coupled two-way clustering analysis of gene microarray data. Proc Nat Acad Sci 97(22):12079–12084

    Article  Google Scholar 

  • Giancarlo R, Bosco GL, Pinello L (2010) Distance functions, clustering algorithms and microarray data analysis. In: International conference on learning and intelligent optimization. Springer, pp 125–138

  • Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537

    Article  Google Scholar 

  • Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129

    Article  Google Scholar 

  • Hochreiter S, Bodenhofer U, Heusel M, Mayr A, Mitterecker A, Kasim A, Khamiakova T, Van Sanden S, Lin D, Talloen W et al (2010) Fabia: factor analysis for bicluster acquisition. Bioinformatics 26(12):1520–1527

    Article  Google Scholar 

  • Huang Q, Tao D, Li X, Liew A (2012) Parallelized evolutionary learning for detection of biclusters in gene expression data. IEEE/ACM Trans Comput Biol Bioinf 9(2):560–570

    Article  Google Scholar 

  • Ihmels J, Bergmann S, Barkai N (2004) Defining transcription modules using large-scale gene expression data. Bioinformatics 20(13):1993–2003

    Article  Google Scholar 

  • Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Inc

  • Liu J, Li Z, Liu F, Chen Y (2008) Multi-objective particle swarm optimization biclustering of microarray data. In: IEEE international conference on bioinformatics and biomedicine, 2008. BIBM’08. IEEE, pp 363–366

  • Maulik U, Mukhopadhyay A, Bandyopadhyay S (2009) Finding multiple coherent biclusters in microarray data using variable string length multiobjective genetic algorithm. IEEE Trans Inf Technol Biomed 13(6):969

    Article  Google Scholar 

  • Ray SS, Bandyopadhyay S, Pal SK (2007) New distance measure for microarray gene expressions using linear dynamic range of photo multiplier tube. In: International conference on computing: theory and applications, 2007. ICCTA’07. IEEE, pp 337–341

  • Sahoo P, Acharya S, Saha S (2016) Automatic generation of biclusters from gene expression data using multi-objective simulated annealing approach. In: 2016 23rd international conference on pattern recognition (ICPR). IEEE, pp 2174–2179

  • Seifoddini HK (1989) Single linkage versus average linkage clustering in machine cells formation applications. Comput Ind Eng 16(3):419–426

    Article  Google Scholar 

  • Seridi K, Jourdan L, Talbi E-G (2015) Using multiobjective optimization for biclustering microarray data. Appl Soft Comput 33:239–249

    Article  Google Scholar 

  • Sirkin RM (2005) Statistics for the social sciences. Sage Publications

  • Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(suppl 1):S136–S144

    Article  Google Scholar 

  • Toussaint GT (1980) Pattern recognition and geometrical complexity. In: Proceedings of the 5th international conference on pattern recognition, vol 334, p 347

  • Yan D, Wang J (2013) Biclustering of gene expression data based on related genes and conditions extraction. Pattern Recogn 46(4):1170–1182

    Article  Google Scholar 

  • Yang J, Wang H, Wang W, Yu P (2003) Enhanced biclustering on expression data. In: 3rd IEEE symposium on bioinformatics and bioengineering, 2003. Proceedings. IEEE, pp 321–327

  • Zhang Z, Teo A, Ooi BC, Tan K-L (2004) Mining deterministic biclusters in gene expression data. In: 4th IEEE symposium on bioinformatics and bioengineering, 2004. BIBE 2004. Proceedings. IEEE, pp 283–290

  • Zhao L, Zaki MJ (2005) Tricluster: an effective algorithm for mining coherent clusters in 3d microarray data. In: Proceedings of the 2005 ACM SIGMOD international conference on Management of data. ACM, pp 694–705

Download references

Acknowledgements

The first author sincerely thanks Tata Consultancy Services (TCS) for providing funding to conduct this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sudipta Acharya.

Ethics declarations

Conflict of interest

Authors Sudipta Acharya, Sriparna Saha and Pracheta Sahoo declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Not applicable.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

First two authors have equal contributions.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Acharya, S., Saha, S. & Sahoo, P. Bi-clustering of microarray data using a symmetry-based multi-objective optimization framework. Soft Comput 23, 5693–5714 (2019). https://doi.org/10.1007/s00500-018-3227-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3227-5

Keywords

Navigation