Skip to main content
Log in

Bi-clustering continuous data with self-organizing map

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In this paper, we present a new SOM-based bi-clustering approach for continuous data. This approach is called Bi-SOM (for Bi-clustering based on Self-Organizing Map). The main goal of bi-clustering aims to simultaneously group the rows and columns of a given data matrix. In addition, we propose in this work to deal with some issues related to this task: (1) the topological visualization of bi-clusters with respect to their neighborhood relation, (2) the optimization of these bi-clusters in macro-blocks and (3) the dimensionality reduction by eliminating noise blocks, iteratively. Finally, experiments are given over several data sets for validating our approach in comparison with other bi-clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Angiulli F, Cesario E, Pizzuti C (2008) Random walk biclustering for microarray data. Inf Sci 178:1479–1497

    Article  MATH  Google Scholar 

  2. Bandyopadhyay S, Mukhopadhyay A, Maulik U (2007) An improved algorithm for clustering gene expression data. Bioinformatics 21:2859–2865

    Article  Google Scholar 

  3. BenDor A, Chor B, Karp R, Yakhini Z (2003) Discovering local structure in gene expression data: the order preserving sub matrix problem. J Comput Biol 10(3–4):373–384

    Article  Google Scholar 

  4. Bergmann S, Ihmels J, Barkai N (2004) Defining transcription modules using large-scale gene expression. Bioinformatics 20(13):1993–2003

    Article  Google Scholar 

  5. Bryan K, Cunningham P, Bolshakova N (2005) Biclustering of expression data using simulated annealing. CBMS 2005:383–388

    Google Scholar 

  6. Busygin S, Jacobsen G, Kramer E (2002) Double conjugated clustering applied to leukemia microarray data. In: Proceedings of the 2nd SIAM international conference on data mining, workshop on clustering high dimensional data

  7. Cheng Y, Church G (2000) Biclustering of expression data. In: Proceedings of the 8th international conference on intelligent systems for molecular biology (ISMB’00), vol 8, pp 93–103

  8. Cottrell M, Ibbou S, Letrémy P (2004) Som-based algorithms for qualitative variables. Neural Netw 17(8–9):1149–1167

    Article  MATH  Google Scholar 

  9. Cottrell M, Letrémy MP (2005) How to use the kohonen algorithm to simultaneously analyze individuals and modalities in a survey. Neurocomputing 63:193–207

    Article  Google Scholar 

  10. Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 269–274

  11. Eisen M, Spellman P, Brown P, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95(25):14863–14868

    Article  Google Scholar 

  12. Fort J, Cottrel M, Letrémy P (2001) Stochastic on-row algorithm versus batch algorithm for quantization and self-organizing maps. Neural networks for signal processing XI, 2001. In: Proceedings of the 2001 IEEE signal processing society workshop, pp 43–52

  13. Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml

  14. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh M, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression. Science 286:531–537

    Article  Google Scholar 

  15. Govaert G, Nadif M (2008) Block clustering with mixture models: comparison of different approaches. Comput Stat Data Anal 52:3233–3245

    Article  MathSciNet  MATH  Google Scholar 

  16. Govaert G (1983) Classification Croisée. Thèse d’état, Université de Paris6

  17. Hartigan J (1972) Direct clustering of data matrix. J Am Stat Assoc 67(337):123–129

    Article  Google Scholar 

  18. Hartigan J (1975) Direct splitting. Clustering algorithms, Chap. 14. Wiley, New York, pp 251–277

  19. Klugar Y, Basri R, Chang J, Gerstein M (2003) Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res 13:703–716

    Article  Google Scholar 

  20. Kohonen T (2001) Self-organizing maps. Springer, Berlin

    Book  MATH  Google Scholar 

  21. Lazzeroni L, Owen A (2000) Plaid models for gene expression data. Stat Sin 12:61–86

    MathSciNet  Google Scholar 

  22. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. Proc Fifth Berkeley Symp Math Stat Probab 1:281–297

    MathSciNet  Google Scholar 

  23. Madeira S, Oliveira A (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinf 1(1):24–45

    Article  Google Scholar 

  24. Meeds E, Roweis S (2007) Nonparametric bayesian bi-clustering. Technical report

  25. Mitra S, Banka H (2006) Multi-objective evolutionary biclustering of gene expression data. Pattern Recogn 39(12):2464–2477

    Article  MATH  Google Scholar 

  26. Murali T, Kasif S (2003) Extracting conserved gene expression motifs from gene expression data. Pac Symp Biocomput 8:77–88

    Google Scholar 

  27. Pensa R, Boulicaut J-F, Cordero F, Atzori M (2010) Co-clustering numerical data under user-defined constraints. Stat Anal Data Min 3(1):38–55

    MathSciNet  Google Scholar 

  28. Prelic A, Bleuler S, Zimmermann P, Wille A, Buhlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1131

    Article  Google Scholar 

  29. Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850

    Article  Google Scholar 

  30. Santamaria R, Quintales L, Theron R (2007) Methods to bicluster validation and comparison in microarray data. In: Proceedings of IDEAL 2007, LNCS4881, pp 780–789

  31. Schummer M, Ng W, Bumgarner R, Nelson P, Schummer B, Bednarski D, Hassell L, Baldwin R, Karlan B, Hood L (1999) Comparative hybridization of an array of 21500 ovarian cdnas for the discovery of genes overexpressed in ovarian carcinomas. Gene 238(2):375–385

    Article  Google Scholar 

  32. Shi J, Malik J (2000) Normalized cuts and image segmentation. Technical report, University of California at Berkeley, Berkeley, CA, USA

  33. Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18:36–44

    Article  Google Scholar 

  34. Xiaowen L, Wang L (2007) Computing the maximum similarity bi-clusters of gene expression data. Bioinformatics 23(1):50–56

    Article  Google Scholar 

  35. Yang J, Wang W, Wang H, Yu P (2003) Enhanced biclustering on expression data. BIBE ’03, pp. 321–327

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khalid Benabdeslem.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Benabdeslem, K., Allab, K. Bi-clustering continuous data with self-organizing map. Neural Comput & Applic 22, 1551–1562 (2013). https://doi.org/10.1007/s00521-012-1047-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-1047-6

Keywords

Navigation