Bi-clustering via MDL-Based Matrix Factorization

Ramírez, Ignacio; Tepper, Mariano

doi:10.1007/978-3-642-41822-8_29

Ignacio Ramírez¹⁸ &
Mariano Tepper¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8258))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

3548 Accesses
3 Citations

Abstract

Bi-clustering, or co-clustering, refers to the task of finding sub-matrices (indexed by a group of columns and a group of rows) within a matrix such that the elements of each sub-matrix are related in some way, for example, that they are similar under some metric. As in traditional clustering, a crucial parameter in bi-clustering methods is the number of groups that one expects to find in the data, something which is not always available or easy to guess. The present paper proposes a novel method for performing bi-clustering based on the concept of low-rank sparse non-negative matrix factorization (S-NMF), with the additional benefit that the optimum rank k is chosen automatically using a minimum description length (MDL) selection procedure, which favors models which can represent the data with fewer bits. This MDL procedure is tested in combination with three different S-NMF algorithms, two of which are novel, on a simulated example in order to assess the validity of the procedure.

Download to read the full chapter text

Chapter PDF

Iterative Multi-mode Discretization: Applications to Co-clustering

Consensus Algorithm for Bi-clustering Analysis

Bi-stochastic Matrix Approximation Framework for Data Co-clustering

References

Madeira, S., Oliveira, A.: Biclustering Algorithms for Biological Data Analysis: A Survey. IEEE Trans. CBB 1(1), 24–45 (2004)
Google Scholar
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 7880–791 (1999)
Google Scholar
Hoyer, P.: Non-negative matrix factorization with sparseness constraints. JMLR 5, 1457–1469 (2004)
MathSciNet MATH Google Scholar
Barron, A., Rissanen, J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Trans. IT 44(6), 2743–2760 (1998)
Article MathSciNet MATH Google Scholar
Jornsten, R., Yu, B.: Simultaneous gene clustering and subset selection for sample classification via MDL. Bioinformatics 19(9), 1100–1109 (2003)
Article Google Scholar
Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research 37, 3311–3325 (1997)
Article Google Scholar
Aharon, M., Elad, M., Bruckstein, A.: The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representations. IEEE Trans. SP 54(11), 4311–4322 (2006)
Article Google Scholar
A bi-clustering formulation of multiple model estimation (submitted, 2013)
Google Scholar
Zou, H., Hastie, T., Tibshirani, R.: Sparse Principal Component Analysis. Computational and Graphical Statistics 15(2), 265–286 (2006)
Article MathSciNet Google Scholar
Hochreiter, S., Bodenhofer, U., Heusel, M., Mayr, A., Mitterecker, A., Kasim, A., Adetayo, K., Tatsiana, S., Suzy, V., Lin, D., Talloen, W., Bijnens, L., Shkedy, Z.: FABIA: factor analysis for biclustering acquisition. Bioinformatics 26(12), 1520–1527 (2010)
Article Google Scholar
Lee, M., Shen, H., Huang, J.Z., Marron, J.S.: Biclustering via sparse singular value decomposition. Biometrics 66(4), 1087–1095 (2010)
Article MathSciNet MATH Google Scholar
Bruckstein, A.M., Donoho, D.L., Elad, M.: From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images. SIAM Review 51(1), 34–81 (2009)
Article MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal Matching Pursuit: Recursive function approximation with applications to wavelet decomposition. In: Proc. 27th Ann. Asilomar Conf. Signals, Systems, and Computers (1993)
Google Scholar
Cover, T.M.: Enumerative source coding. IEEE Trans. Inform. Theory 19, 73–77 (1973)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Universidad de la República, Uruguay
Ignacio Ramírez
Duke University, USA
Mariano Tepper

Authors

Ignacio Ramírez
View author publications
You can also search for this author in PubMed Google Scholar
Mariano Tepper
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Advanced Technologies Application Center (CENATAV), 7a A#21406 esq. 214 y 216, Rpto. Siboney, Playa., C.P. 12200, La Habana, Cuba
José Ruiz-Shulcloper
National Research Council (CNR), Institute of Cybernetics “E. Caianiello”, Via Campi Flegrei 34, 80078, Pozzuoli, Naples, Italy
Gabriella Sanniti di Baja

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ramírez, I., Tepper, M. (2013). Bi-clustering via MDL-Based Matrix Factorization. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2013. Lecture Notes in Computer Science, vol 8258. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41822-8_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-41822-8_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41821-1
Online ISBN: 978-3-642-41822-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Bi-clustering via MDL-Based Matrix Factorization

Abstract

Chapter PDF

Similar content being viewed by others

Iterative Multi-mode Discretization: Applications to Co-clustering

Consensus Algorithm for Bi-clustering Analysis

Bi-stochastic Matrix Approximation Framework for Data Co-clustering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Bi-clustering via MDL-Based Matrix Factorization

Abstract

Chapter PDF

Similar content being viewed by others

Iterative Multi-mode Discretization: Applications to Co-clustering

Consensus Algorithm for Bi-clustering Analysis

Bi-stochastic Matrix Approximation Framework for Data Co-clustering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation