Skip to main content

Biclustering via structured regularized matrix decomposition


Biclustering is a machine learning problem that deals with simultaneously clustering of rows and columns of a data matrix. Complex structures of the data matrix such as overlapping biclusters have challenged existing methods. In this paper, we first provide a unified formulation of biclustering that uses structured regularized matrix decomposition, which synthesizes various existing methods, and then develop a new biclustering method called BCEL based on this formulation. The biclustering problem is formulated as a penalized least-squares problem that approximates the data matrix \(\mathbf {X}\) by a multiplicative matrix decomposition \(\mathbf {U}\mathbf {V}^T\) with sparse columns in both \(\mathbf {U}\) and \(\mathbf {V}\). The squared \(\ell _{1,2}\)-norm penalty, also called the exclusive Lasso penalty, is applied to both \(\mathbf {U}\) and \(\mathbf {V}\) to assist identification of rows and columns included in the biclusters. The penalized least-squares problem is solved by a novel computational algorithm that combines alternating minimization and the proximal gradient method. A subsampling based procedure called stability selection is developed to select the tuning parameters and determine the bicluster membership. BCEL is shown to be competitive to existing methods in simulation studies and an application to a real-world single-cell RNA sequencing dataset.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  • Asgarian, N., Greiner, R.: Using rank-1 biclusters to classify microarray data. Dept Computing Science, and the Alberta Ingenuity Center for Machine Learning, Univ Alberta, Edmonton, AB, Canada, T6G2E8 (2006)

  • Beck, A.: On the convergence of alternating minimization for convex programming with applications to iteratively reweighted least squares and decomposition schemes. SIAM J. Optim. 25(1), 185–209 (2015)

    MathSciNet  Article  Google Scholar 

  • Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    MathSciNet  Article  Google Scholar 

  • Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering local structure in gene expression data: the order-preserving submatrix problem. J. Comput. Biol. 10(3–4), 373–384 (2003)

    Article  Google Scholar 

  • Bergmann, S., Ihmels, J., Barkai, N.: Iterative signature algorithm for the analysis of large-scale gene expression data. Phys. Rev. E 67(3), 031902 (2003)

    Article  Google Scholar 

  • Campbell, F., Allen, G.I., et al.: Within group variable selection through the exclusive lasso. Electron. J. Stat. 11(2), 4220–4257 (2017)

    MathSciNet  Article  Google Scholar 

  • Chen, K., Chan, K.S., Stenseth, N.C.: Reduced rank stochastic regression with a sparse singular value decomposition. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 74(2), 203–221 (2012)

    MathSciNet  Article  Google Scholar 

  • Chi, E.C., Allen, G.I., Baraniuk, R.G.: Convex biclustering. Biometrics 73(1), 10–19 (2017)

    MathSciNet  Article  Google Scholar 

  • Corneli, M., Bouveyron, C., Latouche, P.: Co-clustering of ordinal data via latent continuous random variables and not missing at random entries. J. Comput. Graph. Stat. 29(4), 771–785 (2020)

    MathSciNet  Article  Google Scholar 

  • Gao, C., Lu, Y., Ma, Z., Zhou, H.H.: Optimal estimation and completion of matrices with biclustering structures. J. Mach. Learn. Res. 17(1), 5602–5630 (2016)

    MathSciNet  MATH  Google Scholar 

  • Govaert, G., Nadif, M.: Block clustering with bernoulli mixture models: comparison of different approaches. Comput. Stat. Data Anal. 52(6), 3233–3245 (2008)

    MathSciNet  Article  Google Scholar 

  • Hartigan, J.A.: Direct clustering of a data matrix. J. Am. Stat. Assoc. 67(337), 123–129 (1972)

    Article  Google Scholar 

  • Hochreiter, S., Bodenhofer, U., Heusel, M., Mayr, A., Mitterecker, A., Kasim, A., Khamiakova, T., Van Sanden, S., Lin, D., Talloen, W., et al.: Fabia: factor analysis for bicluster acquisition. Bioinformatics 26(12), 1520–1527 (2010)

    Article  Google Scholar 

  • Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5(Nov), 1457–1469 (2004)

    MathSciNet  MATH  Google Scholar 

  • Hunter, D.R., Lange, K.: A tutorial on mm algorithms. Am. Stat. 58(1), 30–37 (2004)

    MathSciNet  Article  Google Scholar 

  • Keribin, C., Brault, V., Celeux, G., Govaert, G.: Estimation and selection for the latent block model on categorical data. Stat. Comput. 25(6), 1201–1216 (2015)

    MathSciNet  Article  Google Scholar 

  • Kong, D., Fujimaki, R., Liu, J., Nie, F., Ding, C.: Exclusive feature learning on arbitrary structures via \(\ell _{1,2}\)-norm. In: Advances in Neural Information Processing Systems, pp. 1655–1663 (2014)

  • Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Statistica Sinica 12, 61–86 (2002)

    MathSciNet  MATH  Google Scholar 

  • Lee, M., Shen, H., Huang, J.Z., Marron, J.: Biclustering via sparse singular value decomposition. Biometrics 66(4), 1087–1095 (2010)

    MathSciNet  Article  Google Scholar 

  • Meinshausen, N., Bühlmann, P.: Stability selection. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 72(4), 417–473 (2010)

    MathSciNet  Article  Google Scholar 

  • Murali, T., Kasif, S.: Extracting conserved gene expression motifs from gene expression data. In: Biocomputing 2003, World Scientific, pp. 77–88 (2002)

  • Padilha, V.A., Campello, R.J.: A systematic comparative evaluation of biclustering techniques. BMC Bioinform. 18(1), 1–25 (2017)

    Article  Google Scholar 

  • Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2014)

    Article  Google Scholar 

  • Pontes, B., Giráldez, R., Aguilar-Ruiz, J.S.: Biclustering on expression data: a review. J. Biomed. Inform. 57, 163–180 (2015)

    Article  Google Scholar 

  • Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2006)

    Article  Google Scholar 

  • Qi, X., Luo, R., Zhao, H.: Sparse principal component analysis by choice of norm. J. Multivar. Anal. 114, 127–160 (2013)

    MathSciNet  Article  Google Scholar 

  • Shabalin, A.A., Weigman, V.J., Perou, C.M., Nobel, A.B., et al.: Finding large average submatrices in high dimensional data. Ann. Appl. Stat. 3(3), 985–1012 (2009)

  • Sill, M., Kaiser, S., Benner, A., Kopp-Schneider, A.: Robust biclustering by sparse singular value decomposition incorporating stability selection. Bioinformatics 27(15), 2089–2097 (2011)

    Article  Google Scholar 

  • Tan, K.M., Witten, D.M.: Sparse biclustering of transposable data. J. Comput. Graph. Stat. 23(4), 985–1008 (2014)

    MathSciNet  Article  Google Scholar 

  • Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(suppl-1), S136–S144 (2002)

    Article  Google Scholar 

  • Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3), 515–534 (2009)

    Article  Google Scholar 

  • Ximerakis, M., Lipnick, S.L., Innes, B.T., Simmons, S.K., Adiconis, X., Dionne, D., Mayweather, B.A., Nguyen, L., Niziolek, Z., Ozek, C., et al.: Single-cell transcriptomic profiling of the aging mouse brain. Nat. Neurosci. 22(10), 1696–1708 (2019)

    Article  Google Scholar 

  • Yang, J., Wang, H., Wang, W., Yu, P.: Enhanced biclustering on expression data. In: Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings. IEEE, pp. 321–327 (2003)

  • Zaki, M.J., Meira, W., Jr., Meira, W.: Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press (2014)

  • Zhao, P., Rocha, G., Yu, B.: The composite absolute penalties family for grouped and hierarchical variable selection. Ann. Stat. 37, 3468–3497 (2009)

    MathSciNet  Article  Google Scholar 

  • Zhou, Y., Jin, R., Hoi, S.C.H.: Exclusive lasso for multi-task feature selection. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 988–995 (2010)

Download references


Part of the work of Jianhua Z. Huang was done when he was with Texas A &M University and was partly supported by NSF Grants No. 1956219 and 1900990. Huang was also partly supported by funding from the Pengcheng Peacock Program of Shenzhen.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Yan Zhong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 255 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhong, Y., Huang, J.Z. Biclustering via structured regularized matrix decomposition. Stat Comput 32, 37 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Biclustering
  • Squared \(\ell _{1,2}\)-norm
  • Structured sparsity
  • Stability selection