Parameterized low-rank binary matrix approximation

Fomin, Fedor V.; Golovach, Petr A.; Panolan, Fahad

doi:10.1007/s10618-019-00669-5

Parameterized low-rank binary matrix approximation

Published: 02 January 2020

Volume 34, pages 478–532, (2020)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

636 Accesses
11 Citations
Explore all metrics

Abstract

Low-rank binary matrix approximation is a generic problem where one seeks a good approximation of a binary matrix by another binary matrix with some specific properties. A good approximation means that the difference between the two matrices in some matrix norm is small. The properties of the approximation binary matrix could be: a small number of different columns, a small binary rank or a small Boolean rank. Unfortunately, most variants of these problems are NP-hard. Due to this, we initiate the systematic algorithmic study of low-rank binary matrix approximation from the perspective of parameterized complexity. We show in which cases and under what conditions the problem is fixed-parameter tractable, admits a polynomial kernel and can be solved in parameterized subexponential time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bounded Matrix Low Rank Approximation

Approximability and exact resolution of the multidimensional binary vector assignment problem

Article 29 March 2018

Randomized Algorithms for Low-Rank Matrix Factorizations: Sharp Performance Bounds

Article 24 May 2014

Notes

We are grateful to the anonymous reviewer who pointed to us that the running time of our algorithm can be improved from the original \( 2^{{\mathcal {O}}(r\sqrt{k\log {(k+r)}})}\cdot nm\) to \( 2^{{\mathcal {O}}( \sqrt{rk\log {(k+r)}\log r})}\cdot nm\).

References

Agarwal PK, Har-Peled S, Varadarajan KR (2004) Approximating extent measures of points. J ACM 51(4):606–635
Article MathSciNet MATH Google Scholar
Aho AV, Ullman JD, Yannakakis M (1983) On notions of information transfer in VLSI circuits. In: Proceedings of the 15th annual ACM symposium on theory of computing (STOC), ACM, pp 133–139
Alon N, Sudakov B (1999) On two segmentation problems. J Algorithms 33(1):173–184
Article MathSciNet MATH Google Scholar
Alon N, Yuster R, Zwick U (1995) Color-coding. J ACM 42(4):844–856
Article MathSciNet MATH Google Scholar
Arora S, Ge R, Kannan R, Moitra A (2012) Computing a nonnegative matrix factorization—provably. In: Proceedings of the 44th annual ACM symposium on theory of computing (STOC), ACM, pp 145–162
Badoiu M, Har-Peled S, Indyk P (2002) Approximate clustering via core-sets. In: Proceedings of the 34th annual ACM symposium on theory of computing (STOC). ACM, pp 250–257
Ban F, Bhattiprolu V, Bringmann K, Kolev P, Lee E, Woodruff DP (2019) A PTAS for \(\ell _p\)-low rank approximation. In: Proceedings of the thirtieth annual ACM-SIAM symposium on discrete algorithms, SODA 2019, San Diego, California, USA, 6–9 Jan 2019. SIAM, pp 747–766
Bartl E, Belohlávek R, Konecny J (2010) Optimal decompositions of matrices with grades into binary and graded matrices. Ann Math Artif Intell 59(2):151–167
Article MathSciNet MATH Google Scholar
Basu A, Dinitz M, Li X (2016) Computing approximate PSD factorizations. CoRR arXiv:1602.07351
Belohlávek R, Vychodil V (2010) Discovery of optimal factors in binary data via a novel method of matrix decomposition. J Comput Syst Sci 76(1):3–20
Article MathSciNet MATH Google Scholar
Bodlaender HL, Downey RG, Fellows MR, Hermelin D (2009) On problems without polynomial kernels. J Comput Syst Sci 75(8):423–434
Article MathSciNet MATH Google Scholar
Boucher C, Lo C, Lokshtanov D (2011) Outlier detection for DNA fragment assembly. CoRR arXiv:1111.0376
Bringmann K, Kolev P, Woodruff DP (2017) Approximation algorithms for \(\ell _0\)-low rank approximation. In: Advances in neural information processing systems 30 (NIPS), pp 6651–6662
Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3):11:1–11:37
Article MathSciNet MATH Google Scholar
Chandran LS, Issac D, Karrenbauer A (2016) On the parameterized complexity of biclique cover and partition. In: Proceedings of the 11th international symposium on parameterized and exact computation (IPEC), Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, LIPIcs, vol 63, pp 11:1–11:13
Chandrasekaran V, Sanghavi S, Parrilo PA, Willsky AS (2011) Rank-sparsity incoherence for matrix decomposition. SIAM J Optim 21(2):572–596
Article MathSciNet MATH Google Scholar
Cichocki A, Zdunek R, Phan AH, Si Amari (2009) Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. Wiley, Hoboken
Book Google Scholar
Cilibrasi R, van Iersel L, Kelk S, Tromp J (2007) The complexity of the single individual SNP haplotyping problem. Algorithmica 49(1):13–36
Article MathSciNet MATH Google Scholar
Clarkson KL, Woodruff DP (2015) Input sparsity and hardness for robust subspace approximation. In: Proceedings of the 56th annual symposium on Foundations of Computer Science (FOCS). IEEE Computer Society, pp 310–329
Cohen JE, Rothblum UG (1993) Nonnegative ranks, decompositions, and factorizations of nonnegative matrices. Linear Algebra Appl 190:149–168
Article MathSciNet MATH Google Scholar
Cygan M, Fomin FV, Kowalik L, Lokshtanov D, Marx D, Pilipczuk M, Pilipczuk M, Saurabh S (2015) Parameterized algorithms. Springer, Berlin
Book MATH Google Scholar
Dan C, Hansen KA, Jiang H, Wang L, Zhou Y (2015) On low rank approximation of binary matrices. CoRR arXiv:1511.01699
Downey RG, Fellows MR (1992) Fixed-parameter tractability and completeness. In: Proceedings of the 21st Manitoba conference on numerical mathematics and computing Congressus Numerantium, vol 87, pp 161–178
Downey RG, Fellows MR (2013) Fundamentals of parameterized complexity. Texts in computer science. Springer, Berlin
Book MATH Google Scholar
Drange PG, Reidl F, Villaamil FS, Sikdar S (2015) Fast biclustering by dual parameterization. CoRR arXiv:1507.08158
Feige U (2014) NP-hardness of hypercube 2-segmentation. CoRR arXiv:1411.0821
Fiorini S, Massar S, Pokutta S, Tiwary HR, de Wolf R (2015) Exponential lower bounds for polytopes in combinatorial optimization. J ACM 62(2):17
Article MathSciNet MATH Google Scholar
Fomin FV, Kratsch S, Pilipczuk M, Pilipczuk M, Villanger Y (2014) Tight bounds for parameterized complexity of cluster editing with a small number of clusters. J Comput Syst Sci 80(7):1430–1447
Article MathSciNet MATH Google Scholar
Fomin FV, Golovach PA, Lokshtanov D, Panolan F, Saurabh S (2018a) Approximation schemes for low-rank binary matrix approximation problems. CoRR arXiv:1807.07156
Fomin FV, Lokshtanov D, Meesum SM, Saurabh S, Zehavi M (2018b) Matrix rigidity from the viewpoint of parameterized complexity. SIAM J Discrete Math 32(2):966–985
Article MathSciNet MATH Google Scholar
Fomin FV, Lokshtanov D, Saurabh S, Zehavi M (2019) Kernelization. Theory of parameterized preprocessing. Cambridge University Press, Cambridge
MATH Google Scholar
Fu Y (2014) Low-rank and sparse modeling for visual analysis, 1st edn. Springer, Berlin
MATH Google Scholar
Geerts F, Goethals B, Mielikäinen T (2004) Tiling databases. In: Proceedings of the 7th international conference on discovery science, (DS), pp 278–289
Gillis N, Vavasis SA (2015) On the complexity of robust PCA and \(\ell _1\)-norm low-rank matrix approximation. CoRR arXiv:1509.09236
Gramm J, Guo J, Hüffner F, Niedermeier R (2008) Data reduction and exact algorithms for clique cover. ACM J Exp Algorithmics. https://doi.org/10.1145/1412228.1412236
Article MATH Google Scholar
Gregory DA, Pullman NJ, Jones KF, Lundgren JR (1991) Biclique coverings of regular bigraphs and minimum semiring ranks of regular matrices. J Comb Theory Ser B 51(1):73–89
Article MathSciNet MATH Google Scholar
Grigoriev D (1976) Using the notions of separability and independence for proving the lower bounds on the circuit complexity (in Russian). Notes of the Leningrad branch of the Steklov Mathematical Institute, Nauka
Grigoriev D (1980) Using the notions of separability and independence for proving the lower bounds on the circuit complexity. J Sov Math 14(5):1450–1456
Article Google Scholar
Gutch HW, Gruber P, Yeredor A, Theis FJ (2012) ICA over finite fields—separability and algorithms. Sig Process 92(8):1796–1808
Article Google Scholar
Guterman AE (2008) Rank and determinant functions for matrices over semirings. In: Surveys in contemporary mathematics, London Mathematical Society lecture note series, vol 347. Cambridge University Press, Cambridge, pp 1–33
Inaba M, Katoh N, Imai H (1994) Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering. In: Proceedings of the 10th annual symposium on computational geometry. ACM, pp 332–339
Jiang P, Heath MT (2013) Mining discrete patterns via binary matrix factorization. In: ICDM workshops. IEEE Computer Society, pp 1129–1136
Jiang P, Peng J, Heath M, Yang R (2014) A clustering approach to constrained binary matrix factorization. Springer, Berlin, pp 281–303
MATH Google Scholar
Kannan R, Vempala S (2009) Spectral algorithms. Found Trends Theor Comput Sci 4(3–4):157–288
Article MathSciNet MATH Google Scholar
Kleinberg J, Papadimitriou C, Raghavan P (2004) Segmentation problems. J ACM 51(2):263–280
Article MathSciNet MATH Google Scholar
Koyutürk M, Grama A (2003) Proximus: a framework for analyzing very high dimensional discrete-attributed datasets. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, New York, pp 147–156
Kumar A, Sabharwal Y, Sen S (2010) Linear-time approximation schemes for clustering problems in any dimensions. J ACM 57(2):5:1–5:32
Article MathSciNet MATH Google Scholar
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Article MATH Google Scholar
Lokam SV (2009) Complexity lower bounds using linear algebra. Found Trends Theor Comput Sci 4:1–155
Article MathSciNet MATH Google Scholar
Lovász L, Saks ME (1988) Lattices, möbius functions and communication complexity. In: Proceedings of the 29th annual symposium on Foundations of Computer Science (FOCS). IEEE, pp 81–90
Lu H, Vaidya J, Atluri V (2008) Optimal boolean matrix decomposition: application to role engineering. In: Proceedings of the 24th international conference on data engineering, (ICDE), pp 297–306
Lu H, Vaidya J, Atluri V, Shin H, Jiang L (2011) Weighted rank-one binary matrix factorization. In: Proceedings of the eleventh SIAM international conference on data mining, SDM 2011, 28–30 Apr 2011, Mesa, Arizona, USA. SIAM/Omnipress, pp 283–294
Lu H, Vaidya J, Atluri V, Hong Y (2012) Constraint-aware role mining via extended boolean matrix decomposition. IEEE Trans Dependable Secur Comput 9(5):655–669
Google Scholar
Mahoney MW (2011) Randomized algorithms for matrices and data. Found Trends Mach Learn 3(2):123–224
MATH Google Scholar
Marx D (2008) Closest substring problems with small distances. SIAM J Comput 38(4):1382–1410
Article MathSciNet MATH Google Scholar
Meesum SM, Saurabh S (2016) Rank reduction of directed graphs by vertex and edge deletions. In: Proceedings of the 12th Latin American symposium on (LATIN), lecture notes in computer science, vol 9644. Springer, pp 619–633
Meesum SM, Misra P, Saurabh S (2016) Reducing rank of the adjacency matrix by graph modification. Theoret Comput Sci 654:70–79
Article MathSciNet MATH Google Scholar
Miettinen P, Vreeken J (2011) Model order selection for boolean matrix factorization. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, pp 51–59
Miettinen P, Mielikäinen T, Gionis A, Das G, Mannila H (2008) The discrete basis problem. IEEE Trans Knowl Data Eng 20(10):1348–1362
Article Google Scholar
Mitra B, Sural S, Vaidya J, Atluri V (2016) A survey of role mining. ACM Comput Surv 48(4):50:1–50:37
Article Google Scholar
Moitra A (2016) An almost optimal algorithm for computing nonnegative rank. SIAM J Comput 45(1):156–173
Article MathSciNet MATH Google Scholar
Naik GR (2016) Non-negative matrix factorization techniques. Springer, Berlin
Book MATH Google Scholar
Naor M, Schulman LJ, Srinivasan A (1995) Splitters and near-optimal derandomization. In: Proceedings of the 36th annual symposium on Foundations of Computer Science (FOCS). IEEE, pp 182–191
Orlin J (1977) Contentment in graph theory: covering graphs with cliques. Nederl Akad Wetensch Proc Ser A 80=Indag Math 39(5):406–424
Article MathSciNet MATH Google Scholar
Ostrovsky R, Rabani Y (2002) Polynomial-time approximation schemes for geometric min-sum median clustering. J ACM 49(2):139–156
Article MathSciNet MATH Google Scholar
Painsky A, Rosset S, Feder M (2016) Generalized independent component analysis over finite alphabets. IEEE Trans Inf Theory 62(2):1038–1053
Article MathSciNet MATH Google Scholar
Razborov AA (1989) On rigid matrices. Manuscript in Russian
Razenshteyn IP, Song Z, Woodruff DP (2016) Weighted low rank approximations with provable guarantees. In: Proceedings of the 48th annual ACM symposium on theory of computing (STOC). ACM, pp 250–263
Shen BH, Ji S, Ye J (2009) Mining discrete patterns via binary matrix factorization. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD). ACM, New York, pp 757–766
Shi Z, Wang L, Shi L (2014) Approximation method to rank-one binary matrix factorization. In: 2014 IEEE international conference on automation science and engineering, CASE 2014, New Taipei, Taiwan, 18–22 Aug 2014. IEEE, pp 800–805
Vaidya J (2012) Boolean matrix decomposition problem: theory, variations and applications to data engineering. In: Proceedings of the 28th IEEE international conference on data engineering (ICDE). IEEE Computer Society, pp 1222–1224
Vaidya J, Atluri V, Guo Q (2007) The role mining problem: finding a minimal descriptive set of roles. In: Proceedings of the 12th ACM symposium on access control models and (SACMAT), pp 175–184
Valiant LG (1977) Graph-theoretic arguments in low-level complexity. In: Mathematical foundations of computer science (MFCS), Lecture Notes in Computer Science, vol 53. Springer, pp 162–176
Woodruff DP (2014) Sketching as a tool for numerical linear algebra. Found Trends Theor Comput Sci 10(1–2):1–157
Article MathSciNet MATH Google Scholar
Wright J, Ganesh A, Rao SR, Peng Y, Ma Y (2009) Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. In: Proceedings of 23rd annual conference on neural information processing systems (NIPS). Curran Associates, Inc., pp 2080–2088
Wulff S, Urner R, Ben-David S (2013) Monochromatic bi-clustering. In: Proceedings of the 30th international conference on machine learning, (ICML), JMLR.org, JMLR workshop and conference proceedings, vol 28, pp 145–153
Yannakakis M (1991) Expressing combinatorial optimization problems by linear programs. J Comput Syst Sci 43(3):441–466
Article MathSciNet MATH Google Scholar
Yeredor A (2011) Independent component analysis over Galois fields of prime order. IEEE Trans Inf Theory 57(8):5342–5359
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank Daniel Lokshtanov, Syed Mohammad Meesum and Saket Saurabh for helpful discussions on the topic of the paper. We also are very grateful to the anonymous reviewers whose suggestions helped us to improve our results.

Funding

The research leading to these results have been supported by the Research Council of Norway via the projects “CLASSIS” (grant 249994) and “MULTIVAL” (grant 263317).

Author information

Authors and Affiliations

Department of Informatics, University of Bergen, PB 7803, 5020, Bergen, Norway
Fedor V. Fomin & Petr A. Golovach
Department of Computer Science and Engineering, IIT Hyderabad, Kandi, Sangareddy, Telangana, 502285, India
Fahad Panolan

Authors

Fedor V. Fomin
View author publications
You can also search for this author in PubMed Google Scholar
Petr A. Golovach
View author publications
You can also search for this author in PubMed Google Scholar
Fahad Panolan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Petr A. Golovach.

Additional information

Responsible editor: Pauli Miettinen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work was done within the CEDAS center in Bergen. The preliminary version of this paper appeared as an extended abstract in the proceedings of ICALP 2018.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fomin, F.V., Golovach, P.A. & Panolan, F. Parameterized low-rank binary matrix approximation. Data Min Knowl Disc 34, 478–532 (2020). https://doi.org/10.1007/s10618-019-00669-5

Download citation

Received: 21 March 2019
Accepted: 09 December 2019
Published: 02 January 2020
Issue Date: March 2020
DOI: https://doi.org/10.1007/s10618-019-00669-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parameterized low-rank binary matrix approximation

Abstract

Access this article

Similar content being viewed by others

Bounded Matrix Low Rank Approximation

Approximability and exact resolution of the multidimensional binary vector assignment problem

Randomized Algorithms for Low-Rank Matrix Factorizations: Sharp Performance Bounds

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parameterized low-rank binary matrix approximation

Abstract

Access this article

Similar content being viewed by others

Bounded Matrix Low Rank Approximation

Approximability and exact resolution of the multidimensional binary vector assignment problem

Randomized Algorithms for Low-Rank Matrix Factorizations: Sharp Performance Bounds

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation