The Boolean column and column-row matrix decompositions
- 304 Downloads
Matrix decompositions are used for many data mining purposes. One of these purposes is to find a concise but interpretable representation of a given data matrix. Different decomposition formulations have been proposed for this task, many of which assume a certain property of the input data (e.g., nonnegativity) and aim at preserving that property in the decomposition. In this paper we propose new decomposition formulations for binary matrices, namely the Boolean CX and CUR decompositions. They are natural combinations of two previously presented decomposition formulations. We consider also two subproblems of these decompositions and present a rigorous theoretical study of the subproblems. We give algorithms for the decompositions and for the subproblems, and study their performance via extensive experimental evaluation. We show that even simple algorithms can give accurate and intuitive decompositions of real data, thus demonstrating the power and usefulness of the proposed decompositions.
KeywordsMatrix decompositions Approximation CX decomposition CUR decomposition Boolean decompositions
Unable to display preview. Download preview PDF.
- Drineas P, Mahoney MW, Muthukrishnan S (2007) Relative-error CUR matrix decompositions. Technical report arXiv:0708.3696v1 [cs.DS]Google Scholar
- Fortelius M (2003) Neogene of the old world database of fossil mammals (NOW). http://www.helsinki.fi/science/now/. Accessed 17 July 2003
- Golub GH, Van Loan CF (1996) Matrix computations. Johns Hopkins University PressGoogle Scholar
- Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd ACM SIGIR conference on research and development in information retrieval, pp 50–57Google Scholar
- Hyvönen S, Miettinen P, Terzi E (2008) Interpretable nonnegative matrix decompositions. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery & data mining (KDD) (in press)Google Scholar
- Lu H, Vaidya J, Atluri V (2008) Optimal Boolean matrix decomposition: application to role engineering. In: Proceedings of the 24th IEEE international conference on data engineering (ICDE), p 297–306Google Scholar
- Miettinen P (2008a) On the positive–negative partial set cover problem. Inform Process Lett. doi: 10.1016/j.ipl.2008.05.007 (in press)
- Miettinen P et al (2008b) The discrete basis problem. IEEE Trans Knowl Data Eng. doi: 10.1109/tkde.2008.53 (in press)
- Zhang Z et al (2007) Binary matrix factorization with applications. In: Proceedings of the 7th IEEE international conference on data mining (ICDM), pp 391–400Google Scholar