Data Mining and Knowledge Discovery

, Volume 29, Issue 5, pp 1343–1373

Clustering Boolean tensors

Article

DOI: 10.1007/s10618-015-0420-3

Cite this article as:
Metzler, S. & Miettinen, P. Data Min Knowl Disc (2015) 29: 1343. doi:10.1007/s10618-015-0420-3

Abstract

Graphs—such as friendship networks—that evolve over time are an example of data that are naturally represented as binary tensors. Similarly to analysing the adjacency matrix of a graph using a matrix factorization, we can analyse the tensor by factorizing it. Unfortunately, tensor factorizations are computationally hard problems, and in particular, are often significantly harder than their matrix counterparts. In case of Boolean tensor factorizations—where the input tensor and all the factors are required to be binary and we use Boolean algebra—much of that hardness comes from the possibility of overlapping components. Yet, in many applications we are perfectly happy to partition at least one of the modes. For instance, in the aforementioned time-evolving friendship networks, groups of friends might be overlapping, but the time points at which the network was captured are always distinct. In this paper we investigate what consequences this partitioning has on the computational complexity of the Boolean tensor factorizations and present a new algorithm for the resulting clustering problem. This algorithm can alternatively be seen as a particularly regularized clustering algorithm that can handle extremely high-dimensional observations. We analyse our algorithm with the goal of maximizing the similarity and argue that this is more meaningful than minimizing the dissimilarity. As a by-product we obtain a PTAS and an efficient 0.828-approximation algorithm for rank-1 binary factorizations. Our algorithm for Boolean tensor clustering achieves high scalability, high similarity, and good generalization to unseen data with both synthetic and real-world data sets.

Keywords

Tensors Clustering Boolean algebra Approximation Decomposition 

Copyright information

© The Author(s) 2015

Authors and Affiliations

  1. 1.Max-Planck-Institut für InformatikSaarbrückenGermany

Personalised recommendations