Complexity of Searching for 2 by 2 Submatrices in Boolean Matrices
 4 Mentions
 728 Downloads
Abstract
We study the problem of finding a given \(2\times 2\) matrix as a submatrix of a given Boolean matrix. Three variants are considered: search for a matching submatrix of any area, of minimum area, or of maximum area. The problem relates to 2D pattern matching, and to fields such as data mining, where the search for submatrices plays an important role. Besides these connections, the problem itself is very natural and its investigation helps to demonstrate differences between search tasks in onedimensional and multidimensional topologies.
Our results reveal that the problem variants are of different complexities. First, we show that given an \(m\times n\) Boolean matrix, the any variant can be solved in \({\widetilde{O}(mn)}\) time for any given \(2\times 2\) matrix, but requires various strategies for different \(2\times 2\) matrices. This contrasts with the complexity of the task over matrices with entries from the set \(\{0,1,2\}\), where the problem is Triangle Findinghard and hence no algorithm with similar running time is known for it. Then, we show that the minimization variant in the case of Boolean matrices can also be solved in \({\widetilde{O}(mn)}\) time. Finally, in contrast, we prove Triangle Findinghardness for the maximization variant and show that there is a rectangular matrix multiplicationbased algorithm solving it in \(O\left( mn (\min \{m,n\})^{0.5302}\right) \) time.
Keywords
Boolean matrix Submatrices Twodimensional pattern matching Local picture language Triangle Findinghard problem Fast matrix multiplication1 Introduction
 1.
In the input Boolean matrix \(\mathbf {M}\), search for any \(2\times 2\) submatrix of \(\mathbf {M}\) that matches \(\mathbf {B}\) (we abbreviate this task as Open image in new window)
 2.
Search for a submatrix that matches \(\mathbf {B}\) and encloses the minimum area of \(\mathbf {M}\) (abbreviated as Open image in new window)
 3.
Search for a submatrix that matches \(\mathbf {B}\) and encloses the maximum area of \(\mathbf {M}\) (abbreviated as Open image in new window).
In general, the problem of finding a specific submatrix in a larger matrix is of importance in several computer science disciplines. For example Boolean matrices, and their associated submatrices of 1’s, play a central role in data mining problems such as frequent itemset mining [13]. Moreover, finding a submatrix of 1’s in the adjacency matrix of a graph G corresponds to finding a biclique of G [13]. As the maximum edge biclique problem is NPcomplete [10], the complexity of searching for a \(k\times k\) submatrix is expected to grow as k grows. In this paper, we deal with the simplest case when \(k=2\). An example of its use is as follows. Given m respondents answering n yes/no questions in a questionnaire, are there two respondents who answered yes on two of the same questions?
The above tasks Open image in new window, Open image in new window and Open image in new window can also be viewed as twodimensional pattern matching: we search for any/min/max rectangular block of a matrix that matches a given template. In only one dimension, similar pattern matching problems can be described using regular languages [2]. In this case, all the any/min/max tasks are solvable by a finitestate automatonbased algorithm in time linear in the input length [8]. In two dimensions, these problems are easily definable via the notion of local picture languages [5]. This is a formalism defining sets of twodimensional arrays (so called pictures) for which the membership problem can be determined by looking at a window of size \(2\times 2\). These picture languages are a straightforward generalization of the well known local (string) languages [12], which form a proper subset of the family of regular languages.
We introduced in [8] a general algorithm solving twodimensional pattern matching against local picture languages in time \(O(mn\min \{m,n\})\) for \(m\times n\) input matrices. Further, for a specific local picture language, we investigated the pattern matching problem which is precisely Open image in new window and showed it to be solvable in linear time in the input matrix area. Here our goal is to propose more efficient algorithms for a specialized subclass of local picture language pattern matching problems over Boolean matrices called Four Corner Problems. In particular, we show that the problem Open image in new window is solvable in \(\widetilde{O}(mn)\) time for any \(a,b,c,d\in \{0,1\}\) (Theorem 1). This result is surprising because it was proven in [8] that searching for a submatrix matching Open image in new window in an \(n\times n\) matrix over \(\{0,1,2\}\) is Triangle Findinghard. In other words, the proof introduced a finegrained reduction [15] from Triangle Finding to the search problem for Open image in new window over \(\{0,1,2\}\) suggesting that Four Corner Problems are harder over larger alphabets.
The Triangle Finding problem is to decide whether a given undirected graph \(G=(V,E)\) is trianglefree or not. It is a classic algorithmic problem which can be reduced to Boolean Matrix Multiplication (see [6]) and solved in time \(O(n^{\omega })\), where \(n=V\) and \(\omega < 2.373\) denotes the matrix multiplication exponent [14]. However, it is currently unknown whether Triangle Finding can be solved in time \(\widetilde{O}(n^2)\). Note that conditional lower bounds based on Triangle Finding are known for several problems (see, e.g., [1, 7, 9, 11]).
We further investigate the minimization and maximization variants of the search problem over Boolean matrices. For the min variant, we improve on Theorem 1 by showing that the problem Open image in new window is solvable in \(\widetilde{O}(mn)\) time for any \(a,b,c,d\in \{0,1\}\) (Theorem 4). For the max variant, we prove that Open image in new window is Triangle Findinghard for any \(a,b,c,d\in \{0,1\}\) (Theorem 5). Also, we present an algorithm that solves Open image in new window in \(O\left( mn (\min \{m,n\})^{0.5302}\right) \) time (Theorem 6). This algorithm is based on computing a minimum witness for Boolean matrix multiplication [4]. However, it is likely impractical because it uses a fast rectangular matrix multiplication algorithm that involves a large constant factor.
The paper is structured as follows. Section 2 establishes some required notions. Then, Sects. 3, 4 and 5 gradually present results for the problems Open image in new window, Open image in new window and Open image in new window.
2 Preliminaries
\(\mathbb {N}=\{0,1,2,\ldots \}\) is the set of natural numbers and Open image in new window is the set of positive integers. For functions \(f,g:\mathbb {N}\times \mathbb {N}\rightarrow \mathbb {N}\), we write \(f(m,n)=\widetilde{O}(g(m,n))\) if and only if there are numbers \(p,q \in \mathbb {N}\) such that \(f(m,n)=O\left( g(m,n) \log ^p (m) \log ^q (n) \right) \).
For \(a,b,c,d\in \{0,1\}\), we define the following search problems (also known as Four Corner Problems) for an input Boolean matrix \(\mathbf {M}\).

Open image in new window: find \(B\in \mathcal {B}_{\mathbf {M}}\) such that Open image in new window,
3 Searching for Any Matching Submatrix
This section presents algorithms for Open image in new window that run in nearly linear time in the input matrix area, for every a, b, c, and d. In some cases an efficient algorithm is achieved by using properties of the minimum matching submatrix, so these algorithms also solve the corresponding Open image in new window problem (see Lemmas 2 and 3).
Out of all Open image in new window problems, Open image in new window and Open image in new window are easiest to solve. It has already been shown in [8] that Open image in new window reduces to finding a fourcycle in a bipartite graph. Here we give a more straightforward algorithm.
Lemma 1
Open image in new window is solvable in time \(O(m n)\) for m by n Boolean matrices.
Proof
Let an \(m\times n\) Boolean matrix \(\mathbf {M}\) be given. Without loss of generality, suppose that \(m \ge n\). The algorithm is as follows. We create a set S of pairs of column indexes. Initially, the set is empty. The matrix is traversed row by row. For each row i, we find the set \(C_i\) of all column indexes j such that \(\mathbf {M}_{i,j}=1\). Then, for every pair \(\{c_1,c_2\}\in {C_i \atopwithdelims ()2}\), we check whether \(\{c_1,c_2\}\) is in S. If not, it is added to S. Otherwise, a desired submatrix has been found.
The algorithm takes \(O(mn + n^2)\) time because it visits each entry from \(\mathbf {M}\) at most once and it adds at most \({n}\atopwithdelims (){2}\) pairs of column indexes into S. Because \(m \ge n\), the total runtime is \(O(m n)\). \(\square \)
Lemma 2
Open image in new window is solvable in time \(O(m n)\) for m by n Boolean matrices.
Proof
Let \(\mathbf {M}\) be an \(m\times n\) Boolean matrix. The algorithm is based on the following claim: If \(\mathbf {M}\) contains a block \(B=\mathbf {M}[r,c;k,\ell ]\) such that Open image in new window, then it contains a block \(B'=\mathbf {M}[r',c';k',\ell ']\) such that Open image in new window, \(B'_{i,1}=0\) for all \(i=2,\ldots , k'1\) and \(B'_{k',j}=0\) for all \(j=2,\ldots , \ell '1\) (i.e., the left and bottom edge of \(B'\), excluding the corners, contain only 0 entries).
To see this, suppose without loss of generality that \(B_{i,1}=1\) for some \(1< i <k\). Let \(B_1=\mathbf {M}[r,c;i,\ell ]\) and \(B_2=\mathbf {M}[r+i1,c;ki+1,\ell ]\). Then, either Open image in new window (if \(B_{i,\ell }=1\)) or Open image in new window (if \(B_{i,\ell }=0\)). Since \(B_1\) and \(B_2\) are proper subsets of B, we have found a smaller block containing Open image in new window as a submatrix.
Now, we present the algorithm. It creates a map \(\sigma \) where a key is a pair (i, j) such that \(\mathbf {M}_{i,j} = 1\). The value associated with (i, j) is a pair \((i^{\prime },j^{\prime })\) such that \(i^{\prime }\) is the largest row index less than i such that \(\mathbf {M}_{i^{\prime },j} = 1\) (i.e., \(i'\) is the row index of the nearest entry 1 located upwards from the position (i, j)) and \(j^{\prime }\) is the smallest column index greater than j such that \(\mathbf {M}_{i,j^{\prime }} = 1\) (i.e., the column index of the nearest entry 1 rightwards). Note that the value of \(i'\) or \(j'\) might be undefined if there is no such row index or column index, respectively.
It is possible to build \(\sigma \) in \(O(m n)\) time by making two passes over \(\mathbf {M}\). The first pass is to compute the \(i^{\prime }\)’s. The matrix \(\mathbf {M}\) is scanned column by column. Each column index j is scanned from top to bottom. Whenever entry 1 is detected at a position \((i',j)\), then \(i'\) is the first component of \(\sigma (i,j)\) for the next detected entry 1 from position (i, j). Analogously, the second pass, scanning \(\mathbf {M}\) row by row, is to compute the \(j^{\prime }\)’s.
Now, for each key (i, j) in the map \(\sigma \), the algorithm takes its value \((i^{\prime }, j^{\prime })\) and checks if rows \(i, i^{\prime }\) and columns \(j, j^{\prime }\) form a desired submatrix matching Open image in new window. By doing this, every existing block with 0 entries on the left and bottom edges is checked. Among these blocks, a minimumarea block B such that Open image in new window is returned as the result.
Assuming constant time map operations, the algorithm runs in \(O(m n)\) time (note that the map \(\sigma \) can be implemented by using an \(m\times n\) array). \(\square \)
Lemma 3
Open image in new window is solvable in time \({\widetilde{O}(mn)}\) for m by n Boolean matrices.
Proof
Without loss of generality, let us deal only with the border between \(\mathbf {M}_{\mathrm {top}}\) and \(\mathbf {M}_{\mathrm {bottom}}\). We claim: if \(B=\mathbf {M}[r,c;k,\ell ]\) is a minimumarea block of \(\mathbf {M}\) such that Open image in new window, then \(B_{i,1} = B_{i,\ell }\) for all \(i=2,\ldots , k1\). Indeed, \(B_{i,1} \ne B_{i,\ell }\) would clearly contradict the minimality of B.
Based on the claim, we create maps \(\sigma _{\mathrm {top}}\) and \(\sigma _{\mathrm {bottom}}\) such that \(\sigma _{\mathrm {top}}(\{i,j\})\) is the largest row index such that columns i and j differ in \(\mathbf {M}_{\mathrm {top}}\), and, analogously, \(\sigma _{\mathrm {bottom}}(\{i, j\})\) is the smallest row index such that columns i and j differ in \(\mathbf {M}_{\mathrm {bottom}}\). Once we have constructed \(\sigma _{\mathrm {top}}\) and \(\sigma _{\mathrm {bottom}}\) we go through each pair of column indexes \(\{i, j\}\) and check if rows \(\sigma _{\mathrm {top}}(\{i, j\})\), \(\sigma _{\mathrm {bottom}}(\{i, j\})\) and columns i, j together create a desired submatrix of \(\mathbf {M}\). A minimum submatrix among the detected submatrices is the candidate for the resulting submatrix returned by the procedure.
It remains to explain how we obtain the maps. Let us first give a construction for \(\sigma _{\mathrm {top}}\). We create a set X of pairwise disjoint sets of column indexes. Initially, X contains one set containing all column indexes. We repeat the following process for each row of \(\mathbf {M}_{\mathrm {top}}\), starting at the bottommost one and proceeding upwards: Create two disjoint sets \(A_0\) and \(A_1\) where \(A_0\) contains all column indexes that are 0’s and \(A_1\) contains all column indexes that are 1’s in the current row. For each set S in X, split S into two disjoint subsets \(S_0 = S \cap A_0\) and \(S_1 = S \cap A_1\). For every \(\{i, j\}\) such that \(i \in S_0\) and \(j \in S_1\), set \(\sigma _{\mathrm {top}}(\{i, j\})\) to the current row index. Then, update X by replacing S with \(S_0\) and \(S_1\). Throw out any sets from X that have less than two elements. Finish when X is empty or every row of \(\mathbf {M}_{\mathrm {top}}\) has been processed.
We similarly build \(\sigma _{\mathrm {bottom}}\), but we start at the top row of \(\mathbf {M}_{\mathrm {bottom}}\) going one row down at a time. It only takes \(O(n^2)\) time to construct \(\sigma _{\mathrm {top}}\) and \(\sigma _{\mathrm {bottom}}\) because we do O(n) work per row plus an additional constant amount of work for each pair of columns.
Case II (rectangular matrices): Let an \(m\times n\) Boolean matrix \(\mathbf {M}\) be given. Assume without loss of generality that \(m > n\).
We perform horizontal splits to divide \(\mathbf {M}\) into \(d = \lceil \frac{m}{n}\rceil \) smaller matrices \(\{\mathbf {M}_k\}_{k\in [d]}\) such that for each \(k \in [d  1]\), \(\mathbf {M}_k\) is n by n, and \(\mathbf {M}_{d}\) is c by n for some \(c \le n\). A desired minimum submatrix is either in \(\mathbf {M}_k\) for some \(k \in [d]\) or it crosses the border between \(\mathbf {M}_k\) and \(\mathbf {M}_{k+1}\) for some \(k \in [d  1]\). Then, the former cases in total take \(O\left( \frac{m}{n} \cdot t(n) \right) \) time. We claim that the latter cases take O(mn) time. For each \(k \in [d  1]\), we construct maps \(\sigma _{k, \mathrm {top}}\) and \(\sigma _{k, \mathrm {bottom}}\) such that \(\sigma _{k, \mathrm {top}}(\{i, j\})\) is the smallest row index such that columns i and j differ in \(\mathbf {M}_{k}\), and \(\sigma _{k, \mathrm {bottom}}(\{i, j\})\) is the largest row index such that columns i and j differ in \(\mathbf {M}_{k}\). Following the same approach as for the square matrix case, we can construct all maps in total time O(mn). Then, for each pair of column indexes i, j we have up to d cases to check. This results in total time O(mn). Note that if a map is not defined at \(\{i, j\}\), then we try the next map and combine the cases together since this means a submatrix might span across multiple horizontal splits. In total, our algorithm takes \(O\left( \frac{m}{n} \cdot t(n) + m n\right) = O(m n \log (n))\) time. \(\square \)
Lemma 4
Open image in new window is solvable in time \({\widetilde{O}(mn)}\) for m by n Boolean matrices.
Proof
Let an \(m\times n\) Boolean matrix \(\mathbf {M}\) be given.
Case I (tall matrices): We consider the case when \(m \ge n\). We proceed in a similar manner as in the proof of Lemma 1. We create a set S of pairs of column indexes. Initially, the set is empty. The matrix is traversed row by row. For each row, we do the following. We create a set R. Initially, R is empty, but we will add column indexes to R. We scan entries from left to right in the row. When we encounter a 1 entry at column index i, we add i to R. When we encounter a 0 entry at column index j, we go through each column index i from R. If (i, j) is in S, then we found a desired submatrix. Otherwise, we add (i, j) to S. Since \(m \ge n\), this takes \(O(m n + n^2) = O(m n)\) time.
Case II (short matrices): We consider the case when \(m < n\). We perform vertical splits to divide \(\mathbf {M}\) into \(d = \lceil \frac{n}{m}\rceil \) smaller matrices \(\{\mathbf {M}_k\}_{k\in [d]}\) such that for each \(k \in [d  1]\), \(\mathbf {M}_k\) is m by m, and \(\mathbf {M}_{d}\) is m by c for some \(c \le m\). The matrix \(\mathbf {M}\) contains a desired submatrix if and only if some \(\mathbf {M}_k\) contains a minimal submatrix for some \(k \in [d]\) or there is a minimal submatrix that crosses the border between \(\mathbf {M}_k\) and \(\mathbf {M}_{k+1}\) for some \(k \in [d  1]\).
Consider the former condition. Checking if a given \(\mathbf {M}_k\) matrix contains a minimal desired submatrix takes \(O(m^2)\) time by applying the approach from the first case. Checking all of the matrices in \(\{\mathbf {M}_k\}_{k\in [d]}\) takes \(O(d \cdot m^2) = O\left( \frac{n}{m} \cdot m^2\right) = O(m n)\) time.
Now, we focus on checking the latter condition. For each \(k \in [d  1]\), we construct maps \(\sigma _{k, \mathrm {left}}\) and \(\sigma _{k, \mathrm {right}}\) such that \(\sigma _{k, \mathrm {left}}(\{i, j\})\) is the smallest column index such that rows i and j are equal in \(\mathbf {M}_{k}\), and \(\sigma _{k, \mathrm {right}}(\{i, j\})\) is the largest column index such that rows i and j are equal in \(\mathbf {M}_{k}\). Once we have constructed these maps, we consider each pair of rows i and j. We have up to \(d  1\) cases to check where each case considers the border between \(\mathbf {M}_k\) and \(\mathbf {M}_{k+1}\) for some \(k \in [d  1]\). We check each case by seeing if rows i and j along with columns \(\sigma _{k, \mathrm {right}}(\{i, j\})\) and \(\sigma _{k + 1, \mathrm {left}}(\{i, j\})\) form a desired submatrix. It is sufficient to check these submatrices because we are only concerned with desired submatrices crossing the border that are minimal. Note if a map is not defined at \(\{i, j\}\), then we try the next map and combine the cases together since this means a submatrix might span across multiple vertical splits. This takes \(O(d \cdot m^2) = O\left( \frac{n}{m} \cdot m^2\right) = O(m n)\) time. It remains to describe how the maps are constructed. We claim that the maps can be constructed in \(O(m n \log (m))\) time. Therefore, the total runtime is \(O(m n \log (m))\).
Given \(k \in [d  1]\), we describe how to construct \(\sigma _{k, \mathrm {left}}\) for the matrix \(\mathbf {M}_{k}\). For each \(\ell \in [\log (m)]\), we construct a matrix \(\mathbf {M}_{k, \ell }\). The matrix \(\mathbf {M}_{k, \ell }\) is obtained from \(\mathbf {M}_{k}\) by negating all bits in each row i such that i’s binary expansion has a 1 at position \(\ell \). Next, in a similar manner as described in the proof of Lemma 3, we construct a map \(\sigma _{\ell }\) such that \(\sigma _{\ell }(\{i, j\})\) is the smallest column index where rows i and j differ in \(\mathbf {M}_{k, \ell }\). Now, we use these \(\log (m)\) maps to construct \(\sigma _{k, \mathrm {left}}\). For each pair of rows i and j, there is some position \(\ell \) in i and j’s binary expansions where they differ. The smallest column index where rows i and j differ in \(\mathbf {M}_{k, \ell }\) is exactly the same as the smallest column index where rows i and j are equal in \(\mathbf {M}_{k}\). Hence, we make \(\sigma _{k, \mathrm {left}}(\{i, j\}) = \sigma _{\ell }(\{i, j\})\). It takes \(O(m^2 \log (m))\) time to construct \(\sigma _{k, \mathrm {left}}\). The map \(\sigma _{k, \mathrm {right}}\) can be constructed in a similar manner. In total, it takes \(O(d \cdot m^2 \log (m)) = O\left( \frac{n}{m} \cdot m^2 \log (m)\right) = O(m n \log (m))\) time to construct all of the maps. \(\square \)
Theorem 1
Problem Open image in new window is solvable in time \({\widetilde{O}(mn)}\) for m by n Boolean matrices and any \(a,b,c,d\in \{0,1\}\).
Proof
Consider the set of matrices Open image in new window. Every \(2\times 2\) Boolean matrix \(\mathbf {A}\) is similar to a Boolean matrix \(\mathbf {B} \in S\) in the sense that \(\mathbf {B} = U(\mathbf {A})\) for an operation U that combines a rotation with an optional negation of all bits. Further, for every Boolean matrix \(\mathbf {M}\), the matrix \(\mathbf {M}\) contains \(\mathbf {A}\) as a submatrix if and only if \(U(\mathbf {M})\) contains \(\mathbf {B}\) as a submatrix. Applying Lemmas 1, 2, 3, and 4, we can determine if \(\mathbf {M}\) has \(\mathbf {A}\) as a submatrix in \({\widetilde{O}(mn)}\) time. \(\square \)
4 Searching for a Minimum 2by2 Submatrix of 1’s
In the previous section, we presented fast algorithms for minimization problems Open image in new window and Open image in new window. Here, we use preceding results for Open image in new window and Open image in new window to also obtain fast algorithms for Open image in new window and Open image in new window.
First, we introduce an algorithm for Open image in new window. The technique we apply requires several preparatory steps: a characterization of Boolean matrices that do not have Open image in new window as a submatrix (Lemma 5), an algorithm solving Open image in new window whose complexity depends on the number of pairs of 1’s within the same rows (Lemma 6), and a fast algorithm solving Open image in new window approximately (Lemma 8). Then, we can apply a similar approach to solve Open image in new window.
Lemma 5
Let \(\mathbf {A}\) be an m by n Boolean matrix. Let \(a_i\) denote the number of 1’s in the ith row of \(\mathbf {A}\). If \(\varSigma _{i=0}^m {a_i \atopwithdelims ()2} > {n \atopwithdelims ()2}\), then \(\mathbf {A}\) must contain a block whose corners are 1’s.
Proof
\(\varSigma _{i=0}^m {a_i \atopwithdelims ()2}\) is the size of the set \(T=\{(i,\{j,k\}) \mid j\ne k \,\wedge \, \mathbf {A}_{ij}=\mathbf {A}_{ik} = 1\}\). If \(T>{n \atopwithdelims ()2}\), then there are \((i_1,\{j,k\}),(i_2,\{j,k\})\in T\), where \(i_1\ne i_2\). This means that rows \(i_1, i_2\) and columns j, k form a submatrix Open image in new window. \(\square \)
Lemma 6
Let \(\mathbf {M}\) be an m by n Boolean matrix and \(T(\mathbf {M})=\{(i,\{j,k\}) \mid j\ne k \,\wedge \, \mathbf {M}_{i,j}=\mathbf {M}_{i,k}=1\}\). There is an algorithm solving Open image in new window time.
Proof
The algorithm uses a map \(\sigma \) with keys \(\{j,k\}\), where \(j\ne k\) are column indexes. The value of \(\sigma (\{j,k\})\) is a row index. Initially, the map is empty.
The input Boolean matrix \(\mathbf {M}\) is processed row by row. In the ith row, the following actions are performed for each \((i,\{j,k\})\in T(\mathbf {M})\). First, it is checked whether \(\sigma (\{j,k\})\) is defined. If it is not, then \(\sigma (\{j,k\})\) is set to i. Otherwise, the algorithm finds out whether the rectangle formed by rows i, \(\sigma (\{j,k\})\) and columns j, k is the minimum one so far. Then, \(\sigma (\{j,k\})\) is updated to be i. \(\square \)
For convenience, for each Boolean matrix \(\mathbf {M}\) considered now until the end of this section, assume that the number of rows and the number of columns of \(\mathbf {M}\) are powers of 2. Since any matrix of a general size \(m\times n\) can be extended to a \(2^{\lceil \log _2 m \rceil }\times 2^{\lceil \log _2 n \rceil }\) matrix (with the added entries set to “undefined” value), the assumption will not have any impact on the generality and asymptotic time complexity of the presented algorithms.
For \(p\in \mathbb {N}^{+}\), let \(\mathcal {S}(p)=\{2^i \mid i=1,2,\ldots ,\lfloor \log _2 p \rfloor \}\) be the set of powers of two greater than 1 and not greater than p. Let \(\mathbf {M}\) be an \(m\times n\) Boolean matrix. For \(k\in \mathcal {S}(m)\) and \(\ell \in \mathcal {S}(n)\), let \(\mathcal {R}_{\mathbf {M}}(k,\ell )\) denote the set of all \(k\times \ell \) blocks of \(\mathbf {M}\) whose top left corner is located in \(\mathbf {M}\) at a position \((1+a\cdot \frac{k}{2},1+b\cdot \frac{\ell }{2})\) for some \(a,b\in \mathbb {N}\). Let \(\mathcal {R}_{\mathbf {M}}=\bigcup _{k\in \mathcal {S}(m),\ell \in \mathcal {S}(n)} \mathcal {R}_{\mathbf {M}}(k,\ell )\).
Lemma 7
Let B be a p by q block of \(\mathbf {M}\). There are powers of 2, denoted by k and \(\ell \), such that \(k< 4p\), \(\ell < 4q\) and B is included in a block from \(\mathcal {R}_{\mathbf {M}}(k,\ell )\).
Proof
Quite analogously, the definition of b ensures that \(1+b\cdot \frac{\ell }{2} \le c\) and it can be proved that \(c+q1 \le b \cdot \frac{\ell }{2} + \ell \). \(\square \)
Lemma 7 and the defined set of blocks \(\mathcal {R}_{\mathbf {M}}\) provide a basis for designing a fast algorithm that solves Open image in new window approximately.
Lemma 8
There is an algorithm that, for any m by n Boolean matrix \(\mathbf {M}\), finds in \(O(mn\log m\log n)\) time a block B of \(\mathbf {M}\) such that Open image in new window and \(\mathrm {a}(B) < 16 \cdot \mathrm {a}(B_{\mathrm {min}})\), where \(B_{\mathrm {min}}\) is a minimumarea block of \(\mathbf {M}\) fulfilling Open image in new window.
Proof
The algorithm works as follows. For each block \(B\in \mathcal {R}_{\mathbf {M}}\), it uses the algorithm of Lemma 1 to search inside B for a submatrix matching Open image in new window. Among all the detected submatrices, it outputs a minimal one.
By Lemma 7, \(B_{\mathrm {min}}\) is a part of a block \(B'\in \mathcal {R}_{\mathbf {M}}\) whose area is less than 16 times the area of \(B_{\mathrm {min}}\), hence the algorithm of Lemma 1 running on \(B'\) finds a block of \(\mathbf {M}\) fulfilling the lemma requirement.
For each \((k,\ell )\in \mathcal {S}(m)\times \mathcal {S}(n)\), the sum of the areas of the blocks in \(\mathcal {R}_{\mathbf {M}}(k,\ell )\) is \(O(mn)\), hence all these blocks are processed by the algorithm of Lemma 1 cumulatively in O(mn) time. Finally, since \(\mathcal {S}(m)\times \mathcal {S}(n)=O(\log m \log n)\), the proposed algorithm runs in \(O(mn\log m\log n)\) time. \(\square \)
Theorem 2
Open image in new window is solvable in time \(\widetilde{O}(mn)\) for m by n Boolean matrices.
Proof
Let \(\mathbf {M}\) be an input \(m\times n\) Boolean matrix. Assume that the algorithm of Lemma 8 finds in \(\mathbf {M}\) a block of an area S. The minimum area of a block of \(\mathbf {M}\) containing Open image in new window as a submatrix is in the range \((\frac{S}{16}, S]\).
Claim III: Every \(k\times \ell \) block B in \(\mathcal {R}'_{\mathbf {M}}\) fulfills Open image in new window (see Lemma 6 for the definition of T(B)). To show this, assume without loss of generality that \(k\ge \ell \) and \(k\ge 256\). Consider B to be split horizontally into 256 subblocks \(B_i\) of size \(\frac{k}{256} \times \ell \). Hence, \(\mathrm {a}(B_i)=\frac{k\ell }{256}\). By Claim I, it holds that \(k\ell <16\,\cdot \, S\), and hence \(\mathrm {a}(B_i)<\frac{16\cdot S}{256}=\frac{S}{16}\). This means that \(B_i\) does not contain Open image in new window as a submatrix, and hence Lemma 5 implies that \(T(B_i)=O(\ell ^2)\). Finally, we derive \(T(B)=\sum _{i=1}^{256}T(B_i)=O(\ell ^2)\).
Algorithm: We now have all prerequisites for describing the intended algorithm and deriving its time complexity. It works as follows. Call the algorithm of Lemma 8 to obtain S. For each \(B\in \mathcal {R}'_{\mathbf {M}}\) of a size \(k\times \ell \), call the algorithm of Lemma 6 either for B (if \(k\ge \ell \)) or Open image in new window) to find a minimumarea block within B containing Open image in new window as a submatrix in time \(O((\min \{k,\ell \})^2)\). A minimumarea block among all found blocks is returned as the final output.
Theorem 3
Open image in new window is solvable in time \(\widetilde{O}(mn)\) for m by n Boolean matrices.
Theorem 4
Open image in new window is solvable in time \({\widetilde{O}(mn)}\) for m by n Boolean matrices and any \(a,b,c,d\in \{0,1\}\).
5 Searching for a Maximum Matching Submatrix
We first prove that the problem Open image in new window is Triangle Findinghard for any \(a,b,c,d\in \{0,1\}\) (Theorem 5). Then, we show how Open image in new window can be solved using rectangular matrix multiplication (Theorem 6).
Theorem 5
Open image in new window is Triangle Findinghard for any \(a,b,c,d\in \{0,1\}\).
Proof
By the same reasoning given in the proof of Theorem 1, it suffices to prove Triangle Findinghardness for problems Open image in new window, Open image in new window, Open image in new window, and Open image in new window. We first present a finegrained reduction from Triangle Finding to Open image in new window. We then adapt the reduction to the other three problems.
Let a graph \(G=(V,E)\) be given. Let the set of vertices be \(V=\{v_i \mid i\in \{1,\ldots ,n\}\}\). Let \(\mathbf {A}\) be an \(n\times n\) lower triangular Boolean matrix derived from the adjacency matrix of G as follows: \(\mathbf {A}_{i,j}=1\) if and only if \(i>j\) and \(\{v_i, v_j\}\in E\). Observe that \(\{v_i, v_j, v_k\}\), where \(i<j<k\), is a triangle in G if and only if \(\mathbf {A}_{j,i}= \mathbf {A}_{k,i}= \mathbf {A}_{k,j}=1\).
Triangle Findinghardness of Open image in new window is implied by the following property.
Claim: G has a triangle if and only if there is a block B of \(\mathbf {M}\) such that Open image in new window and \(\mathrm {a}(B)\ge 3n^2\).
A block B included in one of the matrices \(\mathbf {A}_{i}\), \(i\in \{1,2,3\}\). Its area is not greater than \(n^2\).
A block B with two corners in \(\mathbf {A}_2\) and the other two corners in either \(\mathbf {A}_1\) or \(\mathbf {A}_3\). Assume e.g. that the leftmost column of such a block is the kth column of \(\mathbf {M}_1\) and the rightmost column is in the \(\ell \)th column of \(\mathbf {M}_1\), where \(\ell >2n\). The height of B is at most \(nk\) and it holds that \(\ell < 3n\). Hence, \(\mathrm {a}(B)\) is upper bounded by \((\ell  k + 1)(nk)\le (3nk)(nk)<3n^2\).
A block B that has one corner in each of the matrices \(\mathbf {A}_1\), \(\mathbf {A}_2\), \(\mathbf {A}_3\), \(\mathbf {I}\). Let the top left corner of B be in \(\mathbf {M}_1\) at a position \((k,\ell )\), and the bottom right corner of B be at a position (s, t). Properties of the \(\mathbf {A}_i\)’s and \(\mathbf {I}\) ensure that \(\ell k<n\), \(t=2n+k\), and \(s>2n + \ell \). Hence, \(\mathrm {a}(B)= (sk+1)(t\ell +1) >(2n+\ell k) (2n+k\ell )= 4n^2(\ell k)^2\ge 3n^2\).
It is not difficult to verify that there is a onetoone correspondence between blocks of the third type and triples \(i<j<k\) such that \(\mathbf {A}_{j,i}= \mathbf {A}_{k,i}= \mathbf {A}_{k,j}=1\), hence representing triangles \(\{v_i, v_j, v_k\}\) of G.
One can again verify that G has a triangle if and only if each of the constructed matrices contains a block B such that \(\mathrm {a}(B) \ge 3n^2\) and \(\varkappa (B)\) matches the desired \(2\times 2\) matrix. \(\square \)
Now, let us focus on approaches for solving the maximization problems. Given an \(m\times n\) Boolean matrix \(\mathbf {M}\) and \(p,q\in \{0,1\}\), let \(\sigma ^{\mathbf {M}}_{p,q}\) denote the map whose keys are pairs (i, j), where i, j are row indexes of \(\mathbf {M}\) such that \(i<j\). For a key (i, j), the map value is defined as the smallest column index c such that \(\mathbf {M}_{i,c}=p\) and \(\mathbf {M}_{j,c}=q\).
Let \(\mathbf {M}'\) denote the matrix \(\mathbf {M}\) flipped left to right. It is easy to see that every problem Open image in new window, where \(a,b,c,d\in \{0,1\}\), can be solved based on the maps \(\sigma ^{\mathbf {M}}_{a,c}\) and \(\sigma ^{\mathbf {M}'}_{b,d}\). Conversely, this also shows that these maps are Triangle Findinghard to build.
The maps can be computed based on a minimum witness for Boolean matrix multiplication [4], and hence the time complexity of solving Open image in new window in this way coincides with the time complexity in [4] for the minimum witness problem.
Lemma 9
There is an algorithm that, for any m by n Boolean matrix \(\mathbf {M}\) where \(m\le n\), and any \(p,q\in \{0,1\}\), builds \(\sigma ^{\mathbf {M}}_{p,q}\) in time \(O\left( mn\cdot m^{0.5302}\right) \).
Theorem 6
For any \(a,b,c,d\in \{0,1\}\), there is an algorithm solving Open image in new window in time \(O\left( mn\cdot ({\min \{m,n\}})^{0.5302}\right) \) for m by n Boolean matrices.
6 Conclusion
We investigated the complexity of Four Corner Problems over Boolean matrices. A Four Corner Problem is concerned with searching for a given \(2\times 2\) submatrix in a given Boolean matrix. We demonstrated that minimumarea Four Corner Problems over Boolean matrices are solvable in nearly linear time in the input matrix area (Theorem 4) and maximumarea Four Corner Problems over Boolean matrices are Triangle Findinghard (Theorem 5). The algorithms that we presented for the former problems might lead to efficient implementations, while the results achieved for the latter problems give rise to an interesting unresolved theoretical question: Are the maximumarea Four Corner Problems harder than the Triangle Finding problem? Going further, we suggest that a possible future direction is to investigate the complexity of Four Corner Problems over matrices with entries from larger alphabets.
Notes
Acknowledgment
We greatly appreciate all of the help and suggestions that we received. We would especially like to thank Joseph Swernofsky for helping us obtain some preliminary results, for contributing to a preliminary version of this work, and for providing valuable feedback. We also thank the Czech Science Foundation for supporting the first author (grant no. 1909967S).
References
 1.Abboud, A., Backurs, A., Williams, V.V.: If the current clique algorithms are optimal, so is Valiant’s parser. In: Guruswami, V. (ed.) IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS 2015), pp. 98–117. IEEE Computer Society (2015)Google Scholar
 2.Aho, A.V.: Algorithms for finding patterns in strings. In: van Leeuwen, J. (ed.) Algorithms and Complexity, Handbook of Theoretical Computer Science, vol. A, pp. 255–300. The MIT Press, Cambridge (1990)Google Scholar
 3.Bentley, J.L., Haken, D., Saxe, J.B.: A general method for solving divideandconquer recurrences. SIGACT News 12(3), 36–44 (1980)CrossRefGoogle Scholar
 4.Cohen, K., Yuster, R.: On minimum witnesses for boolean matrix multiplication. Algorithmica 69(2), 431–442 (2014)MathSciNetCrossRefGoogle Scholar
 5.Giammarresi, D., Restivo, A.: Twodimensional languages. In: Rozenberg, G., Salomaa, A. (eds.) Handbook of Formal Languages, pp. 215–267. Springer, Heidelberg (1997). https://doi.org/10.1007/9783642591266_4CrossRefGoogle Scholar
 6.Itai, A., Rodeh, M.: Finding a minimum circuit in a graph. In: 9th Annual ACM Symposium on Theory of Computing (STOC 1977), pp. 1–10. ACM, New York (1977)Google Scholar
 7.Lee, L.: Fast contextfree grammar parsing requires fast boolean matrix multiplication. J. ACM 49(1), 1–15 (2002)MathSciNetCrossRefGoogle Scholar
 8.Mráz, F., Průša, D., Wehar, M.: Twodimensional pattern matching against basic picture languages. In: Hospodár, M., Jirásková, G. (eds.) CIAA 2019. LNCS, vol. 11601, pp. 209–221. Springer, Cham (2019). https://doi.org/10.1007/9783030236793_17CrossRefGoogle Scholar
 9.de Oliveira Oliveira, M., Wehar, M.: Intersection nonemptiness and hardness within polynomial time. In: Hoshi, M., Seki, S. (eds.) DLT 2018. LNCS, vol. 11088, pp. 282–290. Springer, Cham (2018). https://doi.org/10.1007/9783319986548_23CrossRefGoogle Scholar
 10.Peeters, R.: The maximum edge biclique problem is NPcomplete. Discret. Appl. Math. 131(3), 651–654 (2003)MathSciNetCrossRefGoogle Scholar
 11.Potechin, A., Shallit, J.: Lengths of words accepted by nondeterministic finite automata. CoRR abs/1802.04708 (2018)Google Scholar
 12.Salomaa, A.: Jewels of Formal Language Theory. Computer Science Press, Rockville (1981)zbMATHGoogle Scholar
 13.Sun, X., Nobel, A.B.: On the size and recovery of submatrices of ones in a random binary matrix. J. Mach. Learn. Res. 9(Nov), 2431–2453 (2008)MathSciNetzbMATHGoogle Scholar
 14.Williams, V.V.: Multiplying matrices faster than CoppersmithWinograd. In: 44th Annual ACM Symposium on Theory of Computing (STOC 2012), pp. 887–898. ACM, New York (2012)Google Scholar
 15.Williams, V.V.: Hardness of easy problems: basing hardness on popular conjectures such as the strong exponential time hypothesis (invited talk). In: Husfeldt, T., Kanj, I.A. (eds.) 10th International Symposium on Parameterized and Exact Computation (IPEC 2015). LIPIcs, vol. 43, pp. 17–29. Schloss Dagstuhl  LeibnizZentrum für Informatik (2015)Google Scholar