Advertisement

Algorithmica

pp 1–15 | Cite as

Extreme Witnesses and Their Applications

  • Andrzej Lingas
  • Mia Persson
Open Access
Article

Abstract

We study the problem of computing the so called minimum and maximum witnesses for Boolean vector convolution. We also consider a generalization of the problem which is to determine for each positive value at a coordinate of the convolution vector, q smallest (largest) witnesses, where q is the minimum of a parameter k and the number of witnesses for this coordinate. We term this problem the smallest k-witness problem or the largest k-witness problem, respectively. We also study the corresponding smallest and largest k-witness problems for Boolean matrix product. First, we present an \(\tilde{O}(n^{1.5}k^{0.5})\)-time algorithm for the smallest or largest k-witness problem for the Boolean convolution of two n-dimensional vectors, where the notation \(\tilde{O}(\ )\) suppresses polylogarithmic in n factors. In consequence, we obtain new upper time bounds on reporting positions of mismatches in potential string alignments and on computing restricted cases of the \((\min , +)\) vector convolution. Next, we present a fast (substantially subcubic in n and linear in k) algorithm for the smallest or largest k-witness problem for the Boolean matrix product of two \(n\times n\) Boolean matrices. It yields fast algorithms for reporting k lightest (heaviest) triangles in a vertex-weighted graph.

Keywords

Boolean vector convolution Boolean matrix product String matching Witnesses Minimum and maximum witnesses Lightest triangles Time complexity 

1 Introduction

For a potential alignment of a pattern string with a text string over the same alphabet, a position in the alignment where the pattern symbol is different from the text symbol is a witness to the symbol mismatch while a position where the pattern and text symbol are equal is a witness to the symbol match.

Similarly, if A and B are two \(n\times n\) Boolean matrices and C is their Boolean matrix product then for any entry \(C[i,j]=1\) of C,  a witness is an index m such that \(A[i,m]\wedge B[m,j]=1.\) The smallest (or, largest) possible witness is called the minimum witness (or, maximum witness, respectively).

The problems of finding “witnesses” have been extensively studied for several decades, at the beginning independently within stringology and graph algorithms relying on matrix computations. In string matching, witnesses for symbol mismatches or matches in potential alignments of two strings are sought [4, 9, 17] while in graph algorithms, witnesses for the Boolean matrix product are typically sought, originally in order to solve shortest path problems in graphs [2, 3]. In both cases, highly non-trivial efficient algorithmic solutions have been presented [2, 3, 4, 17].

Also in both areas, useful generalizations and/or specializations of the problems of finding witnesses have been studied. A natural generalization introduced for string matching in [17] is to request up to k witnesses instead of a single one. It has been efficiently solved by using concepts from group testing in [4] and conveyed to Boolean matrix product in [4, 14]. A natural specialization is to request minimum or maximum witnesses. This specialization has been introduced and efficiently solved in [10] in the context of finding lowest common ancestors in directed acyclic graphs and it found many other applications since then (cf. [8, 18, 21]).

In analogy to witnesses for Boolean matrix product, if a and b are two n-dimensional Boolean vectors and c is their Boolean convolution then for any coordinate \(c_i=1\) of c,  a witness is an index l such that \(a_l\wedge b_{i-l}=1\). In contrast to string matching and Boolean matrix product, the problem of computing the witnesses of Boolean vector convolution does not seem to be explicitly studied in the literature. On the other hand, Boolean vector convolution is very much related to string matching [12], and hence the algorithms for reporting witness or more generally up to k witnesses can be easily conveyed from stringology to Boolean vector convolution (see Proposition 3.1).

In this paper, we study the problem of computing minimum and maximum witnesses for Boolean vector convolution. We also consider a generalization of the problem which is to determine for each positive (value at a)1 coordinate of the convolution vector, q smallest (largest) witnesses, where q is the minimum of a parameter k and the number of witnesses for this coordinate. We term this problem the smallest k-witness problem or the largest k-witness problem, respectively. We also study the corresponding generalization for Boolean matrix product.

Let \(\omega (1,r,1)\) denote the exponent of fast arithmetic multiplication of an \(n \times n^r\) matrix by an \(n^r \times n\) matrix. In particular, \(\omega (1,1,1)\) denoted by \(\omega \) is known to not exceed 2.373 [15, 22]. Next, let the notation \(\tilde{O}(\ )\) suppress polylogarithmic in n factors. Our main contributions are as follows:
  • an \(\tilde{O}(n^{1.5})\)-time algorithm for reporting minimum and maximum witnesses for the Boolean convolution of two n-dimensional vectors, and more generally, an \(\tilde{O}(n^{1.5}k^{0.5})\)-time algorithm for the smallest or largest k-witness problem for the convolution;

  • as corollaries, \(\tilde{O}(n^{1.5}k^{0.5})\) time bounds for the smallest or largest k-witness problems in string matching;

  • in part as corollaries, several upper time bounds on computing the \((\min ,+)\) integer vector convolution in restricted cases, summarized in Table 1;

  • an \(O(n^{2 + \lambda }k)\)-time algorithm for the smallest or largest k-witness problem for the Boolean matrix product of two \(n\times n\) Boolean matrices, where \(\lambda \) is a solution to the equation \(\omega (1, \lambda , 1) = 1 + 2 \, \lambda + \log _n k\);

  • as a corollary, an \(O(n^{2 + \lambda }k)\) time bound for the problem of reporting for each edge of a vertex-weighted graph k lightest (heaviest) triangles containing it, where \(\lambda \) satisfies the aforementioned equation; also, an \(O(\min \{n^{\omega }k+n^{2+o(1)}k, n^{2 + \lambda }k\})\) time bound for the problem of reporting k lightest (heaviest) triangles in the input vertex-weighted graph.

Table 1

Our upper time bounds for computing the \((\min ,+)\) convolution of two n-dimensional integer vectors either with coordinates having a bounded number of different values, or decomposable into a number of non-decreasing or non-increasing subsequences, or just monotone subsequences

Vector a/vector b

\(c_b\) dif. values

\(m_b\) non-decr. subs.

\(m_b\) non-incr. subs.

\(c_a\) different values

\(\tilde{O}(c_ac_bn)\)

\(\tilde{O}(c_am_bn^{1.5})\)

\(\tilde{O}(c_am_bn^{1.5})\)

\(m_a\) non-decr. subs.

\(\tilde{O}(m_ac_bn^{1.5})\)

\(\tilde{O}(m_am_bn^{1.5})\)

?

\(m_a\) non-incr. subs.

\(\tilde{O}(m_ac_bn^{1.5})\)

?

\(\tilde{O}(m_am_bn^{1.5})\)

\(m_a\) mon. subs.

\(\tilde{O}(m_ac_bn^{1.5})\)

?

?

arbitrary

\(\tilde{O}(c_bn^{1.844})\)

?

?

2 Preliminaries

For two n-dimensional vectors \(a=(a_0,\ldots ,a_{n-1})\) and \(b=(b_0,\ldots ,b_{n-1})\) over a semi-ring \((\mathbb {U},\oplus , \odot )\), their convolution over the semi-ring is a vector \(c=(c_0,\ldots ,c_{2n-2})\), where \(c_i=\bigoplus _{l=\max \{i-n+1,0\}}^{\min \{ i,n-1\}}a_l\odot b_{i-l}\) for \(i=0,\ldots ,2n-2.\) Similarly, for a \(p\times q\) matrix A and a \(q\times r\) matrix B over the semi-ring, their matrix product over the semi-ring is a \(p\times r\) matrix C such that \(C[i,j]=\bigoplus _{m=1}^{q}A[i,m]\odot B[m,j]\) for \(1\le i\le p\) and \(1\le j\le r.\) In particular, for the semi-rings \((\mathbb {Z},+,\times ),\) \((\mathbb {Z},\min ,+),\) \((\mathbb {Z},\max ,+),\) and \((\{0,1\},\vee , \wedge )\), we obtain the arithmetic, \((\min ,+),\) \((\max ,+),\) and the Boolean convolutions or matrix products, respectively.

We shall use the unit-cost RAM computational model [1] with computer word of length logarithmic in the maximum of the size of the input and the value of the largest input integer.

The following fact is well known (cf. [12]).

Fact 2.1

Let p and q be two n-dimensional integer vectors. The arithmetic convolution of p and q can be computed in \(\tilde{O}(n)\) time. Hence, also the Boolean convolution of two n-dimensional vectors can be computed in \(\tilde{O}(n)\) time.

For a sequence S of integers, we shall denote the minimum number of monotone subsequences into which S can be decomposed by mon(S).

Fact 2.2

[13, 23] A sequence of n integers can be decomposed into \(O(mon(S)\log n)\) monotone subsequences in \(O(n^{1.5}\log n)\) time.

Fact 2.3

(see Theorem 10 in [5]) The problem of computing the convolution of two n-dimensional vectors over a semi-ring can be reduced to computing \(O(\sqrt{n})\) products of two \(O(\sqrt{n}) \times O(\sqrt{n})\) matrices over the semi-ring. Importantly, the matrices can be constructed in \(O(n^{1.5})\) time in total and their entries are appropriately filled with the coordinates of the vectors.

Fact 2.4

(Theorem 3.2 in [7]) Let A and B be two \(n\times n\) integer matrices where the entries of A range over at most c different integers. The \((\min ,+)\) matrix product of A and B can be computed in \(O(cn^{2.688})\) time.

Fact 2.5

[11] A lightest (heaviest) triangle in an undirected vertex weighted graph on n vertices can be found in \(O(n^{\omega }+n^{2+o(1)})\) time.

3 Extreme Witnesses for Boolean Convolution

Let \(c=(c_0,\ldots ,c_{2n-2})\) be the Boolean convolution of two n-dimensional Boolean vectors a and b. A witness of \(c_i=1\) is any \(l\in [ \max \{i-n+1,0\}, \min \{ i,n-1\}]\) such that \(a_l\wedge b_{i-l}=1.\) A minimum witness (or maximum witness) of \(c_i=1\) is the smallest (or, the largest, respectively) witness of \(c_i.\) The witnesses problem (or minimum witness problem, or maximum witness problem) for the Boolean convolution of two n-dimensional Boolean vectors is to determine witnesses (or, the minimum witnesses or the maximum witnesses, respectively) for all non-zero coordinates of the Boolean convolution of the vectors. The k-witness problem (or, the smallest k -witness problem or the largest k-witness problem) for the Boolean convolution of two n-dimensional Boolean vectors is to determine for each non-zero coordinate of the convolution q witnesses (or, q smallest witnesses or q largest witnesses, respectively), where q is the minimum of k and the number of witnesses for this coordinate.

The Boolean vector convolution is very much related to string matching problems [12]. The corresponding problems of reporting a symbol mismatch or match, or up to k such mismatches or matches for each potential alignments of the pattern with the text have been studied in the so called non-standard stringology [4, 17]. Also, the focus of this paper is on extreme witnesses. For these reasons and on the other hand, for the completeness sake, we just state a proposition and its generalization on standard witnesses for Boolean vector convolution that can be obtained analogously as the well known corresponding facts on string matching or Boolean matrix product.

Proposition 3.1

(Analogous to [3]) The witnesses problem for Boolean convolution of two n-dimensional vectors can be solved in \(\tilde{O}(n)\) time.

Proof

sketch. The witnesses for the Boolean convolution c of two n-dimensional vectors a and b can be computed analogously as the witnesses for the Boolean matrix product [3]. The first observation is that for all coordinates of c that have a single witness, their witnesses can be obtained by computing the arithmetic convolution of a with the vector \(b'\) resulting from replacing each 1 in b with the number of the respective coordinate. The next idea is to dilute the other vector b gradually so the number of witnesses for each positive coordinate of c decreases finally to zero but in most cases passing through 1 first. For instance, if \(c_i\) has l witnesses and in each phase each coordinate of b is set to 0 with probability \(\frac{1}{2}\) then after a logarithmic number of such phases there is a positive probability that exactly one witness will remain. By iterating the process a logarithmic number of times witnesses for all positive coordinates of c can be determined with high probability.

In order to remove the randomness, we can use small c-wise \(\epsilon \)-bias sample spaces analogously as Alon and Naor in their deterministic algorithm for witnesses of Boolean matrix product [3].

The algorithm, its analysis and derandomization are totally analogous to those of the algorithm of Alon and Naor for witnesses of Boolean matrix product [3]. We refer the reader for the technical details to their paper. It is sufficient to replace matrices with vectors, entries with coordinates and Boolean matrix product with Boolean vector convolution in their proof. \(\square \)

Following [4] and [14], one can also generalize Proposition 1 to include an algorithmic solution to the k-witness problem for Boolean convolution of two n-dimensional vectors in \(\tilde{O}(nk)\) time.

With a moderate technical effort, the minimum or maximum witness problem for Boolean convolution could be solved by combining the known \(O(n^{2.575})\)-time algorithm for the corresponding problem of minimum or maximum witnesses of Boolean matrix product [10] with the known reduction of vector convolution over an arbitrary semi-ring to matrix product over the semi-ring described in Fact 2.3 [5]. The combination results in an \(O(n^{1.787})\)-time solution to the extreme witness problem for Boolean convolution. We shall show that a substantially more efficient solution can be obtained directly.

Theorem 3.2

The minimum witness problem (maximum witness problem, respectively) for Boolean convolution of two n-dimensional vectors can be solved in \(\tilde{O}(n^{1.5})\) time.

Proof

Let a and b be two n-dimensional vectors. Let r be an integer parameter between 1 and n. For \(p=1,\ldots ,\lceil n/r \rceil ,\) let \(a^p\) be the Boolean n-dimensional vector resulting from setting to zero all coordinates of a with indices not exceeding \((p-1)r\) and those with indices greater than pr. We compute, for each \(p=1,\ldots ,\lceil n/r \rceil ,\) the Boolean convolution \(c^p\) of \(a^p\) and b. Next, for each \(i=0,\ldots ,2n-2,\) we determine the smallest p such that \(c^p_i=1.\) Then, if such a p exists, we determine the interval of the implicants \(a_l\wedge b_{i-l}\) of \(c^{p}_i\) that potentially can have a non-zero value, i.e., where \(l\in ((p-1)r, pr],\) and return the smallest l in the interval for which \(a_l\wedge b_{i-l}=1.\) The \(\lceil n/r \rceil \) computations of Boolean convolutions \(c^p\) takes \(\tilde{O}(n^2/r)\) time. The total time taken by the determination of the smallest p is \(O(n\times n/r)\). To determine the smallest l for a given i and p requires examining the value of O(r) implicants and hence it takes O(nr) time in total. By setting \(r=\lceil \sqrt{n} \rceil ,\) we obtain the claimed time complexity. \(\square \)

The method of Theorem 3.2 can be generalized to include the smallest k-witness problem and the largest k-witness problem.

Theorem 3.3

The smallest k-witness problem as well as the largest k-witness problem for Boolean convolution of two n-dimensional vectors can be solved in \(\tilde{O}(n^{1.5}k^{0.5})\) time.

Proof

Let a and b be two input n-dimensional vectors. Let r be an integer parameter between 1 and n. Analogously as in the proof of Theorem 3.2, for \(p=1,\ldots ,\lceil n/r \rceil ,\) we let \(a^p\) denote the Boolean n-dimensional vector resulting from setting to zero all coordinates of a with indices not exceeding \((p-1)r\) and those with indices greater than pr. Next, we compute for each \(p=1,\ldots ,\lceil n/r \rceil ,\) the arithmetic convolution \(w^p\) of \(a^l\) and b by interpreting these vectors as \(0-1\) ones. The arithmetic convolutions provide us with the number of witnesses in each interval \(((p-1)r,pr]\) for each coordinate \(c_i\) of the Boolean convolution c of a and b. Their coordinate-wise sum provides us with the total number of witnesses for each coordinate of c. In order to solve the smallest k-witness problem, for \(p=1,\ldots ,\lceil n/r \rceil ,\) and for \(i=0,\ldots ,2n-2,\) whenever \(w^p_i>0\) and the number of witnesses for \(c_i\) found so far is less than the minimum of k and the number of witnesses of \(c_i,\) we search through the interval \(((p-1)r,pr]\) from the left to the right for further witnesses. For details see the algorithm depicted in Fig. 1. In the worst case, for each \(i=0,\ldots ,2n-2,\) we need to search through k of such intervals. The total cost of the searches becomes \(O(n\times \frac{n}{r} +n\times k\times r),\) see lines 15–19 in the algorithm depicted in Fig. 1. On the other hand, the \(\lceil n/r \rceil \) computations of the arithmetic convolutions \(w^p\) takes \(\tilde{O}(n^2/r)\) time. By setting \(r=\lceil \sqrt{\frac{n}{k}} \rceil ,\) we obtain the claimed time complexity for the smallest k-witness problem.

The largest k-witness problem can be solved analogously in the same asymptotic time by considering the intervals in the opposite order and searching them from the right to the left instead. \(\square \)

Fig. 1

An algorithm for the smallest k-witness problem for the Boolean convolution of two n-dimensional vectors a and b

3.1 String Matching

Fisher and Patterson showed already in 1974 [12] that several string matching problems can be efficiently reduced to Boolean vector convolution.

Suppose we are given two strings \(\tau =\tau _{m-1}\tau _{m-2}...\tau _{0}\) and \(\rho =\rho _0\rho _1...\rho _{n-1},\) where \(m<n,\) over a finite alphabet \(\varSigma .\) Following [12], for \(\gamma \in \varSigma ,\) let \(H_{\gamma }(\ )\) be a function from \(\varSigma \) to \(\{\) true, false \(\}\) such that \(H_{\gamma }(x )=true\) if and only if \(x=\gamma .\) If \(i+m\le n,\) the question of whether \(\tau _{m-1}\tau _{m-2}...\tau _{0}\) matches \(\rho _i\rho _{i+1}...\rho _{i+m-1}\) is equivalent to a conjunction of the negations of terms \(\bigvee ^{m-1}_{l=0} H_{\alpha }(\rho _{i+l})\wedge H_{\beta }(\tau _{m-1-l}),\) where \(\alpha ,\beta \in \varSigma \) and \(\alpha \ne \beta .\) Note that whenever such a term is true, the matching cannot take place as at some position \(\alpha \) clashes with \(\beta .\) In this way, the standard string matching problem for \(\tau \) and \(\rho \) easily reduces to \(O(|\varSigma |^2)\) Boolean convolutions of two Boolean vectors of length at most n.

Observe now that witnesses for the aforementioned Boolean convolutions yield positions of the clashes, in other words, symbol mismatches. If we modify the terms to \(\bigvee ^{m-1}_{l=0} H_{\alpha }(\rho _{i+l})\wedge H_{\alpha }(\tau _{m-1-l}),\) for \(\alpha \in \varSigma ,\) the witnesses for the \(O(|\varSigma |)\) Boolean convolutions yield positions of two sided matches with \(\alpha \in \varSigma .\) Hence, we obtain the following theorem as a corollary from Theorem 3.3.

Theorem 3.4

Consider the string matching problem for a text string of length n and a pattern string of length \(m<n,\) both over a finite alphabet. For each alignment of the pattern with the text, we can provide locations of the k earliest symbol mismatches and/or the k earliest symbol matches as well as locations of the k latest symbol mismatches and the k latest symbol matches in the alignments in \(\tilde{O}(n^{1.5}k^{0.5})\) time in total. In particular, we can also provide positions of the earliest and/or latest two-side symbol matches with a given alphabet symbol (cf. ones problem in [17]) in the alignments in \(\tilde{O}(n^{1.5}k^{0.5})\) time in total.

3.2 \((\min , +)\) Convolution

Our original motivation has been an extension of the \(O(n^{1.859})\)-time algorithm due to Chan and Levenstein for the \((\min ,+)\) convolution of two n-dimensional vectors with integer coordinates of size O(n) forming monotone sequences [6] to include the case where the vectors are decomposable into relatively few monotone subsequences. The major difficulty here is that a completion of the subsequences to full monotone sequences can affect the result. Roughly, we can avoid this difficulty when the coordinates of each of the vectors range over relatively few different values or all the subsequences are simultaneously either non-decreasing or non-increasing (see Table 1). The idea is to use our algorithm for minimum and maximum witnesses of Boolean convolution.
Fig. 2

An algorithm for computing the \((\min ,+)\) convolution c of two n-dimensional integer vectors a and b,  where the coordinates of a range over \(c_a\) different values and the sequence of consecutive coordinates of b is decomposable into \(m_b\) monotone subsequences

The correctness of the algorithm depicted in Fig. 2 relies on the following straightforward lemma.

Lemma 3.5

In the algorithm depicted in Fig. 2, the following equivalence holds: \(d_k\ne 0\) in line 13 if and only if \(\min \{a_l+b_m|l+m=k \wedge a_l\in a^i \wedge b_m\in b^j \}\) is equal to the first argument of the minimum in this line.

Theorem 3.6

Let a and b be two n-dimensional integer vectors such that the coordinates of a range over at most \(c_a\) different values while the sequence of the consecutive coordinates of b can be decomposed into \(m_b\) monotone subsequences. The algorithm depicted in Fig. 2 computes their \((\min ,+)\) convolution in \(\tilde{O}(c_am_bn^{1.5})\) steps.

Proof

By Lemma 3.5 and line 13 in the algorithm, none of the coordinates of the output vector has a lower value than the corresponding coordinate of the \((\min ,+)\) convolution of a and b. Conversely, if the k-th coordinate of the \((\min ,+)\) convolution of a and b equals \(a_l+b_m\), where \(l+m=k,\) then there exists ij such that \(a_l\in a^i \) and \(b_m\in b^j .\) Hence again by Lemma 3.5 and line 13 in the algorithm, the k-th coordinate in the output vector has value not larger than the k-th coordinate of the \((\min ,+)\) convolution of a and b.
Fig. 3

An algorithm for computing the \((\min ,+)\) convolution c of two n-dimensional integer vectors a and b given with their decompositions into \(m_a\) and \(m_b\) subsequences that are either all non-decreasing or all non-increasing

The decomposition of the vector a into \(c_a\) constant subsequences in line 1 trivially takes O(n) time. Next, the decomposition of the vector b into \(\tilde{O}(m_a)\) monotone subsequences in line 5 takes \(O(n^{1.5}\log n)\) time by Fact 2.2. The forming of the vectors \(char(a^i)\) in lines 2–3 and \(char(b^j)\) in lines 6–7 take \(\tilde{O}(c_an+m_bn)\) time in total. The \(\tilde{O}(c_am_b)\) computations of the minimum and maximum witnesses of the Boolean convolution d in lines 9–11 take \(\tilde{O}(c_am_bn^{1.5})\) time in total by Theorem 3.2. Finally, the line 13 is executed \(\tilde{O}(c_am_bn)\) times. The bound \(\tilde{O}(c_am_bn^{1.5})\) follows.

If we are given decompositions of the two input n-dimensional vectors a and b into monotone subsequences that are either all non-decreasing or all non-increasing then we can use the algorithm depicted in Fig. 3 which is analogous to that depicted in Fig. 2, in order to compute the \((\min ,+)\) convolution of a and b. Thus, first for each subsequence \(a^i\) of a and each subsequence \(b^j\) of b,  we compute the Boolean vectors \(char(a^i)\) and \(char(b^j)\) indicating with ones the coordinates of a or b covered by \(a^i\) or \(b^j,\) respectively. Next, depending if the subsequences are non-decreasing or non-increasing, for each pair of such subsequences \(a^i\) and \(b^j\), we compute the minimum witnesses of the Boolean convolution of \(char(a^i)\) and \(char(b^j)\) or the maximum witnesses of this convolution, respectively. We use the extreme witnesses to update the current coordinates of the computed \((\min ,+)\) convolution analogously as in the algorithm depicted in Fig. 2. Hence, we obtain the following theorem.

Theorem 3.7

Let a and b be two n-dimensional integer vectors given with the decompositions of the sequences of their consecutive coordinates into \(m_a\) and \(m_b\) monotone subsequences respectively such that all the subsequences are either non-decreasing or non-increasing. The algorithm depicted in Fig. 3 computes the \((\min ,+)\) convolution of a and b in \(\tilde{O}(m_am_bn^{1.5})\) time.

Proof

The proof of the correctness of the algorithm depicted in Fig. 3 is analogous to that of the correctness of the algorithm depicted in Fig. 2. The time complexity analysis of the former algorithm is also similar to that of the latter algorithm. The main difference is that the decompositions of a and b into subsequences are given and that the \(O(n^{1.5})\)-time algorithm for minimum or maximum witnesses of Boolean convolution is run \(m_am_b\) times instead of \(c_am_b\) times. \(\square \)

Fig. 4

An algorithm for computing the \((\min ,+)\) convolution c of two n-dimensional integer vectors a and b whose coordinates range over at most \(c_a\) or \(c_b\) different values, respectively

By combining Fact 2.3 with Fact 2.4, we also obtain the following bound.

Theorem 3.8

Let a and b be two n-dimensional integer vectors such that the coordinates of a range over at most \(c_a\) different values. The \((\min ,+)\) convolution of a and b can be computed in \(\tilde{O}(c_an^{1.844})\) time.

We can also consider the problem of computing the \((\min ,+)\) integer vector convolution of the input vectors a and b, when their coordinates range over \(c_a\) and \(c_b\) different integers, respectively. We can use the algorithm depicted in Fig. 4, analogous to that depicted in Fig. 2. The first difference is that the subsequences \(b^j\) on the side of b are also constant. It follows that for any pair of such constant subsequences \(a^i\) and \(b^j,\) the value of the sum of any element from \(a^i\) with any element from \(b^j\) is constant and it can be trivially computed as \(a^i_1+b^j_1\) a priori. For this reason, it is sufficient to compute the Boolean convolution d of \(char(a^i)\) and \(char(b^j)\) for each pair \(a^i\) and \(b^j\). Then, for any non-zero coordinate of d, we need to update the corresponding coordinate of the computed \((\min ,+)\) convolution of a and b by taking the minimum of the coordinate and \(a^i_1+b^j_1.\) By Fact 2.1, we obtain the following theorem.

Theorem 3.9

Let a and b be two n-dimensional integer vectors such that their coordinates range over at most \(c_a\) or \(c_b\) different values, respectively. The algorithm depicted in Fig. 4 computes the \((\min ,+)\) convolution of a and b in \(\tilde{O}(c_ac_bn)\) time.

Proof

The algorithm depicted in Fig. 4 can be easily implemented in \(\tilde{O} (c_ac_bn)\) time by running \(c_ac_b\) times the known \(\tilde{O}(n)\)-time algorithm for Boolean convolution of two n-dimensional Boolean vectors, see Fact 2.1. \(\square \)

4 Extreme Witnesses for Boolean Matrix Product

For two \(n\times n\) Boolean matrices A and B, a witness of a C[ij] entry of the Boolean matrix product of A and B is any index m such that \(A[i,m]\wedge B[m,j]=1.\) Next, the minimum witness and maximum witness for an entry of C as well as the witness problem, the minimum and maximum witness problems, the k-witness problem, and the smallest k-witness and largest k-witness problems for Boolean matrix product of A and B are defined analogously as those for Boolean vector convolution.

In this section, we shall present a generalization of the algorithm for minimum and maximum witnesses for Boolean matrix product from [10] to include the smallest and largest k-witness problems.

Let \(\ell \) be a positive integer smaller than n. We may assume w.l.o.g. that n is divisible by \(\ell \). Partition the matrix A into \(n \times \ell \) sub-matrices \(A_p\) and the matrix B into \(\ell \times n\) sub-matrices \(B_p\), such that \(1 \le p \le n/\ell \), and the sub-matrix \(A_p\) covers the columns \((p-1) \, \ell + 1\) through \(p \, \ell \) of A whereas the sub-matrix \(B_p\) covers the rows \((p-1) \, \ell + 1\) through \(p \, \ell \) of B.

For \(p = 1, \ldots , n/\ell \), let \(W_p\) be the arithmetic product of \(A_p\) and \(B_p\) treated as \(0-1\) matrices. On the other hand, let C denote the Boolean matrix product of A and B. Then, \(W_p[i,j] =q\) if and only if there are exactly q witnesses of C[ij] in the interval \(((p-1) \, \ell , p \, \ell ]\). Consequently, the total number of witnesses of C[ij] is given by \(\sum _{p=1}^{n/\ell } W_p[i,j]\). Therefore, the following lemma follows.

Lemma 4.1

Suppose that a C[ij] entry of the Boolean product C of A and B is positive. Let q be the minimum of k and the total number of witnesses of C[ij]. Next, let \(p'\) be the minimum value of p such that \(\sum _{u=1}^{p} W_u[i,j]\) is not less than q. The smallest q witnesses of C[ij] belong to the interval \([1, p' \, \ell ]\).

Fig. 5

An algorithm for the smallest k-witness problem for the Boolean matrix product of two \(n\times n\) Boolean matrices A and B

By this lemma, after computing all the matrix products \(W_p = A_p \cdot B_p\), \(1 \le p\le n/\ell \), we need \(O(n/\ell + k\ell )\) time per positive entry of C to find up to k smallest witnesses: \(O(n/\ell )\) time to determine \(p'\) and then \(O(k\ell )\) time to locate the up to k smallest witnesses. See Fig. 5 for our algorithm for the smallest k-witness problem.

Recall that \(\omega (1,r,1)\) denotes the exponent of the multiplication of an \(n \times n^r\) matrix by an \(n^r \times n\) matrix. It follows that the total time taken by our algorithm for the smallest k-witness problem is
$$\begin{aligned} O((n/\ell ) \cdot n^{\omega (1, \log _n\ell , 1)}+ n^3/\ell + n^2 \, k\ell ) . \end{aligned}$$
By setting r to \(\log _n\ell \) and z to \(\log _n k\), our upper bound transforms to \(O(n^{1-r+ \omega (1, r , 1)}+ n^{3-r} + n^{2+r+z})\). Note that by assuming \(r \ge \frac{1}{2} -\frac{z}{2}\), we can get rid of the additive \(n^{3-r}\) term. See Fig. 5 in [24] for the graph of the function \(1 - r + \omega (1,r, 1).\) By solving the equation \(1 - \lambda + \omega (1, \lambda , 1) = 2 + z+ \lambda \) implying \(\lambda \ge \frac{1}{2} -\frac{z}{2}\) by \(\omega (1, \lambda , 1) \ge 2\), we obtain our main result.

Theorem 4.2

Let \(\lambda \) be such that \(\omega (1, \lambda , 1) = 1 + 2 \, \lambda + \log _n k\). The smallest k-witness problem as well as the largest k-witness problem for the Boolean matrix product of two \(n \times n\) Boolean matrices can be solved in \(O(n^{2 + \lambda }k)\) time.

Le Gall has recently substantially improved upper time bounds on rectangular matrix multiplication in [16]. In consequence, he could show that for the equation \(\omega (1, \mu , 1) = 1 + 2 \, \mu \), \(\mu < 0.5302.\) This in particular improves the upper time bound for the minimum and maximum witness problems from \(O(n^{2.575})\) to \(O(n^{2.5302})\). It follows that for \(k\gg 1,\) \(\lambda \) in Theorem 4.2 is substantially smaller than 0.5302.

4.1 Lightest Triangles

By generalizing the reduction of the problem of reporting for each edge of a vertex-weighted graph a heaviest triangle containing it to the maximum witness problem for Boolean matrix product from [21] to include reporting k heaviest triangles and the largest k-witness problem, we obtain the following theorem as a corollary from Theorem 4.2.

Theorem 4.3

Let G be an undirected vertex weighted graph on n vertices and let k be a natural number not exceeding n. Next, let \(\lambda \) be such that \(\omega (1, \lambda , 1) = 1 + 2 \, \lambda + \log _n k\). We can list for each edge \(\{u,v\}\) of G\(q_e\) lightest (heaviest) triangles \(\{u,v,w\}\) in G, where \(q_e\) is the minimum of k and the number of triangles \(\{u,v,w\}\) in G, in \(O(n^{2 + \lambda }k)\) time.

Proof

Number the vertices of G in non-decreasing vertex-weight order. Next, solve the smallest (largest) k-witness problem for the Boolean matrix product C of the adjacency matrix of G with itself. For each edge \(e=\{i,j\}\) of G,  the up to k smallest (or, largest) witnesses of C[ij] yield the \(q_e\) lightest (or, heaviest, respectively) triangles in G including e. Theorem 4.2 yields the claimed upper bound. \(\square \)

As for the problem of finding k lightest (heaviest) triangles in a vertex-weighted graph, iterating the \(O(n^{\omega }+n^{2+o(1)})\)-time algorithm for finding a lightest or heaviest triangle described in Fact 2.5 seems to be a better choice for up to moderate values of k. Before each next iteration, we remove the three vertices of the lastly reported triangle. After k iterations, we stop and find among the reported triangles and no more than \(3(k-1)n^2\) other triangles incident to the removed vertices, the k lightest (heaviest) triangles if possible. The method takes \(O(n^{\omega }k+n^{2+o(1)}k+n^2k),\) i.e., \(O(n^{\omega }k+n^{2+o(1)}k)\) time.

Theorem 4.4

Let G be an undirected vertex weighted graph on n vertices and let k be a natural number not exceeding n. Next, let \(\lambda \) be such that \(\omega (1, \lambda , 1) = 1 + 2 \, \lambda + \log _n k\). We can list q lightest (heaviest) triangles in G, where q is the minimum of k and the number of triangles in G, in \(O(\min \{n^{\omega }k+n^{2+o(1)}k, n^{2 + \lambda }k\})\) time.

Finding or detecting triangles of extreme weight in vertex-weighted graphs has a number of applications . First of all, it can be used to solve the corresponding general problem of finding or detecting subgraphs or induced subgraphs of extreme weigh [11, 20, 21]. Vassilevska and Williams list also two other applications in [19]: a general variant of the 3-SUM problem and a general buyer-seller problem in computational economy.

5 Final Remarks

It is an interesting open problem if any of our upper time bounds on minimum and maximum witnesses for Boolean vector convolution and the extreme k-witness problems both for Boolean vector convolution and Boolean matrix product can be substantially improved? Note here that so far the \(O(n^{2+\lambda })\) time bound (where \(\omega (1, \lambda , 1) = 1 + 2\lambda \)) on minimum and maximum witnesses of Boolean matrix product established one decade ago [10] couldn’t be improved (see also [8]).

The problems of Boolean vector convolution and Boolean matrix product seem to be similar but there are some substantial differences between them. The former problem admits almost a linear in the input size algorithm while for the latter problem the current upper time bound is substantially non-linear [15, 22]. There is a moderately efficient reduction of vector convolution to matrix product described in Fact 2.3 while such a reverse reduction is not known. Our upper time bounds for minimum and maximum witnesses of Boolean vector convolution show that a direct approach to Boolean vector convolution can yield better upper time bounds than those obtained by conveying known upper time bounds for the witness problems for Boolean matrix product via Fact 2.3 to those corresponding for Boolean vector convolution.

The extreme k-witness problems for Boolean matrix product presumably admit several other applications often corresponding to generalizations of the applications for minimum and maximum witnesses of Boolean matrix product [18, 21] and/or the applications of the k-witness problem for Boolean matrix product [4], e.g., the all-pairs k-bottleneck paths.

Finally, a potentially interesting direction for further research is to consider approximation variants of the extreme witnesses problems and the \((\min , +)\) vector convolution.

Footnotes

  1. 1.

    For brevity, we shall identify the i-th coordinate of a vector v with its value \(v_i\) in the continuation.

Notes

Acknowledgements

We thank Mirosław Kowaluk and anonymous reviewers of the preliminary version of this paper for valuable comments. This research has been supported in part by Swedish Research Council Grant 621-2011-6179.

References

  1. 1.
    Aho, A.V., Hopcroft, J.E., Ullman, J.: The Design and Analysis of Computer Algorithms. Addison-Wesley Publishing Company, Reading (1974)zbMATHGoogle Scholar
  2. 2.
    Alon, N., Galil, Z., Margalit, O., Naor, M.: Witnesses for Boolean matrix multiplication and for shortest paths. In: Proceedings of the 33rd Symposium on Foundations of Computer Science (FOCS), pp. 417–426 (1992)Google Scholar
  3. 3.
    Alon, N., Naor, M.: Derandomization, witnesses for Boolean matrix multiplication and construction of perfect hash functions. Algorithmica 16, 434–449 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Aumman, N., Levenstein, M., Levenstein, N., Tsur, D.: Finding witnesses by peeling. In: Proceedings of the Combinatorial Pattern Matching (CPM). LNCS, vol. 4580, pp. 28–39. Springer (2007)Google Scholar
  5. 5.
    Bremner, D., Chan, T.M., Demaine, E.D., Erickson, J., Hurtado, F., Iacono, J., Langerman, S., Patrascu, M., Taslakian, P.: Necklaces, convolutions and X + Y. Algorithmica 69, 294–314 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Chan, T.M., Lewenstein, M.: Clustered integer 3SUM vis additive combinatorics. In: Proceedings of the 47th ACM Symposium on Theory of Computing (STOC 2015)Google Scholar
  7. 7.
    Chan, T.M.: More algorithms for all-pairs shortest paths in weighted graphs. SIAM J. Comput. 39(5), 2025–2089 (2010)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Cohen, K., Yuster, R.: On minimum witnesses for Boolean matrix multiplication. Algorithmica 69(2), 431–442 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press, New York (1994)zbMATHGoogle Scholar
  10. 10.
    Czumaj, A., Kowaluk, M., Lingas, A.: Faster algorithms for finding lowest common ancestors in directed acyclic graphs. In the special ICALP 2005 issue of Theoretical Computer Science 380(1–2), 37–46 (2007)Google Scholar
  11. 11.
    Czumaj, A., Lingas, A.: Finding a heaviest vertex-weighted triangle is not harder than matrix multiplication. SIAM J. Comput. 39(2), 431–444 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Fisher, M.J., Paterson, M.S.: String-matching and other products. In: Proceedings of the 7th SIAM-AMS Complexity of Computation, pp. 113–125 (1974)Google Scholar
  13. 13.
    Fomin, F.V., Kratsch, D., Novelli, J.: Approximating minimum cocolorings. Inf. Process. Lett. 84, 285–290 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Gasieniec, L., Kowaluk, M., Lingas, A.: Faster multi-witnesses for Boolean matrix product. Inf. Process. Lett. 109, 242–247 (2009)CrossRefzbMATHGoogle Scholar
  15. 15.
    Le Gall, F.: Powers of tensors and fast matrix multiplication. In: Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, pp. 296–303 (2014)Google Scholar
  16. 16.
    Le Gall, F.: Faster algorithms for rectangular matrix multiplication. In: Proceedings of the 53rd Symposium on Foundations of Computer Science (FOCS), pp. 514–523 (2012)Google Scholar
  17. 17.
    Muthukrshnan, S.: New results and open problems related to non-standard stringology. In: 6th Proceedings of the Combinatorial Pattern Matching (CPM). LNCS, vol. 937, pp. 298–317. Springer (1995)Google Scholar
  18. 18.
    Shapira, A., Yuster, R., Zwick, U.: All-pairs bottleneck paths in vertex weighted graphs. Algorithmica 59, 621–633 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Vassilevska, V., Williams, R.: Finding a maximum weight triangle in $n^{3 - \delta }$ time, with applications. In: Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC 2006), pp. 225–231. ACM (2006)Google Scholar
  20. 20.
    Vassilevska, V., Williams, R.: Finding, minimizing, and counting weighted subgraphs. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC 2009), pp. 455–463. ACM (2009)Google Scholar
  21. 21.
    Vassilevska, V., Williams, R., Yuster, R.: Finding heaviest H-subgraphs in real weighted graphs, with applications. ACM Trans. Algorithms 6(3), 44:1–44:23 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Vassilevska Williams, V.: Multiplying matrices faster than Coppersmith–Winograd. In: Proceedings of the 44th Annual ACM Symposium on Theory of Computing (STOC), pp. 887–898 (2012)Google Scholar
  23. 23.
    Yang, B., Chen, J., Lu, E., Zheng, S.Q.: A Comparative study of efficient algorithms for partitioning a sequence into monotone subsequences. In: Proceedings of the 4th Theory and Applications of Models of Computation (TAMC). LNCS, vol. 4484, pp. 46–57. Springer (2007)Google Scholar
  24. 24.
    Zwick, U.: All pairs shortest paths using bridging sets and rectangular matrix multiplication. J. ACM 49(3), 289–317 (2002)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© The Author(s) 2018

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Department of Computer ScienceLund UniversityLundSweden
  2. 2.Department of Computer ScienceMalmö UniversityMalmöSweden

Personalised recommendations