1 Introduction

Set \([n]=\left\{ 1,\ldots ,n\right\} \). Let \(S_n\) denote the symmetric group on [n], and define \(S_{n,k}\) as the set of all ordered tuples \((x_1,\ldots ,x_k)\in [n]^k\) whose entries \(x_j\) are distinct. Following [14], a PSCA of strength k and multiplicity \(\lambda \ge 1\) on n symbols is a matrix \(A\in [n]^{(\lambda k!)\times n}\) with distinct symbols per row such that each sequence \(s\in S_{n,k}\) is contained as a subsequence in exactly \(\lambda \) rows (s is said to be covered \(\lambda \) times by A). Denote the class of all such matrices A by \({\text {PSCA}}(n,k,\lambda )\). Let \(g(n,k)\) be the minimum multiplicity \(\lambda \) such that \({\text {PSCA}}(n,k,\lambda )\) is non-empty, and call \(g(n,k)\) the perfect sequence covering array number (in analogy to [2]). A PSCA can be seen as a \((k,\lambda )\)-directed design of blocksize n and order n (cf. [4]). PSCAs are particular sequence covering arrays (SCAs), i.e., matrices with permutations as rows that cover each \(s\in S_{n,k}\) at least once [2]. SCAs were introduced in [8] for the purpose of event sequence testing (cf. also [2, 4]).

Recently, the following two results were obtained by Yuster.

Theorem 1

[14] If k/2 is a prime, then for all \(n\ge k\) we have

$$\begin{aligned} g(n,k)k! \ge \left( {\begin{array}{c}n\\ k/2\end{array}}\right) -\left( {\begin{array}{c}n\\ k/2-1\end{array}}\right) . \end{aligned}$$
(1)

For arbitrary k, provided \(n\gg k\), the bound \(g(n,k) > n^{k/2-o_k(1)}\) applies.

Theorem 2

[14] Let \(n\ge 3\). There exists a constant \(C>0\) such that

$$\begin{aligned} n/6 \le g(n,3) \le C n \log _2\left( n\right) ^{\log _2\left( 7\right) }. \end{aligned}$$

We now point to an isomorphic structure being analyzed in the setting of min-wise independent permutations [3]. We index families \(\mathcal {F}\subset S_n\) by an index set \([d]\), \(d\in \mathbb {N}\). The elements are allowed to occur repeatedly, and hence the cardinality of \(\mathcal {F}\) refers to the cardinality of the indexing set. We need the concept of the rank of an element before we get to the actual definition of interest.

Definition 1

For a subset \(X\subset [n]\) and arbitrary \(x\in X\), we denote the rank of x in X as \({\text {rank}}(x,X) := |\left\{ y\in X:y < x\right\} |\).

Definition 2

A non-empty family \(\mathcal {F}\subset S_n\) is called k-rankwise independent [6, 13], if for each set \(X=\left\{ x_1,\ldots ,x_k\right\} \) of k distinct elements of \([n]\) and each choice of k distinct values \(r_1,\ldots ,r_k\in \left\{ 0,\ldots ,k-1\right\} \), we have

$$\begin{aligned} {\text {Pr}}\left[ \bigwedge _{i=1}^k {\text {rank}}(\pi (x_i), \left\{ \pi (x_1),\ldots ,\pi (x_k)\right\} ) = r_i\right] =\frac{1}{k!}, \end{aligned}$$
(2)

when \(\pi \) is drawn uniformly at random from \(\mathcal {F}\).

It is clear that (2) corresponds to the following condition: For each sequence \((x_1,\ldots ,x_k)\in S_{n,k}\) and for each permutation \(\pi \) randomly drawn (uniform probability) from \(\mathcal {F}\), we have

$$\begin{aligned} {\text {Pr}}\left[ \pi ( x_{1})< \pi (x_{2})< \cdots < \pi (x_{k})\right] = \frac{1}{k!}. \end{aligned}$$
(3)

Next, we give a proposition that illuminates the isomorphy of PSCAs and rankwise independent families. Before that, we need an appropriate mapping and an auxiliary observation from which the proposition follows. For the sake of completeness we include a proof of the following Lemma 3 which was already noticed in [4, Lemma 1.1].

In the following, consider for a family \(\mathcal {F}=(\pi _1,\ldots ,\pi _d)\subset S_n\) the mapping

$$\begin{aligned} A: \mathcal {F}\mapsto A(\mathcal {F}) := \begin{bmatrix}\pi _1^{-1}(1),&{}\ldots ,&{}\pi _1^{-1}(n) \\ \vdots &{}\ldots ,&{} \vdots \\ \pi _d^{-1}(1),&{}\ldots ,&{}\pi _d^{-1}(n)\end{bmatrix}\in [n]^{d\times n}. \end{aligned}$$
(4)

Lemma 3

Let \(\mathcal {F}\subset S_n\) be a family of cardinality \(d\ge 1\). Consider for fixed \(i\in [d]\) the i-th element \(\pi := \pi _i\in \mathcal {F}\) and the i-th row \(r\in S_n\) of \(A(\mathcal {F})\). Let \((x_1,\ldots ,x_k)\in S_{n,k}\). We have \(\pi (x_1)<\pi (x_2)<\cdots <\pi (x_k)\) if and only if \((x_1,\ldots ,x_k)\) is a subsequence of r.

Proof

First, we show that monotonicity implies containment as subsequence: If \(\pi (x_1)<\cdots <\pi (x_k)\), then the row \(r=(\pi ^{-1}(1),\ldots ,\pi ^{-1}(n))\) in particular contains \((\pi ^{-1}(\pi (x_1)),\pi ^{-1}(\pi (x_2)),\ldots ,\pi ^{-1}(\pi (x_k))=(x_1,\ldots ,x_k)\) as subsequence.

For the other proof direction, if \(\pi \) does not satisfy \(\pi (x_1)<\pi (x_2)<\cdots <\pi (x_k)\), then there must be a permutation \(\psi \in S_k\setminus \left\{ {\text {id}}\right\} \) such that \(\pi (x_{\psi (1)})<\cdots <\pi (x_{\psi (k)})\), which implies (as before) that r contains \((x_{\psi (1)},\ldots ,x_{\psi (k)})\) as subsequence. Consequently \((x_1,\ldots ,x_k)\) is not a subsequence of r. \(\square \)

Proposition 4

Let \(n\ge k\). \(\mathcal {F}\subset S_n\) is k-rankwise independent if and only if there exists \(\lambda \in \mathbb {N}\setminus \left\{ 0\right\} \) such that \(A(\mathcal {F})\in {\text {PSCA}}(n,k,\lambda )\). Therefore, \(g(n,k)k!\) determines the minimum cardinality of a k-rankwise independent family of permutations of \([n]\).

Proof

Define \(C_x := \left\{ \pi \in S_n: \pi (x_1)<\cdots <\pi (x_k)\right\} \), for \(x=(x_1,\ldots ,x_k)\in S_{n,k}\).

If \(\mathcal {F}\) is k-rankwise independent, then for \(x\in S_{n,k}\) we can find \(|\mathcal {F}|/k!\) permutations in \(\mathcal {F}\) which lie in \(C_x\). By Lemma 3, this means that x is covered by precisely \(|\mathcal {F}|/k!\) rows of \(A(\mathcal {F})\). As a consequence, by choosing \(\lambda :=|\mathcal {F}|/k!\) we obtain \(A(\mathcal {F})\in {\text {PSCA}}(n,k,\lambda )\).

Conversely, if there is \(\lambda \ge 1\) such that \(A(\mathcal {F})\in {\text {PSCA}}(n,k,\lambda )\), then for \(x\in S_{n,k}\) there are \(\lambda \) rows in \(A(\mathcal {F})\) that cover x. By Lemma 3, \(\mathcal {F}\) must contain \(\lambda \) permutations which all are in \(C_x\) (here permutations are counted with respect to multiplicity). Therefore, a permutation \(\pi \) chosen uniformly at random from \(\mathcal {F}\) satisfies \(\pi (x_1)<\cdots <\pi (x_k)\) with probability 1/k!. \(\square \)

The inverse operation of \(A(\cdot )\) in Proposition 4, i.e., the conversion of a PSCA to a rankwise independent family, is determined by interpreting the rows of the PSCA as permutations and successively storing their inverses one by one in a family of permutations.

In [6] it is already remarked that k-rankwise independence implies \(\ell \)-rankwise independence, for all \(\ell \in [k]\). In other words, a nesting property analogous to the one for PSCAs holds (if \(A\in {\text {PSCA}}(n,k,\lambda )\), then also \(A\in {\text {PSCA}}(n,\ell ,\lambda k!/\ell !)\) [14]). We highlight that k-rankwise independent families can be considered as completely scrambling familiesFootnote 1 (introduced by Spencer [12]) with an additional requirement of regularity. Furthermore, k-rankwise independence is a special case of k-restricted min-wise independence (cf. Definition 3). It is known (cf. [6, p. 139]) that these two notions coincide for \(k=3\) and that k-rankwise independence is strictly more specific when \(k>3\). In [3], k-restricted min-wise independent families were introduced to efficiently estimate the resemblance of two documents. The latter is motivated by practice, where the aim is to reduce the computational cost of searching for near-duplicate documents on the World Wide Web [3].

Definition 3

[3] A non-empty family \(\mathcal {F}\) is k-restricted min-wise independent if for any set \(X\subset [n]\) with \(|X|\le k\) and arbitrary \(x\in X\), any permutation \(\pi \) drawn with uniform probability randomly from \(\mathcal {F}\) satisfies

$$ {\text {Pr}}\left[ \pi (x)=\min _{y\in X}\pi (y)\right] =\frac{1}{|X|}. $$

Remark 1

Given a matrix in \({\text {PSCA}}(n,k,\lambda )\) with \(n\ge k+1\), we can utilize it to construct a matrix in \({\text {PSCA}}(n-1,k,\lambda )\) by dropping the symbol n in each row (see [14]). Consequently, we also know how to construct a k-rankwise independent family \(\mathcal {F}'\subset S_{n-1}\) (with \(|\mathcal {F}'|=|\mathcal {F}|\)) from a k-rankwise independent family \(\mathcal {F}\subset S_n\) (this was already noticed and applied in [6] without referring to PSCAs).

We can now transfer bounds from rankwise independent families to PSCAs (and vice versa).

2 Lower and upper bounds

In this section asymptotic results for rankwise independent permutations will be stated jointly with their implications (cf. Proposition 4) for the number \(g(n,k)\).

The following result can in particular be analyzed when h is a prime; we can therefore compare it with Theorem 1 and conclude that it improves the lower bound by the factor !h, the subfactorial of h,Footnote 2 being superexponential in h, but independent of n.

Theorem 5

[1] Let \(\mathcal {F}\subset S_n\) be a k-rankwise independent family. If \(k=2h\) (\(h\in \mathbb {N}\)), then we have

$$\begin{aligned} |\mathcal {F}| \ge g(n,k)k!\ge \sum _{i=0}^{h} !i\left( {\begin{array}{c}n\\ i\end{array}}\right) =\frac{!h}{h!}n^h(1+o(1)). \end{aligned}$$
(5)

Otherwise, if \(k=2h+1\) (\(h\in \mathbb {N}\)), then

$$\begin{aligned} |\mathcal {F}| \ge g(n,k)k! \ge \sum _{i=0}^{h} !i \left( {\begin{array}{c}n\\ i\end{array}}\right) + !(h+1)\left( {\begin{array}{c}n-1\\ h\end{array}}\right) =\frac{!h+!(h+1)}{h!}n^h(1+o(1)). \end{aligned}$$
(6)

The next Theorem 6 follows from the proof of [6, Theorem 3.2]. It positively answers the question of Yuster on the existence of non-trivial upper bounds for \(g(n,k)\) for fixed k. In fact, the estimate is polynomial in n.

Theorem 6

(variant of [6, Theorem 3.2]) Let \(p\ge n\ge k\) such that p is a prime and \(n\ge (k-1)!\). Furthermore, let \(p_1<\cdots <p_m\) be the sequence of all primes not exceeding \(k-1\), i.e., \(p_m\le k-1\). Let \((e_1,\ldots ,e_m)\in (\mathbb {N}\setminus \left\{ 0\right\} )^m\) be a minimizer of \(Q=\prod _{i=1}^m p_i^{e_i}\) under the side constraints that \((k-1)!\) divides Q, and \(p_i^{e_i}>p\), for \(i=1,\ldots ,m\). Then, there exists a k-rankwise independent family \(\mathcal {P}\) of permutations of \([p]\) such that \(|\mathcal {P}|\le (p^k-p)Q^{\lfloor k/2 \rfloor }\). Consequently, cf. Remark 1, there exists a family \(\mathcal {P}'\subset S_n\) satisfying

$$\begin{aligned} g(n,k)k!\le |\mathcal {P}'| = |\mathcal {P}| = n^{O(k^2/\ln k)}. \end{aligned}$$
(7)

Remark 2

For the case that \((k-1)!>n\) (not covered by Theorem 6), it is established in [6] that a k-rankwise independent family with cardinality of order \(e^{O(k^3)}\) exists.

Remark 3

The proof of Theorem 6 presented in [6] is constructive: For a prime p and the finite field \(\mathbb {F}_p\) of p elements (\(\mathbb {F}_s\) is to be understood accordingly, when s is a prime power), the set of univariate polynomials

$$ \mathbb {F}_p(\xi ,[a,b]) := \left\{ r\in \mathbb {F}_p[\xi ]: a\le \deg r \le b\right\} $$

is used as base for the construction. Under the assumptions of Theorem 6, setting

$$\begin{aligned} E := \mathbb {F}_{{p_1}^{e_1}}(\xi , [0,\lfloor k/2 \rfloor -1])\times \cdots \times \mathbb {F}_{{p_m}^{e_m}}(\xi , [0,\lfloor k/2 \rfloor -1]), \end{aligned}$$

a tuple \((b_y)_{y\in \mathbb {F}_p}\in \left\{ 0,\ldots ,Q-1\right\} ^p\) is generated by a manipulation which depends on a fixed parameter \(h=(h_1,\ldots ,h_m)\in E\). The tuple \((b_y)_{y}\) is then used to carry out the so-called “\(\lfloor k/2 \rfloor \)-tie breaking scheme” [6] which serves to identify each of the \(p^k-p\) elements of \(\mathbb {F}_p(\xi ,[1,k])\) with a permutation of \(\{0,\ldots ,p-1\}\). Deriving from each tuple in E (having cardinality \(|E| = (p_1^{e_1})^{\lfloor k/2 \rfloor } \cdots (p_m^{e_m})^{\lfloor k/2 \rfloor }=Q^{\lfloor k/2 \rfloor }\)) such a collection of permutations, yields a total of \((p^k-p)Q^{\lfloor k/2\rfloor }\) permutations. Collectively these permutations fulfill k-rankwise independence. For a more detailed discussion of the proof of Theorem 6 (and of the aforementioned scheme) we point to [6].

We state more specific results for \(k\in \left\{ 3,4\right\} \). Originally, the subsequent result concerning 3-rankwise independence was established for the equivalent property of 3-restricted min-wise independence.

Theorem 7

([13]) Let \(n\ge 4\). Then, there exists a 3-rankwise independent family \(\mathcal {E}\subset S_n\) and a 4-rankwise independent family \(\mathcal {F}\subset S_n\), with cardinalities

$$\begin{aligned} g(n,3)\cdot 3!&\le |\mathcal {E}|&\le 12\sqrt{e}(1+o(1))n\log _2\left( n\right) ^2, \end{aligned}$$
(8)
$$\begin{aligned} g(n,4)\cdot 4!&\le |\mathcal {F}|&\le 15e(1 + o(1))n^3\log _2\left( n\right) ^6. \end{aligned}$$
(9)

Remark 4

The latter result is obtained by methods from affine/projective finite geometry by a recursive construction. The bound (8) improves on the bound in Theorem 2 by a factor of approximately \(\log _2\left( n\right) ^{0.81}\). The bound (9) is obtained by estimating the members of a recursive sequence which describes cardinalities of 4-rankwise independent families of permutations of \([n]\), for \(n=2^{2q}+1\). The involved recursion is homogeneous in the cardinality of the family employed as base case. For the construction it is required to start with a 4-rankwise independent family of \(\mathcal {F}\subset S_5\). In [13], the authors choose \(\mathcal {F}=S_5\), such that \(|\mathcal {F}|=120\).

We can improve this base case by making use of a construction due to Levenshtein [9] allowing to construct a \({\text {PSCA}}(n+1,n,1)\). It provides us with a \({\text {PSCA}}(5,4,1)\), which we can map via the isomorphism in Proposition 4 to a 4-rankwise independent family \(\tilde{\mathcal {F}}\) satisfying \(|\tilde{\mathcal {F}}|=24\). Hence, we obtain an improvement of the bound (9), which reads

$$\begin{aligned} g(n,4)\cdot 4!\le |\mathcal {F}| \le 3e(1 + o(1))n^3\log _2\left( n\right) ^6. \end{aligned}$$
(10)

The optimized recursion leading to (10) permits us to improve on some recently obtained results in [5] (by taking the exact values of the recursion). We state the hereby found bounds in Table .

Remark 5

Recently in [7], analogously to Remark 4, instead of the entire symmetric group \(S_4\) chosen as base case for a similar recursion designed for the case \(k=3\) [13, Section 3.4], a 3-rankwise independent family of cardinality 6 has been used (such a family exists due to \({\text {PSCA}}(4,3,1)\ne \emptyset \), cf. [9]). The bound

$$\begin{aligned} g(n,3)\cdot 3! \le 3\sqrt{e}(1+o(1))n\log _2\left( n\right) ^2, \end{aligned}$$
(11)

slightly improving upon (8), is obtained. By evaluation of the respective recursion, bounds for \(g(15,3)\) and \(g(16,3)\) have been improved (see Table ).

Table 1 Upper bounds for g(n, 4)
Table 2 Upper bounds for g(n, 3)

3 Conclusion and open questions

In our discussion, we have found that the theory of PSCAs and the theory of min-wise independent permutations can benefit considerably from each other. Despite the isomorphy of many concepts in these theories, strong interconnections seemed not to be exploited in the past (perhaps because one theory consistently uses probabilistic language). By combining the latter theories, we have improved several bounds and established polynomial boundedness for \(g(n,k)\), which Yuster asked for in [14]. Furthermore, we achieved progress in another question appearing in [14] which asks to find the right order of magnitude of \(g(n,3)\): The quasi-linear upper bound (established in [14]) is tightened by a factor of \(\log _2\left( n\right) ^{0.81}\) (cf. (8)). It remains open by how much lower and upper bounds for \(g(n,3)\) can still be improved.

The great difficulty, already for \(k\in \left\{ 3,4\right\} \) and small n, to determine existence of PSCAs (on a certain number of rows) manifests itself in some works, which come up with computationally intensive search procedures [5, 11]. It would be highly interesting to find out whether the regularity, that PSCAs have compared to SCAs, facilitates the determination of the complexity class of the problem of calculating \(g(n,k)k!\), the minimum number of rows able to host a PSCA. For SCAs the respective problem concerning the minimal row count is still open (even if in [2] NP-completeness has been established for an altered problem).

Aiming to calculate further exact values of \(g(n,k)\), we experimentally tried, for n being a divisor of \(\lambda k!\), to seek small PSCAs by restricting the search space to a class of matrices A (having permutations as rows) and satisfying the following “reflection symmetry” property: For any column indices \(j_1<j_2\), the \((j_1,j_2)\)-submatrix of A satisfies that if h rows coincide with \((a,b)\in [n]^2\), then also h rows coincide with (ba). The motivation behind this additional assumption is that recently PSCAs being unions of cosets of \(S_n\) have successfully been found (cf. [11]) for several parameter constellations \((n,k,\lambda )\). We suspect that our proposed kind of symmetry induces a suitable balancing of symbols being advantageous for finding PSCAs, too.

It appears that within the class of matrices having uniform columnwise distribution of symbols (cf. [5]), this latter constraint still allows to find representatives of PSCAs. Indeed, by a recursive search for strength \(k=3\), appending column to column, we find that such special representatives exist for the parameter constellation \((n,k,\lambda )\in \left\{ (3,3,1),(6,3,2)\right\} \). Existence for the constellation (9, 3, 3) could no longer be determined due to computational limitations and we would find it interesting to investigate this case with more dedicated computational effort. The representative of \({\text {PSCA}}(6,4,1)\) presented in [10, Proposition 2.22] is reflection symmetric, too. For strength \(k=4\), the constellation \((n,k,\lambda )=(8,4,3)\) is next, for which it is still open whether it possesses such a special representative (since it was shown in [5] that \({\text {PSCA}}(8,4,2)=\emptyset \)).

Combinatorial aspects (e.g. enumeration for small n and efficient constructions) of the class of matrices satisfying the latter reflection symmetry might be of independent interest.