1 Introduction

1.1 Universal Cycles

Universal cycles have been introduced by Chung, Diaconis, and Graham [5]. They are closely related to both de Bruijn cycles and Gray codes. A universal cycle is a cyclic sequence which contains as a subsequence of consecutive terms a representation of every element of some collection of “combinatorial objects” exactly once. For example de Bruijn cycle is a binary cyclic sequence in which (for a fixed k) every binary sequence of length k appears as a subsequence of k consecutive terms exactly once. The problems of existence and construction of universal cycles for many combinatorial objects such as strings, subsets, multisets, permutations, partitions, lattice paths, vector spaces, weak orders, etc. have been considered by many authors (see [35, 15, 16, 18, 19, 21, 23, 25]). Some variants of these problems where the subsequences representing combinatorial objects are not necessarily continuous or they do not overlap completely were investigated too (see [1, 6, 8, 9, 13, 14, 22]).

In this paper we deal with universal cycles for k-subsets of an n-set that is, sequences where every k-subset of an n-set appears exactly once as a subsequence of k consecutive terms. More precisely, for positive integers k and \(n,k\le n\), a sequence \((a_0,a_1,\ldots ,a_{m-1})\) of elements of the set \([n]=\{ 1,2,\ldots ,n\}\) is a universal cycle if every k-subset of [n] appears in this sequence exactly once as a subsequence \((a_{i+1},\ldots ,a_{i+k})\) of k consecutive terms, where subscripts are taken modulo m. For example, when \(n=5\) and \(k=2\), the cyclic sequence (1, 3, 2, 5, 4, 2, 1, 5, 3, 4) is a universal cycle because every 2-subset of the set [5] appears in it exactly once as a subsequence of 2 consecutive terms.

The research on universal cycles for k-subsets of an n-set has been concentrated around the following conjecture of Chung, Diaconis, and Graham [5].

Conjecture 1

Let \(k\ge 1\). There is an integer \(n_0(k)\) such that for \(n\ge n_0(k)\), there exists a universal cycle for k-subsets of [n] if and only if \({n-1\atopwithdelims ()k-1}\equiv 0\ (\mathrm{mod}\ k)\).

The condition \({n-1\atopwithdelims ()k-1}\equiv 0\ (\mathrm{mod}\ k)\) is an obvious necessary condition for existence of a universal cycle (see [5]). Its sufficiency for large n is far from being settled.

It is easy to see that Conjecture 1 is true for \(k=1,2\). Jackson [19] proved it for \(k=3\) and constructed universal cycles for 4-subsets of an n-set for all odd \(n\ge 9\). Hurlbert [16] showed that for \(k=3,4,6\) and n sufficiently large there exist universal cycles for k-subsets of an n-set whenever n and k are relatively prime. It is reported in [21] that Jackson [20] settled the Conjecture 1 for \(k=4,5\) by constructing with some aid of a computer universal cycles for 4-subsets when \(n\equiv 2\ (\mathrm{mod}\ 8)\) and \(n\ge 10\) and for 5-subsets when n is not divisible by 5 and \(n\ge 8\); the former of these results has been proved as well by Rudoy [27] via an inductive construction. For \(k\ge 7\), and for \(k=6\) when n and k are not relatively prime, Conjecture 1 remains open.

1.2 Near-Universal Cycles

In view of apparent difficulty of Conjecture 1, it is natural to consider a relaxed problem, and there are two possible relaxations: to look for shortest possible cyclic sequences in which every k-subset of an n-set appears at least once as a subsequence of consecutive terms (it is referred to as the covering variant), and to look for longest possible cyclic sequences in which every k-subset of an n-set appears at most once as a subsequence of consecutive terms (the packing variant). Formally, by an (nk)-Ucycle packing (resp. (n, k)-Ucycle covering) we mean a sequence \((a_0,a_1,\ldots , a_{m-1})\) of elements of [n] such that every k-subset of [n] appears in this sequence at most (resp. at least) once as a subsequence \((a_{i+1},\ldots ,a_{i+k})\) of k consecutive terms, where the index addition is performed modulo m. We denote by \(p_{n,k}\) (resp. \(c_{n,k}\)) the length of a longest (nk)-Ucycle packing (resp. a shortest (nk)-Ucycle covering). Obviously, \(p_{n,k}\le {n\atopwithdelims ()k}\le c_{n,k}\).

The problem of constructing a longest (nk)-Ucycle packing has been considered by Curtis et al. [7] and Stevens et al. [28]. In [7] the authors, for a fixed k, prove the asymptotic formula \(p_{n,k}={n\atopwithdelims ()k}(1-o(1))\) Footnote 1. It is shown in [28] that \(p_{n,n-2}=n\); an immediate consequence of this result is that for \(k=n-2\) and \(n\ge 4\) no universal cycle exists.

Lonc et al. [26] considered, in connection with database file organization, the problem of constructing a shortest (non-cyclic) sequence in which every k-subset of an n-set appears as a subsequence k of consecutive terms. Translating one of the results in [26] to the language of the present paper, they proved that, for a fixed \(k,c_{n,k}={n\atopwithdelims ()k}+O(n^{\lfloor k/2\rfloor })\). A simple non-constructive proof of a weaker asymptotic result \(c_{n,k}={n\atopwithdelims ()k}(1+o(1))\) (where k is fixed) was given by Blackburn [2] who considered this problem in the context of so-called k-radius sequences.

A bit different but related covering problem has been studied by Hurlbert [17] and Stevens et al. [28]. They consider the problem of existence of so-called t-cover Ucycles, i.e. cyclic sequences of \(t{n\atopwithdelims ()k}\) integers in which every k-subset of [n] appears exactly t times as a subsequence of consecutive terms. Blanca and Godbole [3] deal with another variant of the problem. They represent a k-subset of [n] as a binary n-term sequence with exactly k1s and define a universal covering to be a binary cyclic sequence in which such representation of every k-set of [n] appears at least once as a subsequence of n consecutive terms. The authors in [3] construct a binary sequence that contains exactly one representation of each k-set and each \((k-1)\)-set; it follows that if \(k=o(n)\), then there exists a universal covering, in the sense defined in this paragraph, of length \({n\atopwithdelims ()k}(1+o(1))\).

1.3 Our Contributions

In this paper we prove new asymptotic formulas for both the packing number \(p_{n,k}\) and the covering number \(c_{n,k}\). Our main achievement is showing that \(p_{n,k}\) and \(c_{n,k}\) are asymptotically equal to \(n \atopwithdelims ()k\) for a wider range of values of k – note that previous results apply only in the case when k is a constant and n goes to infinity, and we allow k to grow with n. We prove (in Sect. 3) a theorem that gives the following corollary.

Corollary 1

Let \(\alpha , 0< \alpha \le 1/3\), be a fixed real number. If \(k=k(n)\le n^\alpha \), then

$$\begin{aligned} p_{n,k}={n\atopwithdelims ()k}-o\left( {n\atopwithdelims ()k}^\beta \right) \quad \mathrm{and}\quad c_{n,k}={n\atopwithdelims ()k}+o\left( {n\atopwithdelims ()k}^\beta \right) , \end{aligned}$$
(1)

where \(\frac{1}{2}<\beta =\frac{1+\alpha }{2(1-\alpha )}\le 1\).

For the packing version we give an asymptotic result for k growing even faster with n, but with a larger error term (the proof is given in Sect. 4).

Theorem 1

If \(k=o(n)\), then

$$\begin{aligned} p_{n,k}={n\atopwithdelims ()k}(1-o(1)). \end{aligned}$$

We also improve the error term in the packing version from \(O(n^{k-1})\) to \(O(n^{\lfloor k/2\rfloor })\) in the case when k is a constant (the proof is given in Sect. 3).

Theorem 2

For every fixed positive integer k,

$$\begin{aligned} p_{n,k}={n\atopwithdelims ()k}-O(n^{\lfloor k/2\rfloor })\quad \mathrm{and}\quad c_{n,k}={n\atopwithdelims ()k}+O(n^{\lfloor k/2\rfloor }). \end{aligned}$$

Note that the important part of Theorem 2 is the formula for \(p_{n,k}\), as the formula for \(c_{n,k}\) was known before. Nevertheless, it is included in the theorem because it is obtained as a simple byproduct of the proof.

The major (and also novel) part of the proofs is estimating the number of awesome compositions of numbers (the term is explained in Sect. 2). We do not use Shur’s theorem (on which the proof in [7] relies—it is the reason why their argument works only for constant k); instead, we apply direct arguments (in Sects. 2 and 3) and probabilistic method (in Sect. 4).

2 Good and Awesome Subsets and Compositions

For a k-subset \(X=\{x_1,x_2, \ldots ,x_{k}\}\) of \([n]=\{ 1,2,\ldots ,n\}\), where \(x_1<x_2<\ldots <x_{k}\), we define the sequence of differences \(\mathbf{d}(X)=(d_1,d_2,\ldots , d_{k})\), where \(d_j=x_{j}-x_{j-1}\), for \(j=2,3,\ldots ,k\) and \(d_{1}=n+x_1-x_{k}\). Let us divide a circle of length n with n points into n arcs of length 1 and label these points with \(1,2,\ldots ,n\). If we represent the elements of the k-set X as points \(x_1,x_2,\ldots ,x_{k}\) on this circle, then the numbers \(d_i\) are just the lengths of arcs of the circle bounded by the consecutive points representing the elements of X. Following terminology used in Curtis et al. [7], we say that a k-subset X of an n-set [n] is good if there is a term (called a unique term) in \(\mathbf{d}(X)\) which is different from all the remaining terms. For example the 5-subset \(\{ 2,4,6,8,9\}\) of the set [12] is good because 1 (and also 5) are unique terms in its sequence of differences (5, 2, 2, 2, 1) and the 5-subset \(\{ 3,5,8,10,12\}\) of the set [12] is not good because its sequence of differences (3, 2, 3, 2, 2) has no unique terms. We say that a good k-subset X of [n] is awesome if \(\mathbf{d}(X)\) has a unique term greater than 1.

In [7]Footnote 2 the authors construct an (nk)-Ucycle packing \((a_0,a_1,\ldots ,a_{m-1})\) in which the k-subsets of [n] that appear as subsequences of k consecutive terms are precisely the awesome k-subsets. Thus, the number m of awesome k-subsets of [n] is a lower bound for the packing number \(p_{n,k}\). Let \(s_{n,k}={n\atopwithdelims ()k}-m\) be the number of k-subsets of [n] which are not awesome. Obviously,

$$\begin{aligned} p_{n,k}\ge {n\atopwithdelims ()k}-s_{n,k}. \end{aligned}$$
(2)

Moreover, we observe that in the linear (non-cyclic) sequence \((a_0,a_1,\ldots \), \(a_{m-1}\), \(a_0,a_1,\ldots ,a_{k-2})\) every awesome k-subset of [n] appears as a subsequence of k consecutive terms. We extend this sequence by a concatenation of \(s_{n,k} k\)-term sequences which are (arbitrary) permutations of the non-awesome k-subsets of [n]. In the resulting sequence of length \(m+k-1+ks_{n,k}\le {n\atopwithdelims ()k}+(k-1)(s_{n,k}+1)\) every k-subset of [n] appears at least once as a subsequence of k consecutive terms. This brute force construction yields the inequality

$$\begin{aligned} c_{n,k}\le {n\atopwithdelims ()k}+(k-1)(s_{n,k}+1). \end{aligned}$$
(3)

Let \((d_1,d_2,\ldots ,d_{k})\) be a composition of a positive integer n into k parts, i.e. a sequence of positive integers such that \(d_1+d_2+\ldots + d_{k}=n\). We say that a composition \((d_1,d_2,\ldots ,d_{k})\) is good if it has a part (called a unique part) which is different from all the remaining parts. More precisely, there exists \(i=1,2,\ldots ,k\) such that for all \(j\not =i,d_i\not =d_j\). If a good composition has a unique part which is larger than 1, then we call it awesome. We denote by \(g_{n,k}\) the number of compositions of n into k parts which are not good.

The next lemma explains the relationship between the numbers of non-awesome k-subsets of [n] and non-awesome compositions of n into k parts. Let \(a_{n,k}\) be the number of compositions of n into k parts which are not awesome.

Lemma 1

For every \(n\ge k\ge 1\),

$$\begin{aligned} s_{n,k} = \frac{n}{k}a_{n,k}. \end{aligned}$$

Proof

We will double-count the number of pairs (xS), where S is a non-awesome k-subset of [n] and \(x\in S\). Clearly, this number is equal to \(ks_{n,k}\).

On the other hand, consider a function \(\varphi \) that assigns to every non-awesome composition \((p_1,p_2,\ldots ,p_{k})\) of n into k parts the k-subset \(\{x,x+p_1,x+p_1+p_2,\ldots ,x+p_1+p_2+\cdots +p_{k-1}\}\) of [n], where addition is performed modulo n. One can easily verify that \(\varphi \) is a bijection from the set of all non-awesome compositions of n into k parts onto the set of all non-awesome k-subsets of [n] containing x. It follows that the number of pairs equals \(na_{n,k}\), so the lemma holds. \(\square \)

Lemma 2

For every \(n\ge k\ge 2\),

$$\begin{aligned} s_{n,k} \le \frac{n}{k}(g_{n,k}+kg_{n-1,k-1}). \end{aligned}$$

Proof

The set of non-awesome compositions of n into k parts is a union of two disjoint subsets: the set of non-good compositions and the set of good compositions which are non-awesome. The cardinality of the former set is \(g_{n,k}\). The cardinality of the latter set is not larger than \(kg_{n-1,k-1}\) because every good non-awesome composition of n into k parts can be obtained in a unique way from a non-good composition of \(n-1\) into \(k-1\) parts which has no part equal to 1 by inserting a unit part in one of k possible positions. Thus, \(a_{n,k}\le g_{n,k}+kg_{n-1,k-1}\) and we are done by Lemma 1. \(\square \)

Lemma 3

For \(n\ge k\ge 4\),

$$\begin{aligned} g_{n,k}\le (k-1)\sum _{d=1}^{\left\lfloor \frac{n-k+2}{2}\right\rfloor }g_{n-2d,k-2} + {k-1\atopwithdelims ()2}\sum _{d=1}^{\left\lfloor \frac{n-k+3}{3}\right\rfloor }g_{n-3d,k-3}. \end{aligned}$$

Proof

We denote by \(G_{n,k}\) the set of all non-good compositions of n into k parts. Let \(A_{n,k}\) (respectively \(B_{n,k}\)) be the set of non-good compositions \((d_1,d_2,\ldots ,d_{k})\) of n such that \(|\{ j:\ d_j=d_1\}|\not =3\) (resp. \(|\{ j:\ d_j=d_1\}|=3\)). Moreover, we denote by \(A_{n,k}(d),d=1,2,\ldots ,\left\lfloor \frac{n-k+2}{2}\right\rfloor \), (resp. \(B_{n,k}(d),d=1,2,\ldots ,\left\lfloor \frac{n-k+3}{3}\right\rfloor \)) the set of non-good compositions in \(A_{n,k}\) (resp. \(B_{n,k}\)) such that \(d_1=d\).

Clearly, the sets \(A_{n,k}(1),\ldots ,A_{n,k}(\left\lfloor \frac{n-k+2}{2}\right\rfloor ), B_{n,k}(1),\ldots ,B_{n,k}(\left\lfloor \frac{n-k+3}{3}\right\rfloor )\) are pairwise disjoint and

$$\begin{aligned} G_{n,k}=A_{n,k}\cup B_{n,k}=\bigcup _{d=1}^{\left\lfloor \frac{n-k+2}{2}\right\rfloor }A_{n,k}(d)\cup \bigcup _{d=1}^{\left\lfloor \frac{n-k+3}{3}\right\rfloor } B_{n,k}(d). \end{aligned}$$
(4)

To estimate \(|A_{n,k}(d)|\), we define \(C_i,i=2,\ldots ,k\), to be the set of members \((d_1,d_2,\ldots ,d_{k})\) of \(A_{n,k}(d)\) such that \(d_i=d_1=d\). Clearly, \(A_{n,k}(d)=C_2\cup \ldots \cup C_{k}\) because the compositions in \(A_{n,k}\) are not good. Moreover, if we remove from \((d_1,d_2,\ldots ,d_{k})\in C_i\) the terms \(d_1\) and \(d_i\), then we get a non-good composition of \(n-2d\) into \(k-2\) parts because, by the definition of \(A_{n,k},d\) occurs in this composition either 2 or at least 4 times. Hence, \(|C_i|=g_{n-2d,k-2}\) and, consequently,

$$\begin{aligned} |A_{n,k}(d)|\le |C_2|+\cdots +|C_{k}|=(k-1)g_{n-2d,k-2}. \end{aligned}$$
(5)

We estimate \(|B_{n,k}(d)|\) similarly. Let \(D_{i,j},2\le i<j\le k\), be the set of members \((d_1,d_2,\ldots ,d_{k})\) of \(B_{n,k}(d)\) such that \(d_i=d_j=d_1=d\). Obviously, \(B_{n,k}(d)=\cup _{2\le i<j\le k}D_{i,j}\). It follows from the definition of \(B_{n,k}\) that after removing from \((d_1,d_2,\ldots ,d_{k})\in D_{i,j}\) the terms \(d_1, d_i\) and \(d_j\) we get a non-good composition of \(n-3d\) into \(k-3\) parts in which no part is equal to d. Hence, \(|D_{i,j}|\le g_{n-3d,k-3}\) and, consequently,

$$\begin{aligned} |B_{n,k}(d)|= \sum _{2\le i<j\le k}|D_{i,j}|\le {k-1\atopwithdelims ()2}g_{n-3d,k-3}. \end{aligned}$$
(6)

The assertion follows by the equality (4) and the inequalities (5) and (6). \(\square \)

One can readily verify that, for \(k=2,3,g_{n,k}=1\) if and only if k is a divisor of n and \(g_{n,k}=0\) otherwise. Moreover \(g_{n,1}=0\).

Lemma 4

For \(n\ge k\ge 1\),

  1. (i)

    \(g_{n,k}\le k!n^{\lfloor k/2\rfloor -1}\),

  2. (ii)

    \(g_{n,k}\le (kn)^{k/2 - 1}\).

Proof

(i) We proceed by induction on k. Obviously, the lemma holds for \(k=1,2,3\) so let us assume that \(k\ge 4\). By Lemma 3 and the induction hypothesis we get

$$\begin{aligned} g_{n,k}\le & {} (k-1)\sum _{d=1}^{\left\lfloor \frac{n-k+2}{2}\right\rfloor }(k-2)!(n-2d)^{\lfloor (k-2)/2\rfloor -1}\\&+ {k-1\atopwithdelims ()2}\sum _{d=1}^{\left\lfloor \frac{n-k+3}{3}\right\rfloor }(k-3)!(n-3d)^{\lfloor (k-3)/2\rfloor -1}\\\le & {} (k-1)!n^{\lfloor (k-2)/2\rfloor } + (k-1)!n^{\lfloor (k-3)/2\rfloor } \le k!n^{\lfloor k/2\rfloor -1}. \end{aligned}$$

(ii) We proceed by induction on k again. The lemma holds for \(k=1,2,3\) so let us assume that \(k\ge 4\). By Lemma 3 and the induction hypothesis, similarly as in the proof of (i), we get

$$\begin{aligned} g_{n,k}\le & {} (k-1)\sum _{d=1}^{\left\lfloor \frac{n-k+2}{2}\right\rfloor }[(k-2)(n-2d)]^{(k-2)/2-1}\\&+ {k-1\atopwithdelims ()2}\sum _{d=1}^{\left\lfloor \frac{n-k+3}{3}\right\rfloor }[(k-3)(n-3d)]^{(k-3)/2-1}\\\le & {} \frac{1}{2}k^{(k-2)/2}n^{(k-2)/2} + \frac{1}{6}k^{(k-1)/2}n^{(k-3)/2} = (kn)^{k/2-1}\left( \frac{1}{2}+\frac{1}{6}\left( \frac{k}{n}\right) ^{1/2}\right) \\\le & {} (kn)^{k/2 - 1}. \end{aligned}$$

\(\square \)

3 Packing and Covering Number Asymptotic Formulas for a Fixed k and for \(k\le n^{1/3}\)

We are ready now to prove asymptotic bounds on the numbers \(p_{n,k}\) and \(c_{n,k}\) when k is fixed.

Proof

(of Theorem 2) As the theorem obviously holds for \(k=1\), let us assume that \(k\ge 2\). By the inequalities (2) and (3), to prove the theorem, it suffices to show that \(s_{n,k}\le cn^{\lfloor k/2\rfloor }\), for some constant \(c=c(k)\).

By Lemmas 2 and 4(i),

$$\begin{aligned} s_{n,k}&\le \frac{n}{k}(g_{n,k}+kg_{n-1,k-1}) \le n\left( (k-1)!n^{\lfloor k/2\rfloor -1}+(k-1)!(n-1)^{\lfloor (k-1)/2\rfloor -1}\right) \nonumber \\&\le 2(k-1)!n^{\lfloor k/2\rfloor }, \end{aligned}$$

which completes the proof of the theorem. \(\square \)

We shall apply now our results on compositions shown in Sect. 2 to prove asymptotic formulas for \(p_{n,k}\) and \(c_{n,k}\) in the case when \(k=k(n)\) is any function of n such that \(k\le n^{1/3}\).

Theorem 3

Let \(\alpha ,0< \alpha \le 1/3\), be a fixed real number. If \(k=k(n)\le n^\alpha \), then

$$\begin{aligned} p_{n,k}={n\atopwithdelims ()k}-O\left( \frac{1}{2^{k/2}}{n\atopwithdelims ()k}^\beta \right) \quad \mathrm{and}\quad c_{n,k}={n\atopwithdelims ()k}+O\left( \frac{1}{2^{k/2}}{n\atopwithdelims ()k}^\beta \right) , \end{aligned}$$

where \(\frac{1}{2}<\beta =\frac{1+\alpha }{2(1-\alpha )}\le 1\).

Proof

We observe that the function \(f(x)=(1+x)^{\frac{1}{x}}\) is decreasing in the interval (0, 1], so \((1+x)^{\frac{1}{x}}\ge 2\), for \(x\in (0,1]\). Moreover, \(0<\frac{k}{n-k}\le 1\), for \(n\ge 2k\), so

$$\begin{aligned} \left( \frac{n}{n-k}\right) ^{n-k}=\left[ \left( 1+\frac{k}{n-k}\right) ^{\frac{n-k}{k}}\right] ^k\ge 2^k. \end{aligned}$$

Thus, using the Stirling formula for the factorial approximation one can easily observe that there is a constant \(d>0\) such that for all positive integers n and k, where \(n\ge 2k\),

$$\begin{aligned} {n\atopwithdelims ()k}\ge d\sqrt{\frac{n}{k(n-k)}}\frac{n^n}{k^k(n-k)^{n-k}}\ge dk^{-\frac{1}{2}}\left( \frac{n}{k}\right) ^k2^k. \end{aligned}$$
(7)

By Lemmas 2 and 4(ii) and the inequality \(k\le n^\alpha \),

$$\begin{aligned} k(s_{n,k}+1)\le & {} n(g_{n,k}+kg_{n-1,k-1})+k \le n[(kn)^{\frac{k}{2}-1}+k((k-1)(n-1))^{\frac{k-1}{2}-1}]+k\nonumber \\\le & {} k^{\frac{k-1}{2}}n^{\frac{k}{2}}(k^{-\frac{1}{2}}+n^{-\frac{1}{2}})+k \le k^{\frac{k-1}{2}}n^{\frac{k}{2}}\cdot 4k^{-\frac{1}{2}}\le 4n^{\frac{k}{2}(\alpha +1)-\alpha }. \end{aligned}$$
(8)

For \(n\ge 2k\), by the inequality \(\frac{k}{2}(\alpha +1)-\alpha \le \beta (k(1-\alpha )-\frac{\alpha }{2})\), true for \(\alpha \le \frac{1}{3}\), and the inequalities (8), \(k\le n^\alpha \), (7) and \(\beta >\frac{1}{2}\), we get

$$\begin{aligned} k(s_{n,k}+1)\le & {} 4n^{\beta (k(1-\alpha )-\frac{\alpha }{2})}=4\left( n^{-\frac{\alpha }{2}}\left( \frac{n}{n^\alpha }\right) ^k\right) ^\beta \le 4\left( k^{-\frac{1}{2}}\left( \frac{n}{k}\right) ^k\right) ^\beta \nonumber \\\le & {} 4\left( \frac{1}{d2^k}{n\atopwithdelims ()k}\right) ^\beta \le \frac{4}{d^{\beta }}\frac{1}{2^{k/2}}{n\atopwithdelims ()k}^\beta . \end{aligned}$$
(9)

As \(n\ge 2k\) for \(n\ge 3\) (because \(k\le n^{1/3}\)), the theorem follows now from the inequalities (9), (2) and (3). \(\square \)

Proof

(of Corollary 1). By Theorem 3, if \(k\rightarrow \infty \) as \(n\rightarrow \infty \), then the equalities (1) hold. Since \(\left\lfloor \frac{k}{2}\right\rfloor < \beta k\), it follows from Theorem 2 that if k is a fixed positive integer, then the equalities (1) hold too. One can readily verify that these two statements imply that the equalities (1) are true for every function \(k=k(n)\le n^\alpha \). \(\square \)

4 A Packing Number Asymptotic Formula for \(k=o(n)\)

Let us pass on now to a more general case when \(k=o(n)\). Using a probabilistic method, we will prove in this case that \(p_{n,k}={n\atopwithdelims ()k}(1-o(1))\). To show this result, we will prove first that if \(k=o(n)\) and \(k\rightarrow \infty \) as \(n\rightarrow \infty \), then almost every composition of n into k parts is good. More precisely, we will show (see Lemma 5(iv)) that if we select a composition of n into k parts uniformly at random, then with high probability its largest part is unique. We will call the random partition model described in the preceding sentence a uniform model and denote it by \(\varPi _{n,k}\) because of its similarity to the uniform random graph model. Some related results on the issues concerning the largest parts and multiplicities of parts in random compositions can be found in Knopfmacher and Robbins [24], Hitczenko and Louchard [11], Hitczenko and Savage [12] and the book by Heubach and Mansour [10]. In these results, however, the number of parts is not a parameter so it seems they are not useful in our considerations.

We can imagine that a positive integer n is represented as an interval of length n which is divided into n segments of length 1 by \(n-1\) dashes. Note that a composition \((d_1,d_2,\ldots ,d_{k})\) of n into k parts can be thought of as a selection of \(k-1\) dashes that divide our interval (into segments of lengths \(d_1,d_2,\ldots ,d_{k}\)). We will say that a part (of a given composition) starts at position i if the i-th dash is selected (the part \(d_1\) starts at position 0). The length of this part is the distance from the i-th dash to the next selected dash (or to the position n if no dash at positions larger than i have been selected).

Let us define the \({ binomial\ model}\), denoted by \(\varPi _{n,p}\), so that each composition of n into k parts is chosen with probability \(p^{k-1}(1-p)^{n-k}\). This is equivalent to saying that in the “interval” representation described above we pick dashes to form our composition independently, each with probability p. Note that the expected number of parts in \(\varPi _{n,p}\) is \(p(n-1)+1\), so one may hope that it would be equivalent to \(\varPi _{n,k}\) when \(k\approx np\).

Here is a sketch of our reasoning in the remaining part of this section. We start by proving that, with high probability, the largest part in \(\varPi _{n,p}\) is unique. We set p to be slightly smaller than \(\frac{k}{n}\), so, with high probability, the number of parts in \(\varPi _{n,p}\) is smaller than k. To obtain a composition with exactly k parts we take a random composition from \(\varPi _{n,p}\) and randomly add the missing dashes. If the number of added dashes is small enough, we can show that, with high probability, it does not affect our unique part. This procedure is a \(\varPi _{n,k}\) model in disguise, which completes the argument.

Lemma 5

Let \(p=p(n),0<p<1\), be a function of n such that \(p=o(1)\) and \(np\rightarrow \infty \) as \(n\rightarrow \infty \). We define \(\ell _0=\frac{\ln (np)-\ln \omega }{-\ln (1-p)}\) and \(\ell _1=\frac{\ln (np)+\ln \omega }{-\ln (1-p)}\), where \(\omega =\omega (n)=\min (np,p^{-\frac{1}{3}})\). Then, with high probability,

  1. (i)

    \(\varPi _{n,p}\) does not have a part \(d>\ell _1\);

  2. (ii)

    \(\varPi _{n,p}\) has a part \(d>\ell _0\);

  3. (iii)

    every part \(d\in (\ell _0,\ell _1]\) in \(\varPi _{n,p}\) is unique;

  4. (iv)

    the largest part in \(\varPi _{n,p}\) is unique.

Proof

Let \(X_d\) be the number of parts equal to d in \(\varPi _{n,p}\) and define \(X_{\ge d}=\sum _{i=d}^nX_i\). Clearly, \(X_{\ge d}=\sum _{i=0}^{n-d}I_i\), where \(I_i\) is the 0-1 random variable indicating that a part of length at least d starts at position i. We observe that for \(1\le i\le n-d,\mathbb {E}[I_i]=Pr[I_i=1]=p(1-p)^{d-1}\) (we must select a dash at position i with probability p and skip \(d-1\) next dashes) and \(\mathbb {E}[I_0]=(1-p)^{d-1}\), so

$$\begin{aligned} \mathbb {E}[X_{\ge d}]=\sum _{i=0}^{n-d}\mathbb {E}[I_i]=(n-d)p(1-p)^{d-1}+(1-p)^{d-1}\le (np+1)(1-p)^{d-1}.\nonumber \\ \end{aligned}$$
(10)

(i) We need to show that, with high probability, \(X_{\ge \lfloor \ell _1\rfloor +1}=0\). One can readily verify that \((1-p)^{\ell _1}=\frac{1}{np\omega }\), so

$$\begin{aligned} \mathbb {E}[X_{\ge \lfloor \ell _1\rfloor +1}]\le (np+1)(1-p)^{\lfloor \ell _1\rfloor }\le (np+1)(1-p)^{\ell _1-1}=\frac{1+\frac{1}{np}}{\omega (1-p)}=o(1), \end{aligned}$$

because \(p=o(1)\) and \(np\rightarrow \infty \) as \(n\rightarrow \infty \), so \(\omega \rightarrow \infty \). As \(Pr[X\ge 1]\le \mathbb {E}[X]\) for any nonnegative integer valued random variable X, we get (i).

(ii) We shall use the second moment method. First we observe that for sufficiently large \(n,0\le \ell _0 = \frac{\ln (np)-\ln \omega }{-\ln (1-p)}\le \frac{\ln (np)}{p}=o(n)\), because \(p\le -\ln (1-p)\) and \(np\rightarrow \infty \) as \(n\rightarrow \infty \). Moreover,

$$\begin{aligned} (1-p)^{\ell _0}=\frac{\omega }{np}. \end{aligned}$$
(11)

Let \(I_i\) be the 0-1 random variable indicating that a part of length at least \(\lfloor \ell _0\rfloor +1\) starts at position i. By the first equality in (10), we have

$$\begin{aligned} \mathbb {E}[X_{\ge \lfloor \ell _0\rfloor +1}]= & {} \sum _{i=0}^{n-\lfloor \ell _0\rfloor -1}\mathbb {E}[I_i]\ge (n-\lfloor \ell _0\rfloor )p(1-p)^{\lfloor \ell _0\rfloor } \ge (n-\ell _0)p(1-p)^{\ell _0}\nonumber \\= & {} np(1-p)^{\ell _0}(1-o(1))= np\frac{\omega }{np}(1-o(1))=\omega (1-o(1)). \end{aligned}$$
(12)

As for the second moment, we have

$$\begin{aligned} \mathbb {E}[X^2_{\ge \lfloor \ell _0\rfloor +1}]=\mathbb {E}\left[ \sum _{i=0}^{n-\lfloor \ell _0\rfloor -1}\sum _{j=0}^{n-\lfloor \ell _0\rfloor -1}I_iI_j\right] \le \sum _{i=0}^{n-1}\sum _{j=0}^{n-1}Pr[I_i=I_j=1]. \end{aligned}$$
(13)

Note that

$$\begin{aligned} Pr[I_i=I_j=1]\le \left\{ \begin{array}{ll} p^2(1-p)^{2\lfloor \ell _0\rfloor }&{} \text{ for } 0\not =i\not =j\not =0\\ p(1-p)^{2\lfloor \ell _0\rfloor }&{} \text{ for } \text{( }i=0 \text{ or } j=0\text{) } \text{ and } i\not =j\\ p(1-p)^{\lfloor \ell _0\rfloor }&{} \text{ for } i=j\not =0\\ (1-p)^{\lfloor \ell _0\rfloor }&{} \text{ for } i=j=0\\ \end{array}. \right. \end{aligned}$$
(14)

Thus, applying (13), (14) and (11), we get

$$\begin{aligned} \mathbb {E}[X^2_{\ge \lfloor \ell _0\rfloor +1}]\le & {} n^2p^2(1-p)^{2\lfloor \ell _0\rfloor }+2np(1-p)^{2\lfloor \ell _0\rfloor }+np(1-p)^{\lfloor \ell _0\rfloor }+(1-p)^{\lfloor \ell _0\rfloor }\nonumber \\\le & {} (n^2p^2+2np)(1-p)^{2\ell _0-2}+(np+1)(1-p)^{\ell _0-1}\nonumber \\= & {} \omega ^2\frac{1+\frac{2}{np}}{(1-p)^2}+\omega \frac{1+\frac{1}{np}}{1-p}\nonumber \\= & {} (\omega ^2+\omega )(1+o(1)). \end{aligned}$$
(15)

Applying (12) and (15), by the second moment method, we have

$$\begin{aligned} Pr[X_{\ge \lfloor \ell _0\rfloor +1}>0]\ge & {} 2-\frac{\mathbb {E}[X^2_{\ge \lfloor \ell _0\rfloor +1}]}{(\mathbb {E}[X_{\ge \lfloor \ell _0\rfloor +1}])^2} \ge 2-\frac{(\omega ^2+\omega )(1+o(1))}{\omega ^2(1-o(1))}\\= & {} 2-(\frac{1}{\omega }+1)(1+o(1))=1 - o(1), \end{aligned}$$

because \(\omega \rightarrow \infty \) as \(n\rightarrow \infty \), which proves (ii).

(iii) For any \(d > \ell _0\), let us estimate the probability of the event \(C_d\) that a part d occurs at least twice in a composition \(\varPi _{n,p}\). We denote by \(A_{i,j},0\le i<j\le n-d\), the event that parts of length d in \(\varPi _{n,p}\) start at positions i and j. For all pairs (ij) such that \(j>i+d\) and \(i>0,A_{i,j}\) holds whenever we select dashes at positions \(i,i+d,j\) and \(j+d\) (all with probability p) and skip the \(2d-2\) positions between i and \(i+d\) and between j and \(j+d\) (all with probability \(1-p\)). Thus, in this case \(Pr[A_{i,j}]=p^4(1-p)^{2d-2}\). If \(i=0\) and \(j>d\) or \(i>0\) and \(j=i+d\) then, by a similar reasoning, we get \(Pr[A_{i,j}]=p^3(1-p)^{2d-2}\). If \(i=0\) and \(j=d\), then \(Pr[A_{i,j}]=p^2(1-p)^{2d-2}\). Finally, if \(i<j<i+d\), then \(Pr[A_{i,j}]=0\). The number of pairs \((i,j),0<i<j\le n-d\), such that \(j>i+d\) is equal to \(\frac{1}{2}(n-2d)(n-2d-1)\le \frac{1}{2}n^2\) and the number of pairs (ij), such that \(i=0\) and \(j>d\) or \(i>0\) and \(j=i+d\), is not larger than 2n. Thus, applying the assumption \(np\rightarrow \infty \) as \(n\rightarrow \infty \), the equality (11) and the definition of \(\omega \), for sufficiently large n, we get

$$\begin{aligned} Pr[C_{d+1}]= & {} Pr\left[ \bigcup _{0\le i<j\le n-d-1}A_{i,j}\right] \le \sum _{0\le i<j\le n-d-1}Pr[A_{i,j}]\nonumber \\\le & {} \frac{1}{2}n^2p^4(1-p)^{2d} + 2np^3(1-p)^{2d}+p^2(1-p)^{2d}\nonumber \\= & {} \frac{1}{2}n^2p^4(1-p)^{2d}\left( 1+\frac{4}{np}+\frac{2}{n^2p^2}\right) \le n^2p^4(1-p)^{2d}\nonumber \\= & {} n^2p^4\frac{\omega ^2}{n^2p^2}=p^2\omega ^2. \end{aligned}$$
(16)

Let D be the event that there is a part in \(\varPi _{n,p}\) from the interval \((\ell _0,\ell _1]\) which is not unique. Clearly, \( D\subseteq \bigcup _{d=\lfloor \ell _0\rfloor }^{\lfloor \ell _1\rfloor -1}C_{d+1}.\) Thus, by the inequalities (16), \(p\le -\ln (1-p)\), the definition of \(\omega \) and the assumption \(p=o(1)\), we have

$$\begin{aligned} Pr[D]\le & {} \sum _{d=\lfloor \ell _0\rfloor }^{\lfloor \ell _1\rfloor -1}Pr[C_{d+1}] \le (\lfloor \ell _1\rfloor -\lfloor \ell _0\rfloor )p^2\omega ^2\le \left( \frac{2\ln \omega }{-\ln (1-p)}+1\right) \cdot p^2\omega ^2\\\le & {} \left( \frac{2\ln p^{-\frac{1}{3}}}{p}+1\right) \cdot p^2\cdot p^{-\frac{2}{3}}= p^{\frac{4}{3}}-\frac{2}{3}p^\frac{1}{3}\ln p=o(1). \end{aligned}$$

Hence, with high probability, every part \(d\in (\ell _0,\ell _1]\) of \(\varPi _{n,p}\) is unique.

The statement (iv) follows immediately from (i)-(iii). \(\square \)

Lemma 6

If \(k=o(n)\) and \(k\rightarrow \infty \) as \(n\rightarrow \infty \), then, with high probability, the largest part in \(\varPi _{n,k}\) is unique.

Proof

We consider the following procedure. We take a random composition \(\overline{\pi }\) from \(\varPi _{n,\overline{p}}\), where \(\overline{p}=\frac{k-k^{\frac{3}{4}}}{n-1}\). Let \(\overline{k}\) be the number of parts in \(\overline{\pi }\). Now, in the “interval representation” of \(\overline{\pi }\), we add \(k-\overline{k}\) dashes (or delete \(\overline{k}-k\) dashes, if \(\overline{k}>k\)), selected uniformly at random (meaning uniform distribution over all \(n-\overline{k} \atopwithdelims ()k-\overline{k}\) sets of \(k-\overline{k}\) dashes), to obtain a composition \(\pi \) with exactly k parts. Clearly, this is the same as generating \(\pi \) by selecting \(k-1\) dashes uniformly at random, so the procedure is equivalent to \(\varPi _{n,k}\) model.

Note that, with high probability, we have \(k-2k^{\frac{3}{4}}\le \overline{k}\le k\), as \(\overline{k} - 1\) is a random variable with the expectation \((n-1)\overline{p}\) and the variance \((n-1)\overline{p}(1-\overline{p})\). In order to complete the proof we need to show that, with high probability, the largest part in \(\overline{\pi }\) is unique and that the process of adding dashes will not destroy its uniqueness, so that, with high probability, the largest part in \(\pi \) is unique.

As \(p=\overline{p}\) satisfies the assumptions of Lemma 5, with high probability, the largest part, say d, in \(\overline{\pi }\) is unique and \(d\le \ell _1=\frac{\ln (n\overline{p})+\ln \omega }{-\ln (1-\overline{p})}\le \frac{2\ln (n\overline{p})}{\overline{p}}\). Note that this will also be a unique part in \(\pi \), unless we select (in our procedure of adding dashes) at least one of the \(d-1\) dashes inside it. The probability that such a bad event will not happen is equal to

$$\begin{aligned} \frac{{n-\overline{k}-(d-1)\atopwithdelims ()k-\overline{k}}}{{n-\overline{k}\atopwithdelims ()k-\overline{k}}}= & {} \prod _{i=0}^{k-\overline{k}-1}\frac{n-\overline{k}-(d-1)-i}{n-\overline{k}-i}\\\ge & {} \left( 1-\frac{d-1}{n-k}\right) ^{k-\overline{k}}\ge \left( 1-\frac{\ell _1}{n-k}\right) ^{2k^{\frac{3}{4}}}, \end{aligned}$$

because \(d\le \ell _1\) and \(k-\overline{k} \le 2k^{\frac{3}{4}}\). Note that, as \(n\overline{p}\le k\) and \(\overline{p}\ge \frac{1}{2}\frac{k}{n}\), for sufficiently large n, we have \(\ell _1 \le \frac{2\ln (n\overline{p})}{\overline{p}} \le 4n\frac{\ln (k)}{k}\). Hence, \(2k^{\frac{3}{4}}\cdot \frac{\ell _1}{n-k}=o(1)\), so \( \left( 1-\frac{\ell _1}{n-k}\right) ^{2k^{\frac{3}{4}}}=1-o(1).\) We proved that the largest part in \(\overline{\pi }\) will survive in \(\pi \) with high probability, which completes the proof of the lemma. \(\square \)

Corollary 2

If \(k=o(n)\) and \(k\rightarrow \infty \) as \(n\rightarrow \infty \), then

$$\begin{aligned} s_{n,k}={n\atopwithdelims ()k}o(1). \end{aligned}$$

Proof

Clearly, the number of compositions of n into k parts is equal to \({n-1\atopwithdelims ()k-1}\). It follows from the definition of an awesome composition that if the largest part in a composition is unique, then the composition is awesome. Thus, by Lemma 6, \(a_{n,k}={n-1\atopwithdelims ()k-1}o(1)\). The corollary follows now from Lemma 1. \(\square \)

Proof

(of Theorem 1) The inequality (2) and Corollary 2 imply that if \(k=o(n)\) and \(k\rightarrow \infty \) as \(n\rightarrow \infty \), then \(p_{n,k}={n\atopwithdelims ()k}(1-o(1))\). Obviously, by Theorem 2, the equality \(p_{n,k}={n\atopwithdelims ()k}(1-o(1))\) holds for any fixed positive integer k too. One can readily verify that these two statements imply the theorem. \(\square \)

5 Final Remarks and Open Problems

In this paper we proved several asymptotic results for the length of a longest (nk)-Ucycle packing and a shortest (nk)-Ucycle covering. There are several natural problems which still remain open. For a fixed k we proved that \(p_{n,k}={n\atopwithdelims ()k}-O(n^{\lfloor k/2\rfloor })\) and \(c_{n,k}={n\atopwithdelims ()k}+O(n^{\lfloor k/2\rfloor })\). It would be good to improve these asymptotic formulas. In particular the following problem seems to be interesting.

Problem 1

Is there an absolute constant t such that, for every positive integer \(k,p_{n,k}={n\atopwithdelims ()k}-O(n^t)\) and \(c_{n,k}={n\atopwithdelims ()k}+O(n^{t})\)?

Solving this problem would require a new construction of long (nk)-Ucycle packings and short (nk)-Ucycle coverings. The constructions considered in this paper are not suitable because it is easy to show that for a fixed k there are \(\varOmega (n^{\lfloor k/2\rfloor })\) k-subsets of [n] which are not good.

In the present paper we also proved asymptotic formulas for the numbers \(p_{n,k}\) and \(c_{n,k}\) when k is not a constant but is a function of n. The results for the packing number are, however, stronger than the results for the covering number. In particular the covering analog of Theorem 1 remains open.

Problem 2

Is it true that if \(k=o(n)\), then \(c_{n,k}={n\atopwithdelims ()k}(1+o(1))\)?

We believe the answer to Problem 2 is positive.

We know very little about existence of universal cycles in the case when k is very large. Clearly, the case of \(k=n-1\) is trivial. Stevens et al. [28] showed that for \(k=n-2\) universal cycles do not exist and that \(p_{n,n-2}=n\). They also considered linear packings instead of cyclic packings of \((n-2)\)-subsets of an n-set and proved that in this case the corresponding packing number is \(3n-6\) (far from the trivial upper bound \({n\atopwithdelims ()n-2}+n-3\)). Using a similar method as the one described in [26], we are able to show that \(c_{n,n-2}\le 2{n\atopwithdelims ()n-2}-n\) but we do not know whether or not the actual value of \(c_{n,n-2}\) is equal to this upper bound. For \(k=n-s\), where \(s>2\), we do not know any nontrivial general results. Therefore we state the following problem.

Problem 3

For a fixed \(s>2\), find good asymptotic formulas for \(p_{n,n-s}\) and \(c_{n,n-s}\).

It is not difficult to see that if \(k=n-s\), where \(s>1\) is a constant, then almost all k-subsets of [n] are not good. On the other hand, we proved in this paper (see Corollary 2) that for \(k=o(n)\) almost all k-subsets of [n] are good. In view of these facts it is interesting to ask what happens when k grows linearly with n. Therefore we state our next problem.

Problem 4

Let c be a constant real number such that \(0<c<1\) and let \(k\sim cn\). Are almost all k-subsets of [n] good?

Applying a similar argument as in the proof of Lemma 1 one can easily prove that the number of non-good k-subsets of [n] is equal to \(\frac{n}{k}g_{n,k}\). Therefore the question formulated in Problem 4 is equivalent to asking if almost all compositions of n into k parts are good.

Some computer experiments done by P. Rza̧żewski suggest that for \(c\ge \frac{1}{2}\) the answers to both questions are negative.