Abstract
We study the random walk on the symmetric group \(S_n\) generated by the conjugacy class of cycles of length k. We show that the convergence to uniform measure of this walk has a cut-off in total variation distance after \(\frac{n}{k}\log n\) steps, uniformly in \(k = o(n)\) as \(n \rightarrow \infty \). The analysis follows from a new asymptotic estimation of the characters of the symmetric group evaluated at cycles.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A well-known conjecture in the theory of random walk on a group asserts that for the random walk on the symmetric group generated by all permutations of a common cycle structure, the mixing time of the walk depends only on the number of fixed points. Given measure \(\mu \) on \(S_n\) and integer \(t \ge 1\), denote \(\mu ^{*t}\) its t-fold convolution, which is the law of the random walk starting from the identity in \(S_n\) after t steps from \(\mu \). We measure convergence to uniformity in the total variation metric, which for probabilities \(\mu \) and \(\nu \) on \(S_n\) is given by
A precise statement of the conjecture is as follows.
Conjecture 1
For each \(n > 1\) let \(C_n\) be a conjugacy class of the symmetric group \(S_n\) having \(n-k_n< n\) fixed points, and let \(\mu _{C_n}\) denote the uniform probability measure on \(C_n\). Let \(t_2, t_3, t_4, \ldots \) be a sequence of positive integer steps and for each \(n \ge 2\) let \(U_n\) be uniform measure on the coset of the alternating group \(A_n\) supporting the measure \(\mu _{C_n}^{*t_n}\). For any \(\varepsilon > 0\), if eventually \(t_n \ge (1 +\varepsilon ) \frac{n}{k_n} \log n\), then
If eventually \(t_n \le (1-\varepsilon ) \frac{n}{k_n} \log n\), then
This conjecture aims to generalize the mixing time analysis of Diaconis and Shahshahani [4] for the random transposition walk. The formal conjecture seems to have first appeared in [14].
When k grows with n like a constant times n the conjecture is known to be false because the walk mixes too rapidly. This is the work of a number of authors, but first [12] and for later results see [10] and references therein. Whenever \(k = o(n)\) the conjecture is expected to hold, however, in part because the proposed lower bound on mixing time follows from standard techniques in the field. We give the second moment method proof in Appendix A.
Since the initial work of Diaconis and Shahshahani, Conjecture 1 has received quite a bit of attention, see [5, 11, 14–17, 20] for cases of conjugacy classes with finite numbers of non-fixed points. A discussion of ongoing work of Schlage–Puchta towards the general conjecture is contained in [18]. See [2] and [19] for broader perspective.
Recently Berestycki et al. [1] have established the conjecture for any set of k cycles with k fixed as \(n \rightarrow \infty \). Moreover, they assert that their analysis will go through to treat the case of any conjugacy class having bounded total length of non-trivial cycles, and may cover the case when the total cycle length grows like \(o\left( \sqrt{n}\right) \), although they state that they have not checked carefully the uniformity in k with respect to n.
The purpose of this article is to prove Conjecture 1 for the random k cycle walk in the full range \(k = o(n)\).
Theorem 2
Conjecture 1 holds when C is the conjugacy class of all k cycles with k permitted to be any function of n that satisfies \(k = o(n)\) as \(n \rightarrow \infty \). If \(k = o\left( \frac{n}{\log n}\right) \) then the conclusion of Conjecture 1 remains valid when \(\varepsilon = \varepsilon (n)\) is any function satisfying \(\varepsilon (n) \log n \rightarrow \infty \).
Remark
When \(k = o\left( \frac{n}{\log n}\right) \) the window \(\epsilon (n)\log n \rightarrow \infty \) is essentially best possible.
As remarked above, the lower bound was already known. Prior to [1], which is purely combinatorial, all approaches to Conjecture 1 have followed the work of Diaconis and Shahshahani in passing through bounds for characters on the symmetric group. We return to this previous approach; in particular, we give an asymptotic evaluation of character ratios at a k cycle for many representations, in the range \(k = O\left( n^{\frac{1}{2}-\epsilon }\right) \). The asymptotic formula appears to be new when \(k > 6\), see [9] for the smaller cases. A precise statement of our result appears in Sect. 3, after we introduce the necessary notation.
The basis for our argument is an old formula of Frobenius [6], which gives the value of a character of the symmetric group evaluated at a k cycle as a contour integral of a certain rational function, characterized by the cycle and the representation. In the special case of a transposition this formula was already used by Diaconis and Shahshahani. Previous authors in attempting to extend the result of [4] had also used Frobenius’ formula, but they had attempted to estimate the sum of residues of the function directly, which entails significant difficulties since nearby residues of the function are unstable once the cycle length k becomes somewhat large. We avoid these difficulties by estimating the integral itself rather than the residues in most situations. In doing so, a certain regularity of the character values becomes evident. For instance, while nearby residues appear irregular, by grouping clumps of poles inside a common integral we are able to show that the ‘amortized’ contribution of any pole is slowly varying.
It may be initially surprising that our analysis becomes greatly simplified as k grows. For instance, once k is larger than a sufficiently large constant times \(\log n\), essentially trivial bounds for the contour integral suffice, and the greatest part of the analysis goes into showing that our method can handle the handful of small cases which had been treated previously using the character approach. A similar feature occurs in the related paper [8] where a contour integral is also used to bound character ratios at rotations on the orthogonal group. It seems that the contour method works best when there is significant oscillation in the sum of residues, but is difficult to use in the cases where the residues are generally positive.
In Appendix C we show that Frobenius’ formula has a natural generalization to other conjugacy classes on the symmetric group, but with increasing complexity, since one contour integration enters for each non-trivial cycle. The analysis here applies more broadly than to just the class of k cycles, but we have not pushed the method to its limit because it seems that some new ideas are needed to obtain the full Conjecture 1 when the number of small cycles in the cycle decomposition becomes large. From our point of view, the classes containing \(\ge n^{1-\epsilon }\) 2-cycles would appear to pose the greatest difficulty.
We remark that, as typical in the character ratio approach, our upper bound is a consequence of a corresponding upper bound for the walk in \(L^2\), so that our result gives a broader set of Markov chains on \(S_n\) to which one can apply the comparison techniques of [3].
Regarding notation, it will occasionally be convenient to use the Vinogradov notation \(A \ll B\), with the same meaning as \(A = O(B)\). The implicit constants in notation of both types should be assumed to vary from line to line.
2 Character theory and mixing times
We recall some basic facts regarding the character theory of a finite group. A good reference for these is [2].
A conjugation-invariant measure \(\mu \) on finite group G is a class function, which means that it has a ‘Fourier expansion’ expressing \(\mu \) as a linear combination of the irreducible characters X(G) of G. It will be convenient to normalize the Fourier coefficients by setting
By orthogonality of characters, the Fourier expansion takes the form
since, writing \(C_x\) for the conjugacy class of \(x \in G\),
In this setting, the Plancherel identity is
Note the somewhat non-standard factor of the inverse of the dimension \(\chi (1)^{-1}\) in the Fourier coefficients. The advantage of this choice is that the Fourier map satisfies the familiar property of carrying convolution to pointwise multiplication: for conjugation-invariant measures \(\mu _1\) and \(\mu _2\)
This is because the regular representation in \(L^2(G)\) splits as the direct sum over irreducible representations,
and convolution by \(\mu _i\) acts as a scalar multiple of the identity on each representation space \(M_{\rho }\), the scalar of proportionality being \(\hat{\mu }_i(\chi _\rho )\).
When \(G = S_n\) is the symmetric group on n letters, the irreducible representations are indexed by partitions of n, with the partition (n) corresponding to the trivial representation, and partition \((1^n)\) corresponding to the sign. Given a conjugacy class C on \(S_n\) and integer \(t \ge 1\), the measure \(\mu _C^{*t}\) is conjugation invariant, and supported on permutations of a fixed sign, odd if both C and t are odd, and otherwise even. We let \(U_t\) be the uniform measure on permutations of this sign:
with Fourier coefficients
The total variation distance between \(\mu _C^{*t}\) and \(U_t\) is equal to
Thus Cauchy–Schwarz, Plancherel, and the convolution identity give the upper bound
The main ingredient in the proof of the upper bound of Theorem 2 will thus be the following estimate for character ratios.
Proposition 3
Let C be the class of k cycles on \(S_n\). There exists a constant \(\delta > 0\) and a constant \(c_1> 0\) such that uniformly in n, partition \(\lambda \vdash n\) and all \(2 \le k < \delta n\)
For the deduction of Theorem 2 we will use the following technical result on the dimensions of the irreducible representations of \(S_n\).
Proposition 4
For a sufficiently large fixed constant \(c_2>0\),
with the estimate uniform in n.
Deduction of mixing time upper bound of Theorem 2 We use the fact that, apart from the trivial and sign representations, the lowest dimensional irreducible representation of \(S_n\) has dimension \(n-1\) (so \(S_n\) is essentially quasi-random). Let \(c_1, c_2\) as above, and let \(C>0\). Then for \(t > \frac{n}{k_n}(\log n + c_1)\left( 1 + \frac{c_2 + C}{2\log n}\right) \) the application of Cauchy–Schwarz above gives
\(\square \)
The proof of Proposition 3 will require some more detailed information regarding the characters of \(S_n\) evaluated at a cycle. We discuss this in the next section. Proposition 4 is deduced from a very useful approximate dimension formula of Larsen–Shalev [10]. The rather technical proof of this proposition is given in Appendix B.
3 Character theory of \(S_n\)
The irreducible representations of \(S_n\) are indexed by partitions of n. Given a partition
the dual partition \(\lambda '\) is found by reflecting the diagram of \(\lambda \) along its diagonal. We write \(\chi ^\lambda \) for the character of representation \(\rho ^\lambda \), and \(f^\lambda = \chi ^\lambda (1)\) for the dimension. Let \(\mu = \lambda + (n-1, n-2, \ldots , 0)\). Then the dimension is given by (see e.g. [13])
Setting apart those terms that pertain to \(\lambda _1\), we find
The product is bounded by 1, and \(\sum _{\lambda \vdash n} (f^\lambda )^2 = n!\), so that we have the bound employed by Diaconis–Shahshahani
The trivial representation is \(\rho ^n\) while the sign representation is \(\rho ^{1^n}\), both one-dimensional. \(\rho ^{n-1,1}\) corresponds to the standard representation, which is the irreducible \((n-1)\)-dimensional sub-representation of the representation \(\rho \) in \({\mathbb {R}}^n\) given by
This representation and its dual are the lowest dimensional non-trivial irreducible representations of \(S_n\).
Given the character \(\chi ^\lambda \), the character of the dual representation is given by
The characters corresponding to partitions with long first piece have relatively simple interpretations. For instance, if we write \(i_1\) for the number of fixed points and \(i_2\) for the number of 2-cycles in permutation \(\sigma \) then [7, ex. 4.15]
It is now immediate that
Also,
We use these formulas in the proof of the lower bound of Conjecture 1.
Many properties of partitions are most readily evident in Frobenius notation, and we will use this notation to state our main technical theorem. In Frobenius notation we identify the partition \(\lambda \vdash n\) by drawing the diagonal, say of length m, and measuring the legs that extend horizontally to the right and vertically below the diagonal:
Notice that
In Appendix B we consider the quantity \(\Delta \), which is the number of boxes contained neither in the square formed by the diagonal nor in the first row. The notation used here is summarized in Fig. 1.
We now state our asymptotic evaluation of the character ratio.
Theorem 5
Let n be large, let \(2 \le k \le n\), and let C be the class of k cycles on \(S_n\). Let \(\lambda \vdash n\) be a partition of n with Frobenius notation \(\lambda = (a_1, \ldots , a_m|b_1, \ldots , b_m)\).
-
(a)
(Long first row) Let \(0 < \epsilon < \frac{1}{2}\), let \(r = n-\lambda _1\) and suppose that \(r \,+\, k \,+\, 1 < \left( \frac{1}{2} - \epsilon \right) n\). Then
$$\begin{aligned} \frac{\chi ^\lambda (C)}{f^\lambda }= & {} \frac{(a_1 - \frac{1}{2})^{\underline{k}}}{n^{\underline{k}}} \prod _{j=2}^m \frac{a_1 -a_j - k}{a_1 - a_j} \prod _{j=1}^m \frac{a_1 + b_j}{a_1 + b_j-k} \nonumber \\&+ O_\epsilon \left( \exp \left( k \left[ \log \frac{(1 + \epsilon )(k+1+r)}{n-k} + O_\epsilon \left( r^{-\frac{1}{2}}\right) \right] \right) \right) . \end{aligned}$$(9)If \(r < k\) then the error term is actually 0.
-
(b)
(Large k, short first row and column) Let \(\theta > \frac{2}{3}\). There exists \(\epsilon (\theta )>0\) such that, for all n sufficiently large, for all k with \(6\log n \le k \le \epsilon n\) and for all \(\lambda \vdash n\) such that \(b_1 \le a_1 \le e^{-\theta }n\),
$$\begin{aligned} \left| \frac{\chi ^\lambda (C)}{f^\lambda }\right| \le e^{\frac{-k}{2}}. \end{aligned}$$ -
(c)
(Asymptotic expansion) Let \(0<\epsilon < \frac{1}{2}\) and suppose now that \(k < n^{\frac{1}{2}-\epsilon }\). We have the approximate formula
$$\begin{aligned} \frac{\chi ^\lambda (C)}{f^\lambda }&= \sum _{a_i > kn^{\frac{1}{2}}} \frac{a_i^k}{n^k}\left( 1 + O_\epsilon \left( \frac{kn^{\frac{3}{4}+\epsilon }}{a_i}\right) \right) \nonumber \\&\quad + (-1)^{k-1}\sum _{ b_i > kn^{\frac{1}{2} }} \frac{b_i^k}{n^k}\left( 1 + O_\epsilon \left( \frac{kn^{\frac{3}{4}+\epsilon }}{b_i}\right) \right) \nonumber \\&\quad + O_\epsilon \left( n^{\frac{1}{2}}(\log n)^2 \left( \frac{k \log ^2 n}{\sqrt{n}}\right) ^k\right) . \end{aligned}$$(10)
This Theorem is the main technical result of the paper. Actually it gives more than we will need, and we apply the detailed statement of part (c) only in the case when the cycle length k is relatively short, \(k \le 6 \log n\). The cruder bounds of parts (a) and (b) suffice when the cycle length k is larger.
4 Proof of Theorem 5
When C is the class of k cycles Frobenius [6] (see [7, p. 52 ex. 4.17 b]) proved a famous formula that expresses the character ratio of a given representation at C as the ‘residue at \(\infty \)’ of a meromorphic function depending upon the representation. In Frobenius notation, and using the falling power
the formula is
where the integration has winding number 1 around each (finite) pole of the integrand. [Note that our \((a_i, b_i)\) correspond to \(\left( b_{m-i+1} + \frac{1}{2}, a_{m-i + 1} + \frac{1}{2}\right) \) of [7], and replace y there with our \(z = y - \frac{k-1}{2}\).]
Our proof of Theorem 5 is an asymptotic estimation of this integral. We first record several properties of \(F_k^{a,b}\) in the following lemma.
Lemma 6
Let \(\lambda = (a_1, \ldots , a_m|b_1, \ldots , b_m)\) be a partition of n in Frobenius notation.
-
(1)
Each of \(F_k^{a}(z)\) and \(F_k^b(z)\) has at most \(\sqrt{n}\) poles.
-
(2)
If \(k > n\) then \(F_k^{a,b}(z)\) is holomorphic.
Denote by \(\lambda ' = \lambda {\setminus }\lambda _1 = (a_1', \ldots , a_{m'}'|b_1', \ldots , b_{m'}')\) the partition of \(n-\lambda _1\) found by deleting the first row of \(\lambda \).
-
(3)
We have
$$\begin{aligned} F_k^{a,b}(z) \frac{z- a_1 +\frac{k}{2}}{z-a_1 - \frac{k}{2}} = F_k^{a',b'}(z+1). \end{aligned}$$ -
(4)
If \(a_1 > n-k - \frac{1}{2}\) and if \(n \ge 2k\) then \(F_k^{a,b}(z)\) has a single simple pole at \(z= a_1 - \frac{k}{2}\).
Proof
The first item follows from the bound for the diagonal \(m \le \sqrt{n}\), since both \(F_k^a\) and \(F_k^b\) are products of m terms.
For (2), first observe that \(a_i\) and \(b_i\) are strictly decreasing so that the poles of \(F_k^a(z)\) are all simple, as are the poles of \(F_k^b(z)\). Since \(k > n\), \(a_i + b_j - k < 0\), so \(a_i - \frac{k}{2} \ne \frac{k}{2} - b_j\). Thus the poles of \(F_k^a(z)F_k^b(z)\) are all simple, and are all cancelled by the factor of \(\left( z + \frac{k-1}{2}\right) ^{\underline{k}}\).
For (3), observe that deleting the first row of the diagram for \(\lambda \) shifts the diagonal down one square. Thus, if \(b_m = \frac{1}{2}\) then \(m' = m-1\) and
In this case, accounting for the lost factor from \(b_m\),
If instead \(b_m > \frac{1}{2}\) then \(a_m = \frac{1}{2}\), which implies \(m' = m\), and
Thus (12) still holds, the new factor accounting for the introduction of \(a_{m}' = \frac{1}{2}\). Thus, in either case,
For (4), the condition \(a_1 > n-k - \frac{1}{2}\) and \(n \ge 2k\) implies that \(a_1 > k -\frac{1}{2}\). Thus, for all i, \(a_1 + b_i > k\), so that \(a_1 - \frac{k}{2} \ne \frac{k}{2} - b_i\), and \(a_1 -\frac{k}{2}\) is a simple pole of \(F_k^a(z)F_k^b(z)\). Furthermore, it is not cancelled by \(\left( z+\frac{k-1}{2}\right) ^{\underline{k}}\), so that \(a_1 - \frac{k}{2}\) is a simple pole of \(F_k^{a,b}(z)\). It is the only pole, since \(F_k^{a',b'}(z)\) has \(n-\lambda _1 < k\), hence is holomorphic. \(\square \)
Parts (a) and (b) of Theorem 5 are more easily proven. We give these proofs immediately, and then prove several lemmas before proving part (c).
Proof of Theorem 5 part (a)
In this part, the first row \(\lambda _1\) of partition \(\lambda \) is significantly larger than the remainder of the partition, of size r. We extract a residue contribution from the pole corresponding to \(\lambda _1\), and bound the remainder of the character ratio by approximating it with a character ratio on \(S_r\).
Observe that \(\sum _{i>1} a_i + \sum _i b_i =n-a_1 = r + \frac{1}{2}\). Recall that we assume \(r + k + 1 \le (\frac{1}{2}-\epsilon )n\), and that this part of the theorem is the asymptotic
with the error equal to 0 if \(r <k\).
The poles of \(F_k^{a,b}\) are among the points \(a_i - \frac{k}{2}\), \(\frac{k}{2} - b_i\), \(i = 1, \ldots , k\), some of which may be cancelled by the numerator. The condition \(k+ 1 + r \le \left( \frac{1}{2}-\epsilon \right) n\) guarantees that
Since \(a_i, b_j \le r+\frac{1}{2}\) for \(i \ge 2\), \(j \ge 1\), it follows that \(a_1 - \frac{k}{2}\) is the furthest pole from zero of the function \(F^a_k(z)F^b_k(z)\), and that this is a simple pole of \(F_k^{a,b}(z)\).
Set \(R = (1 + \epsilon )\left( r + \frac{k+1}{2}\right) \) and notice
so that \(a_1 - \frac{k}{2}\) is outside the loop \(|z| = R\), while all other poles are inside. Thus
The residue term is equal to the main term of the theorem, so it remains to bound the integral.
Recall the notation \(\left( a_1', \ldots , a_{m'}'|b_1', \ldots , b_{m'}'\right) = \lambda {\setminus } \lambda _1\), and that
Thus we express the integral of (13) as
When \(k > r\), \(F_k^{a',b'}(z+1)\) is holomorphic, and so (14) is zero, which proves the latter claim of the theorem. Thus we may now assume that \(r \ge k\).
On the contour \(|z| = R\),
so that \(\left| \frac{z- a_1 - \frac{k}{2}}{z-a_1 + \frac{k}{2}} -1\right| = O_\epsilon (\frac{k}{n}).\) Thus (14) is equal to (\(C'\) is the k cycle class on \(S_r\))
We bound the first term by \(\frac{r^{\underline{k}}}{n^{\underline{k}}}\), since all character ratios are bounded by 1.
To bound the integral, note that \(F^{a'}_k(z)\) and \(F^{b'}_k(z)\) each have at most \(\sqrt{r}\) poles. On the contour \(|z| = R\),
and the terms in \(F^{b'}_k(z)\) are bounded similarly. It follows that on \(|z| = R\),
Meanwhile, also on \(|z| = R\),
Putting these bounds together, we deduce a bound for the second term of (15) of
To complete the proof of the theorem, use \(n^{\underline{k}} \ge (n-k)^k\) and note that the term \(\frac{r^{\underline{k}}}{n^{\underline{k}}}\) from the character ratio in (15) is trivially absorbed into this error term. \(\square \)
Proof of Theorem 5 part (b)
Choose \(\theta _1\) with \(\frac{2}{3} < \theta _1 < \theta \). Since the poles of \(F^{a,b}_k\) are among \(a_i - \frac{k}{2}\) and \( \frac{k}{2}- b_i\), \(i = 1, 2, 3, \ldots \), choosing \(\epsilon = \epsilon (\theta )>0 \) sufficiently small, \(a_i, b_i <e^{-\theta }n\) and \(k \le \epsilon n\) guarantees that the contour \(|z| =R = e^{-\theta _1} n\) contains all poles of \(F^{a,b}_k(z)\). Thus
Write \( \frac{1}{k n^{\underline{k}}} F^{a,b}_k(z) =\frac{\left( z + \frac{k-1}{2}\right) ^{\underline{k}}}{k n^{\underline{k}}} F^a_k(z)F^b_k(z)\). Since \(k \le \epsilon n\), for \(|z| = R\),
with o(1) indicating a quantity that may be made arbitrarily small with a sufficiently small choice of \(\epsilon \). Also, \(F^a_k(z)\) and \(F^b_k(z)\) are each composed of at most \( \sqrt{n}\) factors, each of size at most \(1 + O_{\theta ,\theta _1,\epsilon }\left( \frac{k}{n}\right) \). We deduce that for \(|z| = R\),
the error term requiring that first \(\epsilon \) be small, and then that n be large. The length of the contour is \(O(n)=O \left( \exp (\frac{k}{6})\right) \). Since \(\theta _1 > \frac{1}{2} + \frac{1}{6}\), it follows that the bound \(\exp \left( -\frac{k}{2}\right) \) holds for the character ratio for all n sufficiently large. \(\square \)
We now turn to part (c) of Theorem 5. We will again estimate the integral
but the idea now will be to evaluate clumps of poles together. With an appropriate choice of contour, at any given point the poles that are sufficiently far away make an essentially constant contribution to the integrand, and so may be safely removed, simplifying the integral.
Denote by
the multi-set (that is, set counted with multiplicities) of poles of \(F_k^{a}(z)\) and \(F_k^b(z)\). One trivial fact concerning the distribution of poles in \({\mathcal {P}}\) is the following bound.
Lemma 7
For any \(x > 0\),
Indeed, this follows from the fact that
A useful consequence is that for x real, \(|x| > k\sqrt{n}\), we can always find a real point y nearby x with large distance from all poles in \({\mathcal {P}}\).
Definition 1
Let \(x \in {\mathbb {R}}{\setminus } 0\) and let \(L>0\) be a parameter. We say that a real number y is L -well-spaced for x with respect to \({\mathcal {P}}\) if the bound is satisfied
The following lemma says that if |x| is sufficiently large, then there are always many points y nearby x that are well-spaced for x with respect to \({\mathcal {P}}\).
Lemma 8
Let \(n > e^5\) and let \(1<k < \sqrt{n}\). Let x be real with \(|x| > k \sqrt{n}\log n\). Then there exists a real y with \(|y-x| < \sqrt{n}\) which is \(4 \sqrt{n} \log n\)-well-spaced for x with respect to \({\mathcal {P}}\).
Proof
Let I be the interval \(I = \{y: |y-x| < \sqrt{n}\}\). By Lemma 7, the interval I contains at most
poles of \({\mathcal {P}}\). Deleting the segment of radius k around each pole in \({\mathcal {P}}\cap I\) removes a set of total length at most
leaving \(I'\) with \(|I'|\ge \frac{|I|}{2} = \sqrt{n}\). Now
Applying the cardinality bound of Lemma 7 a second time (recall \(|I'|\ge \sqrt{n}\)) we obtain a bound for (18) of
It follows that a typical point \(y \in I'\) is \(4 \sqrt{n}\log n\)-well-spaced for x. \(\square \)
Recall that \(k \le n^{\frac{1}{2}-\epsilon }\). We now assume, as we may, that n is sufficiently large to guarantee \(k \le \frac{\sqrt{n}}{(\log n)^2}\) and we choose a sequence of well-spaced points at which we partition our integral.
To find our sequence of points, first set
and
for those j such that \(x_j \le n + 4 \sqrt{n}\). In each interval
apply Lemma 8 to find \(y_j^+\), a \(4\sqrt{n}\log n\)-well-spaced point for \(x_j\). Also find a point \(y_j^-\) in each interval \(\left( -x_j-\sqrt{n}, - x_j + \sqrt{n}\right] \) which is \(4\sqrt{n}\log n\)-well-spaced for \(-x_j\). Thus we have the disjoint intervals
which together cover
Let \(q_j^+\) (resp. \(q_j^-, q_0\)) be the number of poles of \({\mathcal {P}}\) contained in \(I_j^+\), (resp. \(I_j^-, I_0\)).
Around each interval \(I_j^+,\) \(j \ge 1\) we draw a rectangular box
and similarly around \(I_j^-\). If \(q_j^\pm = 0\) then the box may be discarded. We also draw a large box \({\mathcal {B}}_0\) containing the origin, with endpoints at \(y_0^- \pm i k\sqrt{n}, y_0^+ \pm i k\sqrt{n}\). A permissible contour with which to apply integral formula (16) is given by
each box being positively oriented, see Fig. 2. We use a somewhat shortened version of this contour.
In what follows we treat only the integrals around the boxes \({\mathcal {B}}_j^+\) and \({\mathcal {B}}_0\), and we drop superscripts (so we write e.g. \(q_j\) for \(q_j^+\), \(I_j\) for \(I_j^+\) etc). The argument may be carried out symmetrically for \({\mathcal {B}}_j^-\). The main proposition is as follows.
Proposition 9
Let \(j \ge 1\). There exists a piecewise linear contour \({\mathcal {C}}_j\), having total length
such that \({\mathcal {C}}_j\) has winding number 1 around each pole \(p \in {\mathcal {P}} \cap {\mathcal {B}}_j\) and winding number 0 around all poles \(p' \in {\mathcal {P}}{\setminus } {\mathcal {B}}_j\). For \(z\in {\mathcal {C}}_j\), \(F_k^{a,b}(z)\) satisfies
Proof
Recall that \({\mathcal {B}}_j\) is a box containing \(q_j\) poles and surrounding interval \(I_j = [y_{j-1}, y_j)\), with corners at
Recall also that \(y_j \in \left[ x_j - \sqrt{n}, x_j+\sqrt{n}\right) \) and \(x_j - x_{j-1} = 2\sqrt{n}\), so that \(|y_{j} - y_{j-1}| \le 4 \sqrt{n}\) (Fig. 2).
To form \({\mathcal {C}}_j\) we shorten the contour \(\partial {\mathcal {B}}_j\). At each pole \(p \in {\mathcal {B}}_j\) consider the segment
If there exists a subinterval J of interval \(I_j\) that is not covered by \(\cup _{p \in {\mathcal {P}}\cap {\mathcal {B}}_j}S_p\) then such J may be discarded from the box \({\mathcal {B}}_j\) by drawing vertical segments at the endpoints of J and deleting the parts of \({\mathcal {B}}_j\) vertically above and below J. Deleting subintervals in this way if it reduces the total perimeter, or allowing them to remain when the perimeter is increased, we arrive at a contour \({\mathcal {C}}_j\) which is the union of some rectangles, and has total length at most
the first term bounding the result of doing nothing, and the second bounding the length of a contour with an individual box around every pole, see Fig. 3 for a schematic of the construction of \({\mathcal {C}}_j\). Using \(\min (a,b) \le \sqrt{ab}\), we have the bound
We now prove the formula (19). Observe that by Lemma 7, the number of poles \(q_j\) in \({\mathcal {P}} \cap {\mathcal {B}}_j\) satisfies \(q_j \ll \frac{n}{x_j} \ll \frac{\sqrt{n}}{k}\), so that each point of \({\mathcal {C}}_j\) has distance \(O\left( \sqrt{n}\right) \) from \(x_j\). Thus
In particular, it follows that
We prove
and
Inserted in (20) these combine to prove the formula (19), (use \(\left( x_j + \frac{k-1}{2}\right) ^{\underline{k}} \le x_j^k\)).
We first consider (21). This estimate is equivalent to
the sign determined according as p is a pole from \(F_k^a\) or \(F_k^b\). Say a pole \(p \in {\mathcal {P}} {\setminus } {\mathcal {B}}_j\) is ‘good’ if \(|p-x_j| > \frac{x_j}{2}\) and ‘bad’ otherwise. For \(z \in {\mathcal {C}}_j\),
if n is sufficiently large, which implies that for good p, \(|z-p| \ge \frac{x_j}{6}\). Since the total number of poles is \(O\left( \sqrt{n}\right) \), the product over good poles evidently satisfies the bound
Since it is not contained in the box \({\mathcal {B}}_j\), a bad pole p necessarily lives in the complement \({\mathbb {R}}{\setminus } [y_{j-1},y_j]\), and either satisfies \(p > y_j\) and \(|p- x_j|\le \frac{x_j}{2}\) or else \(p < y_{j-1}\), which entails \(\frac{x_j}{2} \le p \le x_j\). Since \(\frac{x_{j-1}}{2} < \frac{x_j}{2}\), and \(\frac{3 x_{j-1}}{2} > x_j\) (at least if n is sufficiently large), it follows that in the second case, \(|p - x_{j-1}| \le \frac{x_{j-1}}{2}\). Thus, for \(z \in {\mathcal {C}}_j\), the fact that \(y_{j-1} \le \mathfrak {R}(z) \le y_j\) implies
and
by invoking the \(4 \sqrt{n}\log n\) well-spaced property of \(y_{j-1}\) and \(y_j\). It follows
proving (21).
To prove (22) we consider two cases. When either \(\mathfrak {R}(z) = y_{j-1}\) or \(\mathfrak {R}(z) = y_j\), observe that for each \(p \in {\mathcal {P}} \cap {\mathcal {B}}_j\),
Also, since \(x_j-x_{j-1} = 2\sqrt{n}\),
Thus, invoking the well-spaced property of \(y_{j-1}\) and \(y_j\),
so that if \(\mathfrak {R}(z) = y_{j-1}\),
and similarly when \(\mathfrak {R}(z) = y_j\).
When \(\mathfrak {R}(z) \not \in \{y_{j-1}, y_j\}\) then \(|z-p| \ge k q_j\) for each \(p \in {\mathcal {P}}\cap {\mathcal {B}}_j\), so that in this case the bound
is immediate from \(|{\mathcal {P}} \cap {\mathcal {B}}_j| = q_j\). Thus (22) holds in either case. \(\square \)
We now bound integration around the box \({\mathcal {B}}_0\).
Lemma 10
Assume \(n > e^5\). We have
Proof
Since \({\mathcal {B}}_0\) is a box with corners at \(y_0^{\pm } \pm ik\sqrt{n}\), the integral has length \(O\left( k\sqrt{n}(\log n)^2\right) \), so it will suffice to prove
Recall that
Recall also that \(\left| y_0^{\pm } \mp x_0\right| \le \sqrt{n}\) and \(x_0 = \frac{k}{2}\sqrt{n}(\log n)^2\). Thus, on \(\partial {\mathcal {B}}_0\) we have \(|\mathfrak {R}z| \le \frac{k}{2}\sqrt{n} (\log n)^2 + \sqrt{n}\). Hence, using \(k < \sqrt{n}\),
the last bound requiring \(k+2 \le \frac{k}{2}(\log n)^2\), which plainly holds for \(n > e^5\). Since \(n^{\underline{k}} \gg n^k\) when \(k < \sqrt{n}\), we obtain the required bound for \(\sup _{z \in \partial {\mathcal {B}}_0} \left| \frac{F_{k}^{a,b}(z)}{kn^{\underline{k}}}\right| \) by checking that
To do so, for each \(p \in {\mathcal {P}}\) write the corresponding factor of \(F^a_k(z)\) or \(F^b_k(z)\) as
For a given \(z \in \partial {\mathcal {B}}_0\), say the pole \(p \in {\mathcal {P}}\) is ‘good’ for z if \(|z-p| \ge k \sqrt{n}\). Since the total number of poles in \({\mathcal {P}}\) is \(O\left( \sqrt{n}\right) \), the part of the product in \(F^a_k(z)F^b_k(z)\) contributed by good poles is bounded by O(1), so we may consider only the bad poles.
Bad poles exist only if z is on one of the two vertical sides of the box, that is \(\mathfrak {R}(z) = y_0^{\pm }\). Notice that, e.g.
(similarly if p is near \(y_0^-\)), and so bad poles satisfy \(\min (|p + x_0|, |p - x_0|) < \frac{x_0}{2}\).
By the well-spaced property of \(y_0^\pm \), when \(\mathfrak {R}(z) = y_0^\pm \) we have
It follows that
\(\square \)
We now have all of the estimates that we need to complete the proof of Theorem 5. We use the following integral formula.
Lemma 11
Let \(p_1, p_2, \ldots , p_s \in {\mathbb {C}}\) be a sequence of poles and let d be an arbitrary complex number. Then
where the contour is such that it has winding number one about each pole.
Proof
The integral is independent of the appropriate contour, so take the contour to be a large loop containing the poles. We may write the integrand as \(\prod _{j=1}^s \left( 1 + \frac{d}{z-p_j}\right) .\) Differentiating the integrand with respect to \(p_j\) results in a rational function having total degree \(-2\). It follows that \(\frac{\partial }{\partial p_j} I(p_1, \ldots , p_s, d) = 0\) for each j, since the integral can be made arbitrarily small by taking the contour to be sufficiently large. Taking \(p_j = 0\) for each j, we find that the residue at 0 is sd. \(\square \)
Proof of Theorem 5 part (c)
Write
The contribution from \(\partial {\mathcal {B}}_0\) is bounded in Lemma 10 by
which may be absorbed into the error term of the theorem.
For each \(j \ge 1\), write the contribution from the \(q_j^+\) poles inside \({\mathcal {C}}_j^+\) as
For the main term of this integral, Lemma 11 gives the evaluation
the error resulting since each of the k terms of \(\left( x+\frac{k-1}{2}\right) ^{\underline{k}}\) is equal to \(a_i + O(\sqrt{n})\)—we use that \(a_i = \Omega \left( k\sqrt{n}\right) \) here. Had we considered \({\mathcal {C}}_j^-\) rather than \({\mathcal {C}}_j^+\) this main term would involve a sum over the \(b_i\), with an appropriate sign factor.
The integral of the error term over \({\mathcal {C}}_j\) is bounded trivially by using \(\left| {\mathcal {C}}_j\right| = O\left( k q_j n^{\frac{1}{4}}\right) \), which gives
The ratio of the error from \({\mathcal {C}}_j\) to the corresponding main term is
and since each \(a_i\) in the sum above is within constants of \(x_j\), this proves the Theorem.
Note that in the claimed formula of the Theorem, the sums over \(a_i\) and \(b_j\) contain terms for which \(k\sqrt{n} < a_i \le y_0^+ + \frac{k}{2}\) and \(k \sqrt{n} < b_j \le -y_0^- + \frac{k}{2}\), which do not appear in our evaluation. However, only a bound, rather than an asymptotic, is claimed in this range, so that the theorem is valid with the extra terms discarded. \(\square \)
5 Proof of upper bound for Theorem 2
We now prove the upper bound on mixing time from Theorem 2 by proving Proposition 3. Recall that this proposition is a bound for character ratios at class C of k cycles,
uniformly for k less than a fixed constant times n and for all \(\lambda \vdash n\).
Before proving the estimate, we collect together several observations. First note that the character ratio bound is trivial for the one-dimensional representations \(\lambda = (n), (1^n)\). Also, exchanging \(\lambda \) with its dual \(\lambda '\) leaves the dimension \(f^\lambda \) unchanged, and at most changes the sign of \(\chi ^\lambda (C)\), so we will assume without loss of generality that \(\lambda \) has \(a_1 \ge b_1\). We set \(r = n-\lambda _1 = n-a_1 - \frac{1}{2}\).
Lemma 12
(Criterion lemma) To prove the character ratio bound (23), it suffices to prove that for all n larger than the maximum of k and a fixed constant, and for a sufficiently large \(c > 0\), that for all non-trivial \(\lambda \) with \(a_1 \ge b_1\),
Proof
By the bound (6) for the dimension, we have (use \(\log r! \ge r \log \frac{r}{e}\))
so that a bound of
is sufficient.
On the other hand, for all \(\lambda \), \(f^\lambda \le \sqrt{n!}\), so that a bound of
also suffices. \(\square \)
Our proof splits into two cases depending upon whether \(k \ge 6\log n\). The essential tool in both cases will be the evaluation of the character ratio from Theorem 5. For large k we will use only parts (a) and (b) of that theorem. Recall that part (a) had a main term equal to
Lemma 13
Assume \(k + r+1 < \frac{n}{2}\). We have the bound \( {\text {MT}}\le \exp \left( \frac{-kr}{n}\right) . \)
Proof
Recall \(a_1 = n-r-\frac{1}{2}\). We estimate
\(\square \)
Proof of Proposition 3 when \(6 \log n \le k \le \delta n\) When \(r > 0.49 n\) we have \(a_1 < 0.51 n\). Let \(\theta = 0.67\), and note that \(e^{-\theta } > 0.511\). Thus \(a_1 < e^{-\theta }n\) so that part (b) of Theorem 5 guarantees that there exists \(\delta =\epsilon (0.67)>0\), such that, for n larger than a fixed constant, for all \(6 \log n \le k \le \delta n\), \(\left| \frac{\chi ^\lambda (C)}{f^\lambda }\right| \le e^{\frac{-k}{2}}.\) Thus the second condition of Lemma 12 is satisfied.
So we may suppose that \(r \le 0.49 n\) and appeal to part (a) of Theorem 5. Suppose that \(\delta \) is sufficiently small so that \(r + k + 1 < \frac{n}{2}\). If \(r < k\) then the error term of part (a) of Theorem 5 is zero so that the previous lemma implies
Thus the first criterion of Lemma 12 is satisfied with \(c = 0\).
Assume now that \(r \ge k\). Let now \(\delta \) sufficiently small so that we may choose \(\epsilon = \frac{1}{200}\) in part (a), that is, \(r + k + 1 \le \left( \frac{1}{2} - \frac{1}{200}\right) n\), and also assume that \(\frac{\left( 1 + \epsilon \right) \left( k + r+1\right) }{n-k} < 0.5 - \eta \) for some fixed \(\eta > 0\). Then part (a) gives \(\left| \frac{\chi ^\lambda (C)}{f^\lambda }\right| \le {\text {MT}}+ {\text {ET}}\) with
Since \(r < \frac{n}{2}\) we deduce that
Since \(k \ge 6 \log n\) and \(6(\log 2 - .5) >1.15\) we deduce
so that the first criteria of Lemma 12 is satisfied. \(\square \)
When \(k \le 6\log n\) we make essential use of the asymptotic evaluation of the character ratio proved in part (c) of Theorem 5. The next lemma shows that we may restrict attention to only the main term of that evaluation.
Lemma 14
(Small k criterion lemma) Let \(2 \le k \le 6 \log n\). We have the bound
In particular, if \(r > n^{\frac{5}{6}}\) and n is sufficiently large, and if
then after changing constants, the same estimate holds for \(\log \left| \frac{\chi ^\lambda (C)}{f^\lambda }\right| \), so that the condition of Lemma 12 is satisfied.
Proof
To deduce the second statement from the first, note that
which, for \(r > n^{\frac{5}{6}}\) may be absorbed into the RHS of the second statement by increasing the value of c. Regarding the error of \(O\left( \frac{e^{-k}(\log n)^4}{n^{\frac{1}{4}}}\right) \), it is no loss of generality to assume that
so that this error term has relative size \(1 + O \left( \frac{(\log n)^4e^{-\frac{k}{2}}}{n^{\frac{1}{4}}}\right) \). Again, the logarithm of this error may be absorbed by increasing c.
To prove the first statement, part (c) of Theorem 5 gives
For \(2 \le k \le 6 \log n\), the last error term is plainly \(O\left( \frac{e^{-k}(\log n)^4}{\sqrt{n}}\right) \). Split the sum over \(a_i\) according to \(a_i \ge \frac{n}{e^2}\). Thus the sum over \(a_i\) is bounded by
In the last sum, \(k \left( \frac{ e a_i}{n}\right) ^{k-1}\) is minimized at \(k=2\). Thus
Handling the sum over \(b_i\) in the same way proves the lemma. \(\square \)
Recall that we require \(a_1 \ge b_1\) and that \(a_1 = n-r-\frac{1}{2}\). Thinking of r as fixed, set \(\delta := \frac{r}{n}\). Since \(\sum a_i + \sum b_i = n\), an upper bound for \(\sum \frac{a_i^k}{n^k} + \sum \frac{b_i^k}{n^k}\) is given by the solution to the following optimization problem.
Let \(x_1, x_2, x_3, \ldots \) be real variables and let \(k \ge 2\).
Let \(\ell = \left\lfloor (1-\delta )^{-1} \right\rfloor .\) It is easily checked by varying parameters that an optimal solution of this problem has \(x_1 = \cdots = x_\ell = 1-\delta \), and \(x_{\ell +1} = 1 - \ell (1-\delta )\), \(x_i = 0\) for \(i > \ell +1\), which yields the maximum
\((1-\delta )^{k-1}\) being the solution of the continuous analogue of the optimization problem
which is less constrained.
Lemma 15
Let \(k \ge 2\) and let \(r = n-a_1 -\frac{1}{2}\). Assume \(a_1 \ge b_1\). There exists a \(c_0 > 0\) such that if \(r \le c_0 n\) then
For all \(r \le n\) we have
Proof
Set \(\delta =\frac{r}{n} \le c_0\). We may assume that \(c_0 \le \frac{1}{2}\). Then the first bound for the maximum reduces to \(\delta ^k + (1-\delta )^k\) and we require the statement
Write the left hand side as \((1-\delta ) \left( 1 + (\frac{\delta }{1-\delta })^k\right) ^{\frac{1}{k}}\). Then it is readily checked with calculus that this is decreasing in k, hence maximized at \(k = 2\). Since
it follows that the left is bounded by the right for \(\delta \) sufficiently small.
For the second statement, we use the second bound of the maximum, so that we need to show \((1-\delta )^{k-1} \le e^{-\frac{k\delta }{2}}\). The worst case is \(k = 2\), which reduces to the true inequality \((1-\delta ) \le e^{-\delta }\) for \(0 \le \delta \le 1\). \(\square \)
We can now prove the case \(2 \le k \le 6\log n\) of Proposition 3.
Proof of Proposition 3 for \(k \le 6\log n\) As usual, assume \(a_1 \ge b_1\). As in the proof of the Proposition in the case \(k > 6 \log n\), we write \(\lambda _1 = a_1 + \frac{1}{2} = n-r\).
For \(r \le n^{\frac{5}{6}}\) we appeal to part (a) of Theorem 5, with the main term \({\text {MT}}\) from Lemma 13. This gives
In this range, \(\exp \left( -\frac{kr}{n}\right) = 1 - \frac{kr}{n} + O\left( \left( \frac{kr}{n}\right) ^2\right) = 1-o(1)\), so that, since \(k \ge 2\), the error term is negligible, and the condition of the first criterion lemma, Lemma 12, is satisfied.
Now suppose \(r > n^{\frac{5}{6}}\). Let \(c_0\) be the constant from Lemma 15. Let \(c_1\) be a constant, such that, for \(r \le c_1 n\), \(\frac{r}{\log r} \le \frac{n}{2\log n}\). Notice that this implies \(\frac{r^2}{n^2} \le \frac{r \log r}{2n \log n}\). For \(r \le \min (c_0, c_1) n\), Lemma 15 implies that
so that this quantity is bounded by the first term in the maximum of the Small k criterion lemma, Lemma 14. Thus we may assume that \(r > \min (c_0, c_1)n\). In this case, using that \(\log r = \log n + O(1)\), the second bound of Lemma 15 implies that for a sufficiently large constant c
Thus again this is bounded by the first term in the maximum of Lemma 14. \(\square \)
References
Berestycki, N., Schramm, O., Zeitouni, O.: Mixing times for random \(k\)-cycles and coalescence-fragmentation chains. Ann. Probab. 39(5), 1815–1843 (2011)
Diaconis, P.: Group representations in probability and statistics. Institute of Mathematical Statistics Lecture Notes-Monograph Series, vol. 11. Institute of Mathematical Statistics, Hayward (1988)
Diaconis, P., Saloff-Coste, L.: Comparison techniques for random walk on finite groups. Ann. Probab. 21(4), 2131–2156 (1993)
Diaconis, P., Shahshahani, M.: Generating a random permutation with random transpositions. Z. Wahrsch. Verw. Gebiete 57(2), 159–179 (1981)
Flatto, L., Odlyzko, A.M., Wales, D.B.: Random shuffles and group representations. Ann. Probab. 13(1), 154–178 (1985)
Frobenius, F.G.: Über die Charaktere der Symmetrischen Gruppe. Preussische Akademie der Wissenschaften Berlin: Sitzungsberichte der Preußischen Akademie der Wissenschaften zu Berlin. Reichsdr (1900)
Fulton, W., Harris, J.: Representation Theory: A First Course, Graduate Texts in Mathematics, Readings in Mathematics, vol. 129. Springer, New York (1991)
Hough, B., Jiang, Y.: Mixing times of random walks on the orthogonal group generated by roations of arbitrary angle. Arxiv:1211.2031 (2012)
RE Ingram, S.J.: Some characters of the symmetric group. Proc. Am. Math. Soc. 1, 358–369 (1950)
Larsen, M., Shalev, A.: Characters of symmetric groups: sharp bounds and applications. Invent. Math. 174(3), 645–687 (2008)
Lulov, N.: Random walks on symmetric groups generated by conjugacy classes. PhD thesis, Harvard University (1996)
Lulov, N., Pak, I.: Rapidly mixing random walks and bounds on characters of the symmetric group. J. Algebr. Combin. 16(2), 151–163 (2002)
Macdonald, I.G.: Symmetric Functions and Hall Polynomials. Oxford Mathematical Monographs. The Clarendon Press, Oxford University Press, New York (1979)
Roichman, Y.: Characters of the symmetric group: formulas, estimates, and applications. In: Friedman, J., Gutzwiller, M.C., Odlyzko, A.M. (eds.) Emerging applications of number theory (1996)
Roichman, Y.: Upper bound on the characters of the symmetric groups. Invent. Math. 125(3), 451–485 (1996)
Roussel, S.: Marches aléatoires sur les groupes symétrique. Ph.D. thesis, Toulouse (1999)
Roussel, S.: Phénomène de cut-off pour certaines marches aléatoires sur le groupe symétrique. Colloq. Math. 86(1), 111–135 (2000)
Saloff-Coste, L., Zúñiga, J.: Refined estimates for some basic random walks on the symmetric and alternating groups. ALEA Lat. Am. J Probab. Math. Stat. 4, 359–392 (2008)
Saloff-Coste, L.: Random walks on finite groups. In: Probability on Discrete Structures, Encyclopaedia Math. Sci., vol. 110, pp. 263–346. Springer, Berlin (2004)
Vershik, A.M., Kerov, S.V.: Asymptotic theory of the characters of a symmetric group. Funct. Anal. Appl. 15(4), 246–255 (1981)
Acknowledgments
The author is grateful to Persi Diaconis and John Jiang for stimulating discussions and to K. Soundararajan for drawing his attention to the paper [10].
Author information
Authors and Affiliations
Corresponding author
Additional information
B. Hough is grateful for financial support from a Ric Weiland Graduate Research Fellowship. The final preparation of this manuscript was completed with support from ERC Research Grant 279438, Approximate Algebraic Structure and Applications.
Appendices
Appendix A: Proof of the lower bound in Conjecture 1
In this appendix we show a proof of the lower bound in Conjecture 1 for any conjugacy class C on \(S_n\) having \(k = k_n = o(n)\) non-fixed points.
Let the number of 2-cycles in C be \(j\le \frac{k}{2}\). The proof goes by comparing the distribution of the number of fixed points in a randomly chosen permutation, chosen either according to uniform measure or to \(\mu _C^{*t}\).
Note that expectation against uniform measure at step t is the same as calculating the \(L^2\) inner product with \(\chi ^n +\mathrm {sgn}(C)^t \chi ^{1^n}\). By the formulas in (7), \(\chi ^{n-1,1}(\sigma )\) is one less than the number of fixed points in \(\sigma \), and
Thus according to uniform measure \(\chi ^{n-1,1}\) has mean 0 and variance 1.
With respect to \(\mu _C^{*t}\) we readily check that for any \(\lambda \vdash n\), [see (1) and (3)]
Thus,
and similarly the second moment generates contributions from \(\chi ^n\), \(\chi ^{n-1,1}\), \(\chi ^{n-2,2}\) and \(\chi ^{n-2,1,1}\),
Obviously
Also, since \(2j \le k\),
For \(t = (1-\varepsilon )\frac{n}{k}\log n\), \(\varepsilon > 0\) we deduce
Let
Applying Chebyshev’s inequality, first with measure \(U_t\) and then with measure \(\mu ^{*t}_C\), we deduce that
which completes the proof. Note that if \(k = o\left( \frac{n}{\log n}\right) \) then we may take \(\varepsilon \) on the scale of \(\frac{1}{\log n}\).
Appendix B: The bound for dimension
Recall the dimension formula (\(\mu _i = \lambda _i + n - i\))
In our character ratio bounds we set apart the terms pertaining to \(\lambda _1\) to find
Of course, one could extract more rows from the partition \(\lambda \), and, since \(f^\lambda = f^{\lambda '}\), removing columns is also possible. This makes plausible the following useful result of Larsen–Shalev concerning dimensions.
Theorem 16
[10, Theorem 2.2] Write \(\lambda \vdash n\) in Frobenius notation with diagonal of length m, and set \(a_i' = \lambda _i - i = a_i - \frac{1}{2}\), \(b_i' =\lambda _i'-i = b_i - \frac{1}{2}\). Then
where
The error term holds as \(n \rightarrow \infty \) and is independent of the partition \(\lambda \vdash n\).
Using this theorem, we now prove Proposition 4.
Proof of Proposition 4
In this proof we suppress the \(\prime \) on \(a_i\) and \(b_i\). Also, we write \({\mathbb {N}}^{\underline{m}}\) to indicate strictly decreasing length m vectors of non-negative natural numbers, and given such a vector \(\underline{a}\) we write \(|\underline{a}| = \sum a_i\).
Recall that we intend to show that for a sufficiently large fixed \(c > 0\), uniformly in n we have the estimate
In view of Theorem 16, it suffices to prove instead that, for a possibly larger value of c,
By Stirling’s approximation, we reduce to showing that (the \(\prime \) on the sum indicates that if either \(a_m = 0\) or \(b_m = 0\), its contribution is excluded)
Use that \(\sum \left( a_i + \frac{1}{2}\right) + \sum \left( b_i + \frac{1}{2}\right) = n\) to cancel the \(-n\) term. Thus we seek a bound of \(O(\exp (cn))\) for
By symmetry we may assume that \(a_1 \ge b_1\). Set \(a_1+1 = \lambda _1 = M\), and \(\Delta = n-m^2 -(M-m)\). Thus \(\Delta \) is the number of boxes of partition \(\lambda \) that are neither contained in the first row, nor contained in the square determined by the diagonal. We consider separately the cases \(\Delta \le n^{\frac{3}{4}}\) and \(\Delta > n^{\frac{3}{4}}\).
First consider \(\Delta \le n^{\frac{3}{4}}\). Evidently \(\max (b_1, a_2) \le m + \Delta \) so that, for fixed \(\underline{a}, \underline{b}\) satisfying \(\Delta \le n^{\frac{3}{4}}\), the sum inside the exponential of (24) is bounded by
Thinking of \(a_1\) as fixed, we wish to bound the number of possible strings \(a_2, \ldots , a_m\), \(b_1, \ldots , b_m\). To do this, observe that the numbers \(\lambda _2 - m, \lambda _3-m, \ldots , \lambda _m - m\) together with \(\lambda _1' -m, \lambda _2'-m, \ldots , \lambda _m'-m \) are a pair of partitions of combined length \(\Delta \). From the well-known bound for p(L) the number of partitions of an integer L, it follows that the number of possible choices for \(\lambda _2, \ldots , \lambda _m\), \(\lambda _1', \ldots , \lambda _m'\) (or, equivalently \(a_2, \ldots , a_m, b_1, \ldots , b_m\)) is at most
We can now bound the contribution to (24) from \(\Delta \le n^{\frac{3}{4}}\). It is bounded by
Exchanging the order of summation, this has the bound
When \(\Delta > n^{\frac{3}{4}}\) we bound the sum inside the exponential of (24) crudely. Notice that \(\max (a_i, b_i) \le a_1 < n-\Delta \), so that \(\log a_i, \log b_i \le \log \left( n- n^{\frac{3}{4}}\right) \). Thus the sum inside the exponential of (24) is bounded by
The total number of such possible \(a_i, b_i\) is at most the number of partitions of n, which is \(\exp \left( O(\sqrt{n})\right) \). It follows that the contribution of \(\Delta > n^{\frac{3}{4}}\) to (24) is at most
Combining this bound with the bound for small \(\Delta \) proves the proposition. \(\square \)
Appendix C: Contour formula for characters of \(S_n\)
We conclude by deriving the extension of Frobenius’ character formula to character ratios of arbitrary conjugacy classes on \(S_n\). This derivation is modeled upon the derivation of Frobenius’ formula from [13, p. 118].
In this appendix we write partitions in the standard row notation, so for \(\lambda \vdash n\), \(\lambda = (\lambda _1, \lambda _2, \ldots , \lambda _n)\) counts the number of boxes in each row. Also \(\delta = (n-1, n-2, \ldots , 0)\) and \(\mu = \lambda + \delta = (\lambda _1 + n-1, \lambda _2 + n-2, \ldots , \lambda _n),\) and \(\mu ! = \mu _1!\mu _2!\ldots \mu _n!\).
Set \({\mathbf {e}}_i\) for the standard basis vector on \({\mathbb {R}}^n\). Let \({\mathbf {x}}= (x_1, \ldots , x_n)\) and denote the Vandermonde of n variables by
Let \({\mathbf {\lambda }}, {\mathbf {\rho }}\vdash n\) be two partitions of n with \({\mathbf {\lambda }}\) indexing an \(f^{\mathbf {\lambda }}\) dimensional representation of \(S_n\), and \({\mathbf {\rho }}\) indexing a conjugacy class. Following [13], we start from the classical fact that the character \(\chi _\rho ^\lambda \) is equal to the coefficient of the monomial \({\mathbf {x}}^\mu = x_1^{\mu _1}\ldots x_n^{\mu _n}\) in
The key calculation of [13] is that, for any \(m \ge 1\) and for any \(\mu \) as above,
The calculation is as follows. Expanding \(\Delta \),
In particular, this gives the dimension formula,
and, when \(\rho = (k, 1^{n-k})\) is the class of a single k-cycle,
Here, a term i with \(k > \mu _i\) is to be omitted. Thus
Expanding the Vandermonde, one may check by inspection that the sum of (26) is equal to the sum of the (finite) residues of the meromorphic function
This last fact is the version of Frobenius’ formula derived in [13]. The equivalence between this formulation from [13] and the integral formula used throughout the present paper,
follows from the identity
see e.g. [7, pp. 51–52].
Now consider a conjugacy class with non-trivial cycles
so that \(\rho = (k_1, \ldots , k_r, 1^{n-k})\). The appropriate generalization of (25) and (26) are
again with the convention that a term with negative coordinate is to be omitted. It follows
The formula (27) follows, as does (25), by expanding
We now check that (28) may be evaluated by a multiple contour integral.
Theorem 17
Let \(\lambda \vdash n\) index an irrep. of \(S_n\), and let \({\mathbf {k}}= k_1 \ge k_2 \ge \cdots \ge k_r\), \(k = \sum k_i \le n\) denote the non-trivial cycles in conjugacy class C. Then
with
and each \(G_{k_i}^\mu \) given by
The contours \({\mathcal {C}}_1, \ldots , {\mathcal {C}}_r\) are to be chosen such that \({\mathcal {C}}_1\) has winding number 1 about each pole of \(G_{k_1}^\mu (z_1)\), and in general, \({\mathcal {C}}_i\) has winding number 1 about each pole of \(G_{k_i}^\mu (z_i)\) and also, viewed as loops in a single complex plane, encloses \(({\mathcal {C}}_j + k_i) \cup ({\mathcal {C}}_j - k_j)\) for all \(1 \le j < i\).
Proof
We check that
holds for arbitrary \(\mu \in {\mathbb {C}}^n\), which immediately gives the Theorem.
The case \(r = 1\) is Frobenius’ formula. Suppose that \(r \ge 2\) and that the formula holds in case \(r-1\). Let \(\hat{{\mathbf {k}}}_1\) to indicate \(k_2, \ldots , k_{r}\) with \(k_1\) omitted and similarly \(\hat{{\mathbf {z}}}_1\). Write
Invoking the inductive assumption, the RHS becomes
Now write
and observe that
Since the poles of \(G_{\mu }^{k_1}(z)\) are at \(\mu _1, \ldots , \mu _n\), for any fixed \(z_2, \ldots , z_r\), for any choice of contour \({\mathcal {C}}_1\) having winding number 1 about the poles of \(G_\mu ^{k_1}(z_1)\) and winding number zero about \(\{z_j - k_j, z_j + k_1\}_{2 \le j \le r}\),
In view of the restriction on \({\mathcal {C}}_1, \ldots , {\mathcal {C}}_r\), the identity follows. \(\square \)
Rights and permissions
About this article
Cite this article
Hough, B. The random k cycle walk on the symmetric group. Probab. Theory Relat. Fields 165, 447–482 (2016). https://doi.org/10.1007/s00440-015-0636-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-015-0636-6