BPS states, conserved charges and centres of symmetric group algebras

In N\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathcal{N} $$\end{document} = 4 SYM with U(N) gauge symmetry, the multiplicity of half-BPS states with fixed dimension can be labelled by Young diagrams and can be distinguished using conserved charges corresponding to Casimirs of U(N). The information theoretic study of LLM geometries and superstars in the dual AdS5× S5 background has raised a number of questions about the distinguishability of Young diagrams when a finite set of Casimirs are known. Using Schur-Weyl duality relations between unitary groups and symmetric groups, these questions translate into structural questions about the centres of symmetric group algebras. We obtain algebraic and computational results about these structural properties and related Shannon entropies, and generate associated number sequences. A characterization of Young diagrams in terms of content distribution functions relates these number sequences to diophantine equations. These content distribution functions can be visualized as connected, segmented, open strings in content space.


JHEP01(2020)146
3 Z(C(S n )): the centre of C(S n ) 10 3.1 A linear basis for Z(C(S n )) from conjugacy classes 10 3.2 Proving that T 2 , T 3 , · · · T n generate the centre of Z(C(S n )) 10 3.3 Generating sets for Z(C(S n )) from irreducible representations

Introduction
One of the best understood instances of the AdS/CFT correspondence [1] is the duality between N = 4 SYM with U(N ) gauge group and type IIB superstring theory on AdS 5 ×S 5 with N units of five-form flux. In particular, in the half-BPS sector, giant gravitons [2] were identified as important non-perturbative objects in the string theory which demonstrate remarkable sensitivity to finite N effects, notably the stringy exclusion principle [3], in their classical properties. Sub-determinant operators in the CFT were identified as duals for an interesting class of giant gravitons [4]. The construction of CFT duals of general giant gravitons was obtained by using Young diagrams to organize a finite N orthogonal basis of CFT operators [5]. An underlying free-fermion description of this sector was identified [5,6]. JHEP01(2020)146 JHEP01(2020)146 In this paper, motivated by the discussion in [12] and the subsequent developments in the mathematics of the space of gauge invariant operators -particularly the relevance of the structure theory of permutation algebras -we initiate a systematic study of the quantitative characterisation of the uncertainty in the determination of Young diagram operators in the half-BPS sector, when a finite set of Casimirs is specified. In section 2 we review some key elements of the connections between BPS operators of dimension n, Casimir operators of U(N ), and the symmetric group S n of all permutations of n distinct objects. The group algebra C(S n ) of formal linear combinations of S n group elements with complex coefficients plays an important role, along with the subspace of this group algebra which commutes with all C(S n ). This subspace is a commutative sub-algebra called the centre of C(S n ), or the central algebra, and denoted Z(C(S n )). The eigenvalues of Casimirs of U(N ) are related to the normalized characters of central elements in Z(C(S n )).
In section 3, we consider two linear bases for Z(C(S n )): one corresponds to conjugacy classes of S n and another to irreducible representations. As is well-known the conjugacy classes correspond to cycle structures of permutations. Thus Z(C(S n )) is a vector space of dimension equal to the number of partitions of n, denoted p(n), with a commutative and associative product. A distinguished set of conjugacy classes correspond to permutations have a single cycle of length k: the corresponding central element is denoted T k . We prove that for any n, the set G n = {T 2 , T 3 , · · · , T n } form a generating subspace of the central algebra. This means that by taking linear combinations of these elements and their products, we can get any element of Z(C(S n )). In fact, for a fixed n, we generically only need a subset of G n to generate the central algebra. The connection between cycle structures and irreps, which may be viewed as a Fourier transform, leads to a formulation of the distinguishability of Young diagrams in terms of minimal generating subspaces Z(C(S n )). A simple inspection of normalized characters of the cycle operators in irreps shows, for example, that for n up to 5 and n = 7, but not n = 6, T 2 alone suffices to generate the centre: in other words, T 2 and its powers form a linear basis for Z(C(S n )). This is demonstrated directly by writing out the powers of T 2 in terms of linear combinations of central elements corresponding to the conjugacy classes.
In section 4, we investigate the dimensions of the subspaces of Z(C(S n )) generated by T 2 , by T 3 and by the pair {T 2 , T 3 }. These dimensions, as shown in section 3, are given respectively by the number of distinct normalized characters χ R (T 2 ) d R , the number of distinct d R , and the number of distinct pairs as R runs over the set of Young diagrams. In each case, for small enough n, there are no degeneracies as R runs over all the Young diagrams. However as n increases, one or more R give the same normalized character, or list of normalized characters. The distribution of degeneracies can be used to define a probability distribution over the space of possible normalized characters. For each fixed value or list of values, the Shannon entropy -which is the logarithm of the multiplicity -gives a measure of the uncertainty associated with having knowledge of the value or value sets but not the exact identity of the Young diagram. Depending on a choice of probability distribution over the spectrum of values we can get an expectation value for this uncertainty associated with multiplicities. We study two natural ways of averaging JHEP01(2020)146 this entropy and discuss the data measuring these entropy averages. This gives involves developing an interesting AdS/CFT-based information theoretic physical perspective on mathematical data of fundamental interest, namely normalized characters of specified sets of conjugacy classes in S n .
In section 5 we define and study a number sequence n * (k): for a given k, n * (k) is the smallest n, where the normalised characters of {T 2 , · · · , T k } or equivalently the Casimirs C 2 , · · · , C k fail to distinguish all the Young diagrams. The mathematics literature [43,44] contains elegant formulae for the normalised characters in terms of content polynomials which are explained in this section. The transformations between Casimirs, normalized characters, and content polynomials have a useful triangularity property which allows us to compute n * (k) in terms of content polynomials which are efficiently programmable in Mathematica. For k = 6, we find that n * = 80. Our present computational approach becomes prohibitively inefficient beyond n = 80, so the determination of n * (7) is an interesting computational challenge. As a first step in the direction of developing analytic approaches to the determination of n * (k), for large k, we introduce a notion of content distribution functions which are shown to uniquely characterise Young diagrams: the content polynomials are moments of the content distribution functions. We express our earlier result about G n forming a generating set in terms of the content distribution functions and observe that at n = n * (k), a set of vanishing moment equations are satisfied by the differences of content distribution functions. These content distribution functions can be visualized as segmented, connected, open strings in content space -which may be useful in the future as a tool to develop new techniques to determine the properties of n * (k).
We conclude with a summary and discussion of future research directions.

Casimirs, charges and matrix invariants
In this section we recall the definition of the Schur polynomial basis for the half-BPS sector [5], where half-BPS operators are labelled by Young diagrams of U(N ) and are linear combinations of multi-traces of one complex matrix Z. Multi-traces with scaling dimension n, where each Z has dimension 1, are parametrized by permutations which control the contraction of U(N ) indices. We review how the action of the U (N ) Casimirs on the multi traces can be expressed in terms of central elements of Z(C(S n )) acting on the permutations labels [5,27]. We explain a diagrammatic algorithm for finding the map between the Casimirs and the central elements. We then show that knowledge of the Casimirs {C 2 , C 3 , · · · , C k } is equivalent to knowing the normalized characters for {T 2 , T 3 , · · · T k }.

The map from Casimirs to central elements for 1-matrix problem
The half-BPS operators in N = 4 SYM with U(N ) gauge group are gauge invariant functions of one complex matrix Z transforming in the adjoint of the gauge group Z is a quantum field with scaling dimension one. The gauge invariant functions are traces and products of traces. By the operator-state correspondence of CFT, these correspond JHEP01(2020)146 to quantum states in CFT and hence quantum states in the AdS. For scaling dimension n ≤ N , the linearly independent gauge invariants correspond to partitions of n. The scaling dimension corresponds to the energy operator for translation along the time direction of global coordinates in AdS [28]. For example at n = 3, we have the following basis for gauge invariants General multi-trace operators of degree n can be parametrized by permutations σ in S n , the symmetric group of all permutations of {1, 2, · · · , n}.
In the second expression Z ⊗n and σ are both being viewed as linear operators on the n-fold tensor product V ⊗n N of the fundamental representation V N of U(N ).
obey the commutation relations of the gl(N ) Lie algebra.
Appropriate anti-hermitian linear combinations generate the u(N ) Lie algebra. The CasimirsC k generate the centre of the enveloping algebra U (u(N )).
The lower q index is left invariant, while the upper index transforms as the fundamental representation V N . The commutator of E i k with a product is The upper indices {p 1 , p 2 , · · · , p n } transform as V ⊗n N . These equations can be used to show that the Casimirs (2.6) act on the operators (2.3) through left multiplication by elementsĈ k in the central sub-algebra Z(C(S n ))

JHEP01(2020)146
This is explained in more detail in section 3.1 and appendix A.1 of the paper [27], where the Casimirs are related to Noether charges for an enhanced U(N )×U(N ) symmetry in the free field limit of N = 4 SYM. It is shown how the quadratic Casimir of U(N ), expressed as a second order matrix differential operator, relates to T 2 , the central element of Z(C(S n )) which is related to permutations having one non-trivial cycle of length 2.
This map between C k , viewed as operators on V ⊗n N and central elements of the group algebra C(S n ) has been studied systematically in the context of 2d Yang Mills theory [45,46]. For example, we have the results Note the appearance of T (2,2) in C 5 . For C 6 , the central elements T (2,2) and T (3,2) will appear. An orthogonal basis in the free field inner product is parametrised by Young diagrams [5] These are referred to as Schur Polynomial operators of the half-BPS sector. The commutation relations of the Casimirs with the gauge invariant operators can be read off from the Casimir to central elements transformations. For example In the above examples, we see that knowing the degree n and the normalized character of T 2 is equivalent to knowing C 2 . Knowing n along with the normalized characters of T 2 , T 3 is equivalent to knowing C 2 , C 3 . We will prove the following theorem.
Theorem 1. The Casimir operators C 2 , · · · , C k in V ⊗n N can be expressed in terms of T 2 , T 3 , · · · , T k . This relies on the form of the transformations between the Casimirs and the central elements.
To prove this theorem express the Casimir C k , acting on V ⊗n N in the form

JHEP01(2020)146
where the r i label the different factors of V ⊗n N , and ρ r i (E j 1 j 2 ) is the linear operator of E j 1 j 2 acting on the r i 'th factor. There is a diagrammatic algorithm for converting the generating Casimirs to central class operators [45,46]. We will review this algorithm and use it to prove the theorem. An immediate corollary given the above discussion is that the eigenvalues of the Casimir operators C 2 (R), · · · , C k (R) on the Young diagram operators O R is determined in terms of the normalised characters We now describe the diagrammatic algorithm for C k . Draw a circle with k crosses, labeled 1 to k with orientation as shown in figure 1. The sum over the r indices can be separated into different coincidences between the {r 1 , r 2 , r 3 , · · · , r k }. This is a sum over set partitions: partitions of the set {1, 2, 3, · · · , k} into collections of subsets [47]. Thus, a particular contribution to C k will be a partition of the set {1, 2, 3, · · · , k} into p subsets, where p ≤ k. The total number of set partitions of k elements into p subsets is given by Stirling's number of the second kind [48]. All the crosses labeled by the elements in a particular subset are joined with a chord or a line. To each of the lines, apply the following procedure. Thicken the line, let the cross with the smaller label disappear, and let the cross with the largest number slide along the graph in the direction of the orientation to join the edge of the thickening -this will be illustrated in examples shortly. It is a diagrammatic translation of the multiplication of E operators acting on the same V N : two crosses correspond to two E operators, their multiplication produces a single E and a δ function which results in a reconnection of index lines. After this operation is applied to all the chords, the effect of this is to separate the graph into a set of loops. In general a graph will separate into a loop with k 1 crosses, a loop with k 2 crosses and so on, which we denote by D k 1 ,k 2 ,··· . The central element is obtained from D k 1 ,k 2 ,··· by retaining all the k i > 1. Relabel the k i > 1 with k i and drop the k i which are equal to 0 or 1. We thus obtain T (k 1 ,k 2 ,··· ) . Let n 0 be the number of loops with zero crosses and let n 1 be the number of loops with one cross. There is a factor of N n 0 and a multiplicative factor of (2.14) There is one last symmetry factor generated. Consider T (k 1 ,k 2 ,··· ) . If m µ (µ > 1) is the number of k i values equal to µ, we obtain a multiplicative factor of which accounts for the cyclic variations and permutations of all the r i leaving the overall contribution to C k invariant. Thus the formula for C k takes the form of the set into one subset and the second is a partition of the set into two subsets. The first corresponds to the case where r 1 = r 2 . Here crosses 1 and 2 are joined by a single chord. The second partition corresponds to the case where r 1 = r 2 and the crosses are not joined. Thicken the chord in the first graph, erase the cross labelled 1 and slide the cross labelled 2 along the graph in the direction of the orientation. The result is two separate loops -one with k 1 = 1 and the other with k 2 = 0. Thus, we get a D 1,0 . Furthermore, n 0 = 1 and n 1 = 1 for this graph. If none of the k are bigger than 1, which is the case here, we write the central element labeled by the identity T 1 . Since n 0 = 1, there is a factor of N . Since n 1 = 1 and k i = 0, we obtain the factor n. The overall result of this graph is nN T 1 . This is shown in figure 2. The second graph gives a D 2 , where k 1 = 2. Both n 0 = n 1 = 0 meaning that we have a single loop with two crosses and no factors of N or n. Since k 1 = 2, we relabel to k 1 = 2 and we finally obtain 2T 2 . The factor of 2 comes from applying formula (2.15). For k = 5, there are 52 set partitions in total to sum over, given by the Bell number B 5 . One example is {1|234|5}. This corresponds to r 1 = r 2 = r 3 = r 4 = r 5 . Crosses 2,3 and 4 are joined by a single chord. After thickening the chord, erasing crosses 2 and 3 and sliding 4 along the graph toward 5, the graph splits into 3 loops where k 1 = 3, k 2 = 0 and k 3 = 0. Thus, we get D 3,0,0 , which leads to T (3) . Here, n 0 = 2 and n 1 = 0. This contributes a factor of N 2 . We finally obtain a contribution of 3N T 3 to C 5 from this partition. The factor of 3 is again obtained from (2.15). This example is illustrated in figure 3.
The key fact that we make use of is the following. When we sum over the different set partitions of {1, 2, · · · , k}, the case r 1 = r 2 · · · = r k leads to kT k as described by the diagrammatic rules. This has branching number k − 1, where branching number is defined in equations (3.3) and (3.3). Any other set partition produces something of lower branching number. Suppose the set {r 1 , · · · , r k } is divided into p disjoint subsets. Within each subset we have r's which coincide. Setting aside the case where we get kT k , we have p < k. After we multiply out the E's within a subset, we get a single E. The total number of E's left is p which is also equal to the sum of all the k i values. Some of the k i could equal 1 which corresponds to an E i i , which is the identity. These are not included when writing the T JHEP01(2020)146 Figure 2. Computing C 2 . The above graphs correspond to the two possible set partitions for {1, 2}. The first graph depicts the case when r 1 and r 2 coincide. Thus, these two crosses are joined. After thickening the chord and erasing the cross with the smaller label, the graph splits into a loop with one cross labeled 2 and a loop with no crosses. The second graph depicts the case when r 1 = r 2 . Here, there is no joining of crosses and the graph just remains as is.  Figure 3. Computing the contribution of r 1 = r 2 = r 3 = r 4 = r 5 to C 5 . Crosses 2,3 and 4 are joined as shown on the left. After thickening the chord joining these crosses, erasing the smaller labels 2 and 3 and sliding 4 along toward 5, this diagram splits into three disconnected pieces. Following the recipe converts this term in the Casimir sum into 3N 2 T (3) .
operators. The conjugacy class we get is generally of the form T k 1 ,k 2 ,··· ,k l where k 1 , · · · k l are positive integers larger than one. The remaining copies of E are collected into l cyclic collections. We thus have The branching number is k 1 + k 2 + · · · + k l − l ≤ p − l < k − 1. But T 2 , · · · , T k−1 generate all the T operators for branching number less than k − 1, a result (theorem 2) we prove in section 3.2. Thus C k can be expressed in terms of T k along with products involving T i for i ≤ k − 1. This means that knowing the normalized characters of T 2 , · · · , T k for any Young diagram R is equivalent to knowing the Casimir eigenvalue of C 2 , · · · , C k for the Young diagram R.

JHEP01(2020)146
3 Z(C(S n )): the centre of C(S n ) In this section, we consider the centre of the group algebra C(S n ),denoted as Z(C(S n )). First, we identify a basis for Z(C(S n )) from conjugacy classes labeled by partitions of n. Next, we show that a certain subset G n of these basis elements is capable of generating Z(C(S n )). However, at a given n not all of the elements of G n are needed to generate Z(C(S n )). Another useful basis for the centre comes from projectors associated with irreducible representations (irreps) of S n . In the irrep basis we develop criteria for when elements of G n generate Z(C(S n )). Lastly, we present an explicit non-trivial example where a single element of G n generates the centre.

A linear basis for Z(C(S n )) from conjugacy classes
It is well-known that the partitions of n label the conjugacy classes in S n . In particular λ n labels the conjugacy class of permutations with cycle structure λ. Identify T λ with the formal sum over all elements of the conjugacy class λ with equal coefficient. The elements T λ form a basis for the centre Z(C(S n )). Consider cycle structures of the form [k, 1 n−k ], with one cycle of length k and remaining cycles of length 1. Denote the sum of permutations, in the group algebra C(S n ), with this cycle structure as T k . So, for example These cycle operators play an important role in this paper.
3.2 Proving that T 2 , T 3 , · · · T n generate the centre of Z(C(S n )) In this section, we prove that the central elements {T 2 , · · · T n } generates the centre of the group algebra C(S n ). It is convenient to first define the branching number for the permutation σ corresponding to modified cycle type λ. Say σ ∈ S n has a cycle type ρ = (ρ 1 , ρ 2 , · · · , ρ k ), i.e. k cycles of length ρ 1 , ρ 2 · · · ρ k such that which is denoted as ρ n. The modified cycle type of ρ is defined as λ = (ρ 1 − 1, ρ 2 − 1, · · · , ρ k − 1) [54]. Then the branching number is defined by

JHEP01(2020)146
Define C λ to be the set of all permutations ω whose modified cycle type is the partition λ.
For each partition λ, let C λ denote the sum of all ω ∈ S n whose modified cycle type is λ. For example, take C (2) . We have C (2) = ω∈Sn C (2) ω. So, if n = 10, Thus C (2) is equal to T 3 ; but it is convenient here to work with a notation that uses the reduced cycle type. The set {T 2 , · · · , T n } has branching numbers B = 1, · · · , n − 1. Thus, the branching number of the T i can be read off from the labels of the corresponding C i−1 .
Theorem 2. Given the set of central elements in Z(C(S n )), G k = C (1) , C (2) , · · · C (k) , any C λ , where λ is any partition such that |λ| ≤ k can be written in terms of linear combinations of products of elements in the set.
The statement that {T 2 , · · · , T n } generate Z(C(S n )) is an immediate corollary. We make use of the following result from [54] about the product of central elements where the coefficients a ν λµ = 0 unless |ν| ≤ |λ|+|µ|. 1 In what follows, we frequently consider the case, C λ · C (r) , r ≥ 1 and |λ| + r = |ν|. We have the conditions [54] In the first point, the so-called natural ordering of partitions is being used. A partition λ can be described by the sequence of its parts, listed in weakly decreasing order (λ 1 , λ 2 , · · · , λ r ): Given two partitions λ = (λ 1 , λ 2 , · · · , λ r ) and µ = (µ 1 , µ 2 , · · · , µ s ), the natural ordering is defined by saying that µ ≥ λ if In this definition partitions are extended by zero parts if necessary. This is also called dominance order. Taking the transpose of partitions reverses the dominance order. In other words, if µ ≥ λ then λ T ≥ µ T . We now present examples for small values of k: 1 Note that the interpretation of permutations in terms of branched covers which plays an important role in the string theory of 2D Yang Mills [33] allows a physical interpretation of this inequality. µ, λ describe the branching over two branch points. If we let the two branch points collide to have a single branching described by ν, the change in branching number due to the collision |λ| + |µ| − |ν| must be non-negative since this is accounted by the formation of a number, positive or zero, of collapsed handle singularities as a result of the collision. In the process of collision the Euler characteristic of the covering surface does not change, but contributions to the Riemann Hurwitz formula from branch points are traded for contributions form collapsed handles.
Using equation (3.5), we have The terms for which |ν| < 2 are already generated by G 1 which is contained in G 2 .
When |ν| = 2, we need to sum over the ν for which ν ≥ λ (r). Here λ = (1) and (r) = (1). So λ (r) = (1, 1). We have: where the possible ν's are on the left hand side. Thus Checking this explicitly for n = 10 The coefficient of C (1,1) is always non-zero from equation (3.7). This can also be seen from the diagrammatic algorithm described earlier. The algorithm for converting C k into T k 1 ,k 2 ··· can also be used to multiply the T 's. The term C (1,1) results, in the diagrammatic algorithm, from the diagram with zero lines joining crosses from one C (1,1) to the other.
We need to check that C labeled by each of these partitions is generated by G 3 . All terms for which |ν| ≤ 2 are generated by G 2 which is contained in G 3 . Now, C (3) ∈ G 3 already. According to natural ordering Thus, the next largest from (3) is (2, 1). From equations (3.5)-(3.7), we consider C (2) · C (1) . Both C (1) and C (2) are contained in G 2 . We see that C (2) · C (1) will contain C (3) and C (2,1) . The only other terms will be C ν such that |ν| ≤ 2, which are generated by G 2 . The next largest partition is (1, 1, 1). Thus, we consider C (1,1) · C (1) . Again, both C (1) and C (1,1) are generated by G 2 . From the ordering in (3.17) and from equations (3.5)-(3.7), C (1,1) · C (1) will contain C (1,1,1) and may contain C (3) , C (2,1) along with C ν for which |ν| ≤ 2. For n = 10, (4) . The natural ordering for partitions of 4 is a total ordering: Again, we need to check that C labeled by each of these partitions is generated by G 4 . All terms for which |ν| ≤ 3 are generated by G 3 which is contained in G 4 . Now, C (4) is already contained in our generating set. We proceed to the next largest partition (3,1). We get will contain C (3,1) and may contain C (4) as well as C ν for which |ν| ≤ 3. For n = 10, We continue in this way down the order in (3.19) generating the C operators labeled by each these partitions using C operators generated by G 3 . For n = 10, • For k = 5, natural ordering is still a total ordering, For n = 10, we have • For k = 6, natural ordering is no longer a total ordering. We have The partitions (4, 1, 1) and (3,3) are incomparable according to natural ordering. Thus, (3,3) will not appear in the product C (4,1) · C (1) , and (4, 1, 1) will not appear JHEP01(2020)146 in the product C (3) · C (3) . Similarly for the incomparable (3, 1 3 ) and (2 3 ), which are conjugates of (4, 1, 1) and (3,3). Calculating these products for n = 10, we find • Assume that all C labeled by partitions λ for which |λ| ≤ k − 1 can be generated by • Now consider the generating set G k and all partitions λ such that |λ| = k. We show that all C labeled by partitions of k can be generated by G k . According to natural ordering We consider C λ · C (r) where both C λ and C (r) have |λ| ≤ k − 1 and r ≤ k − 1 and are thus generated by G k−1 . According to equations (3.5)-(3.7), we sum over partitions ν such that ν ≥ λ (r) and a λ (r) λ(r) > 0. Now, C (k) is already contained in G k . To generate C (k−1,1) , we multiply C (k−1) · C (1) . The term C (k−1,1) will appear with nonzero coefficient and all partitions larger than (k − 1, 1) may also appear. This only includes C (k) , which is already contained in G k . Next, to generate C (k−2,2) , we multiply C (k−2) · C (2) . The term C (k−2,2) will appear and partitions larger than (k − 2, 2), which are (k) and (k − 1, 1), may also appear. However, C (k) is already contained in G k , and C (k−1,1) is generated by G k as we have seen above. To generate C (k−2,1,1) , we multiply C (k−2,1) · C (1) . The term C (k−2,1,1) is sure to appear, while the partitions larger than (k − 2, 1, 1), i.e., (k), (k − 1, 1) and (k − 2, 2) may also appear. However, each one of these have already been shown to be generated by G k . We may continue in this way proceeding one by one down the chain of partitions in (3.38). When we arrive at the smallest partition (1 k ), we compute C (1 k−1 ) · C 1 . C (1 k ) will be generated as well as all partitions larger may also appear. But each of these have, in turn, been shown to be generated by G k in the same way as described above.
A small comment is in order. In proceeding down the list of partitions, we will arrive at a set of partitions that are mutually, or pairwise, incomparable according to natural ordering. To generate any one of these partitions, we still form a product of the form (3.5).
Since we need to sum strictly over partitions that are larger than the one in question all incomparable partitions will not appear in the result. As an example, see k = 6.

JHEP01(2020)146
We have successfully shown that G k can generate any C labeled by partition λ whose |λ| ≤ k. In terms of the T operators, this result means that {T 2 , · · · , T k+1 } is capable of generating any T labeled by a partition λ with branching number k or less. This means that {T 2 , · · · , T n } generates any central element T λ with branching number n − 1 or less. However, the T λ ∈ Z(C(S n )) with the largest branching number is T n with B = n − 1, which is already contained in our generating set. Thus, {T 2 , · · · , T n } can generate the centre of C(S n ). Remark 1. Lastly, it is worth noting that T n can be expressed in terms of {T 2 , · · · , T n−1 }. Using (3.5), the calculation of C n−2 · C 1 yields C n−1 (which is T n ), C (n−2,1) , (which is T (n−1,2) ) and then T λ with a lower branching number than n − 1, which can all be generated {T 2 , · · · , T n−1 }. However, we exclude T (n−1,2) since it is labeled by a permutation of n + 1. Thus, T n can be expressed in terms the set T 2 , · · · , T n−1 .

Generating sets for Z(C(S n )) from irreducible representations
A basis for the centre Z(C(S n ) is given by projectors (orthogonal idempotents) where χ R (σ) is the character in the irreduciblel representation R of the permutation σ.
The R correspond to Young diagrams. This is general fact about group algebras of finite groups, see e.g. [34]. These obey where 1 is the identity permutation and the identity in the associative algebra C(S n ). The number of these projectors is p(n), the number of partitions of n. Taking the trace in an irrep R, and using orthogonality of characters, we have We make use of the following fact (lemma 2.1 in [35]): To see this, we observe that which follows by expanding the right hand side and using the projector equations. This is essentially a fact about the algebra of diagonal matrices: the algebra of diagonal matrices is generated by any diagonal matrix with distinct entries.

Generating sets of cycle structures from lists of normalised characters
We can expand any central element such as T 2 in terms of these projectors Take a trace in the representation S on both sides So we have has no repetitions, then T 2 generates the centre.
If we have a list of irreps R, where the normalized characters of T 2 are equal, then we can use another central element such as T 3 . Within this block, we apply the lemma ?? again. So to find out whether T 2 , T 3 generate the centre, we just need to look at the matrix 2 × p(n) matrix If no two rows are identical, then T 2 , T 3 generate the centre. We may apply lemma ??
iteratively. More generally, we would like to consider the k × p(n) matrix. The central elements {T 2 , T 3 , · · · , T k } may each be expanded in terms of the projectors P R , with their respective normalized characters being the expansion coefficients. If this list of normalized characters is distinct for each R then no two rows in are identical. According to lemma ??, the {T 2 , T 3 , · · · , T k } will generate the centre. For example if for R 1 and R 2 , the list of normalized characters for {T 2 , T 3 , · · · , T k−1 } are identical then this list of T i and their respective powers are no longer linearly independent. They no longer generate the subspace of the centre spanned by P R 1 and P R 2 . We now include one more central element is distinct for R 1 and R 2 then the set {T 2 , T 3 , · · · , T k } are linearly independent and, according to lemma ??, this list generates Z(C(S n )). It is interesting to study the sequence of the smallest n values where the {T 2 , T 3 , · · · , T k } fail to generate the centre of C(S n ). Denote this sequence by n * (k). The problem of finding n * (k) is a matter of understanding the degeneracies in the characters of S n , using nice formulae for these characters available from the mathematics literature [43,44]. Using the discussion in section 2 n * (k) is also the smallest value of n where knowledge of the all the Casimir eigenvalues C 2 , C 3 , · · · , C k does not suffice to distinguish all the Young diagrams.

JHEP01(2020)146
3.5 Generating the centre for n = 5 with T 2 The list of normalized characters for T 2 is {10, 5, 2, 0, −2, −5, −10}. In this list, there are no degeneracies. Hence by the above argument, we expect T 2 to generate the centre. Below, we show this explicitly.
We generate the following equations in Mathematica by taking successive powers of T 2 .
For example, after computing (T 2 ) 2 , we count that the identity element T 1 appears ten times, T 3 appears 3 times and T (2,2) , the formal sum of permutations having disjoint twocycles, appearing twice. We may now invert this system of equations to solve for the other T quantities. Solving for T 4 and T (3,2) in terms of T 2 , we find Next, we solve for T 1 = 1, T 3 , T 5 and T (2,2) in terms of T 2 (T 2 ) 6 (3.52) This shows that for n = 5 each T µ corresponding to a cycle-type µ, may be written in terms of T 2 . Thus, T 2 generates the centre Z (C (S 5 )).

Dimensions and entropies associated with low order cycle structures
From the previous section, we concluded that not all elements in G n are needed to generate Z(C(S n )). Indeed for some n, all we need is T 2 . It is interesting to study this operator's ability to generate the centre as a function of n. In this section we present some data concerning T 2 as well as the combination of T 2 and T 3 . We first discuss the codimension for the subspaces generated by these two central elements, and then we discuss the entropy related to the degeneracies of their normalized characters.

Co-dimensions as measures of uncertainty
We have proved that T 2 generates the centre of S n if there are no repetitions or degeneracies in the list of its normalized character. In this section, we refer to the normalized characters frequently and thus we define For a given n, if there are degeneracies in χ R 2 then we include the normalized character of T 3 . If the list { χ R 2 , χ R 3 } has no repetitions then T 2 and T 3 generate the centre for that particular n.
For n = 1, 2, 3, 4, 5, there are no repetitions in χ R 2 , so T 2 does indeed generate Z(C(S n )). For n = 6, we encounter our first degeneracy. The list of normalized characters for T 2 is There are two repetitions in this list. Thus, T 2 no longer generates Z(C(S n )). T 2 instead only generates a subspace of Z(C(S n )). The codimension of a subspace is the difference between the dimension of the full space and the dimension of the subspace. We define codim T k (n) to be the difference of the dimension of Z(C(S n )) and the dimension of the subspace generated by T k . The codimension of a subspace generated by a T k or a collection of T k 's is a measure of how close the central element or collection of central elements is, to generating the centre. For T 2 at n = 6, the number of distinct χ R 2 is 9, which is 2 less than p(6) = 11, so the codimension of the subspace generated is equal to two. Interestingly, T 2 once again generates Z(C(S n )) for n = 7, producing a codimension of zero. From n = 8 however, T 2 fails to generate Z(C(S n )). The codimension data for T 2 for n = 2 to n = 70 is shown in table 1. These co-dimensions can be viewed as a measure of the uncertainty in the determination of Young diagrams with n boxes when we only know the normalized character of T 2 : equivalently when we only know the second Casimir. From this data, we can calculate the relative dimension for T 2 where p (n) is the dimension of Z(C(S n )). The plot for this is shown in figure 4. We now discuss the codimension of the subspace generated by both T 2 and T 3 . For n = 6, the list of normalized characters of T 2 and T 3 is  Since there are no repetitions in this list (i.e. each pair of numbers is unique in this list), these two central elements generate Z(C(S n )). Thus, the codimension for this subspace is zero. We find a zero codimension for all cases of n up until n = 15. For n = 15 the subspace generated by {T 2 , T 3 } has a codimension of 3. See table 1 for data for n = 2 to JHEP01(2020)146 n  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17   T2 codim  0  0  0  0  2  0  3  5  11  9  32  26  56  89  122  156   T3 codim  1  1  2  3  5  7  12  17  24  33  49  64  90  120  164  214   {T2, T3} codim  0  0  0  0  0  0  0  0  0  0  0  0  0  3  4  4   n  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33   T2 codim  244  305  434  571  755  964  1280  1613  2059  2599  3277  4064  5097  6267  7742  9488   T3 codim  285  367  485  619  801  1013  1298  1637  2052  2578  3214  3978  4945  6110  7492    n = 70. The co-dimensions for {T 2 , T 3 } can again be viewed as a measure of the uncertainty in the determination of Young diagrams with n boxes when we only know the normalized characters of T 2 and T 3 : equivalently when we only know the second and third Casimir. Figure 4 shows the relative dimension for {T 2 , T 3 }. We also plot the relative dimension for {T 2 , T 3 } in figure 4. Note that the codimension of the space generated by {T 2 , · · · , T n } is zero for all n. Furthermore, we note that the relative dimension plot for {T 2 , · · · , T n } would just be a straight line at R {T 2 ,··· ,Tn} = 1.

JHEP01(2020)146
It is natural to expect that the uncertainty of identifying the Young diagram is smaller when we know both χ R 2 and χ R 3 compared to when we only know one of these quantities. Furthermore, the higher the codimension of T k (or of some subset of T 's), the higher this uncertainty becomes. Thus, we expect This is indeed reflected in the data of table 1 for all values of n listed there. The relative sizes of codim T 2 and codim T 3 may also be studied. From the data, we see that The behaviour of these codimensions for n larger than 70 is an interesting problem. It is natural to conjecture that codim T 3 < codim T 2 persists for all n higher than 26. Note that AdS/CFT motivates the study of finite N versions of this codimension problem. There is a finite N truncation of Z(C(S n )), where we set to zero all the projectors P R with height l(R) constrained to satisfy l(R) > N . This subspace, which we denote Z N (C(S n )), is a proper subspace when N < n and forms a sub-algebra. Consider the set of generators {T 2 , · · · , T k } for some k < n. The particular case of k ∼ N 1/4 , n ∼ N 2 in the limit of large N is of particular interest in connection with the information loss discussion of [12]. n ∼ N 2 is the dimension of CFT operators which produce non-trivial deformations of the AdS space-time. k ∼ N 1/4 corresponds to the Planck scale cutoff. Calculating the codimensions in this regime of k, n is a very interesting problem for the future.

Average entropies for uniform probability distribution over values of charges
Consider measuring the normalized character for the operator T 2 . Given the discussion in section 2, in particular equation (2.10) this is equivalent to knowing the quadratic Casimir charge. There will be a list of normalized characters generated over the Young diagrams. The value v 2 contained in this list occurs with multiplicity M (v 2 ). Assuming that we have no knowledge about the half-BPS state beyond the dimension n and the quadratic Casimir, this means that a total of M (v 2 ) Young diagrams are equally likely. We thus have a uniform distribution over this subset of Young diagrams. We also have  The Shannon entropy associated with this value of v 2 , and the uniform distribution, is This may be viewed as a measure of the uncertainty in our knowledge of the state R when we only know χ R 2 = v 2 . We may also take an average of these entropy values  where the sum is over all distinct values of χ R 2 = v 2 and N 2 is the total number of distinct χ R 2 values. This quantity may be viewed as a measure of the average uncertainty of the Young diagram when we only know the values of the T 2 normalized characters. We present data for S ave T 2 in table 2 for n = 2 to n = 70. We also present a plot of this data in figure 5. Similar data for T 3 is presented there as well. Note that the average entropy (4.9) can also be viewed as an expectation value in a probability distribution over the values v 2 , where all these values are equally probable, in other words, the uniform distribution over v 2 .

JHEP01(2020)146
We may also consider the list of values {v 2 , v 3 } for χ R 2 and χ R 3 . Denote the multiplicity of the pair where we sum over distinct values of χ R 3 = v 3 in the first equation and over distinct values of χ R 2 = v 2 in the second. log (M (v 2 , v 3 )) is the entropy associated with the values (v 2 , v 3 ) and the uniform distribution over the subset of R corresponding to these values for ( χ R 2 , χ R 3 ). Again we may compute the average of this entropy where N (2,3) is the total number of distinct values for { χ R 2 , χ R 3 }. This entropy may be viewed as a measure of the uncertainty in identifying the exact Young diagram when we only know the values of the T 2 and T 3 normalized characters.
We expect that there should be a smaller uncertainty when we know both { χ R 2 , χ R 3 } compared to knowing just one of the normalized characters. In other words, we expect these average entropies to obey This is indeed compatible with the results in the table 5 and is also the trend we see in the comparison of the codimensions in (4.5). It is not a priori clear which S ave should be larger. The data shows that It is natural to conjecture, and would be interesting to prove that the trend visible for 32 ≤ n ≤ 70 extends for all n ≥ 32. We can define finite N versions of these entropies by considering only Young diagrams with height no larger than N . In these finite N ensembles of Young diagrams, we can define multiplicities of Young diagrams and derive entropies for specified values of n. It will be interesting to obtain estimates of the finite N entropies for n ∼ N 2 , since this corresponds to classical solutions of supergravity. Also of particular interest, given the discussion [12], are the average entropies for the sets

Average entropies for multiplicity-weighted probability distributions over values of charges
In the discussion of the average entropy over a set of known charges, we used a uniform distribution over the spectrum of charges. We found that the average entropies thus calculated satisfied inequalities which were similar to those satisfied by the co-dimensions,

JHEP01(2020)146
lending support to the idea that both the co-dimensions and the average entropies are sensible measures of the information available from knowing a set of charges -in particular, the inequalities reflect the fact that knowing more charges reduces the uncertainty. The comparisons of the information available by knowing just one charge. e.g. T 2 versus T 3 depends on how one measures the information, whether it is through codimensions or average entropies. The data is compatible with the conjecture that at large n, a definite pattern emerges: there is more information in knowing T 3 rather than T 2 , whether this is measured by looking at the codimensions or average entropies. Given a set of charges, and an associated multiplicity of Young diagrams states, there is yet another interesting way to measure the information or conversely the uncertainty associated with that probability distribution. Suppose we take the set of values of χ R 2 at a given n. Let these values form the set V 2 . The multiplicity of a given value v 2 ∈ V 2 is (4.14) Let N 2 be the size of the set V 2 . The probability of having value v 2 is since the total number of Young diagrams is p(n).
Now consider the Shannon entropy for this probability distribution This entropy has an interesting interpretation in a quantum information setting involving quantum measurement and classical communication. Suppose we have a density matrix for the Hilbert space of Young diagram states, where the diagonal part of the density matrix is for a uniform probability distribution over Young diagram states with energies between n 0 and n 1 . The states |R have unit norm. Suppose observer A measures the energy n and the set of charges {T 2 , · · · , T n } to determine the exact R. Given the form of (4.18) we have a uniform probability distribution over R. The Observer A communicates the information to Observer B of the energy n, but to observer C the more detailed information of n, along  Table 3. with the eigenvalues of C 2 , or equivalently the normalized character of T 2 . The first term in (4.17) is a measure of the uncertainty open to B, who knows that the Young diagram is one of p(n) but has no further information. 2 The second term is a measure of the uncertainty log M (v 2 ; n) for each v 2 value averaged over the different v 2 values according to a probability with which v 2 occurs in the measurements of A. The difference is a measure of the reduction in uncertainty, due to the additional information available to C compared to B. Equation (4.19) provides the relation between the entropy S(T 2 ; n) above and the entropy S T 2 (v 2 ) defined in equation (4.8) in section 4.2. We can view (4.19) as an expectation value of S T 2 (v 2 ) taken over the probabilities M (v 2 ;n) p(n) . Thus, It is interesting to plot the entropy (4.17) as a function of n using data we already have, which is motivated by questions such as: what is this entropy as a function of n? How does it behave at large n? This data is presented in table 3. Next consider T 2 , T 3 , · · · , T k . We want to think about the values {v 2 , v 3 , · · · v k } of the normalized characters χ R 2 , χ R 3 , · · · , χ R k . This vector of normalized characters lives in the space V {2,3,··· ,k} . The size of this set is N (2,3,··· ,k) . The multiplicity of a given value-set is Now we can define an entropy for this set of generators S(k; n) = S(T 2 , · · · , T k ; n) Entropy data for {T 2 , T 3 } is also found in table 3. In addition, the entropy data in this table is plotted in figure 6. These entropies demonstrate the following behaviours. For all values of n listed in the table 3 (4.23) The only n values for which S T 2 = S {T 2 ,T 3 } are the ones where T 2 generate the centre: n = 2, 3, 4, 5 and 7. This is exactly as expected from our interpretation of these entropies in terms of information gained by knowing in addition to the energy n, the specified charges. We generically gain more information from knowing more charges, unless the more limited set of charges is sufficient to determine the Young diagram entirely. When analyzing the relative sizes between S T 2 and S T 3 the data indicates S T 3 < S T 2 for n ≤ 23 S T 2 < S T 3 for 24 ≤ n ≤ 70. (4.24)

JHEP01(2020)146
Again, it is natural to again conjecture that S T 2 < S T 3 for all n larger than 24. If this conjecture, along with the corresponding conjectures in sections 4.1 and 4.2, are true, they would support the plausible conclusion that different measures of uncertainty -codimensions and variations in the choice of entropy function -give the same ranking of information provided by different conserved charges in the limit of large n.
The above entropies are relevant to AdS/CFT questions when N > n. We can also define finite N entropies, where n > N , motivated by the discussion of [12]. In this case, we are interested in Young diagrams with no more than N rows, which we express as l(R) ≤ N , using l(R) to refer to the vertical length of the Young diagram.
The associated entropies are Here p(n, N ) is the number of Young diagrams with no more than N rows. Of particular interest, from the discussion in [12], is the large N behaviour of S(k = N 1/4 , n = N 2 , N ) (4.27) k = N 1/4 corresponds to the Planck scale, while N 2 is the dimension of CFT operators which cause a significant backreaction in the geometry. We leave a systematic computation and discussion of these finite N version of the co-dimensions and entropies we have discussed for the future.

Content distribution functions and diophantine equations
In section 3 we considered the problem of generating Z(C(S n )) using central elements {T 2 , T 3 , · · · T k }. We used results in the representation theory of C(S n ) to show that this question can be answered by inspection of the normalized characters of the T i . This allowed us to generate interesting data on the subspaces generated by T 2 , by T 3 and the pair {T 2 , T 3 }. In this section, we once again rephrase the problem in terms of the so-called content polynomials, which have been used to produce elegant expressions for normalized characters in [43,44]. We show that the normalized characters of {T 2 , T 3 , · · · , T k } can be expressed in terms of the first k content polynomials. One can then reformulate the problem of generating the centre, and distinguishing all Young diagrams, in terms of these polynomials. This new formulation is used to write simple code in Mathematica to determine the values of n for which the first k polynomials distinguish all Young diagrams (see appendix A). In section 5 we define the content distribution function (CDF) for a Young diagram. We prove that each Young diagram can be uniquely specified by its corresponding CDF. The content polynomials can be expressed as moments of the distribution JHEP01(2020)146 functions, analogous to moments of a probability distribution. We show that knowledge of all n moments for a Young diagram uniquely determines its CDF. We show that the event of two diagrams having k degenerate moments (k < n) translates to a set of k vanishing moment equations in the difference of the two respective CDFs. Lastly we provide some examples of CDFs and CDF plots for degenerate diagrams at n = 6, 15, 24. We explain a visualization of the CDF plots in terms of segmented open strings with Dirichlet boundary conditions.

Content polynomials and normalized characters
An important result in [44] (lemma 3.1) is We are summing over partitions λ = (λ 1 , λ 2 , · · · , λ r ) where λ 1 + λ 2 + · · · + λ r = l ≤ (k − 2). The c k (R) are defined as where we are summing over the coordinates of the boxes of a Young diagram: i is the row number and j is the column number. For example the Young diagram with row lengths It is immediately apparent that degeneracies in the normalized characters χ R 2 , χ R 3 and χ R 4 translate to degeneracies in the content polynomials c(R) , c 2 (R) and c 3 (R) .

Computing with content polynomials
If we are given the T 2 , · · · , T k what is the smallest n, where these fail to generate the centre of C(S n )? This question can be reformulated in terms of the content polynomials. Given { c (R) , c 2 (R) , · · · , c k (R) }, what is the smallest n such that the sequence of these lists as R runs over Young diagrams with n boxes has degeneracies, i.e. multiple R have the same list. The experimental answer is for k starting from 1, is displayed in table 4. For up to n = 5, the first content polynomial c (R) , where R labels the Young diagram, is able to distinguish all R 5. The first time c (R) fails to distinguish all Young diagrams is at n = 6. However, the set { c (R) , c 2 (R) } is unique to each R 6. These two polynomials together are then able to distinguish all Young diagrams for n = 6 up to JHEP01(2020)146 k first n  1  6  2  15  3  24  4  42  5 80 Table 4. Table showing the smallest values of n for which T 2 , · · · , T k fail to generate the centre C(S n ). We give the n values for k from 1 to 5. This table was generated by computing the content polynomials for each irrep R at fixed n. Degeneracies in the content polynomials translate to degeneracies in the normalized characters. The degeneracies for k = 5 at n = 80, 81, 82, 83, 84, 85 are 3, 0, 2, 2, 11, 12 (5.8) It is interesting that the degeneracies start off very low. It is useful to look at the degenerate Young diagrams which share the same set of Casimirs, when we are at these thresholds of distinguishability.

Content distribution functions
Every box in a Young diagram has a content c given by j − i. For example at n = 3, we have 3 Young diagrams which can be described in terms of their row lengths [3], [2,1] We begin this proof by noting that a Young diagram has a depth d. This is the number of boxes along the diagonal at 45 degrees when to the horizontal. All the boxes along the diagonal have content 0. Alongside d, there is a set of parameters These parameters are illustrated in figure 7. Going up from corner of the deepest box with content 0, we have k + 1 boxes before we reach a corner, then we go l + 1 steps horizontally to get to the next corner, then k + 2 steps vertically to the next corner, then l + 2 steps horizontally. This continues until we have k + p steps up for some positive p. Similarly going to the left from corner of the deepest box of content zero, we have l − 1 steps to the next corner, going down from there we have l − 1 steps to the next corner and so forth, until we get to l − q steps left for some positive q. These parameters uniquely specify the Young diagram. Note that The content distributions consist of segments of slope 0, −1, 1. Define three functions of the content c, with parameters k, a, b. a, b are integers with b ≥ a. k is a positive integer equal to the value of the function at a so that Θ(k, a, b; c = a) = k. The three functions are Below: a typical content distribution plot. The content of the boxes along the main diagonal is c = 0 and its multiplicity is d. The content of the box at the top right most corner of the diagram is k + 1 + l + 1 + · · · k + p − 1 and its multiplicity is 1. Similarly, the content of the box at the bottom left most corner is l − 1 − k − 1 − · · · l − q + 1 also with multiplicity of 1. At the end points of n and −n, the CDF open string is at zero for all Young diagrams.

JHEP01(2020)146
The content distribution function is easily written in terms of these.
where f + , f − are given below. and The Young diagram in equation (5.3) illustrates an example of the counting in f + (c) and f − (c). Here, k + 1 = 3, k + 2 = 2, k + 3 = 3, l + 1 = 5, l + 2 = 5. Then l − The function Θ − will count the content multiplicity of the shaded regions to the right of, and including, the main diagonal (whose content is zero). It counts content multiplicity from c = 0 to k + 1 − 1, (i.e. from c = 0 to c = 2). Then it counts the multiplicity from k + 1 + l + 1 to k + 1 + l + 1 + k + 2 − 1, (i.e. from c = 8 to c = 9). Lastly, Θ − evaluates the multiplicity for contents k + 1 + l + 1 + k + 2 + l + 2 to k + 1 + l + 1 + k + 2 + l + 2 + k + 3 − 1, (i.e. from c = 15 to c = 17). The shaded regions below the main diagonal are handled by the function Θ + . For this example, it will count the contents −1 to −l − 1 (i.e. from c = −1 to c = −2), then −l − The CDF plots for these two partitions are displayed in figure 8. Define the difference between the two CDFs for [4, 1, 1] and [3,3]: 3]. This six dimensional vector is Obviously it satisfies i ∆ i = 0 and i i∆ i = 0. Another degenerate pair of diagrams at n = 6 which satisfy the same equations is These are conjugates of the [4, 1, 1], [3,3] degenerate pair. From this simple example, it seems useful to study the data in terms of these content distribution functions, and their differences. Given any two content distribution functions f i and f i , define These are positive or negative integers. These differences have the properties that Existence of degenerate moments implies that the following equations have non-tivial solutions This looks like a function with a set of vanishing moments up to k. For low values of n, there are no solutions for ∆ i to these equations. Functions with vanishing moments up to a certain maximum are studied in the literature on wavelets (see for example [51][52][53]). Discrete wavelets are also an active area of research. So this could be a way to approach an answer to our question. The problem of extending the sequence n * (k) displayed in table 4 to larger values of k is very interesting. It is indeed plausible that the CDFs could play a major role in this JHEP01(2020)146 regard. The degeneracies that arise at n * (k) capture the failure of the first k to distinguish the Young diagrams. These degeneracies are related to the existence of vanishing moment equations for differences of CDFs. Some of these solutions are exhibited using the computer generated data in section 5.4. The CDFs may lead to new approaches for determining n * (k) drawing on number theory (diophantine equations) and probability theory.

Summary and outlook
We have proved that central elements T 2 , T 3 , · · · , T n associated with reduced cycle structures C (1) , C (2) , · · · , C (n−1) generate the centre Z(C(S n )). We then showed that restricting to a subset {T 2 , · · · , T k } generates the centre for all n up to n * (k). We used computational methods to determine n * (k) for k up to 6. For the classes T 2 , T 3 and the collection {T 2 , T 3 }, we computed the dimensions of the subspaces generated. We showed that these dimensions are directly encoded in the normalized characters of these conjugacy classes in Young diagrams R with n boxes, which are in turn related to Casimir eigenvalues for the U(N ) representation associated with R. The multiplicities of the normalized characters can be used to quantify the amount of information available with specified sets such as {T 2 } , {T 3 } or {T 2 , T 3 }, using Shannon entropy functions. These entropies were calculated and led to the conclusion that the dimensions as well as entropies give sensible measures of the amount of information available with the normalized characters, equivalently specified Casimir charges. We presented some conjectures on the large n behaviours of relative dimensions and entropies for T 2 and T 3 , based on the plausible expectation that at large n the different measures should give the same ranking.
We have observed that the power-sums of contents can be viewed as moments of a content distribution function. This is simply a discrete function on a finite set of points in the range [−n + 1, n − 1], which was shown to uniquely determine a Young diagram. Some initial steps in the direction of using CDFs in order to understand the degeneracies between normalized characters that occur for n just above n * (k) have been made. We have observed that differences of Young diagram CDFs obey some diophantine vanishing moment conditions above n * (k). Using computational work, we obtained some instances of these solutions to diophantine equations. It is reasonable to expect that a combination of techniques from combinatorics and number theory will, in the future, allow a general analytic treatment of these diophantine equations for general k and provide further information on n * (k) as k increases.
The centre Z(C(S n ) is one of a class of interesting permutation centralizer algebras which are relevant to multi-matrix and tensor invariants [27,29]. There is a 2-parameter JHEP01(2020)146 algebra A(m, n) relevant to the 2-matrix system with U(N ) gauge symmetry, with structure closely related to Littlewood-Richardson coefficients. There is a 1-parameter algebra K(n) relevant to 3-index tensor systems, which is closely related to Kronecker coefficients. The algebras A(m, n) were recently used to derive identities involving contents of Young diagrams, which have applications in quantum information processing tasks [55]: the present paper is another link between permutation algebras and information theoretic perspectives directly motivated, in the present case, by information theoretic questions in AdS/CFT. Another connection between the 2-matrix system and small black holes in AdS/CFT is proposed in [56]. It is evident that we are only beginning to scratch the surface of the story linking AdS/CFT, information and permutation algebras. Analogous algebras play a role in matrix/tensor systems with O(N )/Sp(N ) symmetry [57][58][59][60][61][62]. The results of this paper, developing the connection between Casimirs and the structure of permutation algebras should admit a generalization to these cases. It will be fascinating to explore these systems using the combination of analytic and computational techniques we have used here, to generate sequences analogous to n * (k) for these cases.

A Mathematica code
We begin by writing code to calculate the content polynomial or the content power sum for a partition specified by P . The integer k specifies the power of the terms in the sum, while P specifies the actual partition. For example, taking partitions of 3, and k = 2, the above code computes, for the three partitions (3), (2, 1) and (1, 1, 1):

JHEP01(2020)146
The definition in the second line of code "ListContentPowerSums" gives a list of these power sums that runs over all the partitions on n. After running the above code for k = 1 and n = 6, we find: In [3]:= ListContentPowerSums [1 ,6] Out [3]= {15 , 9 , 5 , 3 , 3 , 0 , -3 , -3 , -5 , -9 , -15} Now we wish to compare the lists of the content power sums for different values of k. Below S will be a set of positive integers specifying the powers of contents to be summed over; this function will produce the list of vectors of content power sums for the partitions of n, with the powers specified by S: We can see that the list { c(R) , c(R) 2 } contain no degeneracies at n = 6. The code to compute codimension data for T 2 , then T 3 and then for {T 2 , T 3 } respectively is found below. The idea here is simply to generate the list of content polynomials, and count the number of uniques elements by subtracting the length of the list when duplicates have been deleted. The above code generates codimension data for n from n = 2 to n = 70.
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.