Symmetry breaking to Majorana Brown-Susskind metric

In parts I [1] and II [2] of our earlier work, we studied how metrics gij on su\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathfrak{su} $$\end{document}(n) may spontaneously break symmetry and crystallize into a form which is kaq, knows about qubits. We did this for n = 2N and then away from powers of 2. Here we address the Fermionic version and find kam metrics, these know about Majoranas. That is, there is a basis of principal axes {Hk} of which is of homogeneous Majorana degree. In part I, we searched unsuccessfully for functional minima representing crystallized metrics exhibiting the Brown-Susskind penalty schedule, motivated by their study of black hole scrambling time. Here, by segueing to the Fermionic setting we find, to good approximation, kam metrics adhering to this schedule on both su\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathfrak{su} $$\end{document}(4) and su\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathfrak{su} $$\end{document}(8). Thus, with this preliminary finding, our toy model exhibits two of the three features required for the spontaneous emergence of spatial structure: (1) localized degrees of freedom and, (2) a preference for low body-number (or low Majorana number) interactions. The final feature, (3) constraints on who may interact with whom, i.e. a neighborhood structure, must await an effective analytic technique, being entirely beyond what we can approach with classical numerics.


Introduction
The most symmetrical metric on su(n) is the Killing form H, H = −2n tr(H 2 ). This metric is adjoint-invariant and induces a left-invariant metric on SU(n) of diameter π. Less symmetric metrics g ij on su(n) are motivated by quantum compiling and black hole physics. Both contexts suggest metrics diagonal in the Pauli-word basis [3][4][5][6][7]. In this paper n = 2 N and a Pauli word is an N -fold tensor product with a 1, X, Y , or Z in each slot, e.g. 1 is a word of weight w = 4 (4 letters) in su (2 8 ). Low weight directions are both the practical directions along which to evolve a quantum state in a quantum computer and according to the SYK [8] model (in its Bosonic version), the principal directions of black hole evolution [6,7]. These constraints direct our attention to the Brown-Susskind exponential penalty metrics: g ij = δ ij e const. weight(i) . (1.1)

JHEP04(2022)041
In parts I and II [1,2], we studied functionals on the space of metrics, recalled briefly in section 2. The chief finding was that a surprising number of local minima create and respect a tensor product structure on the underlying Hilbert space. We called these metrics kaq for knows about qubits and the process of falling into such a minimum a metric crystallization in analogy to the formation of crystals through spontaneous symmetry breaking. However, in part I, we did not find crystallized metrics of Brown-Susskind type eq. (1.1), either from random initialization or from seeds of exactly that form. Indeed, initializing in metrics obeying eq. (1.1) always led through gradient descent to unrelated minima.
But just as the SYK model appears most useful in Fermionic form where 2-body interactions appear as a Fermionic analogy of Brown-Susskind metric would have a first (random) term like eq. (1.2) and continue with higher order interactions with exponentially decaying coefficients. We observe local minima on both su(4) and su (8) of this form. It is true that finding Fermionic Brown-Susskind minima requires careful seeding, but similar care on the qubit side failed to reach metrics of the form eq. (1.1). This suggests that Fermionic Brown-Susskind metrics arise from spontaneous symmetry breaking, and makes them natural candidates for H initial , the initial Hamiltonian of the universe, at least with the toy model under discussion. In [9], the concept of a universal critical metric is developed axiomatically. Our data here may reflect on both su(4) and su(8) critical metrics with a roughly exponential structure. In both cases we see a brief dip in eigenvalues prior to their exponential growth. What stands out is the preference for low degree Majorana monomials rather than the precise exponential form.
We have already described Pauli word basis, here in more detail is the Majorana basis for SU(2 N ). Just as in [1,2] where there was a variable isomorphism J : (C 2 ) ⊗N ∼ = − → C 2 N (and the induced j : Her(2) ⊗N ∼ = − → Her(2 N ) in the background), here there is also the same choice. We may conjugate by any U ∈ SU(2 N ) to transform the coordinates on su(2 N ). To have a precise * -isomorphism, we need to identify the complexified Real Clifford (Majorana) algebra with u(2 N ) ⊗ R C represented as follows:

JHEP04(2022)041
Note. In many applications, operators of odd Majorana degree are not physical because their application would violate the Fermionic parity super-selection rule. There is no similar issue here, we are simply using Γ-matrices to write out a basis for a Lie algebra. Both even and odd degree Majorana operators (after judicious insertion of powers of √ −1) are legitimate basis elements of su(2 N ).
Although we hope to study metric crystallization analytically in the future (e.g. the link [10] between SU(n) and 2D Hamiltonian system might provide an avenue), this paper is numerical, and we must allow some tolerance around the ideal definition. If, from a random seed, we find a metric g ij so that a related g i j has 80% or more of a principal axes basis each with at least 95% of its L 2 -norm concentrated in a single homogeneous degree subspace we consider the minimum to be kam. In [1], we did dimension counting to demonstrate the rarity of kaq metrics. To make a similar case for the rarity of kam or kaq metrics, up to an exponential tolerance, one ideally would estimate the phase space volume satisfying our acceptance rule. Although undoubtedly tiny, a rigorous estimate would require a feat of algebraic geometry. We instead adopt an expedient. To complement the roughly 100 gradient descents from fully random or random Majorana diagonal seeds, carried out for this study, we randomly generate a similar number of metrics g ij to be used as a control group. These g ij are random except for the specification of the principal axis degeneracy pattern which we chose to mimic Majorana degree degeneracies, e.g.: (4,6,4,1) in the case of su(4) and (6,15,20,15,6,1) in the case of su(8). Then we search the possible conjugate metrics g i j , as above, for accidental kam structure. In no case was our experimental criterion close to being met. 2 Before giving the details of our search methods and the results, we should explain that search is done from three qualitatively different types of initial metrics g ij [1, refer to section 3 for a fully detailed list]: 1. Fully random seed ([1, GenPerturbId]). Here the metric is selected from a Gaussian centered at the ad-invariant Killing metric, which we call 1 n , since it appears as the identity when written in either a Pauli-word-weight, or a Majorana-degree basis. For reasons of numerical stability we choose a Gaussian of small variance and generally use a slow learning rate to avoid inadvertently jumping over nearby local minima.  (1), g ij has no initial structure so emergence of kaq or kam metrics is most surprising. Cases (2) and (3) increasingly "stack the deck" making it easier to locate local minima of interest. The functionals we study (section 2) are on spaces of metrics having hundreds of dimensions (2015 dimension for su (8)) and many local minima, and require such initialization to fully explore. In cases (2) and (3), what we are looking for is strictly unforced behavior. In the case of (2), this would be the formation of eigenvalue degeneracies associated to Majorana degree and perhaps sub-Lie-algebra structures (although these were not found from Majorana initializations). In case (3), the independent variable is the eigenvalues or lengths 2 of principal axes. The finding highlighted above of Fermionic Brown-Susskind metrics was the result of a type (3) initialization.
It is natural to inquire if this finding could be due to chance. Although we do not have enough data for a careful statistical study, for comparison, for each of the 7 functionals analyzed, we generated 10,000 random functions f of {1, . . . , 6} to represent possible eigenvalues at the local minimum for the batches of degree d eigenvalues, 1 ≤ d ≤ 6, corresponding to the Majorana basis for su (8). The value was selected uniformly between the smallest and largest eigenvalues seen in our actual runs. For each function the loss for the best L 2 -fit to the exponential form was evaluated. and compared to the mean loss of our actual runs. One functional particularly stands out as always giving approximately a Fermionic Brown-Susskind structure in its local minima, while others struggle to do so. We refer to section 4.3 for more details.

Review of functional
We review the perturbed Gaussian integral (inspired from [11]) used to define the functionals in [1]. Let The real and imaginary part of F k are of interest:

JHEP04(2022)041
From the two functionals above, We derive two functionals called F 26 (c, g, k) and F 24 (c, g, k). The subscripts denote how far the perturbative expansion is computed. For F 26 , we compute the 2 and 6 vertex diagrams, and for F 24 , the 2 and 4 vertex diagrams. The details are discussed in The in [1, section 2.1] and also reviewed in [2, section 3].
To find the local minima of these functionals, we obviously need to fix a volume for g, i.e. set det g = 1. To enforce this condition, it is numerically more stable to take a Lagrangian approach instead of normalizing by det(g) = 1 [1]: where r 1 ≥ 1, r 2 1. Gradient descent on these two functionals yield the solution metrics we analyze for kamness. Our numerics always work through the Feynman-Penrose asymptotic expansion; the integral itself is oscillating and approaching it through Riemann sums would not be fruitful. As mentioned before, these solutions generally have highly degenerate eigenspaces. For N = 2, 3 we refer to [1, tables 1-2 (GenPerturbId)] for the values chosen for k, r 1 , r 2 . We simply note that for N = 2 we always choose k = 100, 200 and for N = 3, we choose k = 500, 1000. Here, as in Chern-Simons theory [11], 1 k serves as an expansion parameter as it controls the relative weights of the quadratic and cubic terms. The ability to pick 1 k small, stabilizes the numerics.

Kam loss function
We want to design a loss function, which global minimum is 0 if and only if the solution g is kam.

Identifying the parameters of the loss function
Following definition 1.1, there are two sources for the parameters of such a loss function. This is identical to kaq loss function defined in [2, section 3.2]). The first set of parameters describe the conjugation of the eigenbasis by some U ∈ U (n). Next, note that the choice of the basis of each degenerate eigenbasis is not unique, and so a degenerate eigenspace of degree d can afford an independent change of basis, leading to the second source of parameters of our loss function, which describe an orthogonal matrix V ∈ O(d). The total number of parameters is is the degeneracy pattern of g. We use θ to denote all these parameters.

Computing projection to homogeneous Majorana spaces
After the above two transformations on the eigenstates, abusing the notation, let the new eigenstates be {iH 1 , . . . , iH n 2 −1 }. Then we compute v p,q which is the squared projection norm of H p , 1 ≤ p ≤ n 2 − 1, to the homogeneous Majorana space of degree q. We compute this as we would compute it for a vector projection to a subspace given by its orthonormal basis. Here, the subspace is given by the orthonormal basis {γ i 1 . . . γ iq } 1≤i 1 <...<iq≤2N .

Formula for the loss
For each p, since the homogeneous Majorana spaces of degree q span the whole hermitian matrix space, we have q v p,q = ||H p || 2 2 = 1. Clearly, we would like one of the projection norms to be one, and thus the rest to be zero. To have a loss function L θ (g) with minimum described by such a configuration, we can simply define: It is not hard to see that L θ (g) = 0 for some parameters θ iff ∀p : L θ (H p ) = 0 iff ∀p∃q : v p,q = 1, i.e. g is kam. There are other possible designs for L θ , like the sum of (1 − v p,q ) 2 for L θ (H p ), which we note, changes the global minimum of L θ when g is kam. We tried these other formulae and they did not give us any other kam solutions.

Kam solutions tables
We make this section very similar to [2, section 4.2], where the local minima found in [1] for N = 2, 3 through a GenPerturbId search are listed in tables (1, 2, 3, 4) by their degeneracy patterns and their kamness.

Remarks on the results
1. When a solution is declared to be kam, the value of L θ is very low, smaller than 1e − 3, and as a result all L(H p ) are smaller than 1e − 4. On the other hand, in our experience, there has been a clear line between kam and non-kam solutions, where L θ is at least 1 (or in most cases, esp. for N = 3, much larger than 1).     Remark 3.1. Compared to the kaqness results in [2], we see that it is much easier to find a kaq solution than kam. Nevertheless, both are rare as shown in the next section.

Random kams
As discussed before, we take 100 randomly generated metrics g ij with a Majorana degree degeneracy pattern. We do so by first randomly generating a diagonal metric with such a JHEP04(2022)041 pattern, and conjugate it by a random orthogonal matrix. As a result the random metric has the Majorana degree degeneracy pattern, but whether it is kam or not depends on the random orthogonal matrix. After running gradient descent for each 100 randomly generated metric, we found no instance of kam: • For su(4): the vast majority 97/100 had L(g) > 2, and three had loss ∼ 1.5. Even for those three random metrics, none of the L(H p ) were smaller than 0.002, meaning no eigenstate met our criteria (1e − 4) to be a homogeneous Majorana degree subspace in the minimum for L(g). This stands in stark contrast with the non-kam pattern found in table 1 for k = 200.
• For su (8): the lowest loss was L(g) ∼ 12, with the least L(H p ) being 0.01. Again, this is in contrast with the non-kam pattern found in table 4 for k = 200.

Random kaqs
Similarly, we do random simulations to search for kaq patterns. For notations, we refer to [2, section 3.2]: • For su (4): degeneracy pattern is (6,9). For the vast majority L kaq (g) > 2 with a few ∼ 1. Lowest entropy s j was 0.002, meaning no eigenstate H j could be factored to a tensor product (we have a 1e − 4 criteria, similar to L(H p ) for kamness). This is also in contrast with nonkaq solutions found in [2], which had the vast majority of their eigenstates factorized.
• For su (8): note that we searched for partial-kaq, i.e. a C 4 ⊗C 2 decomposition, which is more likely to occur than a kaq decomposition C 2 ⊗ C 2 ⊗ C 2 . The typical loss was ∼ 35, but if two eigenspaces out of the three (9, 27, 27) were to be merged, thus giving more degrees of freedom to find a kaq configuration, then loss dropped to 12, with no H p being factorized. Again, this is in contrast with the non-kaq pattern found in [2].

The search setting
To search for Fermionic Brown-Susskind (FBS) metrics, as discussed previously in item 3, we use the BatchedDiagPerturbId method. Furthermore, we make our batched diagonal initialization on a random FBS metric determined by a weight w, i.e. g ij = δ ij w weight(i) where weight(i) is the Majorana degree of basis element i. Although we proved in [1] that the gradient descent preserves the degeneracy pattern, in no way is it bound to preserve the FBS nature of the metric and neither the ascending order of the eigenvalues as illustrated later. Indeed, the parameters of the gradient descent are the weights w d for degree d Majorana monomials, and not w (although initially w d = w d ). Thus in the solutions obtained, some exhibit approximately an FBS structure, while some do not. Of those that do, notice that the data shows a small dip prior to the exponential rise, we have speculated that this could be a signature of the critical metric proposed in [9].  (4), the 12 searches we did all turned up the same local minima. To see how rare this single exponential fit loss is, we can randomly simulate 10,000 functions similar to section 4.3, but with the additional constraints that 0.85 max ≤ max 1≤j≤2N f (j) ≤ max and that the exponent b > 0, then the loss of this solution would be higher than only 0.84% of the random losses. Note these additional constraints make the comparison fairer, as higher maximum eigenvalue generally means higher L 2loss (since the loss is not scale-invariant), and requiring b > 0 ensures we look at samples that have an overall increasing set of eigenvalues, as is ours.

FBS solutions graph
In figure 1 and figure 2, we plot the solutions that exhibited an approximate FBS structure. We favor these plots since they have some of the highest maximum eigenvalue compared to other plots, and yet achieve relatively low loss. In addition to scatter plotting the eigenvalues, we have plotted the best exponential fit of the form e

Not-FBS solutions graph
Below, in figure 3 and figure 4, we also reproduce the solutions that were far away from an exponential fit.

Null hypothesis: searching for random FBS
To see how rare the exponential fit loss of the aforementioned graphs are, we review the process mentioned in the introduction: we generate 10,000 random functions f of {1, . . . , 2N } representing possible eigenvalues at the local minimum for the batches of degree d eigenvalues, 1 ≤ d ≤ 2N , corresponding to the Majorana basis for su(2 N ).
The value f (i) is selected uniformly from the interval [min, max] between the smallest min and largest max eigenvalues seen in our previous runs for each functional. In total, there were 7 functionals, four for su(4) being L 24 , L 26 with k = 100, 200 and three for su (8) being L 24 (k = 1000) and L 26 with k = 500, 1000. For each of these 7 functionals, we have a [min, max] interval, and we have 7 times 10,000 random functions in total.  The top left and right graphs are for the same solution, and the right graph shows only the eigenvalues and the fitted curve in order to better illustrate the bad fit. As a comparison, the fit loss of the top graph was larger than 70% of the respective random losses found in section 4.3. Finally, we pick our values so that the condition det = 1 is enforced. To do so, we simply take the logs of f (j) and turn the sampling problem into a convex body sampling problem for which there are many available methods and packages such as PyMC3 in Python. We should note that the problem of uniform sampling from the set of f (j) satisfying those constraints is not exactly equivalent to the uniform sampling from the convex body that is formed by the log(f (j))s, as we are making a change of coordinate by taking the logarithm. To the extent of our knowledge, packages such as PyMC3 can only be rigorously applied for convex body sampling. Nevertheless, from this study and side-experiments, we believe that the occurrence of local minima with exponential-like (Fermionic Brown-Susskind) growth are not random events but reflect a genuine propensity. Our experimental design does not enable us to claim this result with a precise confidence interval, for example the gold standard 5-sigmas, but we regard it as trustworthy.

JHEP04(2022)041
For each random function, the loss for the best L 2 -fit to the form e , const. ≥ 0 was evaluated. How to compare these numbers with those of the actual runs? Assume the actual runs for a functional gave r many distinct solutions (e.g. r = 5 for L 24 (k = 1000)), with mean loss l mean . We estimate the distribution of the mean loss of r choices, i.e. r i=1 l j i r , where 1 ≤ j 1 < . . . < j r ≤ 10 4 , from the random losses {l i } 10 4 i=1 . Then, we can see if our r actual losses are in general a rare r-sample of the random losses.
In figure 5 and figure 6, we show how many of the random losses are less than l mean , which is different than the r-sampling problem just discussed. In the plots, the (orange) black color shows the percentage of the random losses that are less than the (mean loss JHEP04(2022)041 l mean ) minimum loss of the actual runs for that functional. We include mean loss for completeness, but we believe the number the most relevant to the argument we are making is the smallest loss, since it is not all local minima, but rather some O(1) fraction of them, which exhibit a close FBS structure We sample 100 million mean loss of subsets of size r from the 10,000 losses for each of the 7 functionals. For each functional, we list the percentage of the samples that were lower than the actual l mean and the number of distinct solutions found: In summary, we conclude that the graphs in figure 1 and figure 2, for su(4) derived from local minima of the functional L 24 (k = 200), and for su (8), derived from the local minima of the functional L 24 (k = 1000), exhibit a close Fermionic Brown-Susskind structure. We see a clear contrast between the L 24 and L 26 type functionals in this case, with only the former having local minima exhibiting a close FBS structure. This is the clearest distinction between the functional types L 24 (defined with an "imaginary time" exponential e −k... ) and L 26 (defined with a "Real time" exponential e ik... ) yet observed (referring back to [1,2]), and provides important feedback on the class of symmetry breaking functionals to be considered in future work.