A Spectral Metric for Collider Geometry

By quantifying the distance between two collider events, one can triangulate a metric space and reframe collider data analysis as computational geometry. One popular geometric approach is to first represent events as an energy flow on an idealized celestial sphere and then define the metric in terms of optimal transport in two dimensions. In this paper, we advocate for representing events in terms of a spectral function that encodes pairwise particle angles and products of particle energies, which enables a metric distance defined in terms of one-dimensional optimal transport. This approach has the advantage of automatically incorporating obvious isometries of the data, like rotations about the colliding beam axis. It also facilitates first-principles calculations, since there are simple closed-form expressions for optimal transport in one dimension. Up to isometries and event sets of measure zero, the spectral representation is unique, so the metric on the space of spectral functions is a metric on the space of events. At lowest order in perturbation theory in electron-positron collisions, our metric is simply the summed squared invariant masses of the two event hemispheres. Going to higher orders, we present predictions for the distribution of metric distances between jets in fixed-order and resummed perturbation theory as well as in parton-shower generators. Finally, we speculate on whether the spectral approach could furnish a useful metric on the space of quantum field theories.


Introduction
Events recorded in high-energy collider experiments populate an intricate and fascinating data manifold.Naïvely, this manifold has dimension 3N − 4, corresponding to the dimensionality of relativistic phase space for N observed on-shell final-state particles.For proton-proton collisions at the Large Hadron Collider (LHC), phase space can occupy hundreds of dimensions; for collisions of heavy ions, the phase space dimensionality can easily be in the thousands or even tens of thousands.Such a large space is much too large to be interpreted and visualized directly, so most collider physics analyses involve some kind of data reduction.Machine learning techniques have become increasingly popular in this context, since they can efficiently identify event structures according to the particular problem one wishes to solve; see e.g.Refs.[1][2][3][4][5][6][7][8][9] for recent reviews.At the same time, there has been a growing interest in studying the manifold of collider data as a geometric object in its own right.Considering the space of collider events as an abstract manifold means that one can evaluate quantities that encode properties of that manifold, such as its topology or local geometry.One approach to rigorously define a metric on the space of collider events is based on the energy mover's distance (EMD) [10], which quantifies the cost of rearranging the set of particles in one event to match the set of particles in another.Computing the EMD involves solving an optimal transport problem for moving energy, analogous to the historical earth mover's distance that quantifies the minimal cost for rearranging piles of dirt [11] through the Wasserstein metric [12][13][14].The EMD has been used to define new collider observables [15][16][17] and study the dimensionality of jets in CMS Open Data [18,19].Variations on the EMD have been proposed for collider physics that simplify computations [20,21], enable new physics searches [22][23][24], mitigate jet contamination [25], improve computational efficiency [26,27], and directly establish the Riemannian metric on phase space [28].
In this paper, we propose an alternative metric for the space of collider events based on spectral functions.With the original EMD, events are treated as distributions of energy on the celestial sphere where experimental calorimeters are located.Because the celestial sphere is two dimensional, one has to solve a two-dimensional optimal transport problem, for which there are efficient algorithms [29][30][31] but no closed-form expressions.Here, we introduce the spectral EMD, where events are treated as one-dimensional spectral functions that encode the distribution of pairwise particle angles.One-dimensional optimal transport has a simple expression in terms of quantile matching, and therefore our spectral EMD is more easily amenable to certain first-principles calculations.Both the original EMD and the spectral EMD are infrared-and-collinear (IRC) safe, enabling studies of their properties in perturbative quantum chromodynamics (QCD).
As with many geometry problems, the choice of metric is a choice, which depends on the features one wants to expose about the data.For example, the original EMD has the advantage of connecting to many well-known observables in collider physics, such as events shapes and jet substructure observables [15].It is, however, cumbersome to evaluate analytically in all but the most symmetric situations.The spectral EMD has the advantage of automatically incorporating basic isometries of collider data, such as azimuthal rotations around the beam line.Not every spectral function corresponds to physical collider event, though, so geodesics in spectral EMD space are more difficult to interpret.We perform a few side-by-side comparisons of the original EMD and spectral EMD to emphasize these kinds of differences, using as our testbed the space of collimated sprays of particles called jets.
We emphasize that the spectral EMD is not just a one-dimensional projection of the original EMD.Each one-dimensional projection of two-dimensional data involves some kind of information loss, whereas the spectral function preserves the complete event information, up to isometries and sets of measure zero.By taking multiple one-dimensional projections, one can compute the sliced Wasserstein distance [32,33], but this quantity behaves more like the original EMD than the spectral version.In some ways, the spectral EMD behaves like the tangent earth mover's distance [34] in that both respect isometries.Still, the original EMD and its variants have units of energy, whereas the spectral EMD has units of energy squared.
The rest of this paper is organized as follows.In Sec. 2, we review the spectral function and use it to define a metric distance between two events, up to isometries.In Sec. 3, we evaluate the spectral metric in closed form between two jets with up to four particles in total, corresponding to a perturbative calculation through order α 2 s .Using these results, we present the simplest, double-logarithmically accurate calculation of the distribution of metric distances in Sec. 4. We find that the distance between quark/gluon jets is controlled by the sum of the corresponding quadratic color Casimirs, which was also observed in Ref. [19] for the original EMD.In Sec. 5, we numerically evaluate the distribution of the spectral EMD between jets at next-to-leading order in electron-positron collisions using the program EVENT2 [35] and note similarities of the results with non-global logarithms in the calculation of the light hemisphere mass [36].We generate quark and gluon jets using a parton shower in Sec.6, and find that the size of non-perturbative effects are smaller than what one might have expected from the analogy to angularities.In Sec. 7, we present an analytical comparison between the spectral EMD and the original EMD for jets with low multiplicity, and present the first (to our knowledge) closed-form expressions for the EMD between two jets with up to two particles in them.The spectral function philosophy enables the construction of metrics for more general spaces, and in Sec. 8 we construct a metric on the space of quantum field theories that shares some features with the Zamolodchikov metric [37,38].We conclude in Sec. 9, and look forward to numerous ways that further investigations into the spectral EMD could illuminate collider physics.

The Spectral Metric
In this section, we introduce and define the spectral EMD on the space of jets.This requires defining the spectral function, which has long been studied in the collider literature [39][40][41][42][43].While we focus our discussion on applications to jets and their substructure, these definitions easily extend to complete sets of particles in an event and to weighted point clouds more generally.

Review of the Original EMD
To put the spectral EMD in context, it is useful to review the formulation of the original EMD [10].Consider a jet consisting of N particles labeled by i, with energies E i and directions ni .The energy flow of the jet is given by: which can be interpreted as the density of energy over an idealized detector at infinity with geometric coordinates n.Because of the inclusive sum over particles, the energy flow exhibits manifest invariance under the permutation group S N .The energy flow is normalized as: where E tot is the total energy of the jet.For hadron collider applications, one would typically replace energy with transverse momentum (p T ), but we stick with energy for our discussion for notational simplicity.Given two energy flows E A and E B , one can compute the optimal transportation cost between them [10]: The subscripts a and b denote particles in jets A and B, respectively, Ω(n a , nb ) is a pairwise angular distance between particles, β ≥ 1 is an angular exponent, and R is a fixed angular scale.The energy transportation plan f ab satisfies the following inequalities: (2.4) As long as R is larger than half the maximum distance between particles, this is a (modified) metric that satisfies the (modified) triangle inequality:

.5)
If E A and E B have the same total energy, then EMD β,R (E A , E B ) 1/β is equivalent to the p-Wasserstein metric with p = β.For N particles per jet, a generic EMD solver requires O(N 3 log N ) runtime.
The angular distance Ω(n a , nb ) is also known as the ground metric, which has to satisfy its own triangle inequality: 0 ≤ Ω(n a , nb ) ≤ Ω(n a , nc ) + Ω(n b , nc ). (2.6) For our calculations in e + e − collisions, we focus on β = 2, R = 1, and where θ ab is the opening angle between particles a and b.Note that with this normalization, Ω ∈ [0, 2].As discussed in Ref. [17], the EMD "faithfully" lifts the ground metric, meaning that if E A and E B are related by a translation of size Ω 0 , then the EMD is equal to Ω β 0 /R β .

The Spectral Function and its Properties
In this paper, we focus on an alternative way to represent a jet of N particles via its spectral function: Here, ω(n i , nj ) is a pairwise angular distance between particles, which may or may not be related to Ω(n a , nb ) above.Because of the inclusive double sum over particles, the spectral function exhibits manifest invariance under the permutation group S N .Because the spectral function depends on pairwise distances, it is invariant to all isometries respected by ω.The spectral function is normalized as (2.9) Like the energy flow in Eq. (2.1), the spectral function in Eq. (2.8) is IRC safe since it exhibits (multi-)linear energy weighting, an inclusive sum over all particles, and only angular dependence inside the δ-function.
The spectral function has a long history in collider and jet physics, starting with the energy-energy correlator [39].It has been used to represent the elements of a complete basis of IRC-safe observables [40], to define observables for jet classification [41], to form the foundation of higher-point energy correlators [42], and to encode a jet's information for machine learning applications [43].To the best of our knowledge, the spectral function has not yet been used to define a metric distance between two jets.
It is worth clarifying the difference between the angular distance ω in the spectral function and the angular distance Ω in the original EMD.In both cases, we would like the pairwise distance to respect the isometries of the detector, which is O(3) for the case of e + e − colliders with spherical geometry.The function ω in the spectral function is the pairwise distance between particles in the same event.With an appropriate choice of ω, the spectral function of an individual jet is automatically invariant to isometries, as is any quantity defined in terms of spectral functions, including the spectral EMD defined below.In particular, the spectral EMD between two jets that differ only by isometries is zero, which is often a desirable feature.By contrast, the function Ω in the original EMD is the pairwise distance between particles in different events.The energy flow of an individual jet is not invariant to isometries, but with appropriate choice of Ω, the original EMD between pairs of jets will respect isometries.Crucially, the original EMD between two jets that differ only by isometries is not zero.
For our studies, we use the angular measure: (2.10) This normalization has been chosen such that ω ij = 1 − cos θ ij for β = 2, which is commonly used in the spectral function literature.Note, though, that the meaning of β is different between the original EMD and the spectral function.For the original EMD, β changes the optimal transport problem between two jets.For the spectral function, β changes the representation of an individual jet, independent of optimal transport.In general, ω need not be a ground metric satisfying a triangle inequality, though it typically will be, since isometries are defined as distance preserving maps between metric spaces.Despite the potential for confusion, we will use the same symbol β since this ensures that the original EMD and the spectral EMD will behave similarly in simple limits, as discussed further in Sec. 7.
Because the spectral function is invariant to isometries, one cannot reconstruct an event uniquely from its spectral representation.As shown in App.A, though, the spectral function does determine an event uniquely up to isometries and pathological cases of measure zero.Theorem 2.6 of Ref. [44] proves that two point clouds in R k are equal, up to the action of an arbitrary isometry group, if their distributions of pairwise distances are identical.This proof can be lifted to weighted point clouds apart from special configurations with degenerate distances, which occupy a space of measure zero in phase space.While the uniqueness of the spectral representation will not be needed for our analysis, we find it satisfying that the metric on the space of spectral functions furnishes a metric on the space of events, modulo isometries and measure zero regions of phase space.

Introducing the Spectral EMD
Following the same logic as in Sec.2.1, we can define the distance between two spectral functions s A (ω) and s B (ω) as the minimal cost to rearrange the spectral functions to be identical.Like with the original EMD, we have to choose a ground metric for ω, which we always take to be the Euclidean distance |ω − ω ′ |.Assuming a Euclidean ground metric, we can leverage the closed form expression for the 1-Wasserstein distance in one dimension.
The spectral EMD is defined as: The subscript β refers to the angular measure in Eq. (2.10), which implies the maximum angular value ω max = 2 β−1 .The spectral EMD depends on the cumulative spectral function S(ω upper ): where Θ is the Heaviside function.For N particles per jet, computing the spectral EMD is dominated by computing the spectral function and sorting its entries to find the cumulative distribution, which takes O(N 2 log N ) runtime (cf.O(N 3 log N ) for the original EMD).
Because S A (ω max ) = E 2 A and S B (ω max ) = E 2 B could be different, this is an example of an unbalanced transport problem.It is straightforward to map this to a balanced transport problem.Letting E 2 A > E 2 B without loss of generality, we introduce a modified spectral function with a "reservoir" at ω max : such that s A and s mod B have the same total weight.This modification leaves Eq. (2.11) unchanged: which is identical to the 1-Wasserstein metric, up to an overall normalization factor of E 2 tot,A .In the body of this paper, we restrict our attention to the 1-Wasserstein metric.More generally, after using Eq.(2.13) to make this a balanced transport problem with total weight E 2 tot , one could consider the (p-th power of the) p-Wasserstein metric: This expression depends on the inverse of the cumulative spectral function S −1 (E 2 upper ), which yields the value of ω upper that encloses E 2 upper of spectral weight.For the special case of p = 1, Eq. (2.15) is equivalent to Eq. (2.11), since both are expressions for the area between the two cumulative spectral functions.In App.B, we present some results for p = 2.
On normalized probability distributions, the Wasserstein distance satisfies the properties of a metric: identity of indiscernibles, symmetry, and the triangle inequality.Thus, after doing the weight balancing trick, our spectral EMD is indeed a metric on the space of spectral functions, {s(ω)}.As argued in App.A, the spectral function uniquely defines a jet up to A < l a t e x i t s h a 1 _ b a s e 6 4 = " n H x K m h d l 7 6 p c q V d K 1 d s s j j y c w C m c g w f X U I V 7 q E E D G C A 8 w y u 8 O Y / O i / P u f C x a c 0 4 2 c w x / 4 H z + A I C T j M A = < / l a t e x i t > 3 < l a t e x i t s h a 1 _ b a s e 6 4 = " y + W D o q t t s z d e x E L Z g 6 isometries and sets of measure zero.Therefore, the spectral EMD is also a metric distance on the space of jets, {J}.That is, if the Wasserstein distance between two spectral functions s A (ω) and s B (ω) is 0, then the two jets A and B are identical, up to isometries and pathological configurations.

Example Optimal Transport Plan
To gain some intuition for the spectral EMD, it is useful to consider a low multiplicity example.We will do a more systematic study in Sec. 3, but here we identify some features of the optimal transport plan for spectral functions.
The optimal transport plan can be viewed as a "geodesic" between two spectral functions s A (ω) and s B (ω). Introducing a "time" parameter t ∈ [0, 1], we can envision continuously transforming one spectral function into another with a minimal cost at each time step.In general, this transportation plan is not unique, but there is a convenient constant speed geodesic for one-dimensional distributions: where S −1 is the functional inverse of the cumulative spectral function.By construction, S OTP (ω; 0) = S A (ω) and S OTP (ω; 1) = S B (ω).To determine the optimal transport plan spectral function, we simply differentiate Eq. (2.16) with respect to ω: (2.17) As an example, consider the optimal transport plan between two jets that each contain two particles and have the same total energy E, as shown in Fig. 1: • Jet A consists of odd-numbered particles {1, 3}; and • Jet B consists of even-numbered particles {2, 4}.
Their cumulative spectral functions are: < l a t e x i t s h a 1 _ b a s e 6 4 = " B D 7 i E Z V N q U 4 R 4 f X x 9 e w O e v n X B Z 4 = " > A A A B 8 n i c b V D L S g N B E J y N r x h f U Y 9 e B o P g K e x q U I 9 B L x 4 j m A d s l j A 7 m S R D 5 r H M 9 A p h y W d 4 8 Snapshots of the optimal transport plan between two spectral functions.
The inverse cumulative spectral functions are then: The optimal transportation plan is simple enough that the inverses in Eq. (2.16) can be taken analytically, as can the derivative in Eq. (2.17).The final result for the optimal transport plan spectral function is: As required, this spectral function reduces to s A (ω) and s B (ω) at t = 0 and t = 1, respectively.An example of this optimal transport plan is illustrated in Fig. 2, for the case of Note that the intermediate spectral functions at times t ∈ (0, 1) have two peaks away from z = 0.A collection of n particles on the plane has n 2 pairwise angles and of course there is no integer n for which n 2 = 2. Thus, these intermediate spectral functions cannot be mapped to a generic collection of particles on the plane.In this way, the optimal transport plan between spectral functions does not generically correspond to a rearrangement of the particles.
This behavior is distinct from original EMD behavior in Ref. [10], where the optimal transportation plan directly measures the cost of rearranging particles on the plane.Since the spectral function is blind to isometries, it is not surprising that the spectral optimal transportation plan does not have a particle interpretation.In our view, this is not a problem but simply a choice.What it does imply, though, is that the space of jets as defined by their spectral functions will have a different structure than the space of jets defined by particles.We start to compare these two spaces on low-multiplicity jets in Sec. 7, but leave a detailed study to future work.

Spectral Metric Between Low-Multiplicity Jets
In this section, we explicitly calculate the spectral metric between two jets with low constituent multiplicities.This will concretely illustrate what information is encoded in the metric and explicitly demonstrate its IRC safety.We arrange our analysis perturbatively in the strong coupling α s , and consider jets with up through three particles, corresponding to a relative O(α 2 s ) compared to leading order.The following expressions will be used compute analytic distributions of the spectral metric in Secs. 4 and 5.

O(α 0
s ): Jets with One Particle At lowest order in an α s expansion, the two jets being compared each consist of a single particle.Hence, their spectral functions are: where E A are E B are the respective energies of the two jets, and the superscript denotes the order in α s .Using Eq. (2.11), their spectral metric distance is where the (k, ℓ) superscript means that we are working to order α k s for jet A and order α ℓ s for jet B. Framed as an optimal transport problem, this distance corresponds to the minimal energy-squared that must be eliminated to render the spectral functions identical.Eliminating energy corresponds to transporting it "out" of the jet, from ω = 0 to immediately beyond ω = ω max , where the extra energy-squared can be dumped.The distance that this extra energy must be carried is ω max in units of ω in this framework, and so the total cost of removing this energy is the distance carried times the amount of excess energy.
As long as ω max is larger than the maximum ω in either spectral functions, Eq. (2.11) defines a metric distance.For two jets with radius R, it is natural, though not necessary, to set ω max ≃ R β /2, such that the excess energy would be proportional to |E 2  A − E 2 B |R β .We prefer to leave ω max as a free parameter, since it enables a meaningful comparison of jets that differ both in total energy and in jet radius used to define them.

O(α 1
s ): Jets with Up to Two Particles The contribution to the spectral metric at O(α 1 s ) comes from two sources: • Jet A consists of two particles and jet B consists of only one; or For simplicity, we assume that all final state particles are massless.For jet A consisting of two particles {1, 3}, its spectral function is: where E 1 + E 3 = E A , the total energy of jet A. This yields a contribution to the spectral metric between jets A and B of: where ω 13 is the angular distance between particles 1 and 3.The complete metric distance at this order in perturbation theory is the sum of this result with the corresponding configuration when jet B consists of two particles and jet A only has a single particle, i.e.SEMD (0,1) β .To interpret this result a bit more clearly, let us assume that the jet energies are identical, E A = E B ≡ E, and use the canonical choice of β = 2.Then, Eq. (3.4) simplifies to which is just the squared mass of jet A. The metric distance through this order is therefore: Because the jet mass is itself an IRC-safe observable, the metric distance is too, at least through O(α 1 s ).

O(α 2
s ): Jets with Up to Three Particles To simplify the analysis and illustrate the unique features of the spectral metric at O(α 2 s ), we assume that jets A and B have the same energy E A = E B ≡ E. Now, there are three possible configurations that must be considered: • Jet A consists of three particles (O(α 2 s )) and jet B has a single particle (O(α 0 s )); • Vice versa; or We continue to use the convention that jet A (B) consists of odd-numbered (even-numbered) particles.
1 Strictly speaking, one cannot simply add the contributions, since different orders appear in different parts of the calculation.In this case, though, the mass is zero at lowest order, we can use the same observable for SEMD (1,0) and SEMD (0,1) .
< l a t e x i t s h a 1 _ b a s e 6 Figure 3: Illustration of the optimal transportation plan between two jets at O(α 2 s ).The heights of the peaks in the spectral functions are representative and sum to the same total value, and the locations of the peaks satisfy the Pythagorean theorem.(a) Transport of three δ-functions of the jet A spectral function (solid) to render it identical to the jet B spectral function (dashed), isolated to the origin.(b) Transport between the spectral functions of two jets (solid and dashed), each with two constituent particles.Part of the rightmost peak is transported to the origin and part is transported to the location of the peak in the other spectral function.
We start with the first two configurations.Considering jet A with particles {1, 3, 5}, its spectral function is: While it is straightforward to calculate the spectral distance from its integral representation, it is convenient to think in terms of the optimal transportation plan shown in Fig. 3a.To make s (2) B , we must transport each of the δ-functions at ω > 0 to ω = 0.The cost of making these moves is the distance times the squared energy weight for each δ-function, yielding a distance between spectral functions of: (3.9) For the special case of β = 2, this is just the total squared mass of jet A. Symmetrizing over the two jets, the β = 2 spectral distance to this order at least consists of SEMD (2,0) Next, consider the configuration for which both jets have two constituent particles.This is the same configuration as in Sec.2.4, where jet A has particles {1, 3} and jet B has particles {2, 4}.The cumulative spectral functions were given already in Eqs.(2.18) and (2.19).Using the integral representation in Eq. (2.11), the spectral metric contains: Even with β = 2, this expression cannot be reduced to the jet mass.Instead, it contains a detailed comparison between the angular separation and the product of energies of the particles in the jets, in a way that cannot be captured by mass alone.An alternative way to derive Eq. (3.11) is using the optimal transportation plan in Fig. 3b.The less energetic δ-function is moved to the location of the more energetic δ-function, and then the extra energy of the more energetic δ-function is moved to the origin.The cost of these moves is: By enumerating all cases, it is straightforward to verify that Eqs.(3.11) and (3.12) are indeed identical.These two methods of performing calculation illustrate distinct ways of thinking about the problem: either ordering in the product of energies or in the pairwise angles. 2ombining the results of Eqs.(3.10) and (3.11), the spectral metric between two jets at O(α 2 s ) takes the compact expression: SEMD where the sums run over all possible pairs of particles a, a ′ in jet A and b, b ′ in jet B. If any of the energies vanish or the particles in a jet become collinear, the distance reduces to an energy-energy correlation function of the more massive jet, making IRC safety manifest.These calculations can be continued to higher orders, and one can observe how IRC safety manifests itself at every perturbative order, as it must, due to the IRC safety of the spectral functions themselves.

Double-Logarithmic Distance Distributions Between Jets
We now move to calculating the distribution of distances between jets initiated by different partons.The distribution of distances depends on the squared-amplitudes for two processes: Here, ⃗ x A is the vector of phase space coordinates for jet A and M B is the corresponding matrix element, and similarly for jet B. This quantity is not a cross section, since the integration is over two phase spaces for distinct jets.With the normalization factor Z, Eq. (4.1) is nevertheless a probability density for ℓ β , which justifies the notation ϱ.
For the analyses in this paper, we focus on jets initiated by quarks or gluons, so the matrix elements can be calculated in perturbative QCD.Then, we can expand Eq. (4.1) in powers of α s : We established in Sec. 3 that it is sufficient at O(α 0 s ) and O(α 1 s ) to let at least one of the jets be massless, such that the corresponding spectral function is simply a delta function at the origin.Therefore, through O(α 1 s ), this distribution is equivalent to the distribution of an energy correlation function-like observable [45,46] measured on a single jet.Starting at O(α 2 s ), though, the distance ℓ β describes honest correlations between jets that both have non-zero mass, as discussed further in Sec. 5.
It is straightforward to compute the distribution of distances at double-logarithmic accuracy, where jet emissions are strongly-ordered in both energy and angle.At this accuracy, we can immediately write down resummed results for the distribution of distances between two jets, since the spectral EMD is dominated by a single emission.Using Eq. (3.4) and taking the two jets to have equal energy E, the dimensionless distance at double-logarithmic accuracy is Here, the energy fractions and angles are: For strongly-ordered emissions, only one term in Eq. (4.3) will dominate for a given phase space configuration.Thus, to double-logarithmic accuracy, lβ is simply the sum of two dimensionless energy-energy correlation functions with angular exponent β, and the distribution of this observable can therefore be copied from standard results.The resummed cumulative distribution for the metric distances exponentiates into a familiar Sudakov form, but because of the two terms in Eq. (4.3), the probability of no emissions is controlled by the sum of the color Casimirs of the two jets, C A + C B .Said another way, we must forbid emissions in both jets to be larger than the observed value of lβ .To double-logarithmic accuracy, we have The QCD color factors are C q = 4/3 for quark-initiated jets and C g = 3 for gluon-initiated jets.This distribution between QCD jets of different origins was also calculated in Ref. [19] for the original EMD, where it was interpreted as the dimension of the space of jets as a function of resolution or distance.As shown in Sec.7.1, the original EMD and spectral EMD have the same behavior at double-logarithmic accuracy, so Eq.(4.5) holds in both cases.This calculation illustrates the expected behavior of a metric distance.In general, gluon jets are more massive than quark jets because C g > C q , and as such the difference between the masses of gluon jets from one another is expected to be larger than for quark jets.Thus, the spectral EMD between two gluon jets is expected to be larger than the spectral EMD between two quark jets, as born out by this calculation.We will see in Sec.6 that this same behavior is exhibited by parton shower simulations of quark-and gluon-initiated jets.

Fixed-Order Correlations Between Jets at e + e − Colliders
We now explore the structure of the spectral EMD through O(α 2 s ).Our analysis will be based on hemisphere jets produced in e + e − collisions, calculated at fixed order.The leading non-trivial correlations between the two hemispheres are referred to as non-global logarithms [36], and we will be able to directly probe them through the spectral EMD distribution.As a baseline, we compute the spectral EMD distribution between jets from distinct events, where such non-global effects are absent at this order.

Isolating the Non-Trivial Correlations
Up through O(α 1 s ), the spectral EMD between two jets is determined by their individual properties, with no non-trivial correlations.Only at O(α 2 s ) do we see such correlations, so we would like to define an observable that isolates those effects.
We define the reduced spectral EMD as where the reduced spectral functions are: Because of the signs in Eq. (5.1), larger values of ∆ β correspond to stronger correlations (i.e.smaller distances) between jets.
Furthermore, ∆ β ≥ 0 which we argue as follows.Without loss of generality, we assume that E A ≥ E B and can then immediately write closed form expressions for much of ∆ β .First, the distance between reduced spectral functions is which is the cost of removing the excess squared energy from the origin to outside the jet.
Next, the distance between the reduced spectral function of jet A and the full spectral function of jet B is because we must move all peaks in s B to the origin and remove the excess squared energy from the origin to outside the jet.With these results so far, note that which is the distance between the reduced and full spectral functions of jet B itself.Then, ∆ β can be equivalently expressed as Because the spectral EMD is a metric, the triangle inequality holds, with and therefore ∆ β ≥ 0 as promised.
At O(α 0 s ), ∆ β is manifestly zero because ŝ(0) A = s (0) A and ŝ(0 B .Similarly, at O(α 1 s ), where one of the jets must be massless, ∆ β is also zero.Only at O(α 2 s ) do we get a non-zero reduced spectral EMD.Following the same logic as in Sec.3.3, we find: (5.8) which holds even if E A ̸ = E B .Note that this formula also works at lower orders, since ∆ (2) β → 0 in the soft and/or collinear limits.
This expression for ∆ β has a similar structure to the minimum of the two jets' hemisphere energy correlation functions, with the key difference that the minimum is taken independently over the energy and angular factors.In general, the minimal energy and minimal angle of emission do not have to occur within the same jet.This has the interesting feature of significantly suppressing the contribution from one jet that has a soft, wide angle emission and the other jet with a hard, collinear emission.Effectively, to the observable ∆ β , such a configuration would involve both a soft and a collinear emission.
For the subsequent calculations, we assume that the hemisphere jets come from events with a common center-of-mass energy 2E, such that the Born-order hemisphere jet energy is E. Because of energy-momentum conservation, the jet energies will in general differ from E at higher orders.We define a dimensionless reduced spectral EMD as (5.9) In Sec.5.2, jets A and B correspond to hemisphere jets from different events, and we randomly choose one jet per event.In Sec.5.3, jets A and B come from different hemispheres of the same event.By comparing these two distributions, we can isolate the effects of non-trivial correlations starting at O(α 2 s ).

Distance
Between Jets in Distinct Events to O(α 2 s ) Consider the reduced spectral EMD between hemisphere jets A and B in distinct uncorrelated events.We randomly choose one hemisphere jet from each event, which ensures that A and B are identically distributed.The distribution of the dimensionless reduced spectral EMD ∆β is: where Z is a normalization factor, ⃗ x A is the phase space coordinate for event A, |M(⃗ x A )| 2 is the corresponding squared matrix element, and similarly for event B. Crucially, the spectral functions s A and s B are associated with the hemisphere jets (and not with the event as a whole).By construction, this expression is only non-trivial starting at O(α 2 s ).Expanding the squared matrix elements order by order, we obtain: (5.11) where σ 0 is the Born-order cross section, and ∆ (2) β is defined in Eq. (5.8).Terms proportional to not appear in this expression, since each jet requires at least two particles for ∆β to be non-zero.This distribution can be calculated numerically from the well-known matrix element for e + e − → q qg scattering, as shown in Sec.5.4 below.

Distance Between Jets in a Single Event to O(α 2
s ) Now consider the reduced spectral EMD between hemisphere jets A and B within the same event: (5.12) Though this expression is a proper cross section, and thus deserving of the symbol σ, we continue to use ϱ for ease of comparison.
As with the uncorrelated case, this expression is only non-trivial starting at O(α 2 s ): Here, we see the appearance of |M (2) (⃗ x)| 2 , since this is the first squared amplitude that allows each jet to have at least two particles each.This distribution involves the matrix elements for e + e − → q qgg and e + e − → q qq ′ q′ , which we compute numerically using EVENT2 [35].

Numerical Results
We now show numerical results for the distributions of ∆β=2 in the uncorrelated and correlated cases.We restrict to β = 2 for simplicity and familiarity with hemisphere jet masses.In Fig. 4, we show the distributions of log ∆β=2 , where we remove overall factors of the coupling and color factors.That is, we plot dϱ (2) /d log ∆β=2 , defined implicitly through where σ 0 is the Born-order cross section for e + e − → q q scattering and C (2) is the appropriate color factor for the secondary emission.We study the three possible color structures: where n f is the number of active quarks, and T R is the normalization of the fundamental representation of color SU(3).The C (2) = C F channel appears for both the uncorrelated and correlated cases in Fig. 4a.Soft gluon emissions proportional to C F are emitted incoherently from one another, just like photons from charged particles.Such emissions are described by the product of uncorrelated matrix elements |M (1) | 2 |M (1) | 2 .Two Abelian gluons produced in the same event are identical bosons, though, so there is a Bose factor of 1/2 in the matrix element |M (2) | 2 .Thus, in the deep infrared, where ∆β ≪ 1, we expect that correlated contribution from C F gluon emission in |M (2) | 2 is a factor of 2 smaller than the uncorrelated contribution in |M (1) Indeed, the general trends of the distributions agree well, with differences arising at subleading order where specific angular or energy ordering becomes important.
The results for the C (2) = C A channel are plotted in Fig. 4b, where only the correlated case contributes.The linear behavior in the deep infrared is expected from the form of leading non-global logarithms (NGLs).To obtain a non-zero value of ∆β , the two correlated gluons must be in different hemispheres, and therefore exhibit no collinear singularities, but can have hierarchical low energies.On this plot, we also include a linear fit and find that the leading Figure 4: Distributions of the reduced spectral EMD distance ∆β=2 calculated on jets production in e + e − events at O(α 2 s ), separated by color channel.For the C F channel in (a), the correlated distribution is a factor of 2 smaller than the uncorrelated one due to a Bose factor.For the C A channel in (b), we compare the output of EVENT2 to a fit accounting for leading non-global logarithms.For the n f T R channel in (c), we simply show the output from EVENT2.logarithms (the slope on this plot) is well described by as expected from the value of leading NGLs for hemisphere mass [36].Because there is no collinear singularity that contributes to the leading NGLs, the fact that the angular dependence of the hemisphere mass and ∆ β=2 are different has no effect.
Finally, we plot the C (2) = n f T R channel in Fig. 4c, where again only the correlated case contributes.There is no divergence associated with hierarchical energies from a gluon splitting into two quarks, so this distribution is only single logarithmic (flat on this plot) in the deep infrared.Along with the subleading logarithms in the C A channel, the fit we establish in these plots is (5.17) By contrast, the values of the subleading hemisphere mass NGLs are [47,48]: While close, we do not expect perfect agreement between subleading hemisphere mass NGLs and ∆β because of the distinct energy and angular ordering in the definition of ∆β .

Results from a Parton Shower
Having established some resummed and fixed-order results, we now investigate spectral EMD distributions obtained from a parton shower.We generate events in MadGraph 3.4.0[49] at a center-of-mass collision energy of 2 TeV, and use the following ℓ + ℓ − → jj processes to produce samples of quark and gluon jets: • Quark jets: e + e − → uū; • Gluon jets: τ + τ − → gg. 3hese hard scattering events are showered using Pythia 8.306 [50] with its default settings, except when we turn off hadronization.Two exclusive k T jets [51] are found in each event with FastJet 3.4.0[52], and one jet per event is chosen randomly for analysis.

Distance between Jets at Parton Level
Following Eq. ( 4.3), we compute the normalized spectral EMD lβ .In this subsection, jets are simulated at parton level with no hadronization effects.In Fig. 5a, we plot the spectral EMD distribution between two quark jets, a quark and a gluon jet, and two gluon jets, focusing on β = 2.There is a clear ordering to the distances between jets: log lβ=2 qq < log lβ=2 qg < log lβ=2 gg , (

Parton-Level Spectral EMD
MadGraph+Pythia, e + e -→ qq @ 2 TeV where the subscripts denote the two jet categories being compared.From the double logarithmic analysis in Eq. (4.5), the average log distance between jets is related to the sum of the color Casimirs: Taking ratios, this relation is well-reproduced by the Pythia parton-level samples: In Fig. 5b, we compare the spectral EMD for angular weighting parameters β = 1 2 , β = 1, and β = 2, focusing on the quark-quark sample.Here, we find: log lβ=1/2 qq ≈ log lβ=1 qq ≲ log lβ=2 qq . (6.6) These mean ratios differ rather substantially from the leading logarithmic predictions (note the minus sign in Eq. (6.2)): This difference seems to be due to physics at log lβ ≈ 0, where the double-logarithmic approximation is no longer accurate.Importantly, at double-logarithmic accuracy, the maximal value of lβ is 1, but the true upper bound depends on β through the maximum angular value ω max = 2 β−1 .We can reduce sensitivity to this upper bound effect by considering the variance of the spectral EMD.Using Eq. (4.5), the variance predicted from a double-logarithmic analysis is again related to the sum of the color Casimirs: The ratios of the variances in the Pythia quark-quark sample are much better described than the means: β=1,qq β=2,qq σ 2 β= 1 2 ,qq ≈ 3.13.(6.11) Computing these ratios at higher orders would be an interesting avenue for future studies.

Impact of Hadronization
As a first study of the effect of non-perturbative physics on the spectral EMD, we compare the distributions of lβ=2 between jets at parton-level versus hadron-level.This is shown for quark jets (e + e − → uū) in Fig. 5 for three different collision energies.The difference between parton level and hadron level is relatively modest, with no discernible scaling with center-of-mass collision energy.
To try to understand if this non-perturbative insensitivity is expected, we can try to draw an analogy with jet mass.As shown in Sec.3.2, the β = 2 spectral EMD at O(α 1 s ) is closely related to the sum of jet masses.Perturbatively, the mass (or two-point energy correlation function) of a jet is well-understood [45,53,54], which is closely related to the observable jet thrust and angularities [55][56][57].Because of its simplicity, the leading non-perturbative corrections to mass can be estimated by considering a jet with a single emission sensitive to the non-perturbative scale Λ QCD ≃ 1 GeV, which is of comparable order to the Landau pole or hadron masses [58].This non-perturbative emission has relative transverse momentum which depends on its energy E NP and angle θ NP from hard jet core.The non-perturbative contribution to jet mass is dominated by wide-angle emissions with θ NP ≃ 1, yielding where E is the jet energy.Consider E = 250 GeV, which is the approximate jet energy in Fig. 5a.If the analogy with jet mass held, then we would expect Note that log 0.004 ≈ −5.5, which is a region on this plot where the distribution has nearly vanished, so it is difficult to draw robust conclusions about what is happening given the small statistics.Nevertheless, at log lβ=2 ∼ −4, the expected non-perturbative shift would push the distribution up to log lβ=2 ∼ −3.8.If anything, the non-perturbative shift appears to go down by roughly this amount, in the opposite direction from the jet mass expectation.
One reason that the analogy with jet mass might be misleading is that the spectral EMD is not an additive observable.An IRC-safe observable is additive if its value never decreases when a new emission is added to the jet [45], with jet mass being the canonical example.For a large classes of additive observables, one can prove that non-perturbative physics affects the perturbative region via a simple positive shift of the distribution [59].While the spectral EMD has an additive structure at O(α s ), this no longer holds at higher orders.Specifically, the value of the spectral EMD can decrease due to additional emissions at O(α 2 s ), because of the negative contribution appearing in Eq. (3.13).
Without an all-orders understanding of how the spectral EMD is modified by nonperturbative emissions, we cannot make more concrete statements at this point.Ref. [10] showed how the EMD between parton-and hadron-level jets can be used to bound the nonperturbative shift on certain IRC-safe observables.It is not clear, though, how to translate this into a bound on the shift of the spectral EMD distribution.We leave a further analysis of non-perturbative effects to future work.

Comparison with the Original EMD
To gain more intuition for the spectral EMD, it is worth comparing its properties to the original EMD from Eq. (2.3).Already at O(α 0 s ), the distances have quite different expressions due to the differing treatment of isometries.For two jets A and B consisting of a single particle, the EMDs are: where E A and E B are the two jet energies and Ω AB is their relative angle.In addition to having different units, we see that the original EMD is sensitive to Ω AB while the spectral EMD is not.
To do a meaningful comparison between the original EMD and spectral EMD, we assume that the two jets are aligned and have equal energies such that: With this assumption, both EMDs have values of 0 at O(α 0 s ).For the remainder of our analysis, we set R = 1 for simplicity.
For jets with more than one particle, we need to specify what we mean by Ω AB = 0. We define the jet axis through a β-dependent weighted average of the jet constituents: nA,β = arg min and similarly for nB . 4For β = 2 and massless jet constituents, this corresponds to the usual jet axis aligned with the jet momentum: Regardless of the value of β, we let which is implicitly β dependent via the jet axes.s ), one jet can have at most two particles, while the other jet must have one.For the following analysis, let jet A consist of particles {1, 3} and jet B consist of particle 2, with From Eqs. (3.4) and (2.10), we see that the spectral EMD is For the original EMD, all of the radiation from particles 1 and 3 has to be transported to particle 2, yielding: While the spectral EMD depends on the angle between particles in the same jet, the original EMD depends on the angle between particles in different jets.
For the special case of β = 2, we know from Eq. (3.5) that the spectral EMD equals the squared jet mass at this order.Taking the collinear limit Ω 12 + Ω 23 ≈ Ω 13 ≪ 1, the original EMD reduces the squared jet mass divided by the jet energy: Note that thrust and β = 2 angularities [55][56][57] are also proportional to the squared jet mass in the collinear limit.For generic values of β, we can solve Eq. (7.4) in the collinear limit: Taking also the soft limit of E 3 → 0, we find: Thus, up to normalization, the original and spectral EMDs agree at O(α 1 s ) in the simultaneous soft and collinear limits.This means that the double-logarithmic analysis of Sec. 4 holds for both cases, with differences appearing at higher orders.
Since jet B consists of a single particle at this order, one way to interpret these EMDs is as the distance of closest approach between jet A and the manifold of one-particle configurations.For the original EMD, this interpretation was identified in Ref. [15], where it was shown more generally that the N -(sub)jettiness observables [63][64][65][66][67] corresponding to the distance of closest approach to N -particle manifolds.This interpretation crucially relies on setting the jet axis via Eq.(7.4).For the spectral EMD which is automatically invariant to isometries, the manifold of one-particle configurations consists of a single configuration with s(ω) = E 2 δ(ω).

O(α 2
s ): Jets with Up to Three Particles To the best of our knowledge, the expression for the original EMD at O(α 2 s ) has not been presented in the literature.Just as in Sec.3.3, there are two phase space configurations to consider.One contribution arises when jet A has three particles {1, 3, 5} and jet B consists of a single particle {2}.From Eq. (3.9), the spectral EMD is: For the original EMD, we have: Up to an overall energy scaling, these expressions agree in the strongly ordered limit with The second contribution arises when jet A has two particles {1, 3} and jet B also has two particles {2, 4}.From Eq. (3.13), the spectral EMD is: The original EMD requires solving a genuine optimal transport problem in two dimensions.Let f ≡ f 12 be the amount of energy transported from particle 1 to particle 2. Because of the constraints in Eq. (2.4), the other elements of the transportation plan are fixed by f : where the energy coefficients must all be positive.Solving the minimization in Eq. ( 7.15), we find . The value of the original EMD depends on the hierarchy of the energies and angles.We can express this as a minimization over the four possible hierarchies as: Once again, the spectral EMD depends on the angle between particles in the same jet, while the original EMD depends on the angle between particles in different jets.
In the collinear limit for β = 2, we can express this EMD in a nice form that can we will further simplify in Sec.7.3 by incorporating rotations about the jet axis.As discussed above, to make a meaningful comparison between jets with the EMD, we align their axes, which for β = 2 means that we align their net momenta.Then, we consider the particles in the two jets as illustrated in Fig. 7, where the relative azimuthal angle ϕ of the particles in the two jets is measured between the common jet axis and particles a and b.
Using the law of cosines in the collinear limit, the pairwise angles that appear in Eq. (7.17) can be expressed as: ) ) Then, in terms of the intrajet angles Ω 13 ,Ω 24 and the azimuth ϕ, the EMD takes the simpler form: (7.21)

Incorporating Rotational Isometries
For the above analysis, we assumed that we can freely translate the jets to align their axes, as in Eq. (7.6).As discussed in Ref. [68], one can also perform rotations to further align the radiation.This strategy of projecting the original EMD out by translations and rotations is known as the tangent EMD (TEMD) [34].For general angles and β, we are unaware of a closed form expression for the TEMD.By working in the collinear limit and fixing β = 2, though, we can gain insight from an approximate closed form expression.
Assuming that rotations about the jet axis are isometries, to calculate the TEMD in the collinear limit, we simply fix the relative azimuthal angle that appears in Eq. (7.21) to the value that minimizes the EMD.This angle is clearly ϕ = π, and so the TEMD in the collinear limit with β = 2 is From this expression, the original EMD differs by a non-negative term that depends on the relative azimuthal angle ϕ: Thus, we see that there is an explicit EMD cost to rotations about the jet axis, which enforces the relationship: Both the TEMD and the SEMD respect isometries, so it is interesting to compare their behavior in the collinear limit with β = 2: The sign of the difference on the second line fixes the relative size of the SEMD with respect to the TEMD.This difference can be either positive or negative, however, so there exists no fixed relationship between the SEMD and TEMD.Considering the angular factors first, note that: For the energy factors, though, the hierarchy is the opposite: Therefore, the relative size of the SEMD and the TEMD depends in detail on how the energy is shared among the two particles in the jets and how that sharing compares to the relative angles between particles.This also implies that the relative size of the original EMD and the SEMD depends sensitively on the particular momentum of the particles in the jets, which highlights the complementarity of these approaches.

A Spectral Metric for Theory Space
Finally, we introduce a spectral approach for constructing a metric on the space of theories.As shown in Ref. [15], one can lift the original EMD to a cross-section mover's distance (ΣMD), which provides an data-driven way to define the distance between theories.Here, we introduce the spectral ΣMD, which is invariant to the isometries of a theory by construction, even if the explicit form of those isometries is not known.We note that the idea of a metric on theory space has a long history.In conformal field theories (CFTs), the Zamolodchikov metric [37,38] is the canonical Riemannian metric on the space of theories, and it can be used to establish general and far-reaching results regarding the growth of degrees of freedom due to renormalization group evolution from high scales to low scales.The differential Zamolodchikov metric, or line element, is defined as the value of a two-point correlation function between local operators constructed from tangent vectors on the space of theories.This metric is therefore very nice for spaces that are smooth under variation of parameters, though not all CFTs are of this form. 5Furthermore, it is unclear how to practically use the Zamolodchikov metric to interpret realistic collider data.More recent approaches to theory space include an application of information geometry in quantum field theory [70] and a reformulation of the exact renormalization group [71] as an optimal transport problem [72].
In this section, we first review the original ΣMD before constructing its spectral variant.For the 2-Wasserstein variant of the spectral ΣMD in particular, we can write down an explicit expression for the metric tensor in terms of Lagrangian parameters.This metric tensor exhibits an intriguing link between the spectral ΣMD and renormalization group flow.

Review of the Original ΣMD
Consider a theory T defined as a set of events E i with associated cross sections σ i .In analogy to Eq. (2.1), we can define T as a distribution over events [15]: Integrating over all events, a theory is normalized via: where σ tot is the total cross section.Strictly speaking, the volume element dE involves separate integrations over different multiplicity final states.Given a distance between events d(E i , E j ), we can define a distance between theories as the work needed to rearrange theory A to theory B. Since the weight being moved is cross section, Ref. [15] called this the cross-section mover's distance: where the transportation plan F ab is constrained analogously to Eq. (2.4).The ΣMD depends on γ and S (the analogies of β and R in the original EMD) and the choice of ground metric d (the analogy of Ω in the original EMD).While Ref. [15] advocated setting d(E i , E j ) equal to EMD β,R (E i , E j ), the ΣMD can be defined using any ground metric on event space.The geometry on theory space is then induced by the ΣMD.

Introducing the Spectral ΣMD
Just as the spectral representation of an event in Eq. (2.8) is invariant to isometries, we can introduction a spectral representation of a theory that is invariant to isometries.Unlike collision events or jets in particle physics, the isometries in theory space are not necessarily known a priori, unless one has a known model space being studied.In the case of collision events, the solution was to represent an event exclusively in terms of pairwise angular distances, weighted by particle energies, which is clearly invariant to O(3) rotations or reflections of the celestial sphere.In the case of theories, we can represent a theory in terms of pairwise distances between events, ζ(E i , E j ).With this motivation, we introduce the theory spectral function, which is defined through pairwise distances between events and weighted by event cross sections: Here, T ≡ {E} is a set of events as produced in some theory, ζ(E i , E j ) is the distance between events i and j, and σ i is the cross section for event E i .By definition, pairwise distances are invariant to isometries, and any metric distance between events could be used to define ζ, including the original EMD or spectral EMD.
From the theory spectral function, we can define its cumulative function as: The spectral ΣMD between theory A and theory B is therefore: where the p = 1 subscript reminds us that this is a 1-Wasserstein distance.We have suppressed integration bounds in this expression, but it ranges over all physical values of ζ, from 0 up to the maximal distance between events ζ max .By construction, Eq. (8.6) is a metric on the space of theory space spectral functions.From the theorem of Ref. [44] (see App. A) we expect it to also be a metric on the space of theories modulo isometries, though there may potentially be some subtleties.Technically, Ref. [44] proved that knowing the unordered pairwise distances of n points in the space R k is sufficient to uniquely determine the set of points, up to isometries, as long as all pairwise distances are distinct.In general, we do not know what the manifold of theories is nor do we know its topology.We expect, however, that this is not an issue by results like the Whitney [73] (see, e.g., Refs.[74,75] for modern presentations) or Nash [76][77][78] embedding theorems, which establish that smooth manifolds can be isometrically embedded in sufficiently high dimensional Euclidean space.So, we assume that the theorem of Ref. [44] can be applied to generic theories as long as the number of events is countable, but we leave a more detailed justification or identification of limitations of this assumption to future work.

Riemannian Theory Space
A key advantage of the spectral ΣMD over the original ΣMD is that we can express it in closed form.The 2-Wasserstein metric has a Riemannian structure [79][80][81] and this can be used to extract a metric tensor for theory space, expressed in terms of the cumulative theory spectral functions.
Analogous to Eq. (2.15), the (squared) 2-Wasserstein version of Eq. (8.6) is: where S −1 (σ 2 ) is the inverse cumulative spectral function, whose argument is a squared cross section.With continuous distributions of pairwise event distances, we can express this (squared) Riemannian metric in terms of the cumulative spectral function.First, the cumulative spectral function is where dσ i is the differential cross section squared for events in ensemble i.The inverse cumulative spectral function can be written as This expression satisfies the inverse function theorem with The expression in Eq. (8.9) can be used to evaluate the p = 2 spectral ΣMD of Eq. (8.7).
To convert the spectral ΣMD into a metric tensor, we need to define coordinates on theory space.Let the theory spectral functions be dependent on a set of parameters {λ}, like couplings or masses, that define the theory.We will be interested in the spectral ΣMD between a theory evaluated at energy scales Q and Q+dQ, respectively, where these parameters change under renormalization group flow as Assuming that Q only appears implicitly through the parameters {λ}, the inverse cumulative spectral function transforms as: Plugging this into Eq.(8.7), the differential line element is: where the symmetric rank-two tensor g ij can be viewed as a metric on theory space:

.14)
Since it is often more convenient to compute the cumulative spectral function S(ζ) rather than its inverse S −1 1 (σ 2 ), we provide an alternative expression for the metric tensor g ij .Taking derivates of Eq. (8.9) with respect to λ i , we find:

.15)
Using the fact that the cross-section-squared coordinate is related to the event distance coordinate ζ as σ 2 = S(ζ), we find: Inserting these relations into Eq.(8.14) yields an alternative form for g ij : The differential line element in Eq. (8.13) shares a key property with the Zamolodchikov metric, namely that non-zero distances are only accumulated with non-zero β-functions.A key difference, though, is that the spectral ΣMD has a direct connection to measured quantities in collider events.Here, we just note this fascinating connection and leave a deeper interpretation and understanding to future work.

Conclusions
Equipping the space of particle collision data with a metric opens up a suite of geometric data analysis strategies.The spectral EMD introduced in this paper offers a complementary approach to the original EMD from Ref. [10].The spectral EMD respects isometries by construction, unlike the original EMD which is faithful (though not invariant) to the symmetries of the ground metric.Futhermore, since spectral functions are one-dimensional objects, we can avoid the numerical optimization needed for two-dimensional optimal transport.Two drawbacks of the spectral approach is that not every spectral function corresponds to a physical arrangement of particles, and when it does, the spectral function redundantly encodes the particle information.We view these as reasonable tradeoffs to achieve closed form expressions for the spectral EMD that are amenable to precision calculations.
This paper has just scratched the surface of potential applications and consequences of the spectral EMD, and there are several questions introduced in the text that deserve further study.Both the original EMD and spectral EMD define a metric space for collider events; can their similarities and differences be made more precise and quantitative?Both the tangent EMD and spectral EMD are invariant to isometries and have the same behavior at double logarithmic accuracy; how do their properties differ going to higher orders?The spectral EMD is not an additive observable, which complicates the analysis of non-perturbative effects; can we nevertheless understand the apparently small impact of hadronization?The spectral function provides a unique jet representation up to isometries and sets of measure zero; is there an experimental impact from degeneracies that can appear due to finite angular resolution?We focused on a spectral EMD construction based on the 1-Wasserstein metric where there is a duality between energy ordering and angular ordering; are there advantages from instead using the 2-Wasserstein metric which exhibits a Riemannian structure?Going to theory space, we found an intriguing connection between the spectral ΣMD and the Zamolodchikov metric; can this relation be sharpened, and what distinguishes theories at different points along their renormalization group flow?
We hope that the spectral EMD and the way that the spectral function encodes information finds broad applications for physics analyses.For example, the spectral function approach may provide a novel method for extracting physical quantities like the strong coupling α s or the top quark mass m t .With the connection to a theory space metric, perhaps it would unlock new ways to observe and measure the QCD β-function, through the flow of QCD as different energy scales are probed.This could also provide a new perspective on entropy growth as a parton shower evolves to the infrared [82,83].Recently, a procedure for measuring the top quark mass was proposed that exploited the correlation between pairwise angles and mass scales in the hadronic decay of a top quark [84], utilizing three-point energy correlators.Energy correlators are the first moment of the spectral function, or its higherpoint generalizations, and their structure has an immediate interpretation as the correlation function of local operators on the celestial sphere.
Finally, the presence of a (Riemannian) metric implies that collider data lives on a manifold, and the properties of this manifold can be studied.The study of Ref. [28] showed that N -body massless relativistic phase space is the product of a simplex and a hypersphere, which has a non-trivial topology as encoded in homotopy groups.This non-trivial topology has consequences for machine learning on the data and may present obstructions that cannot be overcome by some architectures [85].The collider geometry induced by a metric is sensitive to the structure of phase space and four-momentum conservation, but also implicitly involves the squared matrix element.In perturbation theory, matrix elements generically exhibit divergences at phase space boundaries, which might dramatically alter the geometry and topology relative to the naive expectation from phase space.Does the simple form of the spectral EMD enable the prediction, calculation, and observation of quantities like the Ricci curvature in data as an optimal transport space [86]?We hope that answers to these questions and more will produce a rich and fruitful perspective on the vast quantities and high dimensionality of collider physics data.

A Uniqueness of the Spectral Representation
As mentioned in Sec.2.2, the spectral function determines an event uniquely up to isometries and configurations of measure zero.In this appendix, we present a proof of this statement and highlight some pathological cases.
It is worth mentioning that there are other approaches to enforce isometries.For example, one can construct irreducible representations of the isometry group from invariant polynomials of particle momenta.This is similar in spirit to efforts to establish invariant operator bases on which equations of motion redundancies have been eliminated; see, e.g., Ref. [87].The systematic procedure for doing this is rather challenging, though, especially as the multiplicity of operators (or particles) increases.Alternatively, if one is willing to forgo irreducibility, then one can use a jet representation that is overcomplete, like C-correlators [40] or energy flow polynomials [88].Such bases can be excessively overcomplete, though, requiring tens of thousands of terms, even for modest particle multiplicities. 6y contrast, the spectral function of a jet with n resolved particles has n 2 +1 δ-functions, corresponding to all distinct particle pairs and the contact term at z = 0.While some of this information is redundant, it is not excessively so, with a quadratic growth of the required information with multiplicity n.Crucially, this information is encoded in such a way that distinct jets are represented distinctly, up to isometries.

A.1 Proof of Uniqueness
The key assumption for the following proof is that all pairwise distances between particles in a jet are distinct real numbers.For experimental data, particle locations in the tracker or calorimeter are discrete because of the finite angular resolution of the detectors.This complicates the construction of the jet from its spectral function as presented here, but we leave a detailed study to future work.
The proof of the uniqueness of the spectral function representation goes as follows.We start with angular information.The spectral function clearly encodes all pairwise distances between particles in the jet through the location of its δ-function spikes.These pairwise distances are real numbers and in general have continuous, non-zero probability to lie anywhere in the interval ω ∈ [0, ω max ].A jet only ever has a finite integer multiplicity, and so there is strictly 0 probability that two distinct pairs of particles have the same pairwise distance.
With this, we can apply Theorem 2.6 of Ref. [44], which established the conditions for constructibility of n points on the plane from their distribution of pairwise distances.They prove the following result.Let P = (p 1 , p 2 , . . ., p n ) and Q = (q 1 , q 2 , . . ., q n ) be two collections of points on the plane, each drawn from some non-degenerate, continuous probability distributions.Then, with probability 1, if the set of unordered pairwise distances of points in P and Q are identical, then there exists an isometry in E(2) ⊗ S n that renders the sets P and Q θ 26 = θ 46 , then s B (ω) will have a single peak at the same location as for s A (ω). Furthermore, these peaks will have the same height if E 1 E 3 = E 2 E 4 + E 2 E 6 + E 4 E 6 .Thus, to render these spectral functions identical, there must be four constraints imposed on the structure of s B (ω), such that the degenerate subspace has codimension 4.
To estimate when experimental resolution might render this configuration problematic, assume that there is some angular resolution ϵ θ within which all angles are equal and an energy resolution ϵ E within which energies are equal.Then, the probability that these two spectral functions are within resolution of one another scales like ϵ 3 θ ϵ E , assuming continuous and smooth probability distributions for pairwise angles and particle energies.A conservative estimate of these resolution factors exclusively from the resolution of calorimetry at ATLAS or CMS at the Large Hadron Collider, for example, assumes ϵ θ ∼ ϵ E ∼ 0.1, so these nearly degenerate configurations are suppressed by at least 10 −4 with respect to generic particle momenta, with no other assumptions on the distributions of particles.There may be applications that enhance these nearly-degenerate contributions, for example if there are low multiplicity configurations from resonance decays that have a preferred angular scales.The main lesson from this study is that one has to be mindful of configurations that can be close in SEMD even if they are far apart in EMD.

B Comparison with the 2-Wasserstein Metric
In the body of this paper, we focused on the 1-Wasserstein metric for defining the spectral EMD.As discussed in Eq. (2.15), it is also natural to define the (p-th power of) the p-Wasserstein metric.The p = 2 case is particularly interesting, since the 2-Wasserstein distance enjoys many of the properties of a Riemannian metric, like the uniqueness of (affine) geodesics.Indeed, Riemannian manifolds like the surface of the Earth, or Lorentzian manifolds like space-time, are the most familiar contexts in which a metric is used in physics.In this appendix, we briefly discuss some properties of the 2-Wasserstein metric on the spectral function, leaving an in-depth study to future work.Following Eq. (2.15) the (squared) 2-Wasserstein distance between two spectral functions s A (ω) and s B (ω) is: For simplicity, we have assumed that the two jets have the same total energy E. Unlike the p = 1 case from Eq. (2.11), where we could define the spectral EMD through the cumulative spectral function, the p = 2 case only has a closed form expression in terms of the inverse cumulative spectral function S −1 , whose argument is a squared energy and whose value is a pairwise angular distance.Now we want to repeat parts of the analysis of Sec. 3 and consider the p = 2 spectral EMD on low-multiplicity jets.If two jets A and B each consist of a single particle with the

4 Figure 1 :
Figure 1: Labeling convention for particles in this paper.Jet A consists of odd-numbered particles and jet B consists of even-numbered particles.

< l a t e x i t s h a 1 _
b a s e 6 4 = " D G 2 x z l K Y B S G A N l k C m H y a i L A x 9 0 s = " > A A A B 7 X i c b V D L S g N B E J y N r x h f U Y 9 e B o P g K e x K U I 9 B L x 4 j m A c k S 5 i d 9 C Z j 5 r H M z A p h y T 9 4 8 a C I

13 < l a t e x i t s h a 1 _
e a p X 6 b R 5 H E Z 2 g U 3 S O A n S N 6 u g e N V A T U a T R M 3 p F b x 5 4 L 9 6 7 9 7 F o L X j 5 z D H 6 A + / z B / r t k R E = < / l a t e x i t > !b a s e 6 4 = " B 3 L C h c e D H j H t 6 v K x Z + y u 5 O l p y R 8 = " > A A A B 8 n i c b V D L S g N B E J y N r x h f U Y 9 e B o P g K e y G o B 6 D X j x G M A 9 I l j A 7 6 U 2 G z G O Z m R X C k s / w 4 k E R r 3 6 N N / / G S b I H T S x o K K q 6 6 e 6 K E s 6 M 9 f 1 v r 7 C x u b W 9 U 9 w t 7 e 0 f H B 6

24 < l a t e x i t s h a 1 _
r 3 l z 8 z + u l N r 4 J M y a T 1 I K k y 0 V x y r F V e P 4 / H j I N 1 P K p I 4 R q 5 m 7 F d E w 0 o d a l V H I h B K s v r 5 N 2 r R p c V e s P 9 U r j N o + j i M 7 Q O b p E A b p G D X S P m q i F K F L o G b 2 i N 8 9 6 L 9 6 7 9 7 F s L X j 5 z C n 6 A + / z B / 3 4 k R M = < / l a t e x i t > !b a s e 6 4 = " S v E w 9 j + G 6 / n N S Z e b h l v Q w R v l S i I = " > A A A B 6 H i c b V B N S 8 N A E J 3 U r 1 q / q h 6 9 L B b B U 0 m k q M e i F 4 8 t 2 F p o Q 9 l s J + 3 a z S b s b o Q S + g u 8 e F D E q z / J m / / G b Z u D t j 4 Y e L w 3 w 8 y 8 I B F c G 9 f 9 d g p r 6 x u b W 8 X t 0 s 7 u 3 v 5 B + f C o r e N U M W y

/ l a t e x i t > 0 < l a t e x i t s h a 1 _
b a s e 6 4 = " D G 2 x z l K Y B S G A N l k C m H y a i L A x 9 0 s = " > A A A B 7 X i c b V D L S g N B E J y N r x h f U Y 9 e B o P g K e x K U I 9 B L x 4 j m A c k S 5 i d 9 C Z j 5 r H M z A p h y T 9 4 8 a C I

13 < l a t e x i t s h a 1 _
e a p X 6 b R 5 H E Z 2 g U 3 S O A n S N 6 u g e N V A T U a T R M 3 p F b x 5 4 L 9 6 7 9 7 F o L X j 5 z D H 6 A + / z B / r t k R E = < / l a t e x i t > !b a s e 6 4 = " B 3 L C h c e D H j H t 6 v K x Z + y u 5 O l p y R 8 = " > A A A B 8 n i c b V D L S g N B E J y N r x h f U Y 9 e B o P g K e y G o B 6 D X j x G M A 9 I l j A 7 6 U 2 G z G O Z m R X C k s / w 4 k E R r 3 6 N N / / G S b I H T S x o K K q 6 6 e 6 K E s 6 M 9 f 1 v r 7 C x u b W 9 U 9 w t 7 e 0 f H B 6

24 < l a t e x i t s h a 1 _
r 3 l z 8 z + u l N r 4 J M y a T 1 I K k y 0 V x y r F V e P 4 / H j I N 1 P K p I 4 R q 5 m 7F d E w 0 o d a l V H I h B K s v r 5 N 2 r R p c V e s P 9 U r j N o + j i M 7 Q O b p E A b p G D X S P m q i F K F L o G b 2 i N 8 9 6 L 9 6 7 9 7 F s L X j 5 z C n 6 A + / z B / 3 4 k R M = < / l a t e x i t > !b a s e 6 4 = " S v E w 9 j + G 6 / n N S Z e b h l v Q w R v l S i I = " > A A A B 6 H i c b V B N S 8 N A E J 3 U r 1 q / q h 6 9 L B b B U 0 m k q M e i F4 8 t 2 F p o Q 9 l s J + 3 a z S b s b o Q S + g u 8 e F D E q z / J m / / G b Z u D t j 4 Y e L w 3 w 8 y 8 I B F c G 9 f 9 d g p r 6 x u b W 8 X t 0 s 7 u 3 v 5 B + f C o r e N U M W y

< l a t e x i t s h a 1 _ b a s e 6 4 =
" g 1 u d o K q o 8 B C W u 2 k p D H D 3 V W k G U 3 k = " > A A A B 6 H i c b V B N S 8 N A E J 3 U r 1 q / q h 6 9 L B b B U 0 m k q M e i F 4 8 t 2 F p o Q 9 l s J + 3 a z S b s b o Q S + g u 8 e F D E q z / J m / / G b Z u D t j 4 Y e L w 3 w 8 y 8 I B F c G 9 f 9 d g p r 6 x u b W 8 X t 0 s 7 u 3 v 5 B + f C o r e N U M W y x W M S q E 1 C N g k t s G W 4 E d h K F N A o E P g T j 2 5 n / 8 I R K 8 1 j e m 0 m C f k S H k o e c U W O l J u 2 X K 2 7 V n Y O s E i 8 n F c j R 6 J e / e o O Y p R F K w w T V u u u 5 i f E z q g x n A q e l X q o x o W x M h 9 i 1 V N I I t Z / N D 5 2 S M 6 s M S B g r W 9 K Q u f p 7 I q O R 1 p M o s J 0 R N S O 9 7 M 3 E / 7 x u a s J r P + M y S Q 1 K t l g U p o K Y m M y + J g O u k B k x s Y Q y x e 2 t h I 2 o o s z Y b E o 2 B G / 5 5 V X S v q h 6 l 9 V a s 1 a p 3 + R x F O E E T u E c P L i C O t x B A 1 r A A O E Z X u H N e X Re n H f n Y 9 F a c P K Z Y / g D 5 / M H x k u M 7 g = = < / l a t e x i t > a < l a t e x i t s h a 1 _ b a s e 6 4 = " q V 0 I f 9 D T V T b 8 7 x J 8 l S w O 1 9 o P 3 h U = " > A A A B 6 X i c b V B N S 8 N A E J 3 4 W e t X 1 a O X x S J 6 K o k U 9 V j 0 4 r G K / Y A 2 l M 1 2 0 i 7 d b M L u R i i h / 8 C L B 0 W 8 + o + 8 + W / c t j l o 6 4 O B x 3 s z z M w a x / L R j B P 0 I z q Q P O S M G i s 9 0 L N e q e x W 3 B n I M v F y U o Y c 9 V 7 p q 9 u P W R q h N E x Q r T u e m x g / o 8 p w J n B S 7 K Y a E 8 p G d I A d S y W N U P v Z 7 N I J O b V K n 4 S x s i U N m a m / J z I a a T 2 O A t s Z U T P U i 9 5 U / M / r p C a 8 9 j M u k 9 S g Z P N F Y S q I i c n 0 b d L n C p k R Y 0 s o U 9 z e S t i Q K s q M D a d o Q / A W X 1 4 m z Y u K d 1 m p 3 l f L t Z s 8 j g I c w w m c g w d X U I M 7 q E M D G I T w D K / w 5 o y c F + f d + Z i 3 r j j 5 z B H 8 g f P 5 A y a 7 j R 8 = < / l a t e x i t > a 0 < l a t e x i t s h a 1 _ b a s e 6 4 = " t 0 D+ A J x q z o Z E Y s L z + / M h F b J c A A o = " > A A A B 6 X i c b V B N S 8 N A E J 3 4 W e t X 1 a O X x S J 6 K o k U 9 V j 0 4 r G K / Y A 2 l M 1 2 0 y 7 d b M L u R C i h / 8 C L B 0 W 8 + o + 8 + W / c t j l o 6 4 O B x 3 s z z M w L E i k M u u 6 3 s 7 K 6 t r 6 x W d g q b u / s 7 u 2 X D g 6 b J k 4 1 4 w 0 W y 1 i 3 A 2 q 4 F I o 3 U K D k 7 U R z G g W S t 4 L R 7 d R v P X F t R K w e c Z x w P 6 I D J U L B K F r p I T j r l c p u x Z 2 B L B M v J 2 X I U e + V v r r 9 m K U R V 8 g k N a b j u Q n 6 G d U o m O S T Y j c 1 P K F s R A e 8 Y 6 m i E T d + N r t 0 Q k 6 t 0 i d h r G 0 p J D P 1 9 0 R G I 2 P G U W A 7 I 4 p D s + h N x f + 8 T o r h t Z 8 J l a T I F Z s v C l N J M C b T t 0 l f a M 5 Q j i 2 h T A t 7 K 2 F D q i l D G 0 7 R h u A t v r x M m h c V 7 7 J S v a + W a z d 5 H A U 4 h h M 4 B w + u o A Z 3 U I c G M A j h G V 7 h z R k 5 L 8 6 7 8 z F v X X H y m S P 4 A + f z B y h A j S A = < / l a t e x i t > b 0 < l a t e x i t s h a 1 _ b a s e 6 4 = " f V Y t N 1 f O 8 7 9 Z v v b O 3 E / F p G N n t a U = " > A A A B 6 H i c b V B N S 8 N A E J 3 U r 1 q / q h 6 9 L B b B U 0 m k q M e i F4 8 t 2 F p o Q 9 l s J + 3 a z S b s b o Q S + g u 8 e F D E q z / J m / / G b Z u D t j 4 Y e L w 3 w 8 y 8 I B F c G 9 f 9 d g p r 6 x u b W 8 X t 0 s 7 u 3 v 5 B + f C o r e N U M W y x W M S q E 1 C N g k t s G W 4 E d h K F N A o E P g T j 2 5 n / 8 I R K 8 1 j e m 0 m C f k S H k o e c U W O 8 n j K M I J n M I 5 e H A F d b i D B r SA A c I z v M K b 8 + i 8 O O / O x 6 K 1 4 O Q z x / A H z u c P x 8 + M 7 w = = < / l a t e x i t > b < l a te x i t s h a 1 _ b a s e 6 4 = " h p n y S a J h + / C A h H t i 5 m N v H w 1 0 1 y A = " >A A A B 6 3 i c b V B N S 8 N A E J 3 U r 1 q / q h 6 9 L B b B U 0 m k V I 9 F L x 4 r 2 A 9 o Q 9 l s N 8 3 S 3 U 3 Y 3 Q g l 9 C 9 4 8 a C I V / + Q N / + N m z Y H b X 0 w 8 H h v h p l 5 Q c K Z N q 7 7 7 Z Q 2 Nr e 2 d 8 q 7 l b 3 9 g 8 O j 6 v F J V 8 e p I r R D Y h 6 r f o A 1 5 U z S j m G G 0 3 6 i K B Y B p 7 1 g e p f 7 v S e q N I v l o 5 k l 1 B d 4 I l n I C D a 5 N E w

Figure 7 :
Figure7: Illustration of the azimuthal angle ϕ that quantifies the relative orientation of the two jets about their common axis in the collinear limit for calculating EMD(1,1)