1 Introduction

Voronoi tessellations For a basic introduction to Voronoi tessellations, see, e.g., [1] are useful in a wide variety of fields, from biology [2] to astronomy [3, 4] to condensed matter physics [5]. In high energy physics, they have been used rather sporadically, e.g., as an optional approach to QCD jet-finding and area determination in FastJet [6] and in the model-independent definition of search regions in SLEUTH  [7,8,9,10]. In Ref. [11], some of us pointed out that Voronoi methods can be applied directly to the analysis of data from high energy physics experiments, e.g., when trying to detect the presence of a new physics signal in the data or to perform parameter measurements.

In most Voronoi-based approaches, the goal is to use Voronoi tessellations to identify “neighbors” of data points. The tessellation then automatically provides a number of cell-based attributes for each data point. Reference [11] argued that using the geometric properties of Voronoi cells, and, in particular, functions of the geometric properties of Voronoi cells and their neighbors, gives valuable additional information and can allow for relatively model-independent searches for targeted “features” in the data. As briefly discussed in Ref. [11], a particularly useful application is the study of kinematic edges when investigating cascade decays in new physics models such as supersymmetry (SUSY) [12].

To understand the importance of edge-finding in multidimensional spaces for SUSY mass measurement, we first note that many extensions of the standard model (SM) are characterized by a \(\mathbb {Z}_2\) symmetry under which new physics particles (NPPs) are charged but the SM particles are uncharged. Such a symmetry ensures that the lightest NPP will be stable and hence may constitute the dark matter. With the assumption of such a symmetry, a typical collider event involving NPPs proceeds as follows:

  1. 1.

    NPPs are pair produced.

  2. 2.

    Each NPP goes through a series of (generally two and three-body) decays called a “decay chain”. In each decay, an NPP decays to another, lighter, NPP, and one or more SM particles. The NPPs generally have a small intrinsic width compared with their mass. Hence it is generally a good approximation to view the decay chain as consisting of a series of on-shell decays of NPPs.

  3. 3.

    Eventually the lightest particle charged under the \(\mathbb {Z}_2\) is reached. It is stable, and, if a dark matter candidate, uncharged and uncolored. Hence it will escape the detector without being detected.

Popular new physics models within this paradigm include SUSY, where the \(\mathbb {Z}_2\) symmetry is called “R-parity”; Universal Extra Dimensions (UED) [13], in which the \(\mathbb {Z}_2\) symmetry is called “KK-parity”; and Little Higgs models, in which the \(\mathbb {Z}_2\) symmetry is called “T-parity” [14].

As the lightest NPP escapes detection, we are not able to determine directly the masses of the initial new physics particles produced in the collision, nor the masses of any intermediate particles in the decay, as we would if we were studying a resonance decaying to visible particles. However, we can determine the masses of the NPPs by studying the distributions of (functions of) the momenta of observed particles [15].

Much effort has gone into determining the best way to actually perform this mass measurement. The simplest methods involve finding an edge or an endpoint in the one-dimensional distribution of the invariant mass of two (or more) reconstructed objectsFootnote 1  [16,17,18,19,20,21,22,23,24,25,26,27,28]. If one is able to measure enough of these kinematic endpoints, it is then possible to solve for the unknown masses, possibly up to discrete ambiguities  [29,30,31]. This approach naturally evolved into the so-called “polynomial method”  [32,33,34,35,36,37,38,39,40,41,42,43,44,45,46], where one attempts to solve explicitly for the momenta of the invisible particles in a given event, possibly using additional information from prior measurements of kinematic invariant mass endpoints. Since at hadron colliders the longitudinal momenta of the initial state partons are unknown, much effort went into the development of suitable “transverse variables” [47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62], which are Lorentz-invariant under longitudinal boosts.Footnote 2 In principle, the optimal approach to a mass measurement is provided by the so-called “Matrix Element Method” (MEM), [67,68,69,70,71,72,73,74]. However, its use is often computationally prohibitive, especially when dealing with complicated final states and/or large reducible backgrounds. Many of the approaches described above have been extended in various ways, e.g., the \(M_{T2}\) kink method allows the measurement of the mass of the lightest NPP  [75,76,77,78,79,80,81,82], and useful \(3+1\)-dimensional analogues of the “transverse” invariant mass variables have been suggested [83,84,85,86,87,88,89,90,91,92].

The approach to the mass measurement taken here seeks to improve on those described in the existing literature in the following ways:

  1. 1.

    Instead of finding edges or endpoints in the one-dimensional distribution of a single variable, we will attempt to determine the boundary of the signal region in a higher-dimensional phase space. This improves on one-dimensional methods, in increasing greatly the amount of information that can be extracted from the data [93].Footnote 3 To be specific, we shall consider the classic SUSY decay chain of three successive two-body decays as shown in Fig. 1. From the measured four-momenta, \(p_i (i=1,2,3)\), of the three visible particles, \(v_i\), in the decay, one can form three two-body invariant mass combinations, \(m_{ij}=\sqrt{(p_i+p_j)^2}\), for \(i<j\). Signal events will then populate the interior of a compact region, \(\mathcal{V}_3\), in the three-dimensional phase space, \((m_{12}, m_{23}, m_{13})\), of invariant masses [94, 95]. The size and the shape of the two-dimensional surface boundary of \(\mathcal{V}_3\), which we term \(\mathcal{S}_2\), contains the complete information about the underlying mass spectrum, \(m_{X_i} (i=1,2,3,4),\) of the NPPs in Fig. 1. Therefore, we shall focus on methods for detecting \(\mathcal{S}_2\) directly.Footnote 4 One can imagine doing this in two ways:

    • By defining a kinematic variable which takes the same (constant) value (e.g., zero) everywhere along the phase-space boundary.Footnote 5 This approach will be discussed below in Sect. 3, where we review the relevant variable, \(\Delta _4\), introduced in Ref. [93].

    • By analyzing the measured density of events in phase space and locating the boundary, \(\mathcal{S}_2\), directly using techniques inspired by spatial analyses performed in other fields of science. This will be the main subject of this paper, and it will be discussed in Sects. 2.2, 3.2, 3.3, and 4.

  2. 2.

    We build on the idea of Ref. [11] that Voronoi tessellations provide a powerful and model-independent tool for identifying edges (for a brief introduction to Voronoi tessellations, see Sect. 2.1 below). While the analysis of Ref. [11] was limited to data in two dimensions, here we extend the procedure to the three-dimensional case and try to delineate the region, \(\mathcal{V}_3\), in the phase space of the three variables, \((m_{12}, m_{23}, m_{13})\). Before tackling a SUSY physics example in Sect. 4, we consider several analogous toy examples of increasing complexity in Sects. 2 and 3. This helps develop the reader’s intuition and motivates some of our analysis choices. Following Ref. [11], in order to select “edge” cells in the Voronoi tessellation of the data, we consider the relative standard deviationFootnote 6 (RSD), \({\bar{\sigma }}_i\), of the volumes of neighboring cells, which is defined as follows. Let \(N_i\) be the set of neighbors of the ith Voronoi cell, with volumes, \(\{v_j\}\), for \(j\in N_i\). The RSD, \({\bar{\sigma }}_i\), is now defined by

    $$\begin{aligned} \bar{\sigma }_i \equiv \frac{1}{\langle {v}(N_i)\rangle }\, \sqrt{\sum _{j\in N_i} \frac{\left( v_j-\langle {v}(N_i)\rangle \right) ^2}{|N_i|-1}}, \end{aligned}$$
    (1)

    where we have normalized by the average volume of the set of neighbors, \(N_i\), of the ith cell

    $$\begin{aligned} \langle {v}(N_i)\rangle \equiv \frac{1}{|N_i|}\sum _{j\in N_i} v_j. \end{aligned}$$
    (2)

    The variable defined in Eq. (1) is a straightforward extension to three dimensions of the “scaled standard deviation” of neighbor areas found to be helpful in Ref. [11].

  3. 3.

    We make crucial use of the recent observation of Ref. [93] that for sufficiently many-body final states there is an enhancement (in fact a singularity) in the phase-space density near the boundary, \(\mathcal{S}_2\), of the allowed phase space, \(\mathcal{V}_3\). Due to the enhancement in the density of signal events near the boundary of phase space, we can alternatively target the boundary points of \(\mathcal{V}_3\) as being points in a densely populated region. This motivates us to consider, in addition to \({\bar{\sigma }}_i\), a second variable related to volume. We choose

    $$\begin{aligned} \bar{v}_i = \frac{v_i}{\langle {v}(N_i)\rangle }, \end{aligned}$$
    (3)

    where again we normalize by the average volume (2) of the set of neighbors, \(N_i\).

Fig. 1
figure 1

Decay process of a heavy resonance \(X_1\) into three visible particles, \(v_1\), \(v_2\) and \(v_3\), along with an invisible particle, \(X_4\), via two on-shell intermediate states, \(X_2\) and \(X_3\). The NPPs, \(X_i\), are denoted by red dashed lines while visible SM particles are denoted by black solid lines

In the following, we will therefore focus on the two Voronoi-based dimensionless variables (1) and (3). The former is motivated by the discontinuity in the density of events at the boundary [11], while the latter is motivated by the enhancement in the density of signal events at the boundary [93]. We will find that the judicious combination of these two variables yields a significant increase in sensitivity as compared with either variable in isolation. As a result we find that we are able to identify the boundary, \(\mathcal{S}_2\), of the allowed signal phase space, \(\mathcal{V}_3\), with a high degree of accuracy, even when the ratio of signal-to-background events, S / B, is relatively small.

To support these conclusions, as well as to explain in detail to the reader the methods employed, we proceed as follows. In Sect. 2 we will provide a brief, but sufficient, review of Voronoi tessellation methods and the use of the geometric properties of Voronoi cells for identifying features in high energy physics data. Section 3.1 will review the consideration of events in multidimensional phase space, and, especially the observation that, for sufficiently many dimensions, the differential phase-space volume is highly peaked near the boundary. Then in Sects. 3.2 and 3.3 we study the efficacy of Voronoi methods for finding a densely populated spherical boundary in a generalized “phase space”, while Sect. 4 will examine the application of these methods to an actual benchmark point. We present our conclusions in Sect. 5. Throughout our studies, we will use ROC curves to quantify the sensitivity of the variables we define; we briefly review and discuss this approach in Appendix A.

2 Voronoi methods for finding boundaries

Voronoi tessellation [97] refers to the procedure, previously proposed by Dirichlet [98] and hinted at by Descartes [99], through which an n-volume containing a set of \(N_d\) data points, \(\{d_i\}\), is divided into \(N_d\) non-overlapping regions, \(\{R_i\}\), such that \(d_i \in R_i, \forall i\). The boundaries of \(R_i\) are chosen such that, for every point in some region \(R_j\), the corresponding data point, \(d_j\), is the nearest data point.

For applications in high energy physics, we consider the data points to be events in a suitably chosen phase space.Footnote 7 It is important to make a judicious choice of phase space – on the one hand, it should be of low enough dimensionality to keep the problem tractable in practice, yet the dimensional reduction should not result in the loss of any useful information, e.g., the washing out of interesting features in the underlying phase-space distributions. Consider, as an example, the decay chain of Fig. 1. In general, the inclusive production of the \(X_1\) resonance will be described by a nine-dimensional phase space, consisting, e.g., of the nine momentum components of the visible particles, \(v_i\). However, three of those degrees of freedom correspond to uninteresting Lorentz boosts of \(X_1\), another three degrees of freedom are angular variables which are sensitive to spins but not the \(X_i\) mass spectrum, leaving only the three degrees of freedom relevant to a mass measurement. As already mentioned in the introduction, we can take these three degrees of freedom to be the invariant mass quantities \((m_{12}, m_{23}, m_{13})\). We shall present the results from our analysis of this physics example in Sect. 4, but we first begin with a few toy studies.

2.1 Voronoi tessellations in two dimensions

In order to make contact with Ref. [11] and to introduce our notation, we begin by studying several simplified scenarios in two dimensions. In the next Sect. 2.2, we will generalize the approaches taken and the lessons learned here to the case of three dimensions.

2.1.1 A linear boundary in two dimensions

Fig. 2
figure 2

\(N_{events}=280\) events distributed according to (4) with \(\rho =6\) and the respective Voronoi tessellation. The shaded cells are those crossed by the boundary (vertical yellow line) and are defined to be the “edge cells”

In Fig. 2, we consider the unit square in the first quadrant (\(x\ge 0, y\ge 0\)) and simulate \(N_{events}=280\) events (data points) according to the probability distribution

$$\begin{aligned} f(x,y) = \frac{2}{1+\rho } \left[ \rho H(0.5-x) + H(x-0.5) \right] , \end{aligned}$$
(4)

where H(x) is the Heaviside step function and \(\rho \) is a constant density ratio, taken in Fig. 2 to be \(\rho =6\). The meaning of the distribution (4) is very simple: the unit square is divided into two equal halves by the vertical boundary at \(x=0.5\) (the yellow line in Fig. 2). Within each half, the density is constant (on average), but the left region is denser by a factor of \(\rho \). This setup produces an edge at \(x=0.5\), where the density changes by a factor of \(\rho \). Our goal will be to detect this edge by tagging the Voronoi cells that are crossed by the boundary line – such cells from now on will be referred to as “edge cells” and in Fig. 2 they are shaded in brown. The remaining Voronoi cells away from the edge will be referred to as “bulk” cells, and in Fig. 2 they are left white.

Fig. 3
figure 3

The Voronoi tessellation shown in Fig. 2, with cells color coded according to the number of neighboring polygons (upper left), normalized area (7) (upper right), isoperimetric quotient (8) (lower left), and RSD (9) (lower right)

The basic idea put forth in Ref. [11] was to study the resulting Voronoi tessellation and identify edge cells from their geometric properties (as well as from the geometric properties of neighboring cells within the immediate vicinity). The Voronoi cells in two dimensions are planar polygons, for which one can investigate the usual geometric properties like number of sides, area, perimeter, etc. Figure 3 shows four possibilities, where the Voronoi cells are color coded according to the value of the corresponding geometric quantity. Then, in Fig. 4, we plot the probability distributions for these geometric quantities separately for the edge cells (red solid lines) and the bulk cells (blue dotted lines). As can be seen in Fig. 2, the edge cells, by construction, represent a very small fraction of the total number of Voronoi cells in the tessellation. Thus, in order to increase the statistical precision of the distributions in Fig. 4, we generated \(N_{\mathrm{exp}}=1000\) pseudo-experiments analogous to the one shown in Fig. 2.

In the upper left panels of Figs. 3 and 4, we study the number of elements, \(|N_i|\), in the set of neighbors, \(N_i\). This is equivalent to the number of sides of the ith Voronoi polygon. This variable has been studied in the literature See, e.g., [100]: for example, it is known that the Voronoi polygons most commonly have 5 or 6 sides, which is confirmed in Figs. 3 and 4. There is also a long tail of polygons with many sides, which is conjectured to behave asymptotically as \(|N_i|^{-2|N_i|}\) [101]. Indeed, Fig. 2 contains an example of a polygon with as many as 12 sides! The upper left panel of Fig. 4 demonstrates that, as expected, the \(|N_i|\) distributions for bulk and edge cells are rather similar, and are thus not suitable for tagging edge cells [11].

Fig. 4
figure 4

Unit-normalized distributions of the four Voronoi cell properties depicted in Fig. 3. Blue dotted (red solid) histograms refer to bulk (edge) cells. In order to increase the statistics, we show results from \(N_{\mathrm{exp}}=1000\) pseudo-experiments with \(N_{\mathrm{events}}=280\) each

The upper right panels of Figs. 3 and 4 illustrate a different geometric quantity related to the areas, \(a_i~(i=1,2,\ldots ,N_{\mathrm{events}})\), of the Voronoi cells. The areas of the Voronoi polygons are meaningful because they provide an estimate of the value of the underlying distribution f(xy) (4) at the corresponding data point \((x_i,y_i)\):

$$\begin{aligned} f(x_i,y_i) \approx \frac{1}{N_{\mathrm{events}}\times a_i}, \end{aligned}$$
(5)

so that f(xy) is still unit-normalized:

$$\begin{aligned} \sum _{i=1}^{N_{\mathrm{events}}} f(x_i,y_i) \times a_i = \frac{1}{N_{\mathrm{events}}} \sum _{i=1}^{N_{\mathrm{events}}} \frac{a_i}{a_i} = 1. \end{aligned}$$
(6)

In Figs. 3 and 4 we choose to normalize the cell areas not locally as in (3), but by the expected average area in the dense region. Thus, a typical bulk cell to the left of the vertical boundary has normalized area of approximately 1, while a typical bulk cell to the right of the boundary has a normalized area of approximately \(\rho =6\). Note that while we fix the total number of events, \(N_{\mathrm{events}}\), the fraction which ends up on one side of the boundary varies – for example, in the single pseudo-experiment depicted in Figs. 2 and 3, there happen to be 243 events on the left side and 37 events on the right side (to be compared with the expectation of \(\rho N_{\mathrm{events}}/(\rho +1)=240\) events on the left side and \(N_{\mathrm{events}}/(\rho +1)=40\) events on the right side). If the total area is \(A=\sum _i a_i\), the expected average size of a bulk cell in the dense region on the left is given by \(A(\rho +1)/(2\rho N_{\mathrm{events}})\), hence in the upper right panels of Figs. 3 and 4 we plot the cell areas, \(a_i\), normalized as

$$\begin{aligned} \bar{a}_i = \frac{2\rho N_{\mathrm{events}}}{\rho +1}\, \frac{a_i}{A}. \end{aligned}$$
(7)

The distribution of the Voronoi cell areas when the points have been randomly selected (in a “Poisson process”) is not known analytically, and is typically approximated with a three-parameter generalized gamma function, where the parameters are fitted to results from Monte Carlo simulations [102]. For our purposes, we are not interested in the form of the actual distributions but in the question of whether the distributions for bulk and edge cells show any appreciable differences. As seen in the upper right panel of Fig. 4, the area distribution for bulk cells is nearly bimodal; with the normalization (7) two peaks are expected near \(\bar{a}_i\sim 1\) and \(\bar{a}_i\sim \rho =6\).Footnote 8 The area distribution for edge cells, on the other hand, is unimodal, peaking relatively close to \(\bar{a}_i\sim 1\). After all, we expect a larger fraction, namely \(\rho /(\rho +1)\) of the edge cells, to have their centers on the “dense” side of the boundary and only \(1/(\rho +1)\) of the edge cells to have their centers on the “sparse” side of the boundary. A close inspection of Fig. 2 confirms this expectation: out of the 21 edge cells, 18 (3) have their centers to the left (to the right) of the vertical boundary, which is consistent with our expectations for \(\rho = 6\). In conclusion, it is clear that in this case, the Voronoi cell area by itself is not a very good candidate for an edge-tagging variable [11]. We expect that in the more general situation, where the densities on each side of the boundary are not uniform, this variable will be even more unsuitable.

Having investigated a variable describing the size of the Voronoi polygon, we now examine a variable characterizing the shape of the polygon, e.g., the isoperimetric quotient

$$\begin{aligned} q_i \equiv \frac{4\pi a_i}{p_i^2}, \end{aligned}$$
(8)

where \(a_i\) is the area and \(p_i\) is the perimeter of the ith Voronoi polygon. The variable (8) is a measure of “roundedness” – it is equal to zero for infinitely thin (pencil-like) polygons, and is equal to 1 in the limit of a perfectly symmetric polygon with infinitely many sides (i.e., a circle). The corresponding results for the isoperimetric quotient are shown in the lower left panels of Figs. 3 and 4. We observe that the edge cells tend to be slightly more “squashed”, but the difference is very minor and not useful in practice.

Fig. 5
figure 5

The analogue of Fig. 2 (left panel) and the analogue of the lower right panel in Fig. 3 (right panel), for the radially symmetric distribution (11). In order to keep the statistics the same as in Figs. 2 and 3 we place \(N_{\mathrm{events}}=280\) events inside the dashed circle with radius \(r=\sqrt{2}\)

In a similar vein, one could continue to study other geometric characteristics of a single Voronoi cell, e.g., perimeter, average side length, etc. [103], but, similarly, this is unlikely to lead to any success in identifying edge cells. The reason is that we are trying to detect a discontinuity and therefore we need to study the relative properties of cells on both sides of the boundary. One possibility is to compute a derivative quantity, e.g., the gradient at each cell location [11]. Another option is to compare the spread in the areas of the neighboring cells, e.g., by computing the RSD, \(\bar{\sigma }_i\), of the areas of the cells in \(N_i\) (the set of neighbors of the ith Voronoi polygon) in analogy to (1) [11]:

$$\begin{aligned} \bar{\sigma }_i \equiv \frac{1}{\langle {a}(N_i)\rangle }\, \sqrt{\sum _{j\in N_i} \frac{\left( a_j-\langle {a}(N_i)\rangle \right) ^2}{|N_i|-1}}, \end{aligned}$$
(9)

where the normalization now is done by the average area of the neighbors,

$$\begin{aligned} \langle {a}(N_i)\rangle \equiv \frac{1}{|N_i|}\sum _{j\in N_i} a_j. \end{aligned}$$
(10)

The idea is very simple – the neighbors of edge cells are typically quite diverse – some happen to be on the dense side and are therefore relatively small, while others are on the sparse side and are relatively large. As a result, the RSD of neighbor areas for edge cells is expected to be enhanced. On the other hand, for bulk cells, all neighbors are roughly similar, and the RSD of their areas should be small. These expectations are confirmed in the lower right panels of Figs. 3 and 4. In the temperature plot of Fig. 3, the edge cells are clearly distinguished by the different color, and the \(\bar{\sigma }_i\) distributions for bulk and edge cells in Fig. 4 are visibly displaced from each other. We see that, in agreement with the conclusions from Ref. [11], the RSD, \(\bar{\sigma }_i\), is a promising variable for edge detection.Footnote 9

Fig. 6
figure 6

Two-dimensional slices at \(z=0\) through phase space for the three-dimensional toy example studied in Sect. 2.2. We distribute \(N_{\mathrm{events}}=4200\) points according to the three-dimensional probability distribution (12) within a sphere of radius \(\root 3 \of {2}\) centered at the origin \((x,y,z)=(0,0,0)\). The Voronoi tessellation is done before taking the two-dimensional slice, i.e., the cell boundaries seen on these four plots are obtained by intersecting the three-dimensional Voronoi cell boundaries with the plane at \(z=0\). The yellow circle marks the boundary of the dense core. The resulting cells in the two-dimensional slice are color coded by a certain attribute of the corresponding three-dimensional Voronoi cell: number of neighbors (upper left); normalized volume (upper right); isoperimetric ratio (13) (lower left), and RSD of the neighboring volumes (1) (lower right)

2.1.2 A circular boundary in two dimensions

Before concluding our discussion in two dimensions, we perform one more toy exercise. In the example of the previous Sect. 2.1.1, the boundary was a straight line; in a more realistic situation we will encounter a boundary which is an arbitrary curved line. In anticipation of the physics example discussed in Sect. 4, we now consider a two-dimensional example with a curved boundary in the shape of a circle. Instead of the rectangular pattern given by (4), we consider the radially symmetric distribution

$$\begin{aligned} f(\vec { r}) \sim \rho H(1-r) + H(r-1) H(\sqrt{2}-r), \end{aligned}$$
(11)

where \(\vec { r}=(x,y)\) is the position vector in 2 dimensions and \(r\equiv |\vec { r}|\) is its magnitude. As in (4), the distribution (11) describes two regions, the inner region is a unit circle, while the outer region is a hollow disk extending up to \(r=\sqrt{2}\) (the circular dashed line in Fig. 5). The regions are separated by a circular boundary at \(r=1\), marked with the solid yellow curve in Fig. 5. Similarly to the example from Sect. 2.1.1, the two regions have equal areas, each region has a constant density, and one region is \(\rho \) times denser than the other, see Fig. 5.

Just as in Sect. 2.1.1, we choose \(\rho =6\) and generate \(N_{\mathrm{events}}=280\) events according to (11); they are distributed so that the ratio of the bulk events on the two sides of the boundary is equal to \(\rho \). Thus, out of the \(N_{\mathrm{events}}=280\) events inside the dashed circle with \(r=\sqrt{2}\), on average we will have \(\rho N_{\mathrm{events}}/(\rho +1)=240\) events in the dense interior region (the unit circle) and \(N_{\mathrm{events}}/(\rho +1)=40\) events within the sparse exterior hollow disk.Footnote 10

In the left panel of Fig. 5, the brown-shaded polygons are by definition the edge cells (those crossed by the yellow circular boundary). The right panel in Fig. 5 demonstrates that, once again, the edge cells can be effectively selected by the RSD, \(\bar{\sigma }_i\), of the areas of the neighboring cells.

2.2 Voronoi tessellations in three dimensions

Since the relevant physics example we treat in Sect. 4 is in a three-dimensional phase space, \((m_{12}, m_{23}, m_{13})\), we shall now generalize our previous discussion to three dimensions. For this purpose, we consider the three-dimensional analogue of (11):

$$\begin{aligned} f(\vec { R}) \sim \rho H(1-R) + H(R-1) H(\root 3 \of {2}-R) , \end{aligned}$$
(12)

where now \(\vec { R}=(x,y,z)\) is the position vector in three dimensions and \(R\equiv |\vec { R}|\). The distribution, (12), again describes two regions of constant density, except now the dense region is a three-dimensional spherical core of radius 1. Again, we choose \(\rho =6\) and generate \(N_{\mathrm{events}}=4200\) events according to (12). The events populate a ball of radius \(R=\root 3 \of {2}\) centered at the origin, \((x,y,z)=(0,0,0)\). On average, we expect to have \(\rho N_{\mathrm{events}}/(\rho +1)=3600\) events in the core and \(N_{\mathrm{events}}/(\rho +1)=600\) events in the outer hollow spherical shell (\(1\le R\le \root 3 \of {2}\)).Footnote 11

Figures 6 and 7 illustrate this three-dimensional simplified scenario in analogy to Figs. 3 and 4. Since the Voronoi cells in three dimensions are polyhedra, it is difficult to visualize them on a planar plot. Thus, Fig. 6 shows only a slice at a fixed \(z=0\), i.e., an equatorial plane view. The cell boundaries seen in the figure are the intersections of the equatorial plane with the walls of the Voronoi polyhedra. The interiors of those cells are color coded according to the value of the geometric property (number of faces, volume, etc.) of the corresponding three-dimensional polyhedron.Footnote 12 For example, the upper left panel in Fig. 6 shows that the Voronoi polyhedra typically have a relatively large number of faces (or equivalently, neighbors); the corresponding distribution for bulk cells, shown in the upper left panel in Fig. 7, is known to peak at 15 [104]. We also observe that the edge cells are very similar in that regard, i.e., there is no appreciable difference in the number of neighbors as we move across the boundary.

Fig. 7
figure 7

The same as Fig. 4, but for the three-dimensional toy example depicted in Fig. 6. Edge cells are defined to be those Voronoi cells which are crossed by the boundary of the unit sphere (\(r=1\))

In the upper right panels of Figs. 6 and 7 we show the corresponding result for the normalized volumes, \(\bar{v}_i\), of the Voronoi polyhedra, where, in analogy to (7), we scale each volume, \(v_i\), by the expected average volume in the dense core, \(\frac{4}{3}\pi (\rho +1)/(\rho N_{\mathrm{events}})\). As expected, the distribution for bulk cells is bimodal, while edge cells behave somewhat similarly to the interior bulk cells (as we already saw in the two-dimensional example of Sect. 2.1.1).

In the lower left panels of Figs. 6 and 7 we plot the analogous “isoperimetric quotient” for the three-dimensional case,

$$\begin{aligned} Q_i \equiv \frac{6\sqrt{\pi }\, v_i}{s_i^{3/2}}, \end{aligned}$$
(13)

where \(v_i\) (\(s_i\)) is the volume (surface area) of the Voronoi polyhedron and the normalization is chosen so that \(Q_i=1\) for a perfect sphere. Figures 6 and 7 show that the shapes of the Voronoi polyhedra, as measured by (13), are very similar in the two bulk regions and not much different near the boundary either.

This leaves us with the RSD, \({\bar{\sigma }}_i\), of the volumes for the set of neighbors, \(N_i\). This quantity was already defined in (1) and our results are shown in the lower right panels of Figs. 6 and 7. We see that \(\bar{\sigma }_i\) can efficiently identify edge cells; the circular boundary is clearly seen in the lower right plot of Fig. 6. The \(\bar{\sigma }_i\) distributions for bulk and edge cells are quite distinct, as shown in Fig. 7. Thus we verify that \(\bar{\sigma }_i\) remains a promising variable for edge detection beyond the two-dimensional examples studied in Ref. [11].

3 Phase-space considerations

While two- and three-body phase space is discussed at length in most standard lectures and textbooks on quantum field theory, a Lorentz-invariant formulation of the general case with an arbitrary number of final state particles is often omitted. This is in part because processes with more than three final state particles can, in almost all circumstances, be analyzed as a sequence of on-shell production and decay stages and in part because the level of formalism required to describe the general case is significantly more involved. Nevertheless, as was shown in Ref. [93], even when a cascade decay proceeds through multiple on-shell stages, treating the phase space in its fully differential form captures important correlations that cannot be inferred from more traditional one-dimensional observables such as kinematic edges and endpoints. In this context, we briefly review the geometry of four-body phase space,Footnote 13 concentrating on the equation describing the boundary of the kinematically available region and on the differential volume element. In Sect. 3.2, we shall apply kinematic features obtained from the phase-space considerations in Sect. 3.1 to our uniform sphere example from Sect. 2.2 and use the resulting toy example to study our Voronoi methods for three-dimensional data.

3.1 Review of the four-body phase space of a cascade decay

Let us consider the process where a heavy resonance, X, decays into four particles. We first focus on presenting the form of four-body phase space in full generality and will further specialize to the case where the decay proceeds via three-step two-body cascade decays as in Fig. 1. Following the argument in Ref. [105], we begin by introducing a \(4\times 4\) matrix defined as

$$\begin{aligned} {\mathcal Z}=\left\{ z_{ij}\right\} \quad \mathrm{with}\quad z_{ij}= p_{i}\cdot p_{j}, \end{aligned}$$
(14)

where the \(\{p_i\}\) denote the four-momenta of final state particles, including the NPP \(X_4\).Footnote 14 We then define the characteristic polynomial of \(\mathcal {Z}\) as

$$\begin{aligned} \det \left[ \lambda I_{4\times 4}-\mathcal {Z}\right] \equiv \lambda ^4-\left( \sum _{i=1}^{4} \Delta _{i}\lambda ^{4-i}\right) =0, \end{aligned}$$
(15)

where \(\lambda \) represents the relevant eigenvalues and the \(\Delta _i\) identify the coefficients of the above polynomial. Specifically, one finds that \(\Delta _{1}=\mathrm{Tr}[{\mathcal Z}]= \sum _{i=1}^{4}m_{i}^{2}\) and \(\Delta _{4}=-\det \left[ {\mathcal Z}\right] \). It turns out that the kinematically allowed region is given by \(\Delta _{1,2,3,4}>0\) [105]; the boundary of this region is formed by

$$\begin{aligned} \Delta _4 =0,\quad \Delta _{1,2,3}>0. \end{aligned}$$
(16)

What makes four-body (and higher) phase space particularly interesting is the form of the volume element. In terms of \(m_{ij}^{2}=\left( p_{i}+p_{j}\right) ^2=2z_{ij}+m_{i}^{2}+m_{j}^{2}\), the four-body phase space, \(\Pi _4\), can be written as

$$\begin{aligned} \mathrm{d}\Pi _{4}= & {} \left( \prod _{i<j}\mathrm{d}m_{ij}^{2}\right) \frac{8}{(4\pi )^{10}M_{X}^{2} \Delta _{4}^{1/2}}\nonumber \\&\times \delta \left( \sum _{i<j}m_{ij}^{2}-\left( M_{X}^{2} + 2 \sum _{i=1}^{4} m_{i}^{2}\right) \right) , \end{aligned}$$
(17)

where the normalization has been chosen to reproduce the PDG convention [106] for the well-known expression with non-Lorentz-invariant quantities,

$$\begin{aligned} \mathrm{d}\Pi _{4}=\delta \left( p_{X}-\sum _{i=1}^{4}p_{i}\right) \prod _{i=1}^{4} \frac{\mathrm{d}^{3}p_{i}}{(2\pi )^{3}2E_{i}}. \end{aligned}$$
(18)

We remark that \(d\Pi _4\) is inversely proportional to \(\Delta _4\), and, given the fact that \(\Delta _4=0\) defines the kinematic boundary, as in (16), the phase space has a singular structure near \(\Delta _4=0\). While being an integrable singularity, this implies that events are more likely to be populated close to the boundary rather than far away from it. This observation is ideal for mass measurements which ultimately rely on the determination of this phase-space boundary.

Fig. 8
figure 8

The phase-space structure implied by Eq. (25). The data points are generated with the event topology in Fig. 1, using a constant matrix element. The mass spectrum is \((m_{X_1},m_{X_2},m_{X_3},m_{X_4}) = (500,350,200,100)\) GeV. A three-dimensional scatter plot (upper left) and three phase-space slices at fixed \(\xi _{13}\): \(\xi _{13}=0.25\) (upper right), \(\xi _{13}=0.5\) (lower left), and \(\xi _{13}=0.75\) (lower right). The red dot-dashed (outermost) curve is the contour for \(\Delta _4=0\), while the black dashed curves correspond to \(\Delta _4\) contours for 10, 30, 50, 70, and 90% of \(\Delta _{4,\max }\). The data points which would have emerged via the flat component in Eq. (27) are represented by blue\(\times \)symbols, whereas the data points from the remaining enhanced component \(\sim \frac{1}{\sqrt{q}}-1\) are represented by red\(+\)symbols

Given the generic formalism for the phase space with four particles in the final state, we now specialize to the case where the decay proceeds through the three consecutive two-body decays shown in Fig. 1. The \(X_{i}\)’s are NPPs represented by red dashed lines, while the \(v_{i}\)’s are SM particles represented by black solid lines. For simplicity, we assume that all SM particles are massless unless specified otherwise. \(X_{1,2,3}\) are assumed to be narrow resonances, while \(X_{4}\) is collider-stable and invisible.

We point out that the presence of the intermediate particles does not affect the enhancement near the boundary of phase space discussed above. Within the narrow width approximation, each internal propagator squared can be replaced by a delta function, whose argument is linear in the \(z_{ij}\) or, equivalently, in the \(m_{ij}^{2}\) variables. Therefore, integrating over those delta functions does not introduce any non-trivial Jacobian factors which would ruin the enhancement.

To quantify the enhancement near the boundary for this event topology, we derive the analytic form of the \(\Delta _4\) probability distribution and show that it is completely independent of \(m_{X_i}\) for the massless limit, i.e., \(m_{v_i}=0\). We start by writing \(\Delta _{4}\) in terms of the experimental observables \(m_{v_{i}v_{j}}^{2}\) which are denoted by \(m_{12}^{2}\), \(m_{13}^{2}\), and \(m_{23}^{2}\). These dimensionful variables can be traded for dimensionless, unit-normalized variables \(\xi _{ij}\) as

$$\begin{aligned} m_{ij}^{2}\equiv \xi _{ij}\,m_{ij,\mathrm{max}}^{2}, \end{aligned}$$
(19)

where \(0\le \xi _{ij}\le 1\) and the maximal values, \(m_{ij,\mathrm{max}}^{2}\), are given by the well-known kinematic endpoint formulas (see, e.g., [18]):

$$\begin{aligned} m_{12,\mathrm{max}}^{2}= & {} \frac{(m_{X_{1}}^{2}-m_{X_{2}}^{2})(m_{X_{2}}^{2}- m_{X_{3}}^{2})}{m_{X_{2}}^{2}}, \end{aligned}$$
(20)
$$\begin{aligned} m_{13,\mathrm{max}}^{2}= & {} \frac{(m_{X_{1}}^{2}-m_{X_{2}}^{2})(m_{X_{3}}^{2}- m_{X_{4}}^{2})}{m_{X_{3}}^{2}}, \end{aligned}$$
(21)
$$\begin{aligned} m_{23,\mathrm{max}}^{2}= & {} \frac{(m_{X_{2}}^{2}-m_{X_{3}}^{2})(m_{X_{3}}^{2}- m_{X_{4}}^{2})}{m_{X_{3}}^{2}}. \end{aligned}$$
(22)

We also trade the dimension-8 quantity \(\Delta _4\) for a dimensionless and unit-normalized quantity q defined as

$$\begin{aligned} \Delta _{4}\equiv q\,\Delta _{4,\mathrm{max}}, \quad 0\le q\le 1. \end{aligned}$$
(23)

Here the maximum value of \(\Delta _4\) is given by

$$\begin{aligned} \Delta _{4,\mathrm{max}}=\left( \frac{(m_{X_{1}}^{2}-m_{X_{2}}^{2})(m_{X_{2}}^{2}- m_{X_{3}}^{2})(m_{X_{3}}^{2}-m_{X_{4}}^{2})}{8m_{X_{2}}m_{X_{3}}}\right) ^{2}.\nonumber \\ \end{aligned}$$
(24)

As shown in Ref. [93], for any given set of masses, \(\left\{ m_{X_{i}}\right\} \), in this topology, the probability of obtaining any given event near the point, \(\left\{ m_{ij}^{2}\right\} \), is expressed as

$$\begin{aligned} \mathrm{d}P= & {} \frac{1}{4\pi m_{X_{1}}^{2}}\left( 1-\frac{m_{X_{2}}^{2}}{m_{X_{1}}^{2}}\right) ^{-1} \Bigg (1-\frac{m_{X_{3}}^{2}}{m_{X_{2}}^{2}}\Bigg )^{-1}\nonumber \\&\times \;\left( 1-\frac{m_{X_{4}}^{2}}{m_{X_{3}}^{2}}\right) ^{-1} \frac{H\left( \Delta _{4} \right) }{\sqrt{\Delta _{4}}} \mathrm{d}m_{12}^{2}\,\mathrm{d}m_{13}^{2}\,\mathrm{d}m_{23}^{2},\nonumber \\ \end{aligned}$$
(25)

or equivalently, in terms of the dimensionless quantities, \(\xi _{ij}\) and q, defined in (19) and (23), this can be rewritten as

$$\begin{aligned} \mathrm{d}P=\frac{2}{\pi }\frac{m_{X_{3}}}{m_{X_{2}}} \frac{H\left( q \right) }{\sqrt{q}} \mathrm{d}\xi _{12}\,\mathrm{d}\xi _{13}\,\mathrm{d}\xi _{23}. \end{aligned}$$
(26)

Here H(x) is the usual Heaviside step function.

Obviously, the expression in Eq. (26) diverges for \(q\rightarrow 0\), as expected from the general discussion earlier, and has a non-zero finite value at \(q_{\max }=1\). In order to visualize the enhancement near \(q\sim 0\), it is useful to partition the probability density in (26) into two components: a flat piece, proportional to 1, and an enhanced piece, containing the \(q^{-1/2}\) singularity:

$$\begin{aligned} \frac{\mathrm{d}P}{\mathrm{d}V_{\xi }}\sim \frac{1}{\sqrt{q_{\max }}} + \left( \frac{1}{\sqrt{q}}-\frac{1}{\sqrt{q_{\max }}} \right) = 1+ \left( \frac{1}{\sqrt{q}}-1 \right) ,\nonumber \\ \end{aligned}$$
(27)

where \(\mathrm{d}V_{\xi }\equiv \mathrm{d}\xi _{12}\,\mathrm{d}\xi _{23}\,\mathrm{d}\xi _{13}\) is a shorthand notation. If events were uniformly distributed over the entire phase space in \(\xi _{ij}\), their probability density would simply be proportional to the first (constant) term in Eq. (27). Hence, all non-trivial effects in the phase-space density distribution are due to the second term (inside the parentheses) in (27).

Figure 8 helps us develop some useful intuition about the probability distribution (27). The upper left panel shows a scatter plot of physical events in the dimensionless \(\xi _{ij}\)-space, generated according to (27). We used a mass spectrum of \((m_{X_1},m_{X_2},m_{X_3},m_{X_4}) = (500,350,200,100)\) GeV. The events populate a compact region whose shape has been likened to that of a “samosa” [95]. Since it is difficult to visualize the enhancement near the phase-space boundary in this three-dimensional view, in the next three panels of Fig. 8 we take a few slices at fixed \(\xi _{13}\): \(\xi _{13}=0.25\) (upper right), \(\xi _{13}=0.5\) (lower left), and \(\xi _{13}=0.75\) (lower right). For each slice at a fixed \(\xi _{13}\), we show all data points whose \(m_{13}^2\) values fall within 0.5 GeV\(^2\) of the nominal value for that slice, i.e., within \(\xi _{13} m_{13,max}^2 \pm 0.5\) GeV\(^2\). Then we project those points onto the plane of \(\xi _{12}\) vs. \(\xi _{23}\) and divide them into two (color-coded) groups. The data points which would have emerged from the flat piece in (27) are denoted with blue “\(\times \)” symbols, whereas the points arising from the enhanced piece in (27) are identified by red “\(+\)” symbols. In addition, we also show several theoretical contours of constant \(\Delta _4\) values, starting with the outermost red dot-dashed curve at \(\Delta _4=0\) (i.e., \(q=0\)) representing the phase-space boundary. The internal, black dashed curves mark the contours for \(\Delta _4=0.1\, \Delta _{4,\max }\), \(\Delta _4=0.3\, \Delta _{4,\max }\), \(\Delta _4=0.5\, \Delta _{4,\max }\), \(\Delta _4=0.7\, \Delta _{4,\max }\), and \(\Delta _4=0.9\, \Delta _{4,\max }\), respectively. Note that some of these contours are absent from the bottom panels because the relevant hyper-surfaces, corresponding to large \(\Delta _4\) values do not intersect those slices.

Comparing the densities of red and blue data points, we get an idea about the effect of the enhancement in the vicinity of the phase-space boundary. The blue points are more or less uniformly distributed, which is by design. In contrast, the distribution of red points is highly irregular, and their density peaks at the phase-space boundary. For a more quantitative understanding, we derive the analytic expression for the probability density function in q and obtain [107]

$$\begin{aligned} \frac{\mathrm{d}P}{\mathrm{d}q}=\frac{\arcsin (\sqrt{1-q})}{2\sqrt{q}}. \end{aligned}$$
(28)

As previously advertised, this probability density function is completely independent of all \(\left\{ m_{X_{i}}\right\} \) and is enhanced near \(q\approx 0\). In other words, the fraction of events that lie in a fixed q-interval is universal, and it is enhanced near the boundary of the phase-space region. For example, roughly 5% of events have \(q\le 10^{-3}\), i.e., less than 0.1% of \(\Delta _{4,\max }\).

Fig. 9
figure 9

Probability density distribution in the q variable (23) using the same event sample as in Fig. 8. The blue-shaded histogram is contributed by the boundary data points which are tagged by the Voronoi tessellation. The black solid curve is the theory prediction for \(\mathrm{d}P/\mathrm{d}q\) in Eq. (28)

Fig. 10
figure 10

The same as Fig. 6, but for a toy example in which the dense core has a non-uniform distribution given by (30)

Fig. 11
figure 11

The same as Fig. 7, but for the example shown in Fig. 10

In Fig. 9, we plot the distribution of the q variable from (23), taking 20,000 events out of the same event sample as the one used for Fig. 8. If we define any phase-space point whose q value is less than 5% of \(q_{\max }\) as a boundary point, we find that \(\sim \)33% of the events are then categorized as boundary points. The red histogram represents the q distribution with respect to the full data set; the black dashed, vertical line denotes the location of \(0.05\,q_{\max }\). The black solid curve shows the theoretical prediction from Eq. (28). One can easily see that the q distribution (red histogram) is fully consistent with the theory expectation. Indeed, the value of q (or equivalently, \(\Delta _4\)) is not an experimental observable, since it requires a model assumption (the input of a mass spectrum for \(X_i\)). What is needed then is a practical way of tagging the boundary data points with such low values of q by some other means; we employ Voronoi tessellations as an available tool. We Voronoi tessellate our phase space using the full data set. If a given Voronoi cell has vertices on both sides of the “samosa” surface defined by \(q=0\) (or equivalently, \(\Delta _4=0\)), then the associated data point is tagged as a boundary point (recall the definition of “edge” cells in Fig. 2). The contribution from the boundary points extracted with the above algorithm is shown by the blue-shaded histogram in Fig. 9, which we find represents \(\sim \)38% of the events in the sample. As Fig. 9 demonstrates, the set of boundary cells which can be tagged by placing a cut on q is essentially the same as the set of boundary cells identified with the Voronoi tessellation. In the following, therefore, instead of using the variable q, which is experimentally inaccessible, we shall focus on the Voronoi cells belonging to the blue-shaded histogram in Fig. 9 and try to develop a tagging method based on their geometric properties, since they are experimentally observable.

Fig. 12
figure 12

ROC curves for the toy example depicted in Fig. 10. Left the four different ROC curves resulting from each of the four variables shown in Fig. 11. Right improved ROC curves with optimal two-dimensional cuts in the \(({\bar{v}}, {\bar{\sigma }})\) plane as illustrated in Fig. 13: with \(20\times 20\) binning (green dot-dashed) or \(100\times 100\) binning (dotted blue). The blue dashed line is the ROC curve based on the \({\bar{\sigma }}\) variable alone and is identical to the solid red line in the left panel

Fig. 13
figure 13

Two-dimensional histograms of the expected signal to background ratio in the \(({\bar{v}}, {\bar{\sigma }})\) plane: for \(20\times 20\) bin (left) and \(100\times 100\) bins (right). The ROC curves in the right panel of Fig. 12 were built by successively cutting away the bin with the lowest signal-to-background ratio among all remaining bins. (Alternatively, one could start from zero and successively keep adding the bin with the highest signal-to-background ratio among all remaining bins)

3.2 Density-enhanced sphere boundaries

Inspired by the behavior of the phase-space density near the boundary, we deform the density of data points from the sphere example considered previously in Sect. 2.2. Performing Voronoi tessellations and studying the properties of the resulting Voronoi cells, we can develop our insight on what is expected from physical examples. Note that \(\Delta _4\) vanishes on the phase-space boundary and takes its maximum somewhere in the bulk. In other words, the \(\Delta _4\) value increases as the distance between the data point of interest and the boundary surface increases (see also contours in Fig. 8). Although specifying the value of distance does not determine the \(\Delta _4\) value, it turns out that there exists a positive correlation between the two quantities [107]. To proceed, we make the simplifying Ansatz that the distribution of the data points inside a unit sphere depends only on the radius, R, with an enhancement at \(R=1\). Motivated by the form of the probability density in (26), we introduce the following volume density function for the data points inside the unit sphere,

$$\begin{aligned} \frac{\mathrm{d}P}{\mathrm{d}V}\sim \frac{1}{\sqrt{1-R}}. \end{aligned}$$
(29)

Now in analogy to (12), we consider the three-dimensional distribution

$$\begin{aligned} f(\vec { R}) \sim \frac{\rho }{\sqrt{1-R}} H(1-R) + H(R-1)H(\root 3 \of {2}-R). \end{aligned}$$
(30)

Following the example from Sect. 2.2, we again take the density ratio \(\rho =6\) and generate \(N_{\mathrm{events}}=4200\) events according to (30). Our results are shown in Figs. 10 and 11, which are the analogues of Figs. 6 and 7, respectively.

We see that, in principle, all four variables plotted in Fig. 10 show some potential for discriminating edge cells. For example, a careful inspection of the lower left panel of Fig. 10 reveals that the edge cells appear somewhat elongated, which results in a lower isoperimetric quotient, as confirmed by the lower left panel in Fig. 11. On the other hand, due to the density enhancement near the boundary, we would also expect the edge cells to have smaller normalized volumes. This expectation is also confirmed – in the upper right panels of Figs. 10 and 11. Finally, the lower right panels of Figs. 10 and 11 again demonstrate that the RSD of the neighboring areas is a good discriminator, in agreement with our observations from the earlier toy examples. In order to compare the performance of the four variables investigated in Figs. 10 and 11, we use the concept of a ROC curve, which is reviewed in Appendix A.

3.3 Finding density-enhanced sphere boundaries with Voronoi tessellations

We now analyze the example of a sphere with an enhanced density near the boundary considered in Sect. 3.2, in terms of ROC curves. In the left panel of Fig. 12, we show the ROC curve for each of the four variables depicted in Figs. 10 and 11: number of neighbors (magenta dotted line), normalized volume (green dashed line), isoperimetric quotient (blue dot-dashed line) and RSD of neighbor areas (red solid line). We observe that the RSD outperforms the other three variables, in agreement with the conclusions from [11] for the two-dimensional case.

However, the other three variables also have a certain degree of discriminating power, as seen in Fig. 11. The natural question then is how much additional sensitivity can be gained by considering not just one, but two variables simultaneously. We studied the correlations between the RSD, \({\bar{\sigma }}\), and each of the other three variables, and generally find that they are not perfectly correlated. (This makes sense intuitively because the RSD is computed from the neighbor set, \(N_i\), while the other three variables are properties of the individual cell.) We concluded that, among the three options, the normalized volume, \({\bar{v}}\), is the most promising, since it appears least correlated with \({\bar{\sigma }}\). Therefore, we expect that the sensitivity will improve once we incorporate the normalized volume, \({\bar{v}}\), in the analysis. This expectation is confirmed in the right panel of Fig. 12, where we show “improved” ROC curves based on binning in both \({\bar{v}}\) and \({\bar{\sigma }}\). The procedure, illustrated in Fig. 13, is as follows. We consider the \(({\bar{v}}, {\bar{\sigma }})\) plane divided into \(20\times 20\) bins (left panel of Fig. 13) or \(100\times 100\) bins (right panel of Fig. 13). We expect the signal, i.e., the boundary Voronoi cells, to populate the bins with small volume and relatively large RSD, while the background, i.e., the bulk cells, are distributed more uniformly throughout the \(({\bar{v}}, {\bar{\sigma }})\) plane. In order to build the optimal ROC curve, we need to determine the signal-to-background ratio, S / B, in each bin, and design the cuts so that we remove successively the bins with the smallest S / B. The bins in Fig. 13 are color coded according to the corresponding value of \(\log (S/B)\).Footnote 15 Given the finite statistics, there are bins which have no events (neither signal nor background); they are left uncolored. For definiteness, the bins which have some signal events, but no background events, are assigned the same value as the maximal \(\log (S/B)\) value among the bins containing both signal and background events. Similarly, the bins which had some background events, but no signal events, were assigned the same value as the minimal \(\log (S/B)\) value among the bins containing both signal and background events.

Figure 13 shows that, as expected, the bins with the largest S / B (colored in black) are located in the upper left corner of the plot, corresponding to small \({\bar{v}}\) and large \({\bar{\sigma }}\). The spread in the cluster of black-colored bins is indicative of the gain in sensitivity due to simultaneous consideration of the two variables, \({\bar{v}}\) and \({\bar{\sigma }}\). According to the right panel of Fig. 12, the bulk of the gain is already obtained with a \(20\times 20\) grid; increasing the number of bins 25 times to a \(100\times 100\) grid does not lead to substantial improvement. Therefore, in practice, one might want to consider grids of even smaller size, especially since the true ranking of the bins in terms of S / B depends on the parameter values, e.g., the density enhancement on the boundary and the value of \(\rho \), which are not known a priori. This is why when we consider the physics example in the next section, we shall utilize a smaller grid of \(15\times 15\) bins in the \(({\bar{v}}, {\bar{\sigma }})\) plane (see Fig. 16).

4 Finding phase-space boundaries with Voronoi tessellations

We now use Voronoi tessellations to find the phase-space boundary for SUSY events at the 14 TeV LHC. We consider the \(2+2+2\) topology from Fig. 1, where, as usual, a (left-handed) squark \(X_1={\tilde{q}}\) undergoes a cascade decay through a heavy neutralino, \(X_2={\tilde{\chi }}^0_2\); a slepton, \(X_3={\tilde{\ell }}\); and a light neutralino, \(X_4={\tilde{\chi }}^0_1\). As in Ref. [93], we consider the production of a squark in association with a neutralino LSP (\({\tilde{\chi }}^0_1\)). Events were generated with MadGraph5 [108]. The mass spectrum that we used was \(m_{{\tilde{q}}}=350\) GeV, \(m_{{\tilde{\chi }}^0_2}=300\) GeV, \(m_{{\tilde{\ell }}}=250\) GeV, and \(m_{{\tilde{\chi }}^0_1}=200\) GeV.Footnote 16 The particles visible in the detector are: a quark jet \(v_1=j\), a “near” lepton \(v_2=\ell _n\), and a “far” lepton \(v_3=\ell _f\). The relevant phase space is then \((m_{12}, m_{23}, m_{13})\equiv (m_{j\ell _n}, m_{\ell \ell }, m_{j\ell _f})\). For SUSY signal events, each of these three variables exhibits an upper kinematic endpoint. The three endpoint values are given by Eqs. (2022):

$$\begin{aligned} m^2_{j\ell _n,\mathrm{max}}= & {} 9931\ \mathrm{GeV}^2,\end{aligned}$$
(31)
$$\begin{aligned} m^2_{\ell \ell ,\mathrm{max}}= & {} 9900\ \mathrm{GeV}^2, \end{aligned}$$
(32)
$$\begin{aligned} m^2_{j\ell _f,\mathrm{max}}= & {} 11700\ \mathrm{GeV}^2. \end{aligned}$$
(33)

From the three-dimensional point of view, the signal events populate the interior of a compact region in the \((m_{j\ell _n}, m_{\ell \ell }, m_{j\ell _f})\) space, whose boundary is given by the constraint [94, 95]

$$\begin{aligned} \hat{m}^2_{j\ell _f} = \left[ \sqrt{ \hat{m}^2_{\ell \ell } \left( 1- \hat{m}^2_{j\ell _n}\right) } \pm \frac{m_{{\tilde{\ell }}}}{m_{{\tilde{\chi }}^0_2}} \sqrt{ \hat{m}^2_{j\ell _n} \left( 1-\hat{m}^2_{\ell \ell }\right) } \right] ^2 ,\nonumber \\ \end{aligned}$$
(34)

which, for convenience, is written in terms of unit-normalized variables (see also (19))

$$\begin{aligned} \hat{m}_{j\ell _n}= & {} \frac{m_{j\ell _n}}{ m_{j\ell _n,\mathrm{max}}}, \end{aligned}$$
(35)
$$\begin{aligned} \hat{m}_{\ell \ell }= & {} \frac{m_{\ell \ell }}{ m_{\ell \ell ,\mathrm{max}}}, \end{aligned}$$
(36)
$$\begin{aligned} \hat{m}_{j\ell _f}= & {} \frac{m_{j\ell _f}}{ m_{j\ell _f,\mathrm{max}}}. \end{aligned}$$
(37)

Our main goal in this section will be to test the algorithms from the previous sections for tagging the Voronoi cells in the vicinity of the boundary surface (34). In addition to the signal events from squark–neutralino associated production with the squark decaying as in Fig. 1, we shall also consider a representative number of background events. In order to make contact with the results from the previous sections, in Sect. 4.1 we first take the background events to be uniformly distributed in mass-squared phase space, and we ignore the combinatorial background. Then in Sect. 4.2 we study a more realistic case, where the SM background is comprised of dilepton \(t\bar{t}\) events and we also account for the combinatorial problem with the two leptons.

4.1 An example with uniform background and no combinatorics

As in the other two three-dimensional examples considered in Sects. 2.2 and 3.3, in this section we include “SM physics background” events, which we take to be uniformly distributed everywhere throughout the mass-squared phase space \((m^2_{j\ell _n}, m^2_{\ell \ell }, m^2_{j\ell _f})\) and normalized so that the density contrast across the boundary (34) is equal to \(\rho =4\). Note that in this scenario the interior “bulk” events and the “edge” cells on the surface boundary (34) consist of both SUSY signal and SM background events.

Fig. 14
figure 14

Two-dimensional slices of the relevant three-dimensional phase space of the SUSY-like cascade decay in Fig. 1. Each slice is in the \((m_{\ell \ell }^2,m_{j\ell _n}^2)\) plane at a fixed value of \(m_{j\ell _f}^2=2000\) \(\mathrm{GeV}^2\) (upper left panel); \(m_{j\ell _f}^2=4000\) \(\mathrm{GeV}^2\) (upper middle panel); \(m_{j\ell _f}^2=6000\) \(\mathrm{GeV}^2\) (upper right panel); \(m_{j\ell _f}^2=8000\) \(\mathrm{GeV}^2\) (lower left panel); \(m_{j\ell _f}^2=10000\) \(\mathrm{GeV}^2\) (lower middle panel); and \(m_{j\ell _f}^2=11000\) \(\mathrm{GeV}^2\) (lower right panel). As in Figs. 6 and 10, the two-dimensional cells seen in the plots result from the intersection of the projective plane with the three-dimensional Voronoi cells, and are color coded by the value of \({\bar{\sigma }}_i\) for the corresponding three-dimensional Voronoi cell

As before, we visualize the resulting Voronoi tessellation by presenting two-dimensional slices of the relevant three-dimensional phase space, in this caseFootnote 17 \((m^2_{j\ell _n}, m^2_{\ell \ell }, m^2_{j\ell _f})\). In Figs. 14 and 15 we show six slices in the \((m_{\ell \ell }^2,m_{j\ell _n}^2)\) plane at a fixed value of \(m^2_{j\ell _f}\) as follows: \(m_{j\ell _f}^2=2000\) \(\mathrm{GeV}^2\) (upper left panel); \(m_{j\ell _f}^2=4000\) \(\mathrm{GeV}^2\) (upper middle panel); \(m_{j\ell _f}^2=6000\) \(\mathrm{GeV}^2\) (upper right panel); \(m_{j\ell _f}^2=8000\) \(\mathrm{GeV}^2\) (lower left panel); \(m_{j\ell _f}^2=10000\) \(\mathrm{GeV}^2\) (lower middle panel); \(m_{j\ell _f}^2=11000\) \(\mathrm{GeV}^2\) (lower right panel). As in Figs. 6 and 10, the two-dimensional cells seen in the plots result from the intersection of the projective plane with the three-dimensional Voronoi cells and are color coded by the value of the RSD, \({\bar{\sigma }}_i\), defined in (1) (in Fig. 14) or the normalized volume \(\bar{v}_i\) defined in (3) (in Fig. 15) of the corresponding three-dimensional Voronoi cell.

Fig. 15
figure 15

The same as Fig. 14, but color-coding the cells according to the normalized volume, \(\bar{v}_i\), defined in (3)

Just as in the case of the density-enhanced sphere considered in Sect. 3.3, Figs. 14 and 15 suggest that the edge cells near the phase-space boundary (34) are characterized both by a large value of \({\bar{\sigma }}_i\) and a small value of \(\bar{v}_i\). Therefore, in designing a selection cut to pick up edge cells, it makes sense to consider both of these two variables at the same time. This is illustrated in the left panel of Fig. 16, which is the analogue of Fig. 13 for this case.

Fig. 16
figure 16

The integer ranking of the \(15\times 15\) bins in the \(({\bar{v}}, {\bar{\sigma }})\) plane according to their signal-to-background ratio. Left the ranking for the case of \(\rho =4\). Right the average ranking for the cases of \(\rho =1.2\), \(\rho =1.5\), \(\rho =2.0\), \(\rho =3.0\), and \(\rho =4.0\)

We consider a moderately large \(15\times 15\) grid in the \(({\bar{v}}, {\bar{\sigma }})\) plane and rank the resulting bins according to their signal-to-background ratioFootnote 18 as follows. The bin with the highest S / B is assigned rank 1, while the bin with the lowest S / B is assigned rank \(15\times 15=225\). In case of a tie between several bins, each bin is assigned the same average rank. Finally, bins with no events at all are ranked at the very bottom.Footnote 19 We observe that, similarly to Fig. 13, the highest ranked bins in terms of S / B appear at large values of \({\bar{\sigma }}\) and small values of \(\bar{v}\). Using the obtained bin ranking, we can build the corresponding ROC curve, shown with the red solid line in the left panel of Fig. 17, which in some sense is the “ideal” ROC curve that could be achieved, if \({\bar{\sigma }}\) and \(\bar{v}\) were the only discriminating variables under consideration.

Fig. 17
figure 17

The same as the ROC curves shown in the right panel of Fig. 12, but for a \(15\times 15\) grid. Left the ranking of the bins in constructing the ROC curve was done with the correct value of \(\rho \) (shown on each curve) used in generating the “data”. Right the ranking of the bins was always done according to the “average” ranking shown in the right panel of Fig. 16

Fig. 18
figure 18

The Voronoi cells which pass the two-dimensional selection cut requiring the cell to belong to one of the top 5 bins in terms of signal-to-background ratio for the correct choice of \(\rho =4\) (see the left panel in Fig. 16)

Fig. 19
figure 19

The same as Fig. 18, but using the average ranking of the bins shown in the right panel of Fig. 16

Fig. 20
figure 20

Scatter plots of signal events (black and red points, top three rows) and dilepton \(t\bar{t}\) events (blue points, bottom row), for different ranges of the dilepton invariant mass squared: \(1000\ \mathrm{GeV}^2 \le m_{\ell \ell }^2 \le 3000\ \mathrm{GeV}^2\) (first column); \(3000\ \mathrm{GeV}^2 \le m_{\ell \ell }^2 \le 5000\ \mathrm{GeV}^2\) (second column); \(5000\ \mathrm{GeV}^2 \le m_{\ell \ell }^2 \le 7000\ \mathrm{GeV}^2\) (third column); and \(7000\ \mathrm{GeV}^2 \le m_{\ell \ell }^2 \le 9000\ \mathrm{GeV}^2\) (fourth column). In the first row, the signal events are plotted in the plane of \((m^2_{j\ell _n}, m^2_{j\ell _f})\) and colored red (black) if \(m^2_{j\ell _n} \ge m^2_{j\ell _f}\) (\(m^2_{j\ell _n} \le m^2_{j\ell _f}\)). The same points are then plotted in the planes of \((m^2_{j\ell }(\mathrm{low}), m^2_{j\ell }(\mathrm{high}))\) (second row) and \((m^2_{j\ell }(\mathrm{low}), m^2_{j\ell }(\mathrm{high})-m^2_{j\ell }(low))\) (third row). The background events in the fourth row are plotted in the plane of \((m^2_{j\ell }(\mathrm{low}), m^2_{j\ell }(\mathrm{high})-m^2_{j\ell }(\mathrm{low}))\)

Fig. 21
figure 21

The analogue of Fig. 14 for the physics example considered in Sect. 4.2. We show nine slices at fixed values of \(m_{\ell \ell }^2\) as indicated at the top of each panel. The red (black) dashed line in each plot corresponds to the expected theoretical boundary implied by Eq. (34) for the set of points with \(m^2_{j\ell _n} \ge m^2_{j\ell _f}\) (\(m^2_{j\ell _n} \le m^2_{j\ell _f}\)) (see also the third row in Fig. 20)

One can now repeat the same procedure for different values of \(\rho \). We show four more examples in the left panel of Fig. 17, with increasingly pessimistic values for the density contrast \(\rho \): 3, 2, 1.5 and 1.2. As expected, the ROC curves become progressively worse, as quantified in the figure. With regards to the bin ranking in each case, we notice the following trend – the highest ranked bins remain at the lowest possible values of \(\bar{v}\), but slide down the \({\bar{\sigma }}\) axis to slightly lower values of the RSD, near, or even below \({\bar{\sigma }}\sim 1\). This is easy to understand intuitively – as \(\rho \) is decreased, the number densities on both sides of the surface boundary become more similar, and there is less variation between the sizes of bulk cells on the inside and on the outside. The fact that the bin ranking derived from our Monte Carlo simulations depends on the value of \(\rho \) poses an important conceptual problem with this procedure – when the analysis is performed on real data, we will not know the actual value of \(\rho \), and, hence, we will not be certain which particular ordering of bins to use. Nevertheless, the fact that the highest ranked bins are clustered, more or less, in the same location, suggests a possible resolution: we can simply average our results obtained for several different values of \(\rho \) and use the resulting average rank for each bin, which is shown in the right panel of Fig. 16. The corresponding ROC curves derived with the help of this “average” bin ranking procedure are shown in the right panel of Fig. 17. Comparing the two panels of Fig. 17, we see that the ROC curves based on the average ranking are only slightly degraded compared to the “ideal” case.

We are now ready to start designing selection cuts for the edge cells. One possibility is to select the cells which fall into the a predetermined number, \(N_{\mathrm{top}\ \mathrm{bins}}\), of the highest ranked bins in the \(({\bar{v}} , {\bar{\sigma }})\) plane. If we “cheat”, i.e., use the correct value of \(\rho \) for the ranking (as in the left panel of Fig. 16), we obtain the result shown in Fig. 18, where we have chosen \(N_{\mathrm{top}\ \mathrm{bins}}=5\). In general, the tagged cells are distributed throughout the volume of the three-dimensional phase space, so for illustration purposes we again use the same six two-dimensional slices as in Figs. 14 and 15. We observe that the procedure is pretty efficient in tagging edge cells, and occasionally we tag an isolated bulk cell. Of course, for such a low value of \(N_{\mathrm{top}\ \mathrm{bins}}\), not all edge cells will pass the cut, which will cause the boundary contours (marked with black dashed lines) to appear segmented and incomplete. By increasing the value of \(N_{\mathrm{top}\ \mathrm{bins}}\), we can obviously tag more edge cells and eventually “close” those contours, but at the cost of more mistagged bulk cells.

Since the true value of \(\rho \) will be unknown, the plots in Fig. 18 are for academic purposes only. The more realistic situation is depicted in the analogous Fig. 19, where we again choose \(N_{\mathrm{top}\ \mathrm{bins}}=5\), only this time we use the average bin ranking from the right panel of Fig. 16. The result in Fig. 19 is slightly worse than Fig. 18 – while we do find a higher rate for mistagging bulk cells (typically in the interior), the majority of the tagged cells are very close to the surface boundary, suggesting that this is a promising technique for identifying edge cells.

4.2 A realistic example with \(t\bar{t}\) background and combinatorics

We now repeat the exercise from the previous Sect. 4.1 with two improvements. First, we have to address the combinatorial problem of distinguishing the “near” and “far” lepton. The standard approach in the literature is to trade the original variables \(m_{j\ell _n}\) and \(m_{j\ell _f}\) for the ordered pair [18, 20, 31]

$$\begin{aligned} m_{j\ell }(\mathrm{high})&\equiv \max \left\{ m_{j\ell _n}, m_{j\ell _f} \right\} , \end{aligned}$$
(38a)
$$\begin{aligned} m_{j\ell }(\mathrm{low})&\equiv \min \left\{ m_{j\ell _n}, m_{j\ell _f} \right\} . \end{aligned}$$
(38b)

This reordering procedure is pictorially illustrated with the first two rows of plots in Fig. 20, where we show scatter plots of signal events for different ranges of the third invariant mass variable (the dilepton mass): \(1000\ \mathrm{GeV}^2 \le m_{\ell \ell }^2 \le 3000\ \mathrm{GeV}^2\) (first column); \(3000\ \mathrm{GeV}^2 \le m_{\ell \ell }^2 \le 5000\ \mathrm{GeV}^2\) (second column); \(5000\ \mathrm{GeV}^2 \le m_{\ell \ell }^2 \le 7000\ \mathrm{GeV}^2\) (third column); and \(7000\ \mathrm{GeV}^2 \le m_{\ell \ell }^2 \le 9000\ \mathrm{GeV}^2\) (fourth column).

In the first row, the signal events are plotted in the original plane of \((m^2_{j\ell _n}, m^2_{j\ell _f})\), and the points are color coded as follows. The points below the diagonal \(45^\circ \) line, which have \(m^2_{j\ell _n} \ge m^2_{j\ell _f}\), are colored in red, while the remaining points above the diagonal \(45^\circ \) line with \(m^2_{j\ell _n} \le m^2_{j\ell _f}\), are colored in black. The same data is then plotted in the second row of Fig. 20 in the plane of \((m^2_{j\ell }(\mathrm{low}), m^2_{j\ell }(\mathrm{high}))\). Notice that the effect of the reordering procedure (38) is to leave all the black points in place, while interchanging the x and y coordinates of the red points.Footnote 20 After the reordering (38), half of the plane on each plot is left blank. In order to avoid such voids, in the third row of Fig. 20 we replot the data in the plane of \((m^2_{j\ell }(\mathrm{low}), m^2_{j\ell }(\mathrm{high})-m^2_{j\ell }(\mathrm{low}))\), which is fully accessible.

As expected, the scatter plots in the third row of Fig. 20 exhibit boundary lines, which we can target with our edge-detecting method. In fact, each plot has two such boundaries where the signal number density sharply changes – one for the red points and another for the black points. At low values of \(m_{\ell \ell }\) the “red” (“black”) boundary line is an external (internal) boundary, while for high values of \(m_{\ell \ell }\) it is the other way around. At intermediate values of \(m_{\ell \ell }\) the two boundaries are very close to each other and that is where we expect the edge detection method to perform best.

Having thus taken care of the combinatorial problem, we now also improve our treatment of the background – instead of uniformly distributed background events as in Sect. 4.1, we consider dilepton events from \(t\bar{t}\) production. The corresponding scatter plots are shown in the fourth (last) row of Fig. 20. Since there are 2 b-jets, each background event contributes two entries to the scatter plot. We see that within the relevant range of the plotted variables \(m^2_{j\ell }(\mathrm{low})\) and \(m^2_{j\ell }(\mathrm{high})-m^2_{j\ell }(\mathrm{low})\), the distribution of the background events is somewhat uniform, with some noticeable clustering near the origin.

We are now in a position to apply our Voronoi boundary detection algorithm. The result is shown in Fig. 21, where we present nine slices at fixed values of \(m_{\ell \ell }^2\) as indicated at the top of each panel. The red (black) dashed line in each plot corresponds to the expected theoretical boundary implied by Eq. (34) for the set of points with \(m^2_{j\ell _n} \ge m^2_{j\ell _f}\) (\(m^2_{j\ell _n} \le m^2_{j\ell _f}\)) (see also the third row in Fig. 20). Figure 21 demonstrates that the Voronoi cells with the highest values of RSD \({\bar{\sigma }}_i\) (colored in red or orange) are indeed found near the theoretical boundaries (the red or black dashed lines). As anticipated, the method performs well for intermediate values of \(m^2_{\ell \ell }\sim (4000\mathrm{-}5000)\ \mathrm{GeV}^2\), where the two boundaries resulting from the reordering (38) tend to overlap. We also observe that the boundary closer to the origin also seems to be singled out, especially at high values of \(m^2_{\ell \ell }\).

5 Conclusions

In this paper, we took the first steps toward developing a general method for identifying surface boundaries in 3D phase-space distributions from Voronoi tessellations. In the case of a sequential cascade decay like the one exhibited in Fig. 1, the surface boundary in the relevant \((m_{12}, m_{23}, m_{13})\) space is characterized by two properties: (1) it is the location of points where the number density is enhanced, due to the \(\Delta _4^{-1/2}\) factor in the phase-space distribution (25) [93]; (2) it is the location of points where the number density suddenly changes due to the lack of signal events outside the kinematically allowed boundary. These two properties motivate the use of the geometric variables, \({\bar{\sigma }}_i\) and \(\bar{v}_i\), derived from the Voronoi tessellation of the data. (For other options, see [11].) We showed that the edge cells tend to have large values of \({\bar{\sigma }}_i\) and small values of \(\bar{v}_i\), thus we advocated empirically selected cuts in terms of \(\bar{v}_i\) and \({\bar{\sigma }}_i\) for tagging edge cell candidates. We considered several examples of increasing complexity and quantified the efficiency of those selection cuts using the language of ROC curves.

There are several directions in which this line of research can proceed.

  • Statistical significance of a set of tagged edge cells As we have seen in Figs. 18 and 19, the method is not perfect, in the sense that it occasionally tags a few bulk cells. Therefore, we need to develop a statistical procedure for determining the statistical significance of a given observed set of tagged edge cell candidates. Such a procedure should involve not just the relative number of tagged cells, but also their correlations, e.g., proximity to each other, connectedness, etc. This is an interesting subject on its own and will be addressed in a future publication [107].

  • Parameter measurements from fitting to a set of tagged edge cells Having selected a set of edge cell candidates, one could imagine fitting to the theoretical prediction for the shape of the boundary surface (34), obtaining a measurement of the mass spectrum of the new particles \(X_1, X_2, X_3, X_4\). The actual fitting can be done in several different ways, which will also be investigated in [107].

  • Experimental effects In order to keep the discussion simple and to the point, in this paper we have ignored the experimental complications arising from finite particle widths, detector resolution, ISR jet combinatorics, fakes, etc. Our goal was to present the method as a proof of principle, since the Voronoi approach to a data analysis is still in its infancy. Once Voronoi-based methods have become more established and mature, it will become worthwhile to perform detailed and more realistic studies beyond parton level and with full detector simulation.

Here we have focused on cases where the number of signal events on the boundary is significant, leading to a “step” function discontinuity in the total density of events as one moves across the boundary. However, there are interesting examples of distributions where the number density is continuous, but exhibits a “kink”, i.e., a discontinuity in its derivative (gradient) [110,111,112]. In such cases, our methods can still be applied – not directly on the initial data itself, but on a secondary data sample created by taking suitable derivatives. Indeed, while we have identified and studied a promising use of Voronoi tessellations in the analysis of particle physics data, there are many exciting applications yet to be developed. We look forward with anticipation to the future development and adoption of these novel and powerful methods.