Advertisement

Dynamical projections for the visualization of PDFSense data

  • Dianne Cook
  • Ursula LaaEmail author
  • German Valencia
Open Access
Regular Article - Theoretical Physics

Abstract

A recent paper on visualizing the sensitivity of hadronic experiments to nucleon structure (Wang et al. in arXiv:1803.02777, 2018) introduces the tool PDFSense which defines measures to allow the user to judge the sensitivity of PDF fits to a given experiment. The sensitivity is characterized by high-dimensional data residuals that are visualized in a 3-d subspace of the 10 first principal components or using non-linear embeddings. We show how a tour, a dynamic visualisation of high dimensional data, can extend this tool beyond 3-d relationships. This approach enables resolving structure orthogonal to the 2-d viewing plane used so far, and hence finer tuned assessment of the sensitivity.

1 Introduction

Many problems in physics can be broadly characterized as a description of a large number of observations with models that contain multiple parameters. It is common practice to perform a global fit to the observations to arrive at the set of parameter values that best fits the data. To understand how well this fit describes the observations, a series of one or two-dimensional projections of confidence level regions are usually provided.

It is desirable to visually inspect the results of such fits to gain insight into their structure. One possibility is to directly compare the predictions of different parameter sets in the vicinity of the best fit. A simple algorithm to organise this idea that results in a manageable number of such parameter sets can be constructed using singular value decomposition (SVD). One first decides the confidence level at which to make the desired comparison and quantifies it with the corresponding \(\Delta \chi ^2\) for the appropriate number of parameters being fit, n. The region in parameter space within the desired confidence level is approximately an n-dimensional ellipsoid, and SVD provides an ideal set of 2\(\times n\) points on which to evaluate the predictions of the model for visual inspection. These points are given by the intersections of the ellipsoid with its principal axes and clearly provide a minimal sample of parameter space that covers all relevant directions at a desired confidence level.

A tool for the direct visualisation of the high dimensional model predictions thus constructed has existed in the statistics literature for many years, but has not been applied to high energy physics problems recently.1 It is called a tour, and is a dynamic visualization of low-dimensional projections of high-dimensional spaces. The most recent incarnation of the tool is available in the R [3] package, called tourr [4]. The goal of this paper is to introduce the use of a tour as a visualisation tool for sensitivity studies of parton distribution functions (PDFs) building on the formalism that has been developed over the years by the CTEQ collaboration. It is beyond the scope of this article to provide a detailed analysis of the PDF uncertainties. The choice of this example has two motivations: the PDF fits embody the generic problem of multidimensional fits to large numbers of observables that are common in high energy physics; and Ref. [1] has recently provided the parameter sets for this problem in an initial effort to visualize the PDF fits. Our starting point will be the PDFSense [1] results but our study differs in an important way: PDFSense utilizes the Tensorflow Embedding Projector [5], limiting visualisation to three of the first ten principal components, that is, a 3-d subspace, whereas the tour allows us to explore the full space. As we will see here, this allows additional insights into the fits.

Our paper is organised as follows. In Sect. 2 we first describe the problem as formulated in Ref. [1] and we discuss a toy example to illustrate the concepts involved. We then introduce the tour algorithm and its implementation in Sect. 3. Finally we discuss the results obtained by applying tour to the PDFSense dataset in Sect. 4 and present our conclusions in Sect. 5.

2 PDF fits and residuals

The analysis of collider physics results relies on theoretical calculations of cross-sections and distributions. Factorization theorems allow us to bypass non-perturbative physics that cannot be calculated from first principles and to describe instead, the initial state of a reaction in terms of parton distribution functions or PDFs. These consist of simple functional forms describing the probability density for finding a given quark or gluon in the proton with a given momentum fraction x, at a given momentum transfer scale Q, in the lowest order approximation. The PDFs used today have been constructed by fitting high energy physics data collected over many years by multiple experiments and are produced by large collaborations. As such, they constitute an ideal example of a multidimensional parameter fit to a large data set to study with a tour.

For our study we will make use of the framework for treating uncertainties of the PDF predictions as has been defined in [6, 7]. The best fit PDF, defined by the set of n parameters \(a^0_i\), is obtained by finding the global minimum of a \(\chi ^2\) function. To study uncertainties in the fit one considers small variations of the parameters around the minimum using a quadratic approximation for the \(\chi ^2\) function written in terms of the Hessian matrix of second derivatives at the minimum, H. The eigenvectors of this matrix provide the principal axes of the confidence level ellipsoids around the global minimum, and one defines a displacement along these directions to find the n dimensional set of points \(a_i\) which provide 2n PDF sets that differ from the best fit by a desired confidence level.

Reference [1] has introduced the package PDFSense to study the sensitivity of different experiments to different aspects of the PDFs. An ingredient of that study are the so-called shifted residuals which are related to the experimental error contribution to the \(\chi ^2\) by [8]
$$\begin{aligned} \chi ^2_E (\vec {a}) = \sum _{i=1}^{N_d} r^2_i(\vec {a}) + \sum _{\alpha =1}^{N_{\lambda }}\bar{\lambda }_{\alpha }^2(\vec {a}) \end{aligned}$$
(1)
where the \(\bar{\lambda }_{\alpha }\) are the best-fit nuisance parameters. The shifted residuals \(r_i(\vec {a})\) are calculated as the difference between the theoretical prediction \(T_i(\vec {a})\) and the shifted central data value \(D_{i,sh}(\vec {a})\), normalised by the total uncorrelated uncertainty \(s_i\),
$$\begin{aligned} r_i(\vec {a}) = \frac{1}{s_i}(T_i(\vec {a}) - D_{i,sh}(\vec {a})). \end{aligned}$$
(2)
Note that \(D_{i,sh}(\vec {a})\) is the observed central value shifted by a function of the optimal nuisance parameters \(\bar{\lambda }_{\alpha }\) and therefore depends on the point in parameter space considered. The so-called response of a residual to an experimental result i is then defined as [1]
$$\begin{aligned} \delta _{i,l}^{\pm } \equiv (r_i(\vec {a}_l^{\pm })-r_i(\vec {a}_0))/\langle r_0\rangle _E \end{aligned}$$
(3)
with \(\langle r_0\rangle _E\) the root-mean squared residuals characterizing the quality of fit to experiment E, following from Eq. 1
$$\begin{aligned} \langle r_0\rangle _E \approx \sqrt{\frac{\chi ^2_E(\vec {a}_0)}{N_d}}. \end{aligned}$$
(4)
This residual response parameterizes the change in residuals with variations along the independent directions \(\vec {a}_l^{\pm }\).2
Large values of \(\delta _{i,l}^{\pm }\) therefore indicate considerable variation in the theory prediction values within the selected window of allowed probability variation along the considered direction. We thus consider a 2N dimensional vector
$$\begin{aligned} \vec {\delta }_i=\{\delta _{i,1}^{+},\delta _{i,1}^{-},\ldots ,\delta _{i,N}^{+},\delta _{i,N}^{-}\}. \end{aligned}$$
(5)
for each data point (i.e. experimental result). Concretely, here we consider a 56 dimensional parameter space in which we want to compare and group the experimental results. These responses \(\vec {\delta }_i\) are calculated and provided by Ref. [1] and they constitute the starting point of our study.
Fig. 1

For illustrative purposes, two data sets of gluon parton distribution function, in the form \(p(x)\pm \Delta p(x)\) for 15 and 16 values of x, respectively (shown in red and blue). The left (right) panel shows the low (high) x region respectively

Fig. 2

Difference between the \(\chi ^2\)-function (black), and quadratic approximation (orange). Their intersection with a 95% confidence level plane is shown on the right panel. The intersections of the principal axes with the ellipse (that occurs in the quadratic approximation) are shown as the black dots in the right panel. The numbers label the eigenvector of H corresponding to that direction

2.1 Simple illustrative example

The procedure described so far has been used for many years, but it is complicated. For newcomers to the field, we illustrate it here using a simple example drawn from two early data sets for the gluon parton distribution function extracted from two types of \(\psi \) production experiments [9]. This example will allow us to illustrate all the concepts involved. In Fig. 1 we show these two data sets, labelling the points and their error bars \(p(x)\pm \Delta p(x)\), for 15 and 16 values of x (in red and blue) respectively. The points are fit to the two-parameter function
$$\begin{aligned} g(a,b,x)=\frac{1}{2}(1+b)(1-x)^b x^a, \end{aligned}$$
(6)
similar to but simpler than the forms used today. The next step is to minimise the \(\chi ^2\)-function defined by
$$\begin{aligned} \chi ^2(a,b)=\sum _{x_i}\left( \frac{g(a,b,x_i)-p(x_i)}{\Delta p(x_i)}\right) ^2. \end{aligned}$$
(7)
The parameters \(a_0,b_0\) that result in the global minimum \(\chi ^2(a,b)_\mathrm{min}\) define the best fit to the data. They are shown as the cross in the right panel of Fig. 2, and produce the solid black curve shown in Fig. 1. At the same time one adopts a quadratic approximation to the \(\chi ^2\) function in the vicinity of its minimum
$$\begin{aligned} \chi ^2(a,b)\approx \chi ^2(a_0,b_0) +\frac{1}{2} \left( \begin{array}{cc} a-a_0&b-b_0 \end{array} \right) \nonumber \\ \left( \begin{array}{ll} \frac{\partial \chi ^2(a,b)}{\partial a^2} &{} \frac{\partial \chi ^2(a,b)}{\partial a\partial b} \\ \frac{\partial \chi ^2(a,b)}{\partial a\partial b} &{} \frac{\partial \chi ^2(a,b)}{\partial b^2} \end{array} \right) _0 \left( \begin{array}{c} a-a_0 \\ b-b_0 \end{array} \right) , \end{aligned}$$
(8)
where the matrix of second derivatives evaluated at the global minimum is the well-known Hessian. This approximation seems unnecessary for the simple example we are discussing now but is used for the current global fits offering complementary features to exact numerical methods [10]. To quantify the error in the fit one then constructs the region in ab parameter space corresponding to a given confidence level. For our example we take \(\chi ^2(a,b)-\chi ^2(a_0,b_0)\le 5.99\) which corresponds to a 95% confidence level in the estimation of two parameters. The intersection of the plane \(\chi ^2(a,b)=\chi ^2(a_0,b_0)+ 5.99\) (green) with the \(\chi ^2(a,b)\) function (shown in black) and its quadratic approximation (in orange) is shown in the left panel of Fig. 2. The right panel in the same figure shows the ellipsoid (two-dimensional in this case) defined by this intersection for the quadratic approximation (in orange) and the deformed ellipsoid in black for the exact \(\chi ^2(a,b)\) function. The difference between the two is small indicating that the quadratic approximation is quite adequate for this confidence level. The eigenvectors of the Hessian matrix provide the directions of the principal axes of the ellipsoid and are shown in black in the right panel of Fig. 2: the dashed (dotted) lines correspond to the direction associated with the largest (smallest) eigenvalue. The intersections of these axes with the ellipse, shown as black dots, provide a set of fits to the data that can be compared with the best fit and used as a means of quantifying the uncertainty in the fitting procedure. These are also shown in Fig. 1.
The set of responses, \(\delta _{i,l}^{\pm }\), in this example is shown in Fig. 3. From inspecting the limiting behaviour of Eq. 6 it is clear that the description at low x is dependent mainly on a while large values of x are mostly sensitive to b. This is reflected in the uncertainty curves in Fig. 1, and also when looking at the \(\delta \)s. For this simple example the main directions identified by the Hessian method are in fact well aligned with the original directions in parameter space. Considering the values of \(\delta \) we find that \(\delta _1^{\pm }\), which corresponds mainly to a variation of a, takes large values for bins with low values of x, while \(\delta _2^{\pm }\) takes large values for bins with large values of x. We conclude that the parameter dependence is captured by the \(\delta \)s as expected. Going to more complex descriptions and fits, as we do in the following, this correspondence is no longer clear from the description and the \(\delta \) values may be used to infer the parameter dependence of a given prediction.
Fig. 3

The \(\delta \) parameter space of the simple illustrative example: \(\delta _i^+\) form the axes and color indicates the respective value of x. Note that only \(\delta _i^+\) is shown because for this problem the \(\delta _i^-\) directions contain the same information. Labelled points are the same as those labelled in Fig. 1, and illustrate key features of the fits

In Figs. 1 and 3 we have labelled the following four points:
  1. 1.

    point with highest value in \(\delta _1\), found at low x and with small error bar

     
  2. 2.

    point with parametrized highest value in \(\delta _2\), also has the highest value of x

     
  3. 3.

    point that is not well described by the fits, but has small values of \(\delta \)

     
  4. 4.

    point with intermediate value of x and small errors result in larger values in both \(\delta \) directions.

     
These observations illustrate that large values of \(\delta \) correlate with points with errors that are comparable to or smaller than the uncertainty in the fit as parametrized by the Hessian method. At the same time, points that are not well described by the fits do not necessarily result in large \(\delta \)s.

3 Data visualisation

When looking for structure in high dimensional parameter spaces we rely on tools for dimensional reduction and visualisation. Due to the importance of this task, many methods have been developed. Here we give a brief overview of the tools used in the following work. Note that in the following we adopt a broader definition of the word “data” generally used in statistics, which is not restricted to experimental results.

3.1 Dimension reduction

3.1.1 Principal component analysis

Principal component analysis (PCA) is an orthogonal linear transformation of elliptical data into a coordinate system, such that the first basis vector aligns with the direction of maximum variance. The second basis vector is the direction of maximum variation orthogonal to the first coordinate, and the remaining basis vectors are sequentially computed analogously. It is typically used for dimension reduction. To choose the number of principal components (PCs) to use, the proportion of variance explained by each component is examined,
$$\begin{aligned} v_i^{prop} = v_i \Big / \sum _j v_j, \end{aligned}$$
(9)
with \(v_i\) the variance in the direction of PC i. Either a pre-determined proportion of total variance is used, or by plotting the proportions against the number of PCs and choosing the point where this flattens to zero. PCA is an optimization problem with a well defined solution. However, the outcome of the PCA is affected by the preparation of the input data. The preparation can also be used to highlight specific aspects of the data distribution. For example, the input data is generally centered before performing PCA by setting each variable to have a mean value of zero. In this way, large variation describing only mean values different from zero are removed from the results. Another approach would be to normalize the distribution, to emphasize directional information. Typically this means “sphering” of the data points, by normalizing each vector to have length one. This results in comparison of similar, or different, directions in the parameter space, but information about the differences in length are lost by this approach.

In this work we use the standard implementation prcomp in R for the computation of the principal components.

3.1.2 Nonlinear embeddings

It is also common to examine non-linear mapping of the data points onto a low dimensional embedding. The aim is to preserve multidimensional structure by minimizing the difference in distances in the full parameter space as compared to distances in the low dimensional projection. PCA is a simple member of this more general type of transformation. A widely used method in machine learning is the algorithm called t-distributed stochastic neighbor embedding (t-SNE) [11]. It has a goal to cluster similar points together (i.e. points with small Euclidean distance) while separating the individual clusters from one another. This gives appealing and often useful pictures but results should be considered with care as t-SNE is a nonlinear transformation and does not preserve original distance. Note that while nonlinear embeddings may be useful in identifying clusters in the data, their interpretation is limited by lack of an analytical description of the transformation. This is not the case for linear transformations such as the PCA, where the transformation can be readily reversed to identify the contribution of the original parameters to a given principal component direction.

3.2 Tour algorithm

3.2.1 Overview

When a data set has more than two parameters, the tour [12] can be used to plot the multiple dimensions. Currently the typical approach is to plot two parameters or pairs of combinations of the parameters. The tour extends this idea to plot all possible combinations. The viewer is provided with a continuous movie of smooth transitions from one combination to another, from which it is possible to extrapolate the shape of the parameter space in high-dimensions. Seeing many combinations in quick succession shows the associations between all the parameters.

There are several types of tours. Here we use a grand tour, of projections from n-dimensional parameter space to 2-d projections space. A projection of data is computed by multiplying an \( m \times n\) data matrix, \(\mathrm{\mathbf{X}}\), having m sample points in n dimensions, by an orthonormal \(n \times d\) projection matrix, \(\mathrm{\mathbf{A}}\), yielding a d-dimensional projection. The grand tour is a mechanism for choosing which projections to display, and how the smooth transitions happen. New projections are chosen from all possible projections, and a geodesic interpolation to a target projection provides the smooth transition. The original algorithm is documented in [13]. The implementation used in this paper is from the tourr [4] package in R [3].

The tour shows linear projections of the parameter space. In contrast, methods like t-SNE [11] produce non-linear mappings from high- to low- dimensional space. The difference is that the shape of the data in high-dimensions is preserved by linear projections, but not with nonlinear mappings.

3.2.2 Algorithm

A movie of data projections is created by interpolating along a geodesic path from the current (starting) plane to the new target plane. In the grand tour, the target plane is chosen by randomly selecting a plane. The interpolation algorithm (as described in [14]) follows these steps:
Table 1

Summary of key findings, comparing observations made with visualising PDFSense results with the TFEP and with additional insights that can be made using tour. A complete list of experimental datasets together with their CTEQ labelling IDs is given in Appendix A

 

PDFSense & TFEP

Tour

1

Three clusters can be separated in the visualisation, labelled DIS, VBP and jet cluster. In the selected view the jet cluster is roughly orthogonal to the DIS cluster

We observe the differences in distributions between the three clusters more clearly. Substructure within the clusters is also observed, and studied in some detail

2

New ATLAS and CMS results will dominate the jet cluster

A more detailed comparison of jet cluster results shows that CMS results are mainly responsible for extending the range, consistent with sensitivity rankings

3

\(t\bar{t}\) results are characterized by large \(\vec {\delta }\) but there are only a few points and they are found inside the jet cluster

While the \(t\bar{t}\) results follow similar distributions to the jet cluster, they do contain outlying points

4

Results from semi-inclusive charm production at HERA (147) are found to overlap with the DIS and jet clusters

These results do not take significant values in any direction of the \(\vec {\delta }\) space, directional information is misleading here

5

CCFR/NuTeV dimuon SIDIS results (124–127) are orthogonal, the direction cannot be resolved in the selected view

The tour resolves the orthogonal direction and further allows to identify outlying points

6

Reciprocated distance as summary statistic to characterize “relevance” of results

We can use the ranking as guidance to select results to highlight in the visualisation to gain understanding of how the summary statistics relate to raw distributions

  1. 1.

    Given a starting \(n\times d\) projection \(\mathrm{\mathbf{A}}_a\), describing the starting plane, create a new target projection \(\mathrm{\mathbf{A}}_z\), describing the target plane. It is important to check that \(\mathrm{\mathbf{A}}_a\) and \(\mathrm{\mathbf{A}}_z\) describe different planes, and generate a new \(\mathrm{\mathbf{A}}_z\) if necessary. To find the optimal rotation of the starting plane into the target plane we need to find the frames in each plane which are the closest.

     
  2. 2.

    Determine the shortest path between frames using singular value decomposition. \(\mathrm{\mathbf{A}}_a'\mathrm{\mathbf{A}}_z=\mathrm{\mathbf{V}}_a\Lambda \mathrm{\mathbf{V}}_z', ~~~\Lambda =\text{ diag }(\lambda _1\ge \dots \ge \lambda _d)\), and the principal directions in each plane are \(\mathrm{\mathbf{B}}_a=\mathrm{\mathbf{A}}_a\mathrm{\mathbf{V}}_a, \mathrm{\mathbf{B}}_z=\mathrm{\mathbf{A}}_z\mathrm{\mathbf{V}}_z\), a within-plane rotation of the descriptive bases \(\mathrm{\mathbf{A}}_a, \mathrm{\mathbf{A}}_z\) respectively. The principal directions are the frames describing the starting and target planes which have the shortest distance between them. The rotation is defined with respect to these principal directions. The singular values, \(\lambda _i, i=1,\dots , d\), define the smallest angles between the principal directions.

     
  3. 3.

    Orthonormalize \(\mathrm{\mathbf{B}}_z\) on \(\mathrm{\mathbf{B}}_a\), giving \(\mathrm{\mathbf{B}}_*\), to create a rotation framework.

     
  4. 4.

    Calculate the principal angles, \(\tau _i = \cos ^{-1}\lambda _i, i=1,\dots , d\).

     
  5. 5.

    Rotate the frames by dividing the angles into increments, \(\tau _i(t)\), for \(t\in (0,1]\), and create the ith column of the new frame, \(\mathrm{\mathbf{b}}_i\), from the ith columns of \(\mathrm{\mathbf{B}}_a\) and \(\mathrm{\mathbf{B}}_*\), by \(\mathrm{\mathbf{b}}_i(t) = \cos (\tau _i(t))\mathrm{\mathbf{b}}_{ai} + \sin (\tau _i(t))\mathrm{\mathbf{b}}_{*i}\). When \(t=1\), the frame will be \(\mathrm{\mathbf{B}}_z\).

     
  6. 6.

    Project the data into \(\mathrm{\mathbf{A}}(t)=\mathrm{\mathbf{B}}(t)\mathrm{\mathbf{V}}_a'\).

     
  7. 7.

    Continue the rotation until \(t=1\). Set the current projection to be \(\mathrm{\mathbf{A}}_a\) and go back to step 1.

     
In a grand tour the target plane is drawn randomly from all possible target planes, which means that any plane is equally likely to be shown. That is, we are sampling from a uniform distribution on a sphere. To achieve this, sample n values from a standard univariate normal distribution, resulting in a sample from a standard multivariate normal. Standardize this vector to have length equal to one, gives a random value from a \((n-1)\)-dimensional sphere, that is, a randomly generated projection vector. Do this twice to get a 2-dimensional projection, where the second vector is orthonormalized on the first.

The data typically needs some standardization or scaling before computing the tour. This is because we are considering linear combinations of the different parameter directions and differences in overall range might otherwise dominate the resulting display.3 This can be as simple as centering each variable on 0, and standardizing to a range of − 1 to 1. It could be as severe as sphering the data which in statistics means that the data is transformed into principal components (from elliptical shape to spherical shape). The same term is used for a different type of transformation in other fields, where observations are scaled to fall on a high-dimensional sphere, by scaling each observation to have length 1. (An interesting diversion: this type of sphering is the same transformation made on multivariate normal vectors to obtain a point on a sphere, to choose the target planes in the grand tour.)

The initial description of the tour promised display of all possible projections. Theoretically this is true, but practically it would require that the user stay watching forever! However, the coverage of the space is fairly fast, depending on n, and within a short time it is possible to guarantee all possible projections are displayed within an angle of tolerance.

3.2.3 Display

For physics problems, setting \(d=2\) would be most common. The projected data is displayed as a scatterplot of points. It is also possible to overlay confidence regions, or contours. Groups in the data can be highlighted by color. Displaying the combination of variables of a particular projection can be useful to interpret patterns. This can be realized by plotting a circle with segments indicating the magnitude and direction of the contribution, and it is called the axes.

The same tour path can be used to display subsets of the data, in different plots, to compare groups. When we break the display into subsets, the full data is also shown in each plot, in light grey. This makes it easier to do group comparison.
Fig. 4

Projections obtained with TFEP, where principal components 3, 5 and 8 have been selected, and the view was rotated such that the jet+\(t\bar{t}\) cluster is roughly orthogonal to the DIS cluster. The top left plot shows grouping into jets+\(t\bar{t}\) (red), DIS (blue) and VBP (orange), the remaining plots highlight subgroups (indicated by CTEQ labelling IDs shown in the appendix) of the jets+\(t\bar{t}\) cluster in the same view

4 Results

This section compares the findings made using the tour relative to those made with PDFSense using the recent CT14HERA2 fits [15]. The PDFSense results form the basis on which to expand the knowledge of PDF fits. The results from both tools are summarized in Table 1, where PDFSense results were obtained using the TensorFlow Embedding Projector (TFEP) software [5] for the visualisation of high-dimensional data. The summary statistic “reciprocated distance” referenced in Table 1 is defined as:
$$\begin{aligned} \mathcal {D}_{i}\ \equiv \ \left( \sum _{j\ne i}^{N_{ all }}\frac{1}{|\vec {\delta }_{j}-\vec {\delta }_{i}|}\right) ^{-1}. \end{aligned}$$
(10)
This pair-wise distance measure will take larger values for experimental results with residual responses different from most other results considered, and small values if the responses are similar to most other results. For the example shown in Fig. 3 the largest value of reciprocated distance is found for point 2, followed by point 4 and point 1. Point 3 on the other hand has a reciprocated distance that is about a factor 10 below the maximum one since it is found to have \(\delta \) values close to the majority of other data points. \(\mathcal {D}_{i}\) can therefore be used to quantify similarity, enabling for example the identification of systematically different experimental results. TFEP provides two methods, PCA and t-SNE, and [1] is exploring both for the visualisation of the data set. The PCA implementation returns projections onto the 10 first PCs evaluated from centered and sphered data, and allows the user to choose two or three of them to view the results.

4.1 Results from PDFSense & TFEP

For comparison we first reproduce results similar to those found in [1] by using the TFEP software. A selection of four views is shown in Fig. 4, for a complete set of plots related to the PDFSense column in Table 1 we refer the reader to [1]. The selected examples show how the view was chosen based on orthogonality of assigned groups, and how for the example of the jet+\(t\bar{t}\) group the various contributions have been compared.

We can identify several limitations in using the TFEP software for the visualisation:
  • Relevant information about the distributions is encoded in more than 3 dimensions. This is clear as PCs 3, 5 and 8 have been selected in the visualisation, thus the majority of variation in the data is not captured in Fig. 4. Moreover, the application of t-SNE clustering shown in [1] results in a large number of clusters, indicating higher dimensional structure. It would be preferable to display it as a linear projection for which interpretations are straightforward.

  • The sphering of data points when preparing the PCA visualisation is removing relevant information about the length of the vectors \(\vec {\delta }_i\).

  • In addition while the online tool allows highlighting of groups it is considerably less flexible in selecting options compared to scripted tools like the tour, limiting the detail in which the results can efficiently be studied.

We next explore how these points can be addressed, in particular in the framework of dynamical projections and the tour algorithm.

4.2 Expanded findings made using the tour

We first optimize the number of principal components considered in our study, and then show how the tour results expand on previous observations, as was summarized in Table 1. The mapping from the original \(\delta \) coordinates onto the PCs for all PCAs considered in this work are listed in Appendix B.

4.2.1 PCA, normalisation and variance explained

In the following we study two sets of principal components (PCA1, PCA2), corresponding to the two data preparation choices described above (i.e. PCA1 = centered, and PCA2 = centered and sphered). Results from each are compared. Note that for this problem, the centering has negligible impact on the results as the mean value in each direction \(\delta _{i,l}^{\pm }\) is close to zero.

An important consideration is the number of PCs that contain relevant information. To study this we show in Fig. 5 the proportional variance (see Eq. 9) that is explained by the principal components, for the two choices of the PCA, with labels “Centered” for PCA performed on centered data (PCA1) and “Sphered” for the PCA obtained for centered and sphered data (PCA2) thus reproducing results from Fig. 4. We find a steep curve for the first few PCs, followed by a slow decay of the proportional variance, and the curve only flattens out towards zero around PC30. As a consequence we expect that looking at a 3 dimensional subset of the first 10 PCs is not sufficient to understand the variation in the considered parameter space, and that judging similarity based on the view in Fig. 4 only, is misleading.
Fig. 5

Proportional variance explained by the principal components of the 56 dimensional parameter space. To capture all the variation, one would need close to 30 principal components, but around 6 captures about 50% of the variation. Both data preparations produce similar variance explanation, but the differences are enough to matter in some interpretations

In the following we want to study a higher dimensional subspace where we base the number of dimensions considered on the results found in Fig. 5.

For simplicity, we illustrate the tour approach using just the first 6 PCs, which captures about 50% of the overall variation (Table 2).This is sufficient to provide new insights as compared to Fig. 4 (left), and additional PCAs can be added for detailed studies of subgroups as we do below.
Table 2

Cumulative variance as % explained by the first 15 PCs

PCA

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

PCA1

12

21

30

37

43

48

53

57

61

65

68

72

75

78

80

PCA2

12

24

32

40

47

53

59

63

68

71

74

77

80

82

84

4.2.2 Grand tour result details

A short tour path is generated, of 20 basis planes and associated interpolation between them, of 2 dimensional projections of 6-d. This is used to compare between multiple groups. The examples considered are guided by findings in [1] and are summarized in Table 1.

Grouping of data points We first consider a display corresponding to Fig. 4 (left), i.e. the data set is grouped into three main clusters. Selected views from the animation are shown in Fig. 6, PCA1 (left) and PCA2 (right). The same colors used in Ref. [1] indicate the grouping: the DIS cluster is shown in blue, VBP in orange and the jets cluster in red. The first window in the display shows the axes, the other windows show the projected data, where one group is highlighted in color, while the remaining points are shown below in grey for easy comparison. As can be seen from the selected views, in any particular static view it is only possible to separate two of them at a time. The static views are not sufficient to convey the full picture obtained by watching the tour animation which allows to separate all three groups. The tour indicates that there is higher dimensional structure in the data points as can be seen in the linked animation.

In addition, it is possible to visually identify substructure within the clusters (e.g. groups of points aligned along some direction) as well as outlying points. This is especially true for PCA1 which is found to provide a much clearer picture than PCA2. We also find that the DIS and VBP clusters extend in multiple directions, while the jets cluster seems to be well described in a single plane.
Fig. 6

Selected views from the grand tour results of the full dataset. The data points are grouped into DIS, VBP and jets cluster, shown in blue, orange and red respectively. Top left plot shows the projection of the PCs, and other plots show the three subgroups. Colour indicates group, and grey shows the entire data set, as a reference in order to make comparisons between groups. PCA1 (see animation here) is shown on the top row, PCA2 (see animation here) on the bottom row, the left views show a separation between DIS and jets clusters, the right views show the multidimensionality in the DIS cluster

Fig. 7

Focusing on the jets cluster, showing only the first 4 PCs. Top left plot shows the projection coordinates, groups (Tevatron, ATLAS7old, \(\ldots \)) are focused in black in each plot, and grey shows all the data enabling direct comparison between subgroups. This view from the grand tour was selected because it clearly separates the outlying point in the ATLAS7new dataset. In addition the view also illustrates how the CMS results extend the reach away from the main cluster (see animation here)

The jet cluster In more detail, we investigate the jet cluster. These results are of special interest since they contain indeed the largest data sets to be added in the fit, which were indeed found to be important according to [1]. In addition, the new experimental data from LHC jet measurements is of interest because of possible tensions such as the systematic offsets in opposing directions for different rapidity bins observed in the ATLAS measurements, see [16] for a general discussion of the issue. As pointed out in [16] tensions can be reduced when adapting the treatment of systematic uncertainties, but cannot be fully resolved [17]. As seen above the jet cluster appears to be described in a lower dimensional subspace. Indeed performing PCA on the results in the jet cluster alone we see that the cumulative proportional variance reaches 49/75/91/95 % for PC1/2/3/4 respectively, with the proportional variance dropping to less than 2% for PC5. We therefore study substructure in this 4 dimensional space. While [1] distinguish three types of groups, i.e. “old” jet results (those included in the CT14HERA2 fit), “new” jet results (more recent ATLAS and CMS results) and \(t\bar{t}\), it makes sense to differentiate the LHC results further by experiment and \(\sqrt{s}\) (motivated also by the differences in sensitivities observed in [1]). For simplicity we consider only the results from performing PCA on the centered data shown in Fig. 7 with grouping into: Tevatron (IDs 504, 514), ATLAS7old (535), CMS7old (538), CMS7new (542), ATLAS7new (544), \(t\bar{t}\)-energy (565, 567), \(t\bar{t}\)-rap (566, 568) and CMS8 (545). Indeed we observe that the Tevatron results as well as the ATLAS results generally fall in the center of the cluster, with exception of some outlying points. On the other hand CMS 7 and 8 TeV results extend in (different) new directions. It is interesting to note that “old” CMS 7 TeV results extend further out than the corresponding “new” ones. In fact while the new measurement extended to higher rapidities and lower values in jet \(p_T\), the old measurement contains higher \(p_T\) bins no longer present in the updated result, which turn out to give large values of \(\vec {\delta }\). Finally for \(t\bar{t}\) results we distinguish the observations binned in energy (\(p_T^t\) or \(m_{t\bar{t}}\)) or rapidity (\(y_{\langle t/\bar{t}\rangle }\) or \(y_{t\bar{t}}\)). We can identify differences between the two groups in the visualisation, however as already noted in [1] the data points are not significantly different from the main jet cluster.

It is interesting to study which data points are found to be outlying in the visualisation. These points are highlighted in Fig. 7 and are best distinguished when watching the tour animation:
  • \(|y| > 2.5\) and \(\mu > 950\) GeV – marked with a star symbol: only one such point is found in the 7 TeV data sets. It occurs in ATLAS7new, it is the last rapidity bin and is clearly outlying (large negative values in PCs 1, 2 and 3). However no particular trend is observed when comparing with points in nearby bins. There are two more such data points in the CMS8 data set, but they do not stand out in \(\delta \) space.

  • \(|y| > 2\) and \(\mu > 1000\) GeV – marked with downward pointing triangle. These points are seen to align in a new direction, away from the main cluster highlighting their importance in the fits. They are also useful for comparing the different CMS results: in this case there are common points to both datasets that nevertheless look different, suggesting the need for further study of these points.

  • for CMS8 we also highlight \(|y| < 1\) and \(\mu < 200\) – marked with diamond symbol: they are very different from the main distribution and give large positive values in PC1. It is interesting that we can clearly separate these low \(\mu \) bins in CMS8 set but not in CMS7.

The DIS cluster We next consider subgroups of the DIS cluster for which the TFEP visualisation allowed only limited interpretation. Concretely, while the bulk of the cluster was clearly spanned by the HERA results (ID 160) as expected, other results were found to follow quite different distributions. In particular the Charm SIDIS (ID 147) results are distributed in a different direction, overlapping partly with both the DIS and the jet clusters, while the dimuon SIDIS results (IDs 124–127) were found in the center of the distribution and it was concluded that this cluster extends in an orthogonal direction, although it was not shown explicitly.

We therefore compare in detail these three groups. In this case it is useful to consider both PCA1 and PCA2, the latter more closely related to the TFEP output. First, we observe that the dimuon SIDIS is poorly separated in the PCA2 projection, whereas PCA1 clearly shows how it extends considerably away from the main DIS cluster (ID 160). On the other hand, the charm SIDIS can be separated more easily when studying the directional information in the PCA2 projection because the individual values in the space of deltas are all comparatively small. These results suggest that either predictions for these type of observables are well under control in the existing fits, or that alternatively the experimental errors are too large for them to be constraining. We also observe substructure in the DIS HERA1+2, see Fig. 8 and the corresponding animation, indicating that this group combines a number of qualitatively different types of results.

Comparison with summary statistics We now consider the experimental results with the highest values in reciprocated distances to show they can also be easily distinguished with our visualisation. We highlight three groups in Fig. 9: the HERA dataset (ID 160), the W asymmetry measurements (ID 234, 266 and 281) and the fixed-target Drell-Yan measurements from E605 and E866 (ID 201, 203 and 204).
Fig. 8

As Fig. 6, but showing only selected results in the DIS cluster, i.e. DIS HERA1+2 (black), Charm SIDIS (red) and dimuon SIDIS (green). The left view is for PCA1 (see animation here) shows clear separation of dimuon SIDIS results, the right view for PCA2 (see animation here) shows apparent separation of charm SIDIS results obtained by focussing on directional information

Fig. 9

Left: Comparison of groups with large reciprocated distance measures, where now the full dataset is shown below in gray. Right: Comparison in subspace found by performing PCA on DY data only, where DY data is shown in red and all other data points are shown below in gray. Again selected views from the grand tour results are shown here. The left view (see animation here) roughly shows how the HERA and WASY data points are far away from the main distribution of data points, while the DY points are found only in the center. The right view (see animation here) illustrates the three different types of distributions found in the DY group

Indeed we find that the W asymmetry measurements (234, 266 and 281) follow a very distinct distribution, as does the HERA DIS dataset (160). On the other hand, the fixed-target Drell-Yan measurements (201, 203 and 204), do not stand out in our visualisation. We find that this is a consequence of the dimension reduction,4 and we can easily identify views separating this group from the other data points when considering additional dimensions. Here we show this by looking at projections found by performing PCA on this data subset only and using it to compare it to the other data sets in the subspace of the first 4 PCs thus defined. Note however that the tour allows visualisation of the distributions in the full parameter space which would yield the same information. Our choice of procedure is simply to limit the viewing times required, which grow with the number of dimensions considered.5

This type of visualisation, together with inverting the mapping onto principal components, may be used to identify the origin (i.e. underlying physics) of the large differences. For example the first three PCs found for the DY dataset capture three different distributions, and mapping those back to the original \(\delta \) directions together with study of those directions with respect to uncertainty in individual parton pdfs may provide additional insight. Such detailed investigations are however beyond the scope of this study.

5 Summary and conclusions

Starting from the set of 56 dimensional vectors in the space of residual responses calculated in  [1], we have demonstrated how the grand tour may be used for visualizations in particle physics. The 56 dimensions are reduced to 6 dimensions (for illustration) using principal component analysis, and the resulting representation is then passed onto the tour. The findings made about the fits using the tour, even with only 6 dimensions, are more comprehensive and clearer than what TFEP allows.

The tour visualisation verified several results from [1], notably, the separation between DIS, VBP and JET experiments into clusters populating different regions of delta space. It also allowed us to go into further detail by examining certain substructures within these groups. We have moreover demonstrated that the tour can complement and support analyses based on the use of reciprocated distances.

In our examples we have considered performing the PCA either on centered data (PCA1) or on centered and sphered data (PCA2), as they highlight different aspects of the structure, the former retaining length information and the latter emphasizing directionality. In general we find the results from PCA1 more useful, in particular for this application where the length of the individual data point vectors (i.e. for each experiment) carries important information that is lost when sphering the input data.

The sensitivity defined in  [1], or projection of \(\delta \)s onto a direction given by the gradient of a QCD variable (e.g. cross section prediction) can also be inspected visually and the tour permits this visualisation in multiple dimensions.
Table 3

Experimental datasets considered as part of CT14HERA2 and included in the analysis. IDs are following the standard CTEQ labelling system with 1XX/2XX/5XX representing datasets in the DIS/VBP/JET group

ID#

Experimental dataset

 

Group

101

BCDMS \(F_{2}^{p}\)

[18]

DIS

102

BCDMS \(F_{2}^{d}\)

[19]

DIS

104

NMC \(F_{2}^{d}/F_{2}^{p}\)

[20]

DIS

108

CDHSW \(F_{2}^{p}\)

[21]

DIS

109

CDHSW \(F_{3}^{p}\)

[21]

DIS

110

CCFR \(F_{2}^{p}\)

[22]

DIS

111

CCFR \(xF_{3}^{p}\)

[23]

DIS

124

NuTeV \(\nu \mu \mu \) SIDIS

[24]

DIS

125

NuTeV \(\bar{\nu }\mu \mu \) SIDIS

[24]

DIS

126

CCFR \(\nu \mu \mu \) SIDIS

[25]

DIS

127

CCFR \(\bar{\nu }\mu \mu \) SIDIS

[25]

DIS

145

H1 \(\sigma _{r}^{b}\) (\(57.4 \text{ pb }^{-1}\))

[26, 27]

DIS

147

Combined HERA charm production (\(1.504 \text{ fb }^{-1}\))

[28]

DIS

160

HERA1+2 Combined NC and CC DIS (\(1 \text{ fb }^{-1}\))

[29]

DIS

169

H1 \(F_{L}\) (\(121.6 \text{ pb }^{-1}\))

[30]

DIS

201

E605 DY

[31]

VBP

203

E866 DY, \(\sigma _{pd}/(2\sigma _{pp})\)

[32]

VBP

204

E866 DY, \(Q^{3}d^{2}\sigma _{pp}/(dQdx_{F})\)

[33]

VBP

225

CDF Run-1 \(A_{e}(\eta ^{e})\) (\(110 \text{ pb }^{-1}\))

[34]

VBP

227

CDF Run-2 \(A_{e}(\eta ^{e})\) (\(170 \text{ pb }^{-1}\))

[35]

VBP

234

D\(\emptyset \)  Run-2 \(A_{\mu }(\eta ^{\mu })\) (\(0.3 \text{ fb }^{-1}\))

[36]

VBP

240

LHCb 7 TeV W / Z muon forward-\(\eta \) Xsec (\(35 \text{ pb }^{-1}\))

[37]

VBP

241

LHCb 7 TeV W \(A_{\mu }(\eta ^{\mu })\) (\(35 \text{ pb }^{-1}\))

[37]

VBP

260

D\(\emptyset \)  Run-2 Z \(d\sigma /dy_{Z}\) (\(0.4 \text{ fb }^{-1}\))

[38]

VBP

261

CDF Run-2 Z \(d\sigma /dy_{Z}\) (\(2.1 \text{ fb }^{-1}\))

[39]

VBP

266

CMS 7 TeV \(A_{\mu }(\eta )\) (\(4.7 \text{ fb }^{-1}\))

[40]

VBP

267

CMS 7 TeV \(A_{e}(\eta )\) (\(0.840 \text{ fb }^{-1}\))

[41]

VBP

268

ATLAS 7 TeV W / Z Xsec, \(A_{\mu }(\eta )\) (\(35 \text{ pb }^{-1}\))

[42]

VBP

281

D\(\emptyset \)  Run-2 \(A_{e}(\eta )\) (\(9.7 \text{ fb }^{-1}\))

[43]

VBP

504

CDF Run-2 incl. jet (\(d^2\sigma /dp_{T}^{j}dy_{j}\)) (\(1.13 \text{ fb }^{-1}\))

[44]

JET

514

D\(\emptyset \)  Run-2 incl. jet (\(d^2\sigma /dp_{T}^{j}dy_{j}\)) (\(0.7 \text{ fb }^{-1}\))

[45]

JET

535

ATLAS 7 TeV incl. jet (\(d^2\sigma /dp_{T}^{j}dy_{j}\)) (\(35 \text{ pb }^{-1}\))

[46]

JET

538

CMS 7 TeV incl. jet (\(d^2\sigma /dp_{T}^{j}dy_{j}\)) (\(5 \text{ fb }^{-1}\))

[47]

JET

We conclude that the above described method is a valuable tool for PDF uncertainty and sensitivity studies. In addition, the visual analysis allows a better understanding of the method itself and can uncover unexpected features, and even possibly errors. It can provide experiments with a guide to the measurements needed to improve PDF fits.

Footnotes

  1. 1.

    The precursor [2] of this tool was originally developed to tackle problems in high energy physics.

  2. 2.

    Note that the shifted central data value enters the residuals, thus while the observed central value cancels in the definition of \(\delta _{i,l}^{\pm }\), differences in the shift arising from differences of the optimized nuisance parameters at \(\vec {a}_l^{\pm }\) are encoded in the results together with difference in theory predictions.

  3. 3.

    Such a standardisation is in fact done routinely by selecting an axis scale appropriate to the relevant range in each parameter direction in a 2-d plot.

  4. 4.

    Recall that the selected first six PCs only capture 48% of overall variance.

  5. 5.

    When working in the full parameter space one should consider the definition of projection pursuit indices to guide the tour to interesting views, one may e.g. define an index that finds views where a selected group of data points is maximally separated from the cluster of points, similar to the definition of reciprocated distances.

Notes

Acknowledgements

This work was supported in part by the Australian Research Council. We thank Nicholas Spyrison for help with the animations and Timothy Hobbs and Fred Olness for clarifications on their work.

References

  1. 1.
    B.-T. Wang, T. J. Hobbs, S. Doyle, J. Gao, T.-J. Hou, P. M. Nadolsky et al., Visualizing the sensitivity of hadronic experiments to nucleon structure. (2018). arXiv:1803.02777
  2. 2.
    M. Fisherkeller, J. H. Friedman, J. Tukey, PRIM-9: An Interactive Multidimensional Data Display and Analysis System. ASA Statistical Graphics Video Lending Library. http://stat-graphics.org/movies/prim9.html) (1973)
  3. 3.
    R Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2018Google Scholar
  4. 4.
    H. Wickham, D. Cook, H. Hofmann, A. Buja, tourr: an R package for exploring multivariate data with projections. J. Stat. Softw. 40, 1 (2011)Google Scholar
  5. 5.
    TensorFlow Embedding Projector. http://projector.tensorflow.org
  6. 6.
    J. Pumplin, D.R. Stump, W.K. Tung, Multivariate fitting and the error matrix in global analysis of data. Phys. Rev. D 65, 014011 (2001). arXiv:hep-ph/0008191 ADSCrossRefGoogle Scholar
  7. 7.
    J. Pumplin, D. Stump, R. Brock, D. Casey, J. Huston, J. Kalk, Uncertainties of predictions from parton distribution functions. 2. The Hessian method. Phys. Rev. D 65, 014013 (2001). arXiv:hep-ph/0101032 ADSCrossRefGoogle Scholar
  8. 8.
    D. Stump, J. Pumplin, R. Brock, D. Casey, J. Huston, J. Kalk, Uncertainties of predictions from parton distribution functions. 1. The Lagrange multiplier method. Phys. Rev. D 65, 014012 (2001). arXiv:hep-ph/0101051 ADSCrossRefGoogle Scholar
  9. 9.
    V.D. Barger, W.-Y. Keung, R.J.N. Phillips, On psi and upsilon production via gluons. Phys. Lett. 91B, 253 (1980)ADSCrossRefGoogle Scholar
  10. 10.
    T.-J. Hou et al., Reconstruction of Monte Carlo replicas from Hessian parton distributions, JHEP 03 (2017) 099. arXiv:1607.06066
  11. 11.
    L. van der Maaten, G. Hinton, Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579 (2008)zbMATHGoogle Scholar
  12. 12.
    D. Asimov, The grand tour: a tool for viewing multidimensional data. SIAM J. Sci. Stat. Comput. 6, 128 (1985)MathSciNetCrossRefGoogle Scholar
  13. 13.
    A. Buja, D. Cook, D. Asimov and C. Hurley, 14—Computational Methods for High-Dimensional Rotations in Data Visualization, vol. 24 of Handbook of Statistics, pp. 391–413. Elsevier (2005). https://doi.org/10.1016/S0169-7161(04)24014-7 Google Scholar
  14. 14.
    D. Cook, E.-K. Lee, A. Buja and H. Wickham, Grand Tours, Projection Pursuit Guided Tours and Manual Controls, ch. III.2, pp. 295–314. Springer Handbooks of Computational Statistics. Springer (2008)Google Scholar
  15. 15.
    T.-J. Hou, S. Dulat, J. Gao, M. Guzzi, J. Huston, P. Nadolsky et al., CTEQ-TEA parton distribution functions and HERA Run I and II combined data, Phys. Rev. D 95, 034003 (2017). arXiv:1609.07968
  16. 16.
    L. A. Harland-Lang, A. D. Martin, R. S. Thorne, The impact of LHC Jet Data on the MMHT PDF Fit at NNLO, Eur. Phys. J. C 78, 248 (2018). arXiv:1711.05757
  17. 17.
    ATLAS collaboration, M. Aaboud et al., Measurement of the inclusive jet cross-sections in proton-proton collisions at \( \sqrt{s}=8 \) TeV with the ATLAS detector, JHEP 09, 020 (2017). arXiv:1706.03192
  18. 18.
    BCDMS collaboration, A. C. Benvenuti et al., A high statistics measurement of the proton structure functions F(2) (x, Q**2) and R from deep inelastic Muon scattering at high Q**2, Phys. Lett. B 223, 485 (1989)Google Scholar
  19. 19.
    BCDMS collaboration, A. C. Benvenuti et al., A high statistics measurement of the deuteron structure functions F2 (X, \(Q^2\)) and R from deep inelastic Muon scattering at high \(Q^2\), Phys. Lett. B 237, 592 (1990)Google Scholar
  20. 20.
    Nucl. Phys. M. Arneodo et al., Measurement of the proton and deuteron structure functions, F2(p) and F2(d), and of the ratio sigma-L / sigma-T. B 483, 3 (1997). arXiv:hep-ph/9610231
  21. 21.
    J.P. Berge et al., A measurement of differential cross-sections and nucleon structure functions in charged current neutrino interactions on iron. Z. Phys. C 49, 187 (1991)CrossRefGoogle Scholar
  22. 22.
    CCFR/NuTeV collaboration, U.-K. Yang et al., Measurements of \(F_2\) and \(xF^{\nu }_3 - x F^{\bar{\nu }}_3\) from CCFR \(\nu_\mu -\)Fe and \(\bar{\nu }_\mu -\)Fe data in a physics model independent way, Phys. Rev. Lett. 86, 2742 (2001). arXiv:hep-ex/0009041
  23. 23.
    W.G. Seligman et al., Improved determination of alpha(s) from neutrino nucleon scattering. Phys. Rev. Lett. 79, 1213 (1997). arXiv:hep-ex/9701017 ADSCrossRefGoogle Scholar
  24. 24.
    D. A. Mason, Measurement of the strange—antistrange asymmetry at NLO in QCD from NuTeV dimuon data, Ph.D. thesis, Oregon U., 10.2172/879078 (2006)Google Scholar
  25. 25.
    NuTeV collaboration, M. Goncharov et al., Precise measurement of Dimuon production cross-sections in \(\nu_{\mu }\) Fe and \(\bar{\nu }_{\mu }\) Fe deep inelastic scattering at the tevatron., Phys. Rev. D 64, 112006 (2001). arXiv:hep-ex/0102049
  26. 26.
    H1 collaboration, A. Aktas et al., Measurement of F2(\(c \bar{c}\)) and F2(\(b \bar{b}\)) at high \(Q^{2}\) using the H1 vertex detector at HERA, Eur. Phys. J. C 40, 349 (2005). arXiv:hep-ex/0411046
  27. 27.
    H1 collaboration, A. Aktas et al., Measurement of F(2)**c anti-c and F(2)**b anti-b at low Q*2 and x using the H1 vertex detector at HERA, Eur. Phys. J. C 45, 23 (2006). arXiv:hep-ex/0507081
  28. 28.
    ZEUS, H1 collaboration, H. Abramowicz et al., Combination and QCD analysis of charm production cross section measurements in deep-inelastic ep scattering at HERA, Eur. Phys. J. C 73, 2311 (2013). arXiv:1211.1182
  29. 29.
    ZEUS, H1 collaboration, H. Abramowicz et al., Combination of measurements of inclusive deep inelastic \({e^{\pm }p}\) scattering cross sections and QCD analysis of HERA data, Eur. Phys. J. C 75, 580 (2015). arXiv:1506.06042
  30. 30.
    H1 collaboration, F. D. Aaron et al., Measurement of the inclusive \(e{\pm }p\) scattering cross section at high inelasticity y and of the structure function \(F_L\), Eur. Phys. J. C 71, 1579 (2011). arXiv:1012.4355
  31. 31.
    G. Moreno et al., Dimuon production in proton–copper collisions at \(\sqrt{s}\) = 38.8-GeV. Phys. Rev. D 43, 2815 (1991)ADSCrossRefGoogle Scholar
  32. 32.
    collaboration, R. S. Towell et al., Improved measurement of the anti-d / anti-u asymmetry in the nucleon sea, Phys. Rev. D 64, 052002 (2001). arXiv:hep-ex/0103030
  33. 33.
    NuSea collaboration, J. C. Webb et al., Absolute Drell-Yan dimuon cross-sections in 800 GeV / c pp and pd collisions. arXiv:hep-ex/0302019
  34. 34.
    CDF collaboration, F. Abe et al., Forward-backward charge asymmetry of electron pairs above the \(Z^0\) pole, Phys. Rev. Lett. 77, 2616 (1996)Google Scholar
  35. 35.
    CDF collaboration, D. Acosta et al., Measurement of the forward-backward charge asymmetry from \(W \rightarrow e \nu \) production in \(p\bar{p}\) collisions at \(\sqrt{s} = 1.96\) TeV, Phys. Rev. D 71, 051104 (2005) . arXiv:hep-ex/0501023
  36. 36.
    D0 collaboration, V. M. Abazov et al., Measurement of the muon charge asymmetry from \(W\) boson decays. Phys. Rev. D 77, 011106 (2008). arXiv:0709.4254
  37. 37.
    LHCb collaboration, R. Aaij et al., Inclusive \(W\) and \(Z\) production in the forward region at \(\sqrt{s} = 7\) TeV, JHEP 06, 058 (2012). arXiv:1204.1620
  38. 38.
    D0 collaboration, V. M. Abazov et al., Measurement of the ratios of the Z/gamma* + >= n jet production cross sections to the total inclusive Z/gamma* cross section in p anti-p collisions at s**(1/2) = 1.96-TeV. Phys. Lett. B 658, 112 (2008). arXiv:hep-ex/0608052
  39. 39.
    CDF collaboration, T. A. Aaltonen et al., Measurement of \(d\sigma /dy\) of Drell-Yan \(e^+e^-\) pairs in the \(Z\) mass region from \(p\bar{p}\) Collisions at \(\sqrt{s}=1.96\) TeV, Phys. Lett. B 692, 232 (2010). arXiv:0908.3914
  40. 40.
    CMS collaboration, S. Chatrchyan et al., Measurement of the muon charge asymmetry in inclusive \(pp \rightarrow W+X\) production at \(\sqrt{s} =\) 7 TeV and an improved determination of light parton distribution functions, Phys. Rev. D 90, 032004 (2014). arXiv:1312.6283
  41. 41.
    CMS collaboration, S. Chatrchyan et al., Measurement of the electron charge asymmetry in inclusive \(W\) production in \(pp\) collisions at \(\sqrt{s}=7\) TeV. Phys. Rev. Lett. 109, 111806 (2012). arXiv:1206.2598
  42. 42.
    ATLAS collaboration, G. Aad et al., Measurement of the inclusive \(W^\pm \) and Z/gamma cross sections in the electron and muon decay channels in \(pp\) collisions at \(\sqrt{s}=7\) TeV with the ATLAS detector, Phys. Rev. D 85, 072004 (2012). arXiv:1109.5141
  43. 43.
    D0 collaboration, V. M. Abazov et al., Measurement of the electron charge asymmetry in \(\varvec {p\bar{p}\rightarrow W+X \rightarrow e\nu +X}\) decays in \(\varvec {p\bar{p}}\) collisions at \(\varvec {\sqrt{s}=1.96}\) TeV. Phys. Rev. D 91, 032007 (2015). arXiv:1412.2862
  44. 44.
    CDF collaboration, T. Aaltonen et al., Measurement of the inclusive jet cross section at the fermilab tevatron p anti-p collider using a cone-based jet algorithm, Phys. Rev. D 78, 052006 (2008). arXiv:0807.2204
  45. 45.
    D0 collaboration, V. M. Abazov et al., Measurement of the inclusive jet cross-section in \(p \bar{p}\) collisions at \(s^{(1/2)}\) =1.96-TeV. Phys. Rev. Lett. 101, 062001 (2008). arXiv:0802.2400
  46. 46.
    ATLAS collaboration, G. Aad et al., Measurement of inclusive jet and dijet production in \(pp\) collisions at \(\sqrt{s}=7\) TeV using the ATLAS detector, Phys. Rev. D 86, 014022 (2012). arXiv:1112.6297
  47. 47.
    CMS collaboration, S. Chatrchyan et al., Measurements of differential jet cross sections in proton-proton collisions at \(\sqrt{s}=7\) TeV with the CMS detector, Phys. Rev. D 87, 112002 (2013). arXiv:1212.6660
  48. 48.
    LHCb collaboration, R. Aaij et al., Measurement of the forward \(Z\) boson production cross-section in \(pp\) collisions at \(\sqrt{s}=7\) TeV, JHEP 08, 039 (2015). arXiv: 1505.07024
  49. 49.
    LHCb collaboration, R. Aaij et al., Measurement of forward \(\rm Z\rightarrow e^+e^-\) production at \(\sqrt{s}=8\) TeV, JHEP 05, 109 (2015). arXiv:1503.00963
  50. 50.
    ATLAS collaboration, G. Aad et al., Measurement of the \(Z/\gamma ^*\) boson transverse momentum distribution in \(pp\) collisions at \(\sqrt{s}\) = 7 TeV with the ATLAS detector, JHEP 09, 145 (2014). arXiv:1406.3660
  51. 51.
    CMS collaboration, V. Khachatryan et al., Measurement of the differential cross section and charge asymmetry for inclusive \({{\rm pp}\rightarrow {\rm W}^{\pm }+X}\) production at \({\sqrt{s}} = 8\) TeV, Eur. Phys. J. C 76, 469 (2016). arXiv:1603.01803
  52. 52.
    LHCb collaboration, R. Aaij et al., Measurement of forward W and Z boson production in \(pp\) collisions at \( \sqrt{s}=8 \) TeV, JHEP 01, 155 (2016). arXiv:1511.08039
  53. 53.
    ATLAS collaboration, G. Aad et al., Measurement of the double-differential high-mass Drell-Yan cross section in pp collisions at \( \sqrt{s}=8 \) TeV with the ATLAS detector, JHEP 08, 009 (2016). arXiv:1606.01736
  54. 54.
    ATLAS collaboration, G. Aad et al., Measurement of the transverse momentum and \(\phi ^*_{\eta }\) distributions of Drell-Yan lepton pairs in proton-proton collisions at \(\sqrt{s}=8\) TeV with the ATLAS detector, Eur. Phys. J. C 76, 291 (2016). arXiv:1512.02192
  55. 55.
    CMS collaboration, S. Chatrchyan et al., Measurement of the ratio of inclusive jet cross sections using the anti-\(k_T\) algorithm with radius parameters R=0.5 and 0.7 in pp collisions at \(\sqrt{s}=7\) TeV, Phys. Rev. D 90, 072006 (2014). arXiv:1406.0324
  56. 56.
    ATLAS collaboration, G. Aad et al., Measurement of the inclusive jet cross-section in proton-proton collisions at \(\sqrt{s}=7\) TeV using 4.5 fb-1 of data with the ATLAS detector, JHEP 02, 153 (2015). arXiv:1410.8857
  57. 57.
    CMS collaboration, V. Khachatryan et al., Measurement and QCD analysis of double-differential inclusive jet cross sections in pp collisions at \( \sqrt{s}=8 \) TeV and cross section ratios to 2.76 and 7 TeV, JHEP 03, 156 (2017). arXiv:1609.05331
  58. 58.
    ATLAS collaboration, G. Aad et al., Measurements of top-quark pair differential cross-sections in the lepton+jets channel in \(pp\) collisions at \(\sqrt{s}=8\) TeV using the ATLAS detector, Eur. Phys. J. C 76, 538 (2016). arXiv:1511.04716

Copyright information

© The Author(s) 2018

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Funded by SCOAP3

Authors and Affiliations

  1. 1.School of Econometrics and Business StatisticsMonash UniversityMelbourneAustralia
  2. 2.School of Physics and AstronomyMonash UniversityMelbourneAustralia

Personalised recommendations