Skip to main content

Advertisement

Log in

Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis

  • Original Article
  • Published:
Computing and Software for Big Science Aims and scope Submit manuscript

Abstract

We provide a bridge between generative modeling in the Machine Learning community and simulated physical processes in high energy particle physics by applying a novel Generative Adversarial Network (GAN) architecture to the production of jet images—2D representations of energy depositions from particles interacting with a calorimeter. We propose a simple architecture, the Location-Aware Generative Adversarial Network, that learns to produce realistic radiation patterns from simulated high energy particle collisions. The pixel intensities of GAN-generated images faithfully span over many orders of magnitude and exhibit the desired low-dimensional physical properties (i.e., jet mass, n-subjettiness, etc.). We shed light on limitations, and provide a novel empirical validation of image quality and validity of GAN-produced simulations of the natural world. This work provides a base for further explorations of GANs for use in faster simulation in high energy particle physics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

Notes

  1. Full simulation can take up to \({\mathcal {O}}(\text {min/event})\).

  2. While the azimuthal angle \(\phi\) is a real angle, pseudorapidity \(\eta\) is only approximately equal to the polar angle \(\theta\). However, the radiation pattern is nearly symmetric in \(\phi\) and \(\eta\) and so these standard coordinates are used to describe the jet constituent locations.

  3. For more details about this rotation, which slightly differs from Ref. [20], see Appendix 1.

  4. Bicubic spline interpolation in the rotation process causes a large number of pixels to be interpolated between their original value and zero, the most likely intensity value of neighboring cells. Though a zero-order interpolation would solve sparsity problems, we empirically determine that the loss in jet-observable resolution is not worth the sparsity preservation. A more in-depth discussion can be found in Appendix B.

  5. Similar plots for the average signal and background images are shown in Figs. 25, 26.

References

  1. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. In: Advances in neural information processing systems, pp 2672–2680

  2. Odena A, Olah C, Shlens J (2016) Conditional image synthesis with auxiliary classifier GANs. arXiv:1610.09585

  3. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. arXiv:1606.03657

  4. Salimans T, Goodfellow IJ, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training GANs. arXiv:1606.03498

  5. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784

  6. Odena A (2016) Semi-supervised learning with generative adversarial networks. arXiv:1606.01583

  7. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434

  8. Reed SE, Akata Z, Mohan S, Tenka S, Schiele B, Lee H (2016) Learning what and where to draw. arXiv:1610.02454

  9. Reed SE, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H (2016) Generative adversarial text to image synthesis. arXiv:1605.05396

  10. Zhang H, Xu T, Li H, Zhang S, Huang X, Wang X, Metaxas D (2016) StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. arXiv:1612.03242

  11. Goodfellow IJ (2014) On distinguishability criteria for estimating generative models. arXiv:1412.6515

  12. Aad G et al (2010) The ATLAS simulation infrastructure. Eur Phys J C 70:823–874

    Article  ADS  Google Scholar 

  13. CMS Collaboration (2006) CMS Physics: Technical Design Report Volume 1: Detector Performance and Software, Geneva: CERN, p 521

  14. Agostinelli S et al (2003) GEANT4: a simulation toolkit. Nucl Instrum Methods A506:250–303

    Article  ADS  Google Scholar 

  15. Edmonds K, Fleischmann S, Lenz T, Magass C, Mechnich J, Salzburger A (2008) The fast ATLAS track simulation (FATRAS). No. ATL-SOFT-PUB-2008-001

  16. Beckingham M, Duehrssen M, Schmidt E, Shapiro M, Venturi M, Virzi J, Vivarelli I, Werner M, Yamamoto S, Yamanaka T (2010) The simulation principle and performance of the ATLAS fast calorimeter simulation FastCaloSim. Tech. Rep. ATL-PHYS-PUB-2010-013, CERN, Geneva

  17. Abdullin S, Azzi P, Beaudette F, Janot P, Perrotta A (2011) The fast simulation of the CMS detector at LHC. J Phys Conf Ser 331:032049

    Article  Google Scholar 

  18. Childers JT, Uram TD, LeCompte TJ, Papka ME, Benjamin DP (2015) Simulation of LHC events on a millions threads. J Phys Conf Ser 664(9):092006

    Article  Google Scholar 

  19. Cogan J, Kagan M, Strauss E, Schwarztman A (2015) Jet-images: computer vision inspired techniques for jet tagging. JHEP 02:118

    Article  ADS  Google Scholar 

  20. de Oliveira L, Kagan M, Mackey L, Nachman B, Schwartzman A (2016) Jet-images—deep learning edition. JHEP07:069

  21. Almeida LG, Backović M, Cliche M, Lee SJ, Perelstein M (2015) Playing tag with ANN: boosted top identification with pattern recognition. JHEP 07:086

    Article  ADS  Google Scholar 

  22. Komiske PT, Metodiev EM, Schwartz MD (2016) Deep learning in color: towards automated quark/gluon jet discrimination. arXiv:1612.01551

  23. Barnard J, Dawe EN, Dolan MJ, Rajcic N (2016) Parton shower uncertainties in jet substructure analyses with deep neural networks. arXiv:1609.00607

  24. Baldi P, Bauer K, Eng C, Sadowski P, Whiteson D (2016) Jet substructure classification in high-energy physics with deep neural networks. Phys Rev D 93(9):094034

    Article  ADS  Google Scholar 

  25. Chatrchyan S et al (2008) The CMS experiment at the CERN LHC. JINST 3:S08004

    Google Scholar 

  26. Cacciari M, Salam GP, Soyez G (2008) The catchment area of jets. JHEP 804:5. doi:10.1088/1126-6708/2008/04/005

    Article  ADS  MATH  Google Scholar 

  27. Cacciari M, Salam GP, Soyez G (2012) FastJet user manual. Eur. Phys. J. C 72:1896

    Article  ADS  Google Scholar 

  28. Krohn D, Thaler J, Wang L-T (2010) Jet trimming. JHEP 1002:084

    Article  Google Scholar 

  29. Sjostrand T, Mrenna S, Skands PZ (2008) A brief introduction to PYTHIA 8.1. Comput Phys Commun 178:852–867

    Article  ADS  MATH  Google Scholar 

  30. Sjostrand T, Mrenna S, Skands PZ (2006) PYTHIA 6.4 physics and manual. JHEP 0605:026

  31. Nachman B, de Oliveira L, Paganini M (2017) Dataset release—Pythia generated jet images for location aware generative adversarial network training. doi:10.17632/4r4v785rgx.1

  32. de Oliveira L, Paganini M (2017) lukedeo/adversarial-jets: initial code release. doi:10.5281/zenodo.400708

  33. Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: ICML workshop on deep learning, vol 28

  34. Chintala S (2016) How to train a GAN? In: NIPS, workshop on generative adversarial networks

  35. Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th international conference on machine learning, Atlanta, Georgia, USA

  36. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift

  37. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR. arXiv:1412.6980

  38. Chollet F (2017) Keras. https://github.com/fchollet/keras

  39. Abadi M et al (2015) TensorFlow: large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/

  40. Thaler J, Van Tilburg K (2011) Identifying boosted objects with N-subjettiness. JHEP 1103:015

    Google Scholar 

  41. Larkoski AJ, Neill D, Thaler J (2014) Jet shapes with the broadening axis. JHEP 04:017

    Article  ADS  Google Scholar 

  42. Goodfellow, Ian J et al (2013) Maxout networks (Preprint). arXiv:1302.4389

  43. Rubner Y, Tomasi C, Guibas LJ (2000) The earth mover’s distance as a metric for image retrieval. Int J Comput Vis 40:99–121

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Ian Goodfellow for insightful deep learning related discussion, and would like to acknowledge Wahid Bhimji, Zach Marshall, Mustafa Mustafa, Chase Shimmin, and Paul Tipton, who helped refine our narrative.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michela Paganini.

Ethics declarations

Conflict of Interest

The work of Benjamin Nachman and Michela Paganini was supported in part by the Office of High Energy Physics of the U.S. Department of Energy under contracts DE-AC02-05CH11231 and DE-FG02-92ER40704. Luke de Oliveira is founder and CEO at Vai Technologies, LLC.

Appendices

Appendix A: Additional Material

See Figs. 25, 26, 27, 28, 29, 30, 31, 32 and 33.

Fig. 25
figure 25

Average signal image produced by Pythia (left) and by the GAN (right), displayed on log scale, with the difference between the two (center), displayed on linear scale

Fig. 26
figure 26

Average background image produced by Pythia (left) and by the GAN (right), displayed on log scale, with the difference between the two (center), displayed on linear scale

Fig. 27
figure 27

The average signal Pythia image labeled as real (left), as fake (right), and the difference between these two (middle) plotted on linear scale

Fig. 28
figure 28

Average background Pythia image labeled as real (left), as fake (right), and the difference between these two (middle) plotted on linear scale

Fig. 29
figure 29

Average signal GAN-generated image labeled as real (left), as fake (right), and the difference between these two (middle) plotted on linear scale

Fig. 30
figure 30

Average background GAN-generated image labeled as real (left), as fake (right), and the difference between these two (middle) plotted on linear scale

Fig. 31
figure 31

Difference between the average signal and the average background images labeled as real, produced by Pythia (left) and by the GAN (right), both displayed on linear scale

Fig. 32
figure 32

Difference between the average signal and the average background images labeled as fake, produced by Pythia (left) and by the GAN (right), both displayed on linear scale

Fig. 33
figure 33

Normalized confusion matrices showing the correlation of the predicted D output with the true physical process used to produce the images. The matrices are plotted for all images (left), Pythia images only (center) and GAN images only (right)

Appendix B: Image Pre-processing

Reference [20] contains a detailed discussion on the impact of image pre-processing and information content of the image. For example, it is shown that normalizing each image removes a significant amount of information about the jet mass. One important step that was not fully discussed is the rotational symmetry about the jet axis. It was shown in Ref. [20] that a rotation about the jet axis in \(\eta -\phi\) does not preserve the jet mass, i.e. \(\eta _i\mapsto \cos (\alpha )\eta _i+\sin (\alpha )\phi _i,\phi _i\mapsto \cos (\alpha )\phi _i-\sin (\alpha )\eta _i\), where \(\alpha\) is the rotation angle and i runs over the constituents of the jet. One can perform a proper rotation about the x-axis (preserving the leading subjet at \(\phi =0\)) via

$$\begin{aligned} p_{x,i}&\mapsto p_{x,i}\end{aligned}$$
(7)
$$\begin{aligned} p_{y,i}&\mapsto p_{y,i}\cos (\beta )+p_{z,i}\sin (\beta )\end{aligned}$$
(8)
$$\begin{aligned} p_{z,i}&\mapsto p_{z,i}\cos (\beta )-p_{y,i}\sin (\beta )\end{aligned}$$
(9)
$$\begin{aligned} E_i&\mapsto E_i, \end{aligned}$$
(10)

where

$$\begin{aligned} \beta = -\text {atan}\left( \frac{p_\text {y,translated subjet 2}}{p_\text {z,translated subjet 2}}\right) - \pi /2. \end{aligned}$$
(11)

Figure 34 quantifies the information lost by various preprocessing steps, highlighting in particular the rotation step. A ROC curve is constructed to try to distinguish the preprocessed variable and the unprocessed variable. If they cannot be distinguished, then there is no loss in information. Similar plots showing the degradation in signal versus background classification performance are shown in Fig. 35. The best fully preprocessed option for all metrics is the Pix + Trans + Rotation(Cubic) + Renorm. This option uses the cubic spline interpolation from Ref. [20], but adds a small additional step that ensures that the sum of the pixel intensities is the same before and after rotation. This is the procedure that is used throughout the body of the manuscript.

Fig. 34
figure 34

ROC curves quantifying the information lost about the jet mass (left) or n-subjettiness (right) after pre-processing. A preprocessing step that does not loose any information will be exactly at the random classifier line \(f(x)=1/x\). Both plots use signal boosted W boson jets for illustration

Fig. 35
figure 35

ROC curves for classifying signal versus background based only on the mass (left) or n-subjettiness (right). Note that in some cases, the preprocessing can actually improve discrimination (but always degrades the information content—see Fig. 34)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Oliveira, L., Paganini, M. & Nachman, B. Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis. Comput Softw Big Sci 1, 4 (2017). https://doi.org/10.1007/s41781-017-0004-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41781-017-0004-6

Keywords

Navigation