Skip to main content

Introduction to Neural Networks

  • Chapter
  • First Online:
Machine Learning Meets Quantum Physics

Part of the book series: Lecture Notes in Physics ((LNP,volume 968))

Abstract

Machine learning has become an essential tool for extracting regularities in the data and for making inferences. Neural networks, in particular, provide the scalability and flexibility that is needed to convert complex datasets into structured and well-generalizing models. Pretrained models have strongly facilitated the application of neural networks to images and text data. Application to other types of data, e.g., in physics, remains more challenging and often requires ad-hoc approaches. In this chapter, we give an introduction to neural networks with a focus on the latter applications. We present practical steps that ease training of neural networks, and then review simple approaches to introduce prior knowledge into the model. The discussion is supported by theoretical arguments as well as examples showing how well-performing neural networks can be implemented easily in modern neural network frameworks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://pytorch.org/.

  2. 2.

    https://keras.io/.

References

  1. C.M. Bishop, Neural Networks for Pattern Recognition (Oxford University Press, New York, 1995)

    MATH  Google Scholar 

  2. G. Montavon, G.B. Orr, K. Müller (eds.), in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012)

    Google Scholar 

  3. J. Schmidhuber, Neural Netw. 61, 85 (2015)

    Article  Google Scholar 

  4. Y. LeCun, Y. Bengio, G. Hinton, Nature 521(7553), 436 (2015)

    Article  ADS  Google Scholar 

  5. G. Cybenko, Math. Control Signals Syst. 2(4), 303 (1989)

    Article  Google Scholar 

  6. Z. Lu, H. Pu, F. Wang, Z. Hu, L. Wang, in Advances in Neural Information Processing Systems, vol. 30 (2017), pp. 6231–6239

    Google Scholar 

  7. K. Fukushima, Biol. Cybern. 36, 193 (1980)

    Article  Google Scholar 

  8. G. Montavon, M.L. Braun, K. Müller, J. Mach. Learn. Res. 12, 2563 (2011)

    MathSciNet  Google Scholar 

  9. C. Cortes, V. Vapnik, Mach. Learn. 20(3), 273 (1995)

    Google Scholar 

  10. K. Müller, S. Mika, G. Rätsch, K. Tsuda, B. Schölkopf, IEEE Trans. Neural Netw. 12(2), 181 (2001)

    Article  Google Scholar 

  11. B. Schölkopf, A. J. Smola, in Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Adaptive Computation and Machine Learning Series (MIT Press, Cambridge, MA, 2002)

    Google Scholar 

  12. A. Krizhevsky, I. Sutskever, G. E. Hinton, in Neural Information Processing Systems (2012), pp. 1106–1114

    Google Scholar 

  13. K. Simonyan, A. Zisserman, in Third International Conference on Learning Representations (2015)

    Google Scholar 

  14. M. Oquab, L. Bottou, I. Laptev, J. Sivic, in IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 1717–1724

    Google Scholar 

  15. R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, P.P. Kuksa, J. Mach. Learn. Res. 12, 2493 (2011)

    Google Scholar 

  16. Y. Kim, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (2014), pp. 1746–1751

    Google Scholar 

  17. P. Baldi, P. Sadowski, D. Whiteson, Nat. Commun. 5, 4308 (2014)

    Article  ADS  Google Scholar 

  18. K. T. Schütt, F. Arbabzadah, S. Chmiela, K.R. Müller, A. Tkatchenko, Nat. Commun. 8, 13890 (2017)

    Article  ADS  Google Scholar 

  19. A. Mardt, L. Pasquali, H. Wu, F. Noé, Nat. Commun. 9(5) (2018)

    Google Scholar 

  20. L. Holmström, P. Koistinen, IEEE Trans. Neural Netw. 3(1), 24 (1992)

    Article  Google Scholar 

  21. S. Hobday, R. Smith, J. Belbruno, Model. Simul. Mater. Sci. Eng. 7(3), 397 (1999)

    Article  ADS  Google Scholar 

  22. J. Behler, M. Parrinello, Phys. Rev. Lett. 98(14), 146401 (2007)

    Article  ADS  Google Scholar 

  23. K. Yao, J.E. Herr, D.W. Toth, R. Mckintyre, J. Parkhill, Chem. Sci. 9(8), 2261 (2018)

    Article  Google Scholar 

  24. B. Nebgen, N. Lubbers, J.S. Smith, A.E. Sifain, A. Lokhov, O. Isayev, A.E. Roitberg, K. Barros, S. Tretiak, J. Chem. Theory Comput. 14(9), 4687 (2018)

    Article  Google Scholar 

  25. D.E. Rumelhart, G.E. Hinton, R.J. Williams, Nature 323(6088), 533 (1986)

    Article  ADS  Google Scholar 

  26. P.J. Werbos, in System Modeling and Optimization (Springer, Berlin, 1982), pp. 762–770

    Google Scholar 

  27. Y. LeCun, L. Bottou, G.B. Orr, K. Müller, in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012), pp. 9–48

    Google Scholar 

  28. J. Lafond, N. Vasilache, L. Bottou (2017). CoRR abs/1705.09319

    Google Scholar 

  29. A. Botev, H. Ritter, D. Barber, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 557–565

    Google Scholar 

  30. Y. Jeon, C. Choi, in International Joint Conference Neural Network (1999), pp. 1685–1690

    Google Scholar 

  31. G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K.-R. Müller, O. A. von Lilienfeld, New J. Phys. 15(9), 095003 (2013)

    Article  ADS  Google Scholar 

  32. X. Glorot, A. Bordes, Y. Bengio, in International Conference on Artificial Intelligence and Statistics (2011), pp. 315–323

    Google Scholar 

  33. M.D. Zeiler, M. Ranzato, R. Monga, M.Z. Mao, K. Yang, Q.V. Le, P. Nguyen, A.W. Senior, V. Vanhoucke, J. Dean, G.E. Hinton, in IEEE International Conference on Acoustics, Speech and Signal Processing (2013), pp. 3517–3521

    Google Scholar 

  34. K. He, X. Zhang, S. Ren, J. Sun, in IEEE International Conference on Computer Vision (2015), pp. 1026–1034

    Google Scholar 

  35. D.P. Kingma, J. Ba, in Third International Conference on Learning Representations (2015)

    Google Scholar 

  36. L. Bottou, in Proceedings of Neuro-Nîmes, vol. 91 (EC2, Nimes, 1991)

    Google Scholar 

  37. L. Bottou, in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012), pp. 421–436

    Google Scholar 

  38. V.N. Vapnik, The Nature of Statistical Learning Theory, 2nd edn. Statistics for Engineering and Information Science (Springer, Berlin, 2000)

    Google Scholar 

  39. A. Krogh, J.A. Hertz, in Advances in Neural Information Processing Systems, vol. 4 (1991), pp. 950–957

    Google Scholar 

  40. R. Reed, IEEE Trans. Neural Netw. 4(5), 740 (1993)

    Article  Google Scholar 

  41. L. Breiman, Mach. Lear. 24(2), 123 (1996)

    Google Scholar 

  42. M. Rupp, A. Tkatchenko, K.-R. Müller, O.A. von Lilienfeld, Phys. Rev. Lett. 108, 058301 (2012)

    Article  ADS  Google Scholar 

  43. K. Hansen, F. Biegler, R. Ramakrishnan, W. Pronobis, O.A. von Lilienfeld, K.-R. Müller, A. Tkatchenko, J. Phys. Chem. Lett. 6(12), 2326 (2015)

    Article  Google Scholar 

  44. F.A. Faber, L. Hutchison, B. Huang, J. Gilmer, S.S. Schoenholz, G.E. Dahl, O. Vinyals, S. Kearnes, P.F. Riley, O.A. von Lilienfeld, J. Chem. Theory Comput. 13(11), 5255 (2017)

    Article  Google Scholar 

  45. S. Chmiela, A. Tkatchenko, H.E. Sauceda, I. Poltavsky, K.T. Schütt, K.-R. Müller, Sci. Adv. 3(5), e1603015 (2017)

    Article  ADS  Google Scholar 

  46. S. Chmiela, H.E. Sauceda, K.-R. Müller, A. Tkatchenko, Nat. Commun. 9, 3887 (2018)

    Article  ADS  Google Scholar 

  47. I. Guyon, A. Elisseeff, in Feature Extraction—Foundations and Applications. Studies in Fuzziness and Soft Computing, vol. 207 (Springer, Berlin, 2006), pp. 1–25

    Google Scholar 

  48. P.Y. Simard, Y. LeCun, J.S. Denker, B. Victorri, in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012), pp. 235–269

    Google Scholar 

  49. Y. LeCun, P. Haffner, L. Bottou, Y. Bengio, in Shape, Contour and Grouping in Computer Vision (Springer, Berlin, 1999), pp. 319–345

    Google Scholar 

  50. J. Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals, G.E. Dahl, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 1263–1272

    Google Scholar 

  51. K.T. Schütt, H.E. Sauceda, P.-J. Kindermans, A. Tkatchenko, K.-R. Müller, J. Chem. Phys. 148(24), 241722 (2018)

    Article  ADS  Google Scholar 

  52. K. Hansen, G. Montavon, F. Biegler, S. Fazli, M. Rupp, M. Scheffler, O. A. von Lilienfeld, A. Tkatchenko, K.-R. Müller, J. Chem. Theory Comput. 9(8), 3404 (2013)

    Article  Google Scholar 

  53. J. Bergstra, Y. Bengio, J. Mach. Learn. Res. 13, 281 (2012)

    MathSciNet  Google Scholar 

  54. J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, in Advances in Neural Information Processing Systems, vol. 24 (2011), pp. 2546–2554

    Google Scholar 

  55. Z.C. Lipton, ACM Queue 16(3), 30 (2018)

    Google Scholar 

  56. W. Samek, G. Montavon, A. Vedaldi, L.K. Hansen, K.-R. Müller (eds.), Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)

    Google Scholar 

  57. D. Baehrens, T. Schroeter, S. Harmeling, M. Kawanabe, K. Hansen, K. Müller, J. Mach. Learn. Res. 11, 1803 (2010)

    MathSciNet  Google Scholar 

  58. S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, W. Samek, PLoS One 10(7), e0130140 (2015)

    Article  Google Scholar 

  59. R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, N. Elhadad, in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015), pp. 1721–1730

    Google Scholar 

  60. B. Zhou, A. Khosla, À. Lapedriza, A. Oliva, A. Torralba, in IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 2921–2929

    Google Scholar 

  61. K. Yao, J.E. Herr, S.N. Brown, J. Parkhill, J. Phys. Chem. Lett. 8(12), 2689 (2017)

    Article  Google Scholar 

  62. K.T. Schütt, M. Gastegger, A. Tkatchenko, K.-R. Müller, in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)

    Google Scholar 

  63. M.T. Ribeiro, S. Singh, C. Guestrin, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), pp. 1135–1144

    Google Scholar 

  64. R.C. Fong, A. Vedaldi, In IEEE International Conference on Computer Vision (2017), pp. 3449–3457

    Google Scholar 

  65. M. Sundararajan, A. Taly, Q. Yan, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 3319–3328

    Google Scholar 

  66. G. Montavon, A. Binder, S. Lapuschkin, W. Samek, K.-R. Müller, in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)

    Google Scholar 

  67. L. Arras, J. Arjona-Medina, M. Widrich, G. Montavon, M. Gillhofer, K.-R. Müller, S. Hochreiter, W. Samek, in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)

    Google Scholar 

  68. S. Lapuschkin, S. Wäldchen, A. Binder, G. Montavon, W. Samek, K.-R. Müller, Nat. Commun. 10, 1096 (2019)

    Article  ADS  Google Scholar 

Download references

Acknowledgements

This work was supported by the German Ministry for Education and Research as Berlin Center for Machine Learning (01IS18037I). The author is grateful to Klaus-Robert Müller for the valuable feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Grégoire Montavon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Montavon, G. (2020). Introduction to Neural Networks. In: Schütt, K., Chmiela, S., von Lilienfeld, O., Tkatchenko, A., Tsuda, K., Müller, KR. (eds) Machine Learning Meets Quantum Physics. Lecture Notes in Physics, vol 968. Springer, Cham. https://doi.org/10.1007/978-3-030-40245-7_4

Download citation

Publish with us

Policies and ethics