Introduction to Neural Networks

Montavon, Grégoire

doi:10.1007/978-3-030-40245-7_4

Grégoire Montavon²²

Part of the book series: Lecture Notes in Physics ((LNP,volume 968))

5493 Accesses
9 Citations

Abstract

Machine learning has become an essential tool for extracting regularities in the data and for making inferences. Neural networks, in particular, provide the scalability and flexibility that is needed to convert complex datasets into structured and well-generalizing models. Pretrained models have strongly facilitated the application of neural networks to images and text data. Application to other types of data, e.g., in physics, remains more challenging and often requires ad-hoc approaches. In this chapter, we give an introduction to neural networks with a focus on the latter applications. We present practical steps that ease training of neural networks, and then review simple approaches to introduce prior knowledge into the model. The discussion is supported by theoretical arguments as well as examples showing how well-performing neural networks can be implemented easily in modern neural network frameworks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://pytorch.org/.
2.
https://keras.io/.

References

C.M. Bishop, Neural Networks for Pattern Recognition (Oxford University Press, New York, 1995)
MATH Google Scholar
G. Montavon, G.B. Orr, K. Müller (eds.), in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012)
Google Scholar
J. Schmidhuber, Neural Netw. 61, 85 (2015)
Article Google Scholar
Y. LeCun, Y. Bengio, G. Hinton, Nature 521(7553), 436 (2015)
Article ADS Google Scholar
G. Cybenko, Math. Control Signals Syst. 2(4), 303 (1989)
Article Google Scholar
Z. Lu, H. Pu, F. Wang, Z. Hu, L. Wang, in Advances in Neural Information Processing Systems, vol. 30 (2017), pp. 6231–6239
Google Scholar
K. Fukushima, Biol. Cybern. 36, 193 (1980)
Article Google Scholar
G. Montavon, M.L. Braun, K. Müller, J. Mach. Learn. Res. 12, 2563 (2011)
MathSciNet Google Scholar
C. Cortes, V. Vapnik, Mach. Learn. 20(3), 273 (1995)
Google Scholar
K. Müller, S. Mika, G. Rätsch, K. Tsuda, B. Schölkopf, IEEE Trans. Neural Netw. 12(2), 181 (2001)
Article Google Scholar
B. Schölkopf, A. J. Smola, in Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Adaptive Computation and Machine Learning Series (MIT Press, Cambridge, MA, 2002)
Google Scholar
A. Krizhevsky, I. Sutskever, G. E. Hinton, in Neural Information Processing Systems (2012), pp. 1106–1114
Google Scholar
K. Simonyan, A. Zisserman, in Third International Conference on Learning Representations (2015)
Google Scholar
M. Oquab, L. Bottou, I. Laptev, J. Sivic, in IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 1717–1724
Google Scholar
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, P.P. Kuksa, J. Mach. Learn. Res. 12, 2493 (2011)
Google Scholar
Y. Kim, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (2014), pp. 1746–1751
Google Scholar
P. Baldi, P. Sadowski, D. Whiteson, Nat. Commun. 5, 4308 (2014)
Article ADS Google Scholar
K. T. Schütt, F. Arbabzadah, S. Chmiela, K.R. Müller, A. Tkatchenko, Nat. Commun. 8, 13890 (2017)
Article ADS Google Scholar
A. Mardt, L. Pasquali, H. Wu, F. Noé, Nat. Commun. 9(5) (2018)
Google Scholar
L. Holmström, P. Koistinen, IEEE Trans. Neural Netw. 3(1), 24 (1992)
Article Google Scholar
S. Hobday, R. Smith, J. Belbruno, Model. Simul. Mater. Sci. Eng. 7(3), 397 (1999)
Article ADS Google Scholar
J. Behler, M. Parrinello, Phys. Rev. Lett. 98(14), 146401 (2007)
Article ADS Google Scholar
K. Yao, J.E. Herr, D.W. Toth, R. Mckintyre, J. Parkhill, Chem. Sci. 9(8), 2261 (2018)
Article Google Scholar
B. Nebgen, N. Lubbers, J.S. Smith, A.E. Sifain, A. Lokhov, O. Isayev, A.E. Roitberg, K. Barros, S. Tretiak, J. Chem. Theory Comput. 14(9), 4687 (2018)
Article Google Scholar
D.E. Rumelhart, G.E. Hinton, R.J. Williams, Nature 323(6088), 533 (1986)
Article ADS Google Scholar
P.J. Werbos, in System Modeling and Optimization (Springer, Berlin, 1982), pp. 762–770
Google Scholar
Y. LeCun, L. Bottou, G.B. Orr, K. Müller, in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012), pp. 9–48
Google Scholar
J. Lafond, N. Vasilache, L. Bottou (2017). CoRR abs/1705.09319
Google Scholar
A. Botev, H. Ritter, D. Barber, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 557–565
Google Scholar
Y. Jeon, C. Choi, in International Joint Conference Neural Network (1999), pp. 1685–1690
Google Scholar
G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K.-R. Müller, O. A. von Lilienfeld, New J. Phys. 15(9), 095003 (2013)
Article ADS Google Scholar
X. Glorot, A. Bordes, Y. Bengio, in International Conference on Artificial Intelligence and Statistics (2011), pp. 315–323
Google Scholar
M.D. Zeiler, M. Ranzato, R. Monga, M.Z. Mao, K. Yang, Q.V. Le, P. Nguyen, A.W. Senior, V. Vanhoucke, J. Dean, G.E. Hinton, in IEEE International Conference on Acoustics, Speech and Signal Processing (2013), pp. 3517–3521
Google Scholar
K. He, X. Zhang, S. Ren, J. Sun, in IEEE International Conference on Computer Vision (2015), pp. 1026–1034
Google Scholar
D.P. Kingma, J. Ba, in Third International Conference on Learning Representations (2015)
Google Scholar
L. Bottou, in Proceedings of Neuro-Nîmes, vol. 91 (EC2, Nimes, 1991)
Google Scholar
L. Bottou, in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012), pp. 421–436
Google Scholar
V.N. Vapnik, The Nature of Statistical Learning Theory, 2nd edn. Statistics for Engineering and Information Science (Springer, Berlin, 2000)
Google Scholar
A. Krogh, J.A. Hertz, in Advances in Neural Information Processing Systems, vol. 4 (1991), pp. 950–957
Google Scholar
R. Reed, IEEE Trans. Neural Netw. 4(5), 740 (1993)
Article Google Scholar
L. Breiman, Mach. Lear. 24(2), 123 (1996)
Google Scholar
M. Rupp, A. Tkatchenko, K.-R. Müller, O.A. von Lilienfeld, Phys. Rev. Lett. 108, 058301 (2012)
Article ADS Google Scholar
K. Hansen, F. Biegler, R. Ramakrishnan, W. Pronobis, O.A. von Lilienfeld, K.-R. Müller, A. Tkatchenko, J. Phys. Chem. Lett. 6(12), 2326 (2015)
Article Google Scholar
F.A. Faber, L. Hutchison, B. Huang, J. Gilmer, S.S. Schoenholz, G.E. Dahl, O. Vinyals, S. Kearnes, P.F. Riley, O.A. von Lilienfeld, J. Chem. Theory Comput. 13(11), 5255 (2017)
Article Google Scholar
S. Chmiela, A. Tkatchenko, H.E. Sauceda, I. Poltavsky, K.T. Schütt, K.-R. Müller, Sci. Adv. 3(5), e1603015 (2017)
Article ADS Google Scholar
S. Chmiela, H.E. Sauceda, K.-R. Müller, A. Tkatchenko, Nat. Commun. 9, 3887 (2018)
Article ADS Google Scholar
I. Guyon, A. Elisseeff, in Feature Extraction—Foundations and Applications. Studies in Fuzziness and Soft Computing, vol. 207 (Springer, Berlin, 2006), pp. 1–25
Google Scholar
P.Y. Simard, Y. LeCun, J.S. Denker, B. Victorri, in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012), pp. 235–269
Google Scholar
Y. LeCun, P. Haffner, L. Bottou, Y. Bengio, in Shape, Contour and Grouping in Computer Vision (Springer, Berlin, 1999), pp. 319–345
Google Scholar
J. Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals, G.E. Dahl, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 1263–1272
Google Scholar
K.T. Schütt, H.E. Sauceda, P.-J. Kindermans, A. Tkatchenko, K.-R. Müller, J. Chem. Phys. 148(24), 241722 (2018)
Article ADS Google Scholar
K. Hansen, G. Montavon, F. Biegler, S. Fazli, M. Rupp, M. Scheffler, O. A. von Lilienfeld, A. Tkatchenko, K.-R. Müller, J. Chem. Theory Comput. 9(8), 3404 (2013)
Article Google Scholar
J. Bergstra, Y. Bengio, J. Mach. Learn. Res. 13, 281 (2012)
MathSciNet Google Scholar
J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, in Advances in Neural Information Processing Systems, vol. 24 (2011), pp. 2546–2554
Google Scholar
Z.C. Lipton, ACM Queue 16(3), 30 (2018)
Google Scholar
W. Samek, G. Montavon, A. Vedaldi, L.K. Hansen, K.-R. Müller (eds.), Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)
Google Scholar
D. Baehrens, T. Schroeter, S. Harmeling, M. Kawanabe, K. Hansen, K. Müller, J. Mach. Learn. Res. 11, 1803 (2010)
MathSciNet Google Scholar
S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, W. Samek, PLoS One 10(7), e0130140 (2015)
Article Google Scholar
R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, N. Elhadad, in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015), pp. 1721–1730
Google Scholar
B. Zhou, A. Khosla, À. Lapedriza, A. Oliva, A. Torralba, in IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 2921–2929
Google Scholar
K. Yao, J.E. Herr, S.N. Brown, J. Parkhill, J. Phys. Chem. Lett. 8(12), 2689 (2017)
Article Google Scholar
K.T. Schütt, M. Gastegger, A. Tkatchenko, K.-R. Müller, in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)
Google Scholar
M.T. Ribeiro, S. Singh, C. Guestrin, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), pp. 1135–1144
Google Scholar
R.C. Fong, A. Vedaldi, In IEEE International Conference on Computer Vision (2017), pp. 3449–3457
Google Scholar
M. Sundararajan, A. Taly, Q. Yan, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 3319–3328
Google Scholar
G. Montavon, A. Binder, S. Lapuschkin, W. Samek, K.-R. Müller, in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)
Google Scholar
L. Arras, J. Arjona-Medina, M. Widrich, G. Montavon, M. Gillhofer, K.-R. Müller, S. Hochreiter, W. Samek, in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)
Google Scholar
S. Lapuschkin, S. Wäldchen, A. Binder, G. Montavon, W. Samek, K.-R. Müller, Nat. Commun. 10, 1096 (2019)
Article ADS Google Scholar

Download references

Acknowledgements

This work was supported by the German Ministry for Education and Research as Berlin Center for Machine Learning (01IS18037I). The author is grateful to Klaus-Robert Müller for the valuable feedback.

Author information

Authors and Affiliations

Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany
Grégoire Montavon

Authors

Grégoire Montavon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Grégoire Montavon .

Editor information

Editors and Affiliations

Machine Learning, Technical University of Berlin, Berlin, Germany
Kristof T. Schütt
Machine Learning Group, Technical University of Berlin, Berlin, Germany
Stefan Chmiela
Institute of Physical Chemistry and MARVEL, University of Basel, Basel, Switzerland
O. Anatole von Lilienfeld
Department of Physics and Materials Science, University of Luxembourg, Luxembourg, Luxembourg
Alexandre Tkatchenko
Graduate School of Frontier Sciences, University of Tokyo, Kashiwa, Japan
Koji Tsuda
Computer Science, Technical University of Berlin, Berlin, Germany
Klaus-Robert Müller

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Montavon, G. (2020). Introduction to Neural Networks. In: Schütt, K., Chmiela, S., von Lilienfeld, O., Tkatchenko, A., Tsuda, K., Müller, KR. (eds) Machine Learning Meets Quantum Physics. Lecture Notes in Physics, vol 968. Springer, Cham. https://doi.org/10.1007/978-3-030-40245-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-40245-7_4
Published: 04 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-40244-0
Online ISBN: 978-3-030-40245-7
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics