Abstract
Machine learning has become an essential tool for extracting regularities in the data and for making inferences. Neural networks, in particular, provide the scalability and flexibility that is needed to convert complex datasets into structured and well-generalizing models. Pretrained models have strongly facilitated the application of neural networks to images and text data. Application to other types of data, e.g., in physics, remains more challenging and often requires ad-hoc approaches. In this chapter, we give an introduction to neural networks with a focus on the latter applications. We present practical steps that ease training of neural networks, and then review simple approaches to introduce prior knowledge into the model. The discussion is supported by theoretical arguments as well as examples showing how well-performing neural networks can be implemented easily in modern neural network frameworks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
C.M. Bishop, Neural Networks for Pattern Recognition (Oxford University Press, New York, 1995)
G. Montavon, G.B. Orr, K. Müller (eds.), in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012)
J. Schmidhuber, Neural Netw. 61, 85 (2015)
Y. LeCun, Y. Bengio, G. Hinton, Nature 521(7553), 436 (2015)
G. Cybenko, Math. Control Signals Syst. 2(4), 303 (1989)
Z. Lu, H. Pu, F. Wang, Z. Hu, L. Wang, in Advances in Neural Information Processing Systems, vol. 30 (2017), pp. 6231–6239
K. Fukushima, Biol. Cybern. 36, 193 (1980)
G. Montavon, M.L. Braun, K. Müller, J. Mach. Learn. Res. 12, 2563 (2011)
C. Cortes, V. Vapnik, Mach. Learn. 20(3), 273 (1995)
K. Müller, S. Mika, G. Rätsch, K. Tsuda, B. Schölkopf, IEEE Trans. Neural Netw. 12(2), 181 (2001)
B. Schölkopf, A. J. Smola, in Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Adaptive Computation and Machine Learning Series (MIT Press, Cambridge, MA, 2002)
A. Krizhevsky, I. Sutskever, G. E. Hinton, in Neural Information Processing Systems (2012), pp. 1106–1114
K. Simonyan, A. Zisserman, in Third International Conference on Learning Representations (2015)
M. Oquab, L. Bottou, I. Laptev, J. Sivic, in IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 1717–1724
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, P.P. Kuksa, J. Mach. Learn. Res. 12, 2493 (2011)
Y. Kim, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (2014), pp. 1746–1751
P. Baldi, P. Sadowski, D. Whiteson, Nat. Commun. 5, 4308 (2014)
K. T. Schütt, F. Arbabzadah, S. Chmiela, K.R. Müller, A. Tkatchenko, Nat. Commun. 8, 13890 (2017)
A. Mardt, L. Pasquali, H. Wu, F. Noé, Nat. Commun. 9(5) (2018)
L. Holmström, P. Koistinen, IEEE Trans. Neural Netw. 3(1), 24 (1992)
S. Hobday, R. Smith, J. Belbruno, Model. Simul. Mater. Sci. Eng. 7(3), 397 (1999)
J. Behler, M. Parrinello, Phys. Rev. Lett. 98(14), 146401 (2007)
K. Yao, J.E. Herr, D.W. Toth, R. Mckintyre, J. Parkhill, Chem. Sci. 9(8), 2261 (2018)
B. Nebgen, N. Lubbers, J.S. Smith, A.E. Sifain, A. Lokhov, O. Isayev, A.E. Roitberg, K. Barros, S. Tretiak, J. Chem. Theory Comput. 14(9), 4687 (2018)
D.E. Rumelhart, G.E. Hinton, R.J. Williams, Nature 323(6088), 533 (1986)
P.J. Werbos, in System Modeling and Optimization (Springer, Berlin, 1982), pp. 762–770
Y. LeCun, L. Bottou, G.B. Orr, K. Müller, in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012), pp. 9–48
J. Lafond, N. Vasilache, L. Bottou (2017). CoRR abs/1705.09319
A. Botev, H. Ritter, D. Barber, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 557–565
Y. Jeon, C. Choi, in International Joint Conference Neural Network (1999), pp. 1685–1690
G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K.-R. Müller, O. A. von Lilienfeld, New J. Phys. 15(9), 095003 (2013)
X. Glorot, A. Bordes, Y. Bengio, in International Conference on Artificial Intelligence and Statistics (2011), pp. 315–323
M.D. Zeiler, M. Ranzato, R. Monga, M.Z. Mao, K. Yang, Q.V. Le, P. Nguyen, A.W. Senior, V. Vanhoucke, J. Dean, G.E. Hinton, in IEEE International Conference on Acoustics, Speech and Signal Processing (2013), pp. 3517–3521
K. He, X. Zhang, S. Ren, J. Sun, in IEEE International Conference on Computer Vision (2015), pp. 1026–1034
D.P. Kingma, J. Ba, in Third International Conference on Learning Representations (2015)
L. Bottou, in Proceedings of Neuro-Nîmes, vol. 91 (EC2, Nimes, 1991)
L. Bottou, in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012), pp. 421–436
V.N. Vapnik, The Nature of Statistical Learning Theory, 2nd edn. Statistics for Engineering and Information Science (Springer, Berlin, 2000)
A. Krogh, J.A. Hertz, in Advances in Neural Information Processing Systems, vol. 4 (1991), pp. 950–957
R. Reed, IEEE Trans. Neural Netw. 4(5), 740 (1993)
L. Breiman, Mach. Lear. 24(2), 123 (1996)
M. Rupp, A. Tkatchenko, K.-R. Müller, O.A. von Lilienfeld, Phys. Rev. Lett. 108, 058301 (2012)
K. Hansen, F. Biegler, R. Ramakrishnan, W. Pronobis, O.A. von Lilienfeld, K.-R. Müller, A. Tkatchenko, J. Phys. Chem. Lett. 6(12), 2326 (2015)
F.A. Faber, L. Hutchison, B. Huang, J. Gilmer, S.S. Schoenholz, G.E. Dahl, O. Vinyals, S. Kearnes, P.F. Riley, O.A. von Lilienfeld, J. Chem. Theory Comput. 13(11), 5255 (2017)
S. Chmiela, A. Tkatchenko, H.E. Sauceda, I. Poltavsky, K.T. Schütt, K.-R. Müller, Sci. Adv. 3(5), e1603015 (2017)
S. Chmiela, H.E. Sauceda, K.-R. Müller, A. Tkatchenko, Nat. Commun. 9, 3887 (2018)
I. Guyon, A. Elisseeff, in Feature Extraction—Foundations and Applications. Studies in Fuzziness and Soft Computing, vol. 207 (Springer, Berlin, 2006), pp. 1–25
P.Y. Simard, Y. LeCun, J.S. Denker, B. Victorri, in Neural Networks: Tricks of the Trade, 2nd edn. Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin, 2012), pp. 235–269
Y. LeCun, P. Haffner, L. Bottou, Y. Bengio, in Shape, Contour and Grouping in Computer Vision (Springer, Berlin, 1999), pp. 319–345
J. Gilmer, S.S. Schoenholz, P.F. Riley, O. Vinyals, G.E. Dahl, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 1263–1272
K.T. Schütt, H.E. Sauceda, P.-J. Kindermans, A. Tkatchenko, K.-R. Müller, J. Chem. Phys. 148(24), 241722 (2018)
K. Hansen, G. Montavon, F. Biegler, S. Fazli, M. Rupp, M. Scheffler, O. A. von Lilienfeld, A. Tkatchenko, K.-R. Müller, J. Chem. Theory Comput. 9(8), 3404 (2013)
J. Bergstra, Y. Bengio, J. Mach. Learn. Res. 13, 281 (2012)
J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, in Advances in Neural Information Processing Systems, vol. 24 (2011), pp. 2546–2554
Z.C. Lipton, ACM Queue 16(3), 30 (2018)
W. Samek, G. Montavon, A. Vedaldi, L.K. Hansen, K.-R. Müller (eds.), Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)
D. Baehrens, T. Schroeter, S. Harmeling, M. Kawanabe, K. Hansen, K. Müller, J. Mach. Learn. Res. 11, 1803 (2010)
S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, W. Samek, PLoS One 10(7), e0130140 (2015)
R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, N. Elhadad, in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015), pp. 1721–1730
B. Zhou, A. Khosla, À. Lapedriza, A. Oliva, A. Torralba, in IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 2921–2929
K. Yao, J.E. Herr, S.N. Brown, J. Parkhill, J. Phys. Chem. Lett. 8(12), 2689 (2017)
K.T. Schütt, M. Gastegger, A. Tkatchenko, K.-R. Müller, in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)
M.T. Ribeiro, S. Singh, C. Guestrin, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), pp. 1135–1144
R.C. Fong, A. Vedaldi, In IEEE International Conference on Computer Vision (2017), pp. 3449–3457
M. Sundararajan, A. Taly, Q. Yan, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 3319–3328
G. Montavon, A. Binder, S. Lapuschkin, W. Samek, K.-R. Müller, in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)
L. Arras, J. Arjona-Medina, M. Widrich, G. Montavon, M. Gillhofer, K.-R. Müller, S. Hochreiter, W. Samek, in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science, vol. 11700 (Springer, Berlin, 2019)
S. Lapuschkin, S. Wäldchen, A. Binder, G. Montavon, W. Samek, K.-R. Müller, Nat. Commun. 10, 1096 (2019)
Acknowledgements
This work was supported by the German Ministry for Education and Research as Berlin Center for Machine Learning (01IS18037I). The author is grateful to Klaus-Robert Müller for the valuable feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Montavon, G. (2020). Introduction to Neural Networks. In: Schütt, K., Chmiela, S., von Lilienfeld, O., Tkatchenko, A., Tsuda, K., Müller, KR. (eds) Machine Learning Meets Quantum Physics. Lecture Notes in Physics, vol 968. Springer, Cham. https://doi.org/10.1007/978-3-030-40245-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-40245-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-40244-0
Online ISBN: 978-3-030-40245-7
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)