Basics of Machine Learning

Guerraoui, Rachid; Gupta, Nirupam; Pinot, Rafael

doi:10.1007/978-981-97-0688-4_2

Rachid Guerraoui⁶,
Nirupam Gupta⁶ &
Rafael Pinot⁷

Part of the book series: Machine Learning: Foundations, Methodologies, and Applications ((MLFMA))

234 Accesses

Abstract

Machine learning consists in designing algorithms that exploit data (sometimes called observations) in order to acquire domain knowledge and perform an automated decision-making task. Contrary to most conventional computing tasks, learning algorithms are data-dependent in the sense that they build task-specific models and improve upon them using the data fed to them. Machine learning algorithms are mainly classified into four classes: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. Each of these classes has its own interest and peculiarities. In this book, we focus on supervised classification to illustrate and to formalize the main concepts of robust machine learning. Most of the robustness techniques we discuss however in the book can also be applied to other machine learning classes. In this chapter, we present the fundamentals of supervised learning, through the specifics of the supervised classification task, and we review some of the standard optimization algorithms used for solving this task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The animal icons we use in this figure are courtesy of the Freepick team. See the following links for the respective icons: Elephant, Chicken, Mouse and Snake.
2.
Hint: When \(\left \lVert {\nabla {\mathcal {L}\left ( \boldsymbol {\theta } \right )}}\right \rVert > 0\), there exists \(\delta > 0\) such that \(\frac {\psi \left (\gamma \nabla {\mathcal {L}\left ( \boldsymbol {\theta } \right )} \right )}{\gamma \left \lVert {\nabla {\mathcal {L}\left ( \boldsymbol {\theta } \right )} }\right \rVert } < \frac {1}{2} \left \lVert {\nabla {\mathcal {L}\left ( \boldsymbol {\theta } \right )} }\right \rVert \) for all \(\gamma \) such that \(\gamma \left \lVert {\nabla {\mathcal {L}\left ( \boldsymbol {\theta } \right )} }\right \rVert < \delta \).
3.
We use the Bachmann–Landau notation \(\mathcal {O}_a(\dot )\) for describing the infinite behavior of a function when \(a \to +\infty \). See section “Notation” for more details on notation.
4.
This condition is referred as the second-order sufficient condition for optimality.

References

Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from: Tensorflow.org
Arjevani Y, Carmon Y, Duchi JC, Foster DJ, Srebro N, Woodworth B (2023) Lower bounds for non-convex stochastic optimization. Math Program 199(1–2):165–214
Article MathSciNet Google Scholar
Bach F (2023) Learning theory from first principles. MIT Press (Draft), Cambridge
Google Scholar
Bartlett PL, Jordan MI, McAuliffe JD (2006) Convexity, classification, and risk bounds. J Amer Stat Assoc 101(473):138–156
Article MathSciNet Google Scholar
Bottou L (1999) On-line learning and stochastic approximations. In: On-line learning in neural networks. Saad D (ed).. Publications of the Newton Institute. Cambridge University Press, Cambridge, pp 9–42
Chapter Google Scholar
Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. Siam Rev 60(2):223–311
Article MathSciNet Google Scholar
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Book Google Scholar
Cutkosky A, Orabona F (2019) Momentum-based variance reduction in non-convex SGD. In: Advances in Neural Information Processing Systems, vol 32. Curran Associates, Red Hook, pp 15236–15245
Google Scholar
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer series in statistics. Springer New York, New York
Google Scholar
Khaled A, Richtárik P (2023) Better theory for SGD in the nonconvex world. Trans Mach Learn Res. Survey Certification. ISSN: 2835-8856. https://openreview.net/forum?id=AU4qHN2VkS
Mohri M, Rostamizadeh A, Talwalkar A (2018) Foundations of machine learning. MIT Press, Cambridge
Google Scholar
Moulines E, Bach F (2011) Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: Advances in Neural Information Processing Systems. Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger K, vol 24. Curran Associates, Red Hook
Google Scholar
Nesterov Y et al. (2018) Lectures on convex optimization, vol 137. Springer, Berlin
Google Scholar
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol 32. Curran Associates, Red Hook, pp 8024–8035
Google Scholar
Shalev-Shwartz S, Ben-David S (2014) Understanding machine learning: From theory to algorithms. Cambridge University Press, Cambridge
Book Google Scholar
Steinwart I (2007) How to compare different loss functions and their risks. Construct Approx 26(2):225–287
Article MathSciNet Google Scholar
Zhang T (2004) Statistical behavior and consistency of classification methods based on convex risk minimization. Ann Statist 32(1):56–85
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Rachid Guerraoui & Nirupam Gupta
Mathematics, Sorbonne Université, Paris, France
Rafael Pinot

Authors

Rachid Guerraoui
View author publications
You can also search for this author in PubMed Google Scholar
Nirupam Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Pinot
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Guerraoui, R., Gupta, N., Pinot, R. (2024). Basics of Machine Learning. In: Robust Machine Learning. Machine Learning: Foundations, Methodologies, and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-97-0688-4_2

Download citation

DOI: https://doi.org/10.1007/978-981-97-0688-4_2
Published: 05 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0687-7
Online ISBN: 978-981-97-0688-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics