Machine Learning: A Concise Overview

Duarte, Denio; Ståhl, Niclas

doi:10.1007/978-3-319-97556-6_3

Denio Duarte^4,5 &
Niclas Ståhl⁵

Part of the book series: Studies in Big Data ((SBD,volume 46))

2616 Accesses
4 Citations

Abstract

Machine learning is a sub-field of computer science that aims to make computers learn. It is a simple view of this field, but since the first computer was built, we have wondered whether or not they can learn as we do.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
There are other classes of machine learning algorithms: semi-supervised, reinforcement learning, recommender system. In this chapter, we focus on the two most popular ones. We refer [32] to the readers for classes not covered here.
2.
For some representations, zero is not a good initial value. Random values from 0 to 1 work in most of cases.
3.
Select a small value to \(\alpha \), say 0.01, plot \(J(\varTheta )\) to identify how the gradient is converging, increase \(\alpha \) (e.g., doubling its value) up to have an expected convergence.
4.
Multiplication is \(O(n^2)\), and inverse is \(O(n^3)\).
5.
F\(_1\)-score is a specialization of F\(_{\beta }\)-score that is not covered in this chapter.
6.
Definition of eigenvectors can be found in traditional books of linear algebra.
7.
For the sake of simplicity, we consider an invalid value (e.g., mixed characters and numerical values) for a feature as missing data too.
8.
Also known as multivariate or multi-output regression.

References

Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798–1828.
Article Google Scholar
Boutell, M. R., Luo, J., Shen, X., & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771.
Article Google Scholar
Bratko, I., Michalski, R. S., & Kubat, M. (1999). Machine learning and data mining: Methods and applications.
Google Scholar
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2015). Gated feedback recurrent neural networks. In International Conference on Machine Learning (pp. 2067–2075).
Google Scholar
Connor, J. T., Martin, R. D., & Atlas, L. E. (1994). Recurrent neural networks and robust time series prediction. IEEE Transactions on Neural Networks, 5(2), 240–254.
Article Google Scholar
De Houwer, J., Barnes-Holmes, D., & Moors, A. (2013). What is learning? On the nature and merits of a functional definition of learning. Psychonomic Bulletin & Review, 20(4), 631–642.
Article Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 1–38.
Google Scholar
Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10), 78–87.
Article Google Scholar
Erhan, D., Bengio, Y., Courville, A., Manzagol, P. A., Vincent, P., & Bengio, S. (2010). Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research 11, 625–660.
Google Scholar
Färber, I., Günnemann, S., Kriegel, H. P., Kröger, P., Müller, E., & Schubert, E., et al. (2010). On using class-labels in evaluation of clusterings. In Multiclust: 1st International Workshop on Discovering, Summarizing and Using Multiple Clusterings Held in Conjunction with KDD (p. 1)
Google Scholar
Fujita, A., Takahashi, D. Y., & Patriota, A. G. (2014). A non-parametric method to estimate the number of clusters. Computational Statistics & Data Analysis, 73, 27–39.
Article MathSciNet Google Scholar
Gauthier, J. (2014). Conditional generative adversarial nets for convolutional face generation. In Class Project for stanford CS231N: Convolutional neural networks for visual recognition (Vol. 2014, No. 5, p. 2). Winter Semester
Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., & Ozair, S., et al. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672–2680).
Google Scholar
Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv:1308.0850.
Gunn, S. R. (1998). Support vector machines for classification and regression. Technical report, Faculty of Engineering, Science and Mathematics–School of Electronics and Computer Science.
Google Scholar
Hebb, D. (1949). The organization of behavior: A neuropsychological theory. Wiley
Google Scholar
Izenman, A. J. (2008). Modern multivariate statistical techniques. Regression, classification and manifold learning.
Google Scholar
Jain, A. K. (2010). Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8), 651–666.
Article Google Scholar
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: A review. ACM Computing Surveys (CSUR), 31(3), 264–323.
Article Google Scholar
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260.
Article MathSciNet Google Scholar
Keogh, E., & Mueen, A. (2010). Curse of dimensionality. US: Springer.
Google Scholar
Le Cun, Y., Touresky, D., Hinton, G., & Sejnowski, T. (1988). A theoretical framework for back-propagation. In The connectionist models summer school (Vol. 1, pp. 21–28).
Google Scholar
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Article Google Scholar
LeCun, Y., Jackel, L., Bottou, L., Brunot, A., Cortes, C., & Denker, J., et al. (1995). Comparison of learning algorithms for handwritten digit recognition. In International Conference on Artificial Neural Networks, Perth, Australia (Vol. 60, pp. 53–60).
Google Scholar
Loh, W. Y. (2011). Classification and regression trees. In Wiley interdisciplinary reviews: Data mining and knowledge discovery (Vol. 1, No. 1, pp. 14–23).
Google Scholar
Maaten, L. V. D., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605.
Google Scholar
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), 115–133.
Article MathSciNet Google Scholar
Mikolov, T., & Zweig, G. (2012). Context dependent recurrent neural network language model. SLT, 12, 234–239.
Google Scholar
Minsky, M., & Papert, S. (1969). Perceptrons.
Google Scholar
Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. arXiv:1411.1784.
Mitchell, T. M. (1997). Machine learning (1st ed.). New York, NY, USA: McGraw-Hill Inc.
MATH Google Scholar
Molina, L. C., Belanche, L., & Nebot, A. (2002). Feature selection algorithms: A survey and experimental evaluation. In Proceedings of the 2002 IEEE International Conference on Data Mining (pp. 306–313).
Google Scholar
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML 2010) (pp. 807–814).
Google Scholar
Oja, E. (1997). The nonlinear PCA learning rule in independent component analysis. Neurocomputing, 17(1), 25–45.
Article MathSciNet Google Scholar
Pascanu, R., Mikolov, T., & Bengio, Y. (2013). On the difficulty of training recurrent neural networks. ICML, 3(28), 1310–1318.
Google Scholar
Pineda, F. J. (1987). Generalization of back-propagation to recurrent neural networks. Physical Review Letters, 59(19), 2229.
Article MathSciNet Google Scholar
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434.
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386.
Article Google Scholar
Samuel, A. L. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 3(3), 210–229.
Article MathSciNet Google Scholar
Smola, A. J., Vishwanathan, S. V. N., & Hofmann, T. (2005). Kernel methods for missing variables. In R. G. Cowell, & Z. Ghahramani (eds.) Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (pp. 325–332). Society for Artificial Intelligence and Statistics.
Google Scholar
Weisberg, S. (2005). Applied linear regression (Vol. 528). Wiley
Google Scholar
Willmott, C. J. (1982). Some comments on the evaluation of model performance. Bulletin of the American Meteorological Society, 63(11), 1309–1313.
Article Google Scholar
Zeiler, M. D., & Fergus, R. (2014) Visualizing and understanding convolutional networks. In European Conference on Computer Vision (pp. 818–833). Springer.
Google Scholar
Zhang, S. (2011). Shell-neighbor method and its application in missing data imputation. Applied Intelligence, 35(1), 123–133.
Article Google Scholar

Download references

Acknowledgements

Denio Duarte is partially funded by Coordenadoria de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) under process number 88881.119081/2016-01—Brazil during his visit to Skövde Artificial Intelligence Laboratory (SAIL) at University of Skövde (HiS).

Author information

Authors and Affiliations

Universidade Federal da Fronteira Sul, Chapecó, Brazil
Denio Duarte
University of Skövde, Skövde, Sweden
Denio Duarte & Niclas Ståhl

Authors

Denio Duarte
View author publications
You can also search for this author in PubMed Google Scholar
Niclas Ståhl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Denio Duarte .

Editor information

Editors and Affiliations

University of Skövde, Skövde, Sweden
Alan Said
University of Skövde, Skövde, Sweden
Vicenç Torra

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Duarte, D., Ståhl, N. (2019). Machine Learning: A Concise Overview. In: Said, A., Torra, V. (eds) Data Science in Practice. Studies in Big Data, vol 46. Springer, Cham. https://doi.org/10.1007/978-3-319-97556-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-97556-6_3
Published: 20 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97555-9
Online ISBN: 978-3-319-97556-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics