Introduction to Machine Learning

  • F. Richard Yu
  • Ying He
Part of the SpringerBriefs in Electrical and Computer Engineering book series (BRIEFSELECTRIC)


Machine learning is evolved from a collection of powerful techniques in AI areas and has been extensively used in data mining, which allows the system to learn the useful structural patterns and models from training data. Machine learning algorithms can be basically classified into four categories: supervised, unsupervised, semi-supervised and reinforcement learning. In this chapter, widely-used machine learning algorithms are introduced. Each algorithm is briefly explained with some examples.


  1. 1.
    S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine learning: A review of classification techniques,” Emerging Artificial Intelligence Applications in Computer Engineering, vol. 160, pp. 3–24, 2007.Google Scholar
  2. 2.
    J. Friedman, T. Hastie, and R. Tibshirani, The Elements of Statistical Learning. Springer Series in Statistics New York, 2001, vol. 1.Google Scholar
  3. 3.
    T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE Trans. Information Theory, vol. 13, no. 1, pp. 21–27, Jan. 1967.zbMATHGoogle Scholar
  4. 4.
    L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen, Classification and Regression Trees. CRC Press, 1984.Google Scholar
  5. 5.
    J. Han, J. Pei, and M. Kamber, Data Mining: Concepts and Techniques. Elsevier, 2011.Google Scholar
  6. 6.
    J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no. 1, pp. 81–106, 1986.Google Scholar
  7. 7.
    S. Karatsiolis and C. N. Schizas, “Region based support vector machine algorithm for medical diagnosis on Pima Indian Diabetes dataset,” in Proc. IEEE BIBE’12, Larnaca, Cyprus, Nov. 2012, pp. 139–144.Google Scholar
  8. 8.
    W. R. Burrows, M. Benjamin, S. Beauchamp, E. R. Lord, D. McCollor, and B. Thomson, “CART decision-tree statistical analysis and prediction of summer season maximum surface ozone for the Vancouver, Montreal, and Atlantic regions of Canada,” Journal of Applied Meteorology, vol. 34, no. 8, pp. 1848–1862, 1995.Google Scholar
  9. 9.
    A. Kumar, P. Bhatia, A. Goel, and S. Kole, “Implementation and comparison of decision tree based algorithms,” International Journal of Innovations & Advancement in Computer Science, vol. 4, pp. 190–196, May. 2015.Google Scholar
  10. 10.
    L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.zbMATHGoogle Scholar
  11. 11.
    S. Haykin, Neural Networks: A Comprehensive Foundation. Prentice Hall PTR, 1994.zbMATHGoogle Scholar
  12. 12.
    S. Haykin and N. Network, “A comprehensive foundation,” Neural Networks, vol. 2, no. 2004, p. 41, 2004.Google Scholar
  13. 13.
    K. Lee, D. Booth, and P. Alam, “A comparison of supervised and unsupervised neural networks in predicting bankruptcy of Korean firms,” Expert Systems with Applications, vol. 29, no. 1, pp. 1–16, 2005.Google Scholar
  14. 14.
    S. Timotheou, “The random neural network: A survey,” The Computer Journal, vol. 53, no. 3, pp. 251–267, March 2010.Google Scholar
  15. 15.
    S. Basterrech and G. Rubino, “A tutorial about random neural networks in supervised learning,” arXiv preprint arXiv:1609.04846, 2016.Google Scholar
  16. 16.
    H. Bakirciouglu and T. Koccak, “Survey of random neural network applications,” European Journal of Operational Research, vol. 126, no. 2, pp. 319–330, 2000.MathSciNetGoogle Scholar
  17. 17.
    Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, p. 436, 2015.Google Scholar
  18. 18.
    J. Baker, “Artificial neural networks and deep learning,” Feb. 2015. [Online]. Available:
  19. 19.
    J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural Networks, vol. 61, pp. 85–117, 2015.Google Scholar
  20. 20.
    G. Pandey and A. Dukkipati, “Learning by stretching deep networks,” in International Conference on Machine Learning, 2014, pp. 1719–1727.Google Scholar
  21. 21.
    A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.Google Scholar
  22. 22.
    C. Li, Y. Wu, X. Yuan, Z. Sun, W. Wang, X. Li, and L. Gong, “Detection and defense of DDoS attack-based on deep learning in OpenFlow-based SDN,” International Journal of Communication Systems, 2018.Google Scholar
  23. 23.
    T. Mikolov, M. Karafiát, L. Burget, J. Černockỳ, and S. Khudanpur, “Recurrent neural network based language model,” in Eleventh Annual Conference of the International Speech Communication Association, 2010.Google Scholar
  24. 24.
    H. Sak, A. Senior, and F. Beaufays, “Long short-term memory recurrent neural network architectures for large scale acoustic modeling,” in Fifteenth Annual Conference of the International Speech Communication Association, 2014.Google Scholar
  25. 25.
    S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.Google Scholar
  26. 26.
    X. Li and X. Wu, “Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition,” in Proc. IEEE ICASSP’15, Brisbane, QLD, Australia, April 2015, pp. 4520–4524.Google Scholar
  27. 27.
    V. N. Vapnik and V. Vapnik, Statistical Learning Theory. Wiley New York, 1998, vol. 1.Google Scholar
  28. 28.
    B. Yekkehkhany, A. Safari, S. Homayouni, and M. Hasanlou, “A comparison study of different kernel functions for SVM-based classification of multi-temporal polarimetry SAR data,” The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 40, no. 2, p. 281, 2014.Google Scholar
  29. 29.
    A. Patle and D. S. Chouhan, “SVM kernel functions for classification,” in Proc. IEEE ICATE’13, Mumbai, India, Jan 2013, pp. 1–9.Google Scholar
  30. 30.
    I. Steinwart and A. Christmann, Support Vector Machines. Springer Science & Business Media, 2008.zbMATHGoogle Scholar
  31. 31.
    M. Martínez-Ramón and C. Christodoulou, “Support vector machines for antenna array processing and electromagnetics,” Synthesis Lectures on Computational Electromagnetics, vol. 1, no. 1, pp. 1–120, 2005.Google Scholar
  32. 32.
    H. Hu, Y. Wang, and J. Song, “Signal classification based on spectral correlation analysis and SVM in cognitive radio,” in Proc. IEEE AINA’08, Okinawa, Japan, March. 2008, pp. 883–887.Google Scholar
  33. 33.
    G. E. Box and G. C. Tiao, Bayesian Inference in Statistical Analysis. John Wiley & Sons, 2011, vol. 40.Google Scholar
  34. 34.
    J. Bakker, “Intelligent traffic classification for detecting DDoS attacks using SDN/OpenFlow,” Victoria University of Wellington, pp. 1–142, 2017.Google Scholar
  35. 35.
    N. Friedman, D. Geiger, and M. Goldszmidt, “Bayesian network classifiers,” Machine Learning, vol. 29, no. 2–3, pp. 131–163, 1997.zbMATHGoogle Scholar
  36. 36.
    F. V. Jensen, An Introduction to Bayesian Networks. UCL Press London, 1996, vol. 210.Google Scholar
  37. 37.
    D. Heckerman et al., “A tutorial on learning with Bayesian networks,” Nato Asi Series D Behavioural And Social Sciences, vol. 89, pp. 301–354, 1998.zbMATHGoogle Scholar
  38. 38.
    T. D. Nielsen and F. V. Jensen, Bayesian Networks and Decision Graphs. Springer Science & Business Media, 2009.zbMATHGoogle Scholar
  39. 39.
    L. R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989.Google Scholar
  40. 40.
    P. Holgado, V. A. VILLAGRA, and L. Vazquez, “Real-time multistep attack prediction based on hidden markov models,” IEEE Trans. Dependable and Secure Computing, vol. PP, no. 99, pp. 1–1, 2017.Google Scholar
  41. 41.
    E. Alpaydin, Introduction to Machine Learning. MIT Press, 2014.Google Scholar
  42. 42.
    T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu, “An efficient k-means clustering algorithm: Analysis and implementation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881–892, Jul. 2002.zbMATHGoogle Scholar
  43. 43.
    T. Kohonen, “The self-organizing map,” Neurocomputing, vol. 21, no. 1–3, pp. 1–6, 1998.zbMATHGoogle Scholar
  44. 44.
    M. M. Van Hulle, “Self-organizing maps,” in Handbook of Natural Computing. Springer, 2012, pp. 585–622.Google Scholar
  45. 45.
    X. Zhu, “Semi-supervised learning literature survey,” Citeseer, pp. 1–59, 2005.Google Scholar
  46. 46.
    X. Zhou and M. Belkin, “Semi-supervised learning,” in Academic Press Library in Signal Processing. Elsevier, 2014, vol. 1, pp. 1239–1269.Google Scholar
  47. 47.
    D.-H. Lee, “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,” in Workshop on Challenges in Representation Learning, ICML, vol. 3, 2013, p. 2.Google Scholar
  48. 48.
    H. Wu and S. Prasad, “Semi-supervised deep learning using Pseudo labels for hyperspectral image classification,” IEEE Trans. Image Processing, vol. 27, no. 3, pp. 1259–1270, March 2018.MathSciNetGoogle Scholar
  49. 49.
    O. Chapelle, B. Scholkopf, and A. Zien, “Semi-supervised learning (chapelle, o. et al., eds.; 2006)[book reviews],” IEEE Trans. Neural Networks, vol. 20, no. 3, pp. 542–542, 2009.Google Scholar

Copyright information

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2019

Authors and Affiliations

  • F. Richard Yu
    • 1
  • Ying He
    • 1
  1. 1.Carleton UniversityOttawaCanada

Personalised recommendations