Machine Learning

  • Xin Yao
  • Yong Liu


Machine learning is a very active sub-field of artificial intelligence concerned with the development of computational models of learning. Machine learning is inspired by the work in several disciplines: cognitive sciences, computer science, statistics, computational complexity, information theory, control theory, philosophy and biology. Simply speaking, machine learning is learning by machine. From a computational point of view, machine learning refers to the ability of a machine to improve its performance based on previous results. From a biological point of view, machine learning is the study of how to create computers that will learn from experience and modify their activity based on that learning as opposed to traditional computers whose activity will not change unless the programmer explicitly changes it.


Neural Network Machine Learning Learning Algorithm Reinforcement Learning Connection Weight 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Battiti R, Colla AM (1994) Democracy in neural nets: voting schemes for classification. Neural Netw 7:691–707CrossRefGoogle Scholar
  2. Breiman L (1996) Bagging predictors. Mach Learn 24:123–140Google Scholar
  3. Breiman L, Friedman J, Olshen RA, Stone PJ (1984) Classification and regression trees. Wadsworth, BelmontGoogle Scholar
  4. Chandra A, Yao X (2006) Ensemble learning using multi-objective evolutionary algorithms. J Math Model Algorithms 5:417–445CrossRefGoogle Scholar
  5. Chen H, Yao X (2009) Regularized negative correlation learning for neural network ensembles. IEEE Trans Neural Netw 20:1962–1979CrossRefGoogle Scholar
  6. Chen H, Yao X (2010) Multiobjective neural network ensembles based on regularized negative correlation Learning. IEEE Trans Knowl Data Eng 22:1738–1751CrossRefGoogle Scholar
  7. Cheng J, Greiner R, Kelly J, Bell DA, Liu W (2002) Learning Bayesian networks from data: an information-theory based approach. Artif Intell 137:43–90CrossRefGoogle Scholar
  8. Clemen RT, Winkler RL (1985) Limits for the precision and value of information from dependent sources. Oper Res 33:427–442CrossRefGoogle Scholar
  9. Dietterich TG (1997) Machine-learning research: four current directions. AI Mag 18:97–136Google Scholar
  10. Domingos P, Pazzani M (1996) Beyond indpendence: conditions for the optimality of the simple Bayesian classifier. In: Saitta L (ed) Proceedings of the 13th international conference on machine learning, Bari. Morgan Kaufmann, San Mateo, pp 105–112Google Scholar
  11. Drucker H, Schapire R, Simard P (1993) Improving performance in neural networks using a boosting algorithm. In: Hanson SJ et al (eds) Advances in neural information processing systems 5. Morgan Kaufmann, San Mateo, pp 42–49Google Scholar
  12. Drucker H, Cortes C, Jackel LD, LeCun Y, Vapnik V (1994) Boosting and other ensemble methods. Neural Comput 6:1289–1301CrossRefGoogle Scholar
  13. Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman and Hall, LondonGoogle Scholar
  14. Elkan C (1997) Boosting and naive Bayesian learning. Technical report, Department of Computer Science and Engineering, University of CaliforniaGoogle Scholar
  15. Feigenbaum EA (1961) The simulation of verbal learning behavior. In: Proceedings of the western joint computer conference, Los Angeles, pp 121–131Google Scholar
  16. Fogel DB (1995) Evolutionary computation: towards a new philosophy of machine intelligence. IEEE, New YorkGoogle Scholar
  17. Fogel LJ, Owens AJ, Walsh MJ (1966) Artificial intelligence through simulated evolution. Wiley, New YorkGoogle Scholar
  18. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning, Bari. Morgan Kaufmann, San Mateo, pp 148–156Google Scholar
  19. Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4:1–58CrossRefGoogle Scholar
  20. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12:993–1001CrossRefGoogle Scholar
  21. Hebb DO (1949) The organization of behavior: a neurophysiological theory. Wiley, New YorkGoogle Scholar
  22. Heckerman D (1998) A tutorial on learning with Bayesian networks. In: Jordan MI (ed) Learning in graphical models. Kluwer, DordrechtGoogle Scholar
  23. Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Nat Acad Sci USA 79:2554–2558CrossRefGoogle Scholar
  24. Hopfield JJ, Tank DW (1985) Neural computation of decisions in optimization problems. Biol Cybern 52:141–152Google Scholar
  25. Hunt EB, Marin J, Stone PT (1966) Experiments in induction. Academic, New YorkGoogle Scholar
  26. Islam MM, Yao X, Murase K (2003) A constructive algorithm for training cooperative neural network ensembles. IEEE Trans Neural Netw 14:820–834CrossRefGoogle Scholar
  27. Jacobs RA (1997) Bias/variance analyses of mixture-of-experts architectures. Neural Comput 9:369–383CrossRefGoogle Scholar
  28. Jacobs RA, Jordan MI, Barto AG (1991a) Task decomposition through competition in a modular connectionist architecture: the what and where vision task. Cogn Sci 15:219–250CrossRefGoogle Scholar
  29. Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991b) Adaptive mixtures of local experts. Neural Comput 3:79–87CrossRefGoogle Scholar
  30. Jordan MI, Jacobs RA (1994) Hierarchical mixtures-of-experts and the EM algorithm. Neural Comput 6:181–214CrossRefGoogle Scholar
  31. Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285Google Scholar
  32. Kim J, Ahn J, Cho S (1995) Ensemble competitive learning neural networks with reduced input dimensions. Int J Neural Syst 6:133–142CrossRefGoogle Scholar
  33. Kodratoff Y, Michalski RS (eds) (1990) Machine learning—an artificial intelligence approach 3. Morgan Kaufmann, San MateoGoogle Scholar
  34. Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation, and active learning. In: Tesauro G et al (eds) Advances in neural information processing systems 7. MIT, Cambridge, pp 231–238Google Scholar
  35. Langley P (1996) Elements of machine learning. Morgan Kaufmann, San FranciscoGoogle Scholar
  36. Langley P, Simon H (1995) Applications of machine learning and rule induction. Commun ACM 38:54–64CrossRefGoogle Scholar
  37. Lavrač N, Džeroski S (1994) Inductive logic programming: techniques and applications. Ellis Horwood, ChichesterGoogle Scholar
  38. Liu Y, Yao X (1998a) Negatively correlated neural networks can produce best ensembles. Aust J Intell Inf Process Syst 4:176–185Google Scholar
  39. Liu Y, Yao X (1998b) A cooperative ensemble learning system. In: Proceedings of the IJCNN 1998, Anchorage. IEEE, Piscataway, pp 2202–2207Google Scholar
  40. Liu Y, Yao X (1999a) Simultaneous training of negatively correlated neural networks in an ensemble. IEEE Trans Syst Man Cybern B 29:716–725CrossRefGoogle Scholar
  41. Liu Y, Yao X (1999b) Ensemble learning via negative correlation. Neural Netw 12:1399–1404CrossRefGoogle Scholar
  42. Liu Y, Yao X, Higuchi T (2000) Evolutionary ensembles with negative correlation learning. IEEE Trans Evol Comput 4:380–387CrossRefGoogle Scholar
  43. Liu Y, Yao X, Higuchi T (2001) Ensemble learning by minimizing mutual information. In: Proceedings of the 2nd international conference on software engineer, artificial intelligence, networking and parallel/distributed computing, Nagoya. International association for computer and information science, pp 457–462Google Scholar
  44. Liu Y, Yao X, Zhao Q, Higuchi T (2002) An experimental comparison of neural network ensemble learning methods on decision boundaries. In: Proceedings of the IJCNN 2002, Honolulu. IEEE, Piscataway, pp 221–226Google Scholar
  45. McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–137CrossRefGoogle Scholar
  46. Meir R (1995) Bias, variance, and the combination of least squares estimators. In: Tesauro G, Touretzky DS, Leen TK (eds) Advances in neural information processing systems 7. MIT, Cambridge, pp 295–302Google Scholar
  47. Michalski RS, Carbonell JG, Mitchell TM (eds) (1983) Machine learning—an artificial intelligence approach 1. Morgan Kaufmann, San MateoGoogle Scholar
  48. Michalski RS, Carbonell JG, Mitchell TM (eds) (1986) Machine learning—an artificial intelligence approach 2. Morgan Kaufmann, San MateoGoogle Scholar
  49. Michie D, Spiegelhalter DJ, Taylor CC (1994) Machine learning, neural and statistical classification. Ellis Horwood, LondonGoogle Scholar
  50. Minku LL, White A, Yao X (2010) The impact of diversity on on-line ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22:730–742CrossRefGoogle Scholar
  51. Minsky ML, Papert S (1969) Perceptrons: an introduction to computational geometry. MIT, CambridgeGoogle Scholar
  52. Mitchell TM (1997) Machine learning. McGraw-Hill, New YorkGoogle Scholar
  53. Muggleton SH (1995) Inverse entailment and progol. New Gener Comput 13:245–286CrossRefGoogle Scholar
  54. Muggleton SH, Buntine W (1988) Machine invention of first-order predicates by inverting resolution. In: Proceedings of the 5th international conference on machine learning, Ann Arbor. Morgan Kaufmann, San Mateo, pp 339–352Google Scholar
  55. Opitz DW, Shavlik JW (1996) Actively searching for an effective neural network ensemble. Connect Sci 8:337–353CrossRefGoogle Scholar
  56. Perrone MP, Cooper LN (1993) When networks disagree: ensemble methods for hybrid neural networks. In: Mammone RJ (ed) Neural networks for speech and image processing. Chapman and Hall, LondonGoogle Scholar
  57. Quinlan JR (1986) Introduction to decision tree. Mach Learn 1:81–106Google Scholar
  58. Quinlan JR (1990) Learning logical definitions from relations. Mach Learn 5:239–266Google Scholar
  59. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San MateoGoogle Scholar
  60. Raviv Y, Intrator N (1996) Bootstrapping with noise: an effective regularization technique. Connect Sci 8:355–372CrossRefGoogle Scholar
  61. Rogova G (1994) Combining the results of several neural networks classifiers. Neural Netw 7:777–781CrossRefGoogle Scholar
  62. Rosen BE (1996) Ensemble learning using decorrelated neural networks. Connect Sci 8:373–383CrossRefGoogle Scholar
  63. Rosenblatt F (1962) Principles of neurodynamics: perceptrons and the theory of brain mechanisms. Spartan, ChicagoGoogle Scholar
  64. Rumelhart DE, McClelland JL (ed) (1986) Parallel distributed processing: explorations in the microstructures of cognition. MIT, CambridgeGoogle Scholar
  65. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing: explorations in the microstructures of cognition I. MIT, Cambridge, pp 318–362Google Scholar
  66. Russell S, Norvig P (2002) Artificial intelligence: a modern approach. Prentice-Hall, Englewood CliffsGoogle Scholar
  67. Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3:210–229CrossRefGoogle Scholar
  68. Sarkar D (1996) Randomness in generalization ability: a source to improve it. IEEE Trans Neural Netw 7:676–685CrossRefGoogle Scholar
  69. Schapire RE (1990) The strength of weak learnability. Mach Learn 5:197–227Google Scholar
  70. Schapire RE (1999) Theoretical views of boosting and applications. In: Proceedings of the 10th international conference on algorithmic learning theory, Tokyo. Springer, Berlin, pp 13–25Google Scholar
  71. Schwefel HP (1981) Numerical optimization of computer models. Wiley, ChichesterGoogle Scholar
  72. Schwefel HP (1995) Evolution and optimum seeking. Wiley, New YorkGoogle Scholar
  73. Shavlik J, Dietterich T (eds) (1990) Readings in machine learning. Morgan Kaufmann, San MateoGoogle Scholar
  74. Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc 36:111–147Google Scholar
  75. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT, CambridgeGoogle Scholar
  76. Tang K, Lin M, Minku FL, Yao X (2009) Selective negative correlation learning approach to incremental learning. Neurocomputing 72:2796–2805CrossRefGoogle Scholar
  77. Turing A (1950) Computing machinery and intelligence. Mind 59:433–460CrossRefGoogle Scholar
  78. Vapnik VN (1995) The nature of statistical learning theory. Springer, New YorkCrossRefGoogle Scholar
  79. Wang S, Yao X (2009a) Theoretical study of the relationship between diversity and single-class measures for class imbalance learning. In: Proceedings of the IEEE international conference on data mining workshops, Miami. IEEE Computer Society, Washington, DC, pp 76–81Google Scholar
  80. Wang S, Yao X (2009b) Diversity exploration and negative correlation learning on imbalanced data sets. In: Proceedings of the IJCNN 2009, Atlanta, pp 3259–3266Google Scholar
  81. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1:67–82CrossRefGoogle Scholar
  82. Yao X (1991) Evolution of connectionist networks. In: Dartnall T (ed) Preprints of the international symposium on AI, reasoning and creativity, Griffith University, Queensland, pp 49–52Google Scholar
  83. Yao X (1993a) A review of evolutionary artificial neural networks. Int J Intell Syst 8:539–567CrossRefGoogle Scholar
  84. Yao X (1993b) An empirical study of genetic operators in genetic algorithms. Microprocess Microprogr 38:707–714CrossRefGoogle Scholar
  85. Yao X (1994) The evolution of connectionist networks. In: Dartnall T (ed) Artificial intelligence and creativity. Kluwer, Dordrecht, pp 233–243CrossRefGoogle Scholar
  86. Yao X (1995) Evolutionary artificial neural networks. In: Kent A, Williams JG (eds) Encyclopedia of computer science and technology 33. Dekker, New York, pp 137–170Google Scholar
  87. Yao X (1999) Evolving artificial neural networks. Proc IEEE 87:1423–1447CrossRefGoogle Scholar
  88. Yao X, Liu Y (1997) A new evolutionary system for evolving artificial neural networks. IEEE Trans Neural Netw 8:694–713CrossRefGoogle Scholar
  89. Yao X, Liu Y (1998) Making use of population information in evolutionary artificial neural networks. IEEE Trans Syst Man Cybern B 28:417–425Google Scholar
  90. Yao X, Liu Y, Darwen P (1996) How to make best use of evolutionary learning. In: Stocker R, Jelinek H, Durnota B (eds) Complex systems: from local interactions to global phenomena. IOS, Amsterdam, pp 229–242Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.School of Computer ScienceUniversity of BirminghamBirminghamUK
  2. 2.University of AizuAizuwakamatsuJapan

Personalised recommendations