Skip to main content

Global Learning vs. Local Learning

  • Chapter
Machine Learning

Part of the book series: Advanced Topics in Science and Technology in China ((ATSTC))

Abstract

In this chapter, we conduct a more detailed and more formal review on two different schools of learning approaches, namely, the global learning and local learning. We first provide a hierarchy graph as illustrated in Fig. 2.1 in which we try to classify many statistical models into their proper categories, either global learning or local learning. Our review will also be conducted based on this hierarchy structure. To make it clear, we use filled shapes to highlight our own work in the graph.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anand R, Mehrotram GK, Mohan KC, Ranka S (1993) An improved alogrithm for neural network classification of imbalance training sets. IEEE Transactions on Neural Networks 4(6):962–969

    Article  Google Scholar 

  2. Bahl LR, Brown PF, de Souza PV, Mercer RL (1993) Estimating hidden Markov model parameters so as to maximize speech recognition accuracy. IEEE Transactions on Speech and Audio Processing 1:77–82

    Article  Google Scholar 

  3. Barber CB, Dobkin DP, Huhanpaa H (1996) The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software 22(4):469–483

    Article  MATH  Google Scholar 

  4. Beaufays F, Wintraub M, Konig Y (1999) Discriminative mixture weight estimation for large Gaussian mixture models. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing 337–340

    Google Scholar 

  5. Brand M (1998) Structure discovery via entropy minimization. In Neural Information Processing System 11

    Google Scholar 

  6. J Christopher, Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2):121–167

    Article  Google Scholar 

  7. Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Transactions on Information Theory IT-13(1):21–27

    Article  Google Scholar 

  8. Cristianini N, Shawe-Taylor J (2000) An Introduction to Support Vector Machines(and Other Kernel-based Learning Methods). Cambridge, U.K.; New York, NY: Cambridge University Press

    Google Scholar 

  9. Duda R, Hart P (1973) Pattern Classification and Scene Analysis. New York, NY: John Wiley & Sons

    MATH  Google Scholar 

  10. Duda RO, Hart PE, Stork DG (2000) Pattern Classification. New York, NY: John Wiley & Sons

    Google Scholar 

  11. Fausett L (1994) Fundamentals of Neural Networks. New York, NY: Prentice Hall

    MATH  Google Scholar 

  12. Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Machine Learning 29:131–161

    Article  MATH  Google Scholar 

  13. Fukunaga K (1990) Introduction to Statistical Pattern Recognition. San Diego, Academic Press, 2nd edition

    MATH  Google Scholar 

  14. Gilks WR, Richardson S, Spiegelhalter DJ (1996) Markov Chain Monte Carlo in Practice. London: Chapman & Hall

    MATH  Google Scholar 

  15. Grzegorzewski P, Hryniewicz O, Gil M (2002) Soft Methods in Probability, Statistics and Data Analysis. Heidelberg; New York: Physica-Verlag

    MATH  Google Scholar 

  16. Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society(B) 58:155–176

    MATH  MathSciNet  Google Scholar 

  17. Haykin S (1994) Neural Networks: A Comprehensive Foundation. New York, NY: Macmillan Publishing

    MATH  Google Scholar 

  18. Herbrich R, Graepel T (2001) Large scale Bayes point machines. In Advances in Neural Information Processing Systems (NIPS)

    Google Scholar 

  19. Huang K, King I, Chan L, Yang H (2004) Improving Chow-Liu tree performance based on association rules. In J. C. Rajapakse and L. Wang, editors, Neural Information Processing: Research and Development, Studies in Fuzziness and Soft Computing, 152: 94–112. Heidelberg; New York: Springer-Verlag

    Google Scholar 

  20. Huang K, King I, Lyu MR (2002). Learning maximum likelihood semi-naive Bayesian network classifier. In Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC2002). Hammamet, Tunisia TA1F3

    Google Scholar 

  21. Huang K, King I, Lyu MR (2003) Finite mixture model of bound semi-naive Bayesian network classifier. In Proceedings of the International Conference on Artificial Neural Networks (ICANN-2003), Lecture Notes in Artificial Intelligence, Long Paper. Heidelberg: Springer-Verlag 2714:115–122

    Google Scholar 

  22. Jebara T (2002) Discriminative, Generative and Imitative Learning. PhD thesis, Massachusetts Institute of Technology

    Google Scholar 

  23. Jordan MI (1995) Why the logistic function? A tutorial discussion on probabilities and neural networks. Technical Report 9503, MIT Computational Cognitive Science Report

    Google Scholar 

  24. Toussaint GT, Jaromczyk JW (1992) Relative neighborhood graphs and their relatives. Proceedings IEEE 80(9):1502–1517

    Article  Google Scholar 

  25. Kass RE, Carlin BP, Gelman A, Neal RM (1998) Markov chain Monte Carlo in practice: A roundtable discussion. The American Statistician 52:93–100

    Article  MathSciNet  Google Scholar 

  26. Kohavi R, Becker B, Sommerfield D (1997) Improving simple Bayes. In Technique Report. Mountain View, CA: Data Mining and Visualization Group, Silicon Graphics Inc

    Google Scholar 

  27. Laird NM, Dempster AP, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm.J. Royal Statist. Society B39:1–38

    MathSciNet  Google Scholar 

  28. Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2001) Minimax probability machine. In Advances in Neural Information Processing Systems (NIPS)

    Google Scholar 

  29. Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2002) A robust minimax approach to classification. Journal of Machine Learning Research 3:555–582

    Article  Google Scholar 

  30. Lanckriet GRG, Ghaoui LE, Jordan MI (2002) Robust novelty detection with single-class MPM. In Advances in Neural Information Processing Systems (NIPS)

    Google Scholar 

  31. Langley P (1993) Introduction of recursive Bayesian classifiers. In Proceedings of the 1993 European Conference on Machine Learning 153–164

    Google Scholar 

  32. Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. In Proceedings of National Conference on Artificial Intelligence 223–228

    Google Scholar 

  33. McLachlan GJ, Basford KE (1988) Mixture Models: Inference and Applications to Clustering. New York, NY: Marcel Dekker Inc

    MATH  Google Scholar 

  34. Pankaj Mehra, Benjamin W Wah (1992) Artificial Neural Networks: Concepts and Theory. Los Alamitos, California: IEEE Computer Society Press

    MATH  Google Scholar 

  35. Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80(2)

    Google Scholar 

  36. Minka T (2001) A family of Algorithms for Approximate Inference. PhD thesis, Massachusetts Institute of Technology

    Google Scholar 

  37. Neal RM (1993) Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, Dept. of Computer Science, University of Toronto

    Google Scholar 

  38. Neal RM (1998). Suppressing random walks in Markov chain Monte Carlo using ordered overrelaxation M. I. Jordan (editor) Learning in Graphical Models, Dordrecht: Kluwer Academic Publishers 205–225

    Google Scholar 

  39. Patterson D (1996) Artificial Neural Networks. Singapore: Prentice Hall

    MATH  Google Scholar 

  40. Pearl J (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Francisco, CA: Morgan Kaufmann

    Google Scholar 

  41. Pinto RL, Neal RM (2001) Improving Markov chain Monte Carlo estimators by coupling to an approximating chain. Technical Report No. 0101, Dept. of Statistics, University of Toronto

    Google Scholar 

  42. Rathinavelu C, Deng L (1996) The trended HMM with discriminative training for phonetic classification. In Proceedings of ICSLP

    Google Scholar 

  43. Ripley BD (1996) Pattern Recognition and Neural Networks. Press Syndicate of the University of Cambridge

    Google Scholar 

  44. Rujam R (1997) Preceptron learning by playing billiards. Neural Computation 9:99–122

    Article  Google Scholar 

  45. Scholkopf B, Burges C, Smola A (1999) Advances in Kernel Methods: Support Vector Learning. Cambridge, MA: The MIT Press

    Google Scholar 

  46. Scholkopf B, Smola A (2002) Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. Cambridge, MA: The MIT Press

    Google Scholar 

  47. Smola AJ, Bartlett PL, Scholkopf B, Schuurmans D (2000). Advances in Large Margin Classifiers. Cambridge, MA: The MIT Press

    MATH  Google Scholar 

  48. Stolcke A, Omohundro S (1993) Hidden Markov model induction by Bayesian model merging. In NIPS 5:11–18

    Google Scholar 

  49. Tipping M (1999) The relevance vector machine. In Advances in Neural Information Processing Systems 12 (NIPS)

    Google Scholar 

  50. Trivedi PK (1978) Estimation of a distributed lag model under quadratic loss. Econometrica 46(5):1181–1192

    Article  MATH  Google Scholar 

  51. Vapnik VN (1998) Statistical Learning Theory. New York, NY: John Wiley & Sons

    MATH  Google Scholar 

  52. Vapnik VN (1999) The Nature of Statistical Learning Theory. New York, NY: Springer, 2nd edition

    Google Scholar 

  53. Woodland P, Povey D (2000) Large scale discriminative training for speech recognition. In Proceedings of ASR 2000

    Google Scholar 

  54. Zhang W, King I (2002) A study of the relationship between support vector machine and Gabriel Graph. In Proceedings of IEEE World Congress on Computational Intelligence-International Joint Conference on Neural Networks

    Google Scholar 

Download references

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Zhejiang University Press, Hangzhou and Springer-Verlag GmbH Berlin Heidelberg

About this chapter

Cite this chapter

(2008). Global Learning vs. Local Learning. In: Machine Learning. Advanced Topics in Science and Technology in China. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79452-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-79452-3_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-79451-6

  • Online ISBN: 978-3-540-79452-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics