Global Learning vs. Local Learning

doi:10.1007/978-3-540-79452-3_2

Part of the book series: Advanced Topics in Science and Technology in China ((ATSTC))

6476 Accesses
2 Citations

Abstract

In this chapter, we conduct a more detailed and more formal review on two different schools of learning approaches, namely, the global learning and local learning. We first provide a hierarchy graph as illustrated in Fig. 2.1 in which we try to classify many statistical models into their proper categories, either global learning or local learning. Our review will also be conducted based on this hierarchy structure. To make it clear, we use filled shapes to highlight our own work in the graph.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anand R, Mehrotram GK, Mohan KC, Ranka S (1993) An improved alogrithm for neural network classification of imbalance training sets. IEEE Transactions on Neural Networks 4(6):962–969
Article Google Scholar
Bahl LR, Brown PF, de Souza PV, Mercer RL (1993) Estimating hidden Markov model parameters so as to maximize speech recognition accuracy. IEEE Transactions on Speech and Audio Processing 1:77–82
Article Google Scholar
Barber CB, Dobkin DP, Huhanpaa H (1996) The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software 22(4):469–483
Article MATH Google Scholar
Beaufays F, Wintraub M, Konig Y (1999) Discriminative mixture weight estimation for large Gaussian mixture models. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing 337–340
Google Scholar
Brand M (1998) Structure discovery via entropy minimization. In Neural Information Processing System 11
Google Scholar
J Christopher, Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2):121–167
Article Google Scholar
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Transactions on Information Theory IT-13(1):21–27
Article Google Scholar
Cristianini N, Shawe-Taylor J (2000) An Introduction to Support Vector Machines(and Other Kernel-based Learning Methods). Cambridge, U.K.; New York, NY: Cambridge University Press
Google Scholar
Duda R, Hart P (1973) Pattern Classification and Scene Analysis. New York, NY: John Wiley & Sons
MATH Google Scholar
Duda RO, Hart PE, Stork DG (2000) Pattern Classification. New York, NY: John Wiley & Sons
Google Scholar
Fausett L (1994) Fundamentals of Neural Networks. New York, NY: Prentice Hall
MATH Google Scholar
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Machine Learning 29:131–161
Article MATH Google Scholar
Fukunaga K (1990) Introduction to Statistical Pattern Recognition. San Diego, Academic Press, 2nd edition
MATH Google Scholar
Gilks WR, Richardson S, Spiegelhalter DJ (1996) Markov Chain Monte Carlo in Practice. London: Chapman & Hall
MATH Google Scholar
Grzegorzewski P, Hryniewicz O, Gil M (2002) Soft Methods in Probability, Statistics and Data Analysis. Heidelberg; New York: Physica-Verlag
MATH Google Scholar
Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society(B) 58:155–176
MATH MathSciNet Google Scholar
Haykin S (1994) Neural Networks: A Comprehensive Foundation. New York, NY: Macmillan Publishing
MATH Google Scholar
Herbrich R, Graepel T (2001) Large scale Bayes point machines. In Advances in Neural Information Processing Systems (NIPS)
Google Scholar
Huang K, King I, Chan L, Yang H (2004) Improving Chow-Liu tree performance based on association rules. In J. C. Rajapakse and L. Wang, editors, Neural Information Processing: Research and Development, Studies in Fuzziness and Soft Computing, 152: 94–112. Heidelberg; New York: Springer-Verlag
Google Scholar
Huang K, King I, Lyu MR (2002). Learning maximum likelihood semi-naive Bayesian network classifier. In Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC2002). Hammamet, Tunisia TA1F3
Google Scholar
Huang K, King I, Lyu MR (2003) Finite mixture model of bound semi-naive Bayesian network classifier. In Proceedings of the International Conference on Artificial Neural Networks (ICANN-2003), Lecture Notes in Artificial Intelligence, Long Paper. Heidelberg: Springer-Verlag 2714:115–122
Google Scholar
Jebara T (2002) Discriminative, Generative and Imitative Learning. PhD thesis, Massachusetts Institute of Technology
Google Scholar
Jordan MI (1995) Why the logistic function? A tutorial discussion on probabilities and neural networks. Technical Report 9503, MIT Computational Cognitive Science Report
Google Scholar
Toussaint GT, Jaromczyk JW (1992) Relative neighborhood graphs and their relatives. Proceedings IEEE 80(9):1502–1517
Article Google Scholar
Kass RE, Carlin BP, Gelman A, Neal RM (1998) Markov chain Monte Carlo in practice: A roundtable discussion. The American Statistician 52:93–100
Article MathSciNet Google Scholar
Kohavi R, Becker B, Sommerfield D (1997) Improving simple Bayes. In Technique Report. Mountain View, CA: Data Mining and Visualization Group, Silicon Graphics Inc
Google Scholar
Laird NM, Dempster AP, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm.J. Royal Statist. Society B39:1–38
MathSciNet Google Scholar
Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2001) Minimax probability machine. In Advances in Neural Information Processing Systems (NIPS)
Google Scholar
Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2002) A robust minimax approach to classification. Journal of Machine Learning Research 3:555–582
Article Google Scholar
Lanckriet GRG, Ghaoui LE, Jordan MI (2002) Robust novelty detection with single-class MPM. In Advances in Neural Information Processing Systems (NIPS)
Google Scholar
Langley P (1993) Introduction of recursive Bayesian classifiers. In Proceedings of the 1993 European Conference on Machine Learning 153–164
Google Scholar
Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. In Proceedings of National Conference on Artificial Intelligence 223–228
Google Scholar
McLachlan GJ, Basford KE (1988) Mixture Models: Inference and Applications to Clustering. New York, NY: Marcel Dekker Inc
MATH Google Scholar
Pankaj Mehra, Benjamin W Wah (1992) Artificial Neural Networks: Concepts and Theory. Los Alamitos, California: IEEE Computer Society Press
MATH Google Scholar
Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80(2)
Google Scholar
Minka T (2001) A family of Algorithms for Approximate Inference. PhD thesis, Massachusetts Institute of Technology
Google Scholar
Neal RM (1993) Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, Dept. of Computer Science, University of Toronto
Google Scholar
Neal RM (1998). Suppressing random walks in Markov chain Monte Carlo using ordered overrelaxation M. I. Jordan (editor) Learning in Graphical Models, Dordrecht: Kluwer Academic Publishers 205–225
Google Scholar
Patterson D (1996) Artificial Neural Networks. Singapore: Prentice Hall
MATH Google Scholar
Pearl J (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Francisco, CA: Morgan Kaufmann
Google Scholar
Pinto RL, Neal RM (2001) Improving Markov chain Monte Carlo estimators by coupling to an approximating chain. Technical Report No. 0101, Dept. of Statistics, University of Toronto
Google Scholar
Rathinavelu C, Deng L (1996) The trended HMM with discriminative training for phonetic classification. In Proceedings of ICSLP
Google Scholar
Ripley BD (1996) Pattern Recognition and Neural Networks. Press Syndicate of the University of Cambridge
Google Scholar
Rujam R (1997) Preceptron learning by playing billiards. Neural Computation 9:99–122
Article Google Scholar
Scholkopf B, Burges C, Smola A (1999) Advances in Kernel Methods: Support Vector Learning. Cambridge, MA: The MIT Press
Google Scholar
Scholkopf B, Smola A (2002) Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. Cambridge, MA: The MIT Press
Google Scholar
Smola AJ, Bartlett PL, Scholkopf B, Schuurmans D (2000). Advances in Large Margin Classifiers. Cambridge, MA: The MIT Press
MATH Google Scholar
Stolcke A, Omohundro S (1993) Hidden Markov model induction by Bayesian model merging. In NIPS 5:11–18
Google Scholar
Tipping M (1999) The relevance vector machine. In Advances in Neural Information Processing Systems 12 (NIPS)
Google Scholar
Trivedi PK (1978) Estimation of a distributed lag model under quadratic loss. Econometrica 46(5):1181–1192
Article MATH Google Scholar
Vapnik VN (1998) Statistical Learning Theory. New York, NY: John Wiley & Sons
MATH Google Scholar
Vapnik VN (1999) The Nature of Statistical Learning Theory. New York, NY: Springer, 2nd edition
Google Scholar
Woodland P, Povey D (2000) Large scale discriminative training for speech recognition. In Proceedings of ASR 2000
Google Scholar
Zhang W, King I (2002) A study of the relationship between support vector machine and Gabriel Graph. In Proceedings of IEEE World Congress on Computational Intelligence-International Joint Conference on Neural Networks
Google Scholar

Download references

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

(2008). Global Learning vs. Local Learning. In: Machine Learning. Advanced Topics in Science and Technology in China. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79452-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-540-79452-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79451-6
Online ISBN: 978-3-540-79452-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics