Abstract
In this chapter, we conduct a more detailed and more formal review on two different schools of learning approaches, namely, the global learning and local learning. We first provide a hierarchy graph as illustrated in Fig. 2.1 in which we try to classify many statistical models into their proper categories, either global learning or local learning. Our review will also be conducted based on this hierarchy structure. To make it clear, we use filled shapes to highlight our own work in the graph.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anand R, Mehrotram GK, Mohan KC, Ranka S (1993) An improved alogrithm for neural network classification of imbalance training sets. IEEE Transactions on Neural Networks 4(6):962–969
Bahl LR, Brown PF, de Souza PV, Mercer RL (1993) Estimating hidden Markov model parameters so as to maximize speech recognition accuracy. IEEE Transactions on Speech and Audio Processing 1:77–82
Barber CB, Dobkin DP, Huhanpaa H (1996) The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software 22(4):469–483
Beaufays F, Wintraub M, Konig Y (1999) Discriminative mixture weight estimation for large Gaussian mixture models. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing 337–340
Brand M (1998) Structure discovery via entropy minimization. In Neural Information Processing System 11
J Christopher, Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2):121–167
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Transactions on Information Theory IT-13(1):21–27
Cristianini N, Shawe-Taylor J (2000) An Introduction to Support Vector Machines(and Other Kernel-based Learning Methods). Cambridge, U.K.; New York, NY: Cambridge University Press
Duda R, Hart P (1973) Pattern Classification and Scene Analysis. New York, NY: John Wiley & Sons
Duda RO, Hart PE, Stork DG (2000) Pattern Classification. New York, NY: John Wiley & Sons
Fausett L (1994) Fundamentals of Neural Networks. New York, NY: Prentice Hall
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Machine Learning 29:131–161
Fukunaga K (1990) Introduction to Statistical Pattern Recognition. San Diego, Academic Press, 2nd edition
Gilks WR, Richardson S, Spiegelhalter DJ (1996) Markov Chain Monte Carlo in Practice. London: Chapman & Hall
Grzegorzewski P, Hryniewicz O, Gil M (2002) Soft Methods in Probability, Statistics and Data Analysis. Heidelberg; New York: Physica-Verlag
Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society(B) 58:155–176
Haykin S (1994) Neural Networks: A Comprehensive Foundation. New York, NY: Macmillan Publishing
Herbrich R, Graepel T (2001) Large scale Bayes point machines. In Advances in Neural Information Processing Systems (NIPS)
Huang K, King I, Chan L, Yang H (2004) Improving Chow-Liu tree performance based on association rules. In J. C. Rajapakse and L. Wang, editors, Neural Information Processing: Research and Development, Studies in Fuzziness and Soft Computing, 152: 94–112. Heidelberg; New York: Springer-Verlag
Huang K, King I, Lyu MR (2002). Learning maximum likelihood semi-naive Bayesian network classifier. In Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC2002). Hammamet, Tunisia TA1F3
Huang K, King I, Lyu MR (2003) Finite mixture model of bound semi-naive Bayesian network classifier. In Proceedings of the International Conference on Artificial Neural Networks (ICANN-2003), Lecture Notes in Artificial Intelligence, Long Paper. Heidelberg: Springer-Verlag 2714:115–122
Jebara T (2002) Discriminative, Generative and Imitative Learning. PhD thesis, Massachusetts Institute of Technology
Jordan MI (1995) Why the logistic function? A tutorial discussion on probabilities and neural networks. Technical Report 9503, MIT Computational Cognitive Science Report
Toussaint GT, Jaromczyk JW (1992) Relative neighborhood graphs and their relatives. Proceedings IEEE 80(9):1502–1517
Kass RE, Carlin BP, Gelman A, Neal RM (1998) Markov chain Monte Carlo in practice: A roundtable discussion. The American Statistician 52:93–100
Kohavi R, Becker B, Sommerfield D (1997) Improving simple Bayes. In Technique Report. Mountain View, CA: Data Mining and Visualization Group, Silicon Graphics Inc
Laird NM, Dempster AP, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm.J. Royal Statist. Society B39:1–38
Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2001) Minimax probability machine. In Advances in Neural Information Processing Systems (NIPS)
Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2002) A robust minimax approach to classification. Journal of Machine Learning Research 3:555–582
Lanckriet GRG, Ghaoui LE, Jordan MI (2002) Robust novelty detection with single-class MPM. In Advances in Neural Information Processing Systems (NIPS)
Langley P (1993) Introduction of recursive Bayesian classifiers. In Proceedings of the 1993 European Conference on Machine Learning 153–164
Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. In Proceedings of National Conference on Artificial Intelligence 223–228
McLachlan GJ, Basford KE (1988) Mixture Models: Inference and Applications to Clustering. New York, NY: Marcel Dekker Inc
Pankaj Mehra, Benjamin W Wah (1992) Artificial Neural Networks: Concepts and Theory. Los Alamitos, California: IEEE Computer Society Press
Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80(2)
Minka T (2001) A family of Algorithms for Approximate Inference. PhD thesis, Massachusetts Institute of Technology
Neal RM (1993) Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, Dept. of Computer Science, University of Toronto
Neal RM (1998). Suppressing random walks in Markov chain Monte Carlo using ordered overrelaxation M. I. Jordan (editor) Learning in Graphical Models, Dordrecht: Kluwer Academic Publishers 205–225
Patterson D (1996) Artificial Neural Networks. Singapore: Prentice Hall
Pearl J (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Francisco, CA: Morgan Kaufmann
Pinto RL, Neal RM (2001) Improving Markov chain Monte Carlo estimators by coupling to an approximating chain. Technical Report No. 0101, Dept. of Statistics, University of Toronto
Rathinavelu C, Deng L (1996) The trended HMM with discriminative training for phonetic classification. In Proceedings of ICSLP
Ripley BD (1996) Pattern Recognition and Neural Networks. Press Syndicate of the University of Cambridge
Rujam R (1997) Preceptron learning by playing billiards. Neural Computation 9:99–122
Scholkopf B, Burges C, Smola A (1999) Advances in Kernel Methods: Support Vector Learning. Cambridge, MA: The MIT Press
Scholkopf B, Smola A (2002) Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. Cambridge, MA: The MIT Press
Smola AJ, Bartlett PL, Scholkopf B, Schuurmans D (2000). Advances in Large Margin Classifiers. Cambridge, MA: The MIT Press
Stolcke A, Omohundro S (1993) Hidden Markov model induction by Bayesian model merging. In NIPS 5:11–18
Tipping M (1999) The relevance vector machine. In Advances in Neural Information Processing Systems 12 (NIPS)
Trivedi PK (1978) Estimation of a distributed lag model under quadratic loss. Econometrica 46(5):1181–1192
Vapnik VN (1998) Statistical Learning Theory. New York, NY: John Wiley & Sons
Vapnik VN (1999) The Nature of Statistical Learning Theory. New York, NY: Springer, 2nd edition
Woodland P, Povey D (2000) Large scale discriminative training for speech recognition. In Proceedings of ASR 2000
Zhang W, King I (2002) A study of the relationship between support vector machine and Gabriel Graph. In Proceedings of IEEE World Congress on Computational Intelligence-International Joint Conference on Neural Networks
Rights and permissions
Copyright information
© 2008 Zhejiang University Press, Hangzhou and Springer-Verlag GmbH Berlin Heidelberg
About this chapter
Cite this chapter
(2008). Global Learning vs. Local Learning. In: Machine Learning. Advanced Topics in Science and Technology in China. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79452-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-79452-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79451-6
Online ISBN: 978-3-540-79452-3
eBook Packages: Computer ScienceComputer Science (R0)