Abstract
Two intelligent abilities and three inverse problems are re-elaborated from a probability theory based two pathway perspective, with challenges of statistical learning and efforts towards the challenges overviewed. Then, a detailed introduction is provided on the Bayesian Ying-Yang (BYY) harmony learning. Proposed firstly in (Xu,1995) and systematically developed in the past decade, this approach consists of a two pathway featured BYY system as a general framework for unifying a number of typical learning models, and a best Ying-Yang harmony principle as a general theory for parameter learning and model selection. The BYY harmony learning leads to not only a criterion that outperforms typical model selection criteria in a two-phase implementation, but also model selection made automatically during parameter learning for several typical learning tasks, with computing cost saved significantly. In addition to introducing the fundamentals, several typical learning approaches are also systematically compared and re-elaborated from the BYY harmony learning perspective. Moreover, a further brief is made on the features and applications of a particular family called Gaussian manifold based BYY systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akaike, H.: A new look at the statistical model identification. IEEE Tr. Automatic Control 19, 714–723 (1974)
Akaike, H.: Likelihood of a model and information criteria. Journal of Econometrics 16, 3–14 (1981)
Amari, S., Cichocki, A., Yang, H.: A new learning algorithm for blind signal separation. In: Advances in NIPS, vol. 8, pp. 757–763. MIT Press, Cambridge (1996)
An, Y.J., et al.: A Comparative Investigation on Model Selection in Independent Factor Analysis. J. Mathematical Modelling and Algorithms 5, 447–473 (2006)
Barndorff-Nielson, O.E.: Methods of Information and Exponential Families. Wiley, Chichester (1978)
Bourlard, H., Kamp, Y.: Auto-association by multilayer Perceptrons and singular value decomposition. Biological Cybernetics 59, 291–294 (1988)
Bozdogan, H.: Model Selection and Akaike’s Information Criterion: The general theory and its analytical extension. Psychometrika 52, 345–370 (1987)
Bozdogan, H., Ramirez, D.E.: FACAIC: Model selection algorithm for the orthogonal factor model using AIC and FACAIC. Psychometrika 53(3), 407–415 (1988)
Brown, L.: Fundamentals of Statistical Exponential Families. Institute of Mathematical Statistics, Hayward, CA (1986)
Cavanaugh, J.E.: Unifying the derivations for the Akaike and corrected Akaike information criteria. Statistics & Probability Letters 33, 201–208 (1997)
Dayan, P., Hinton, G.E., Neal, R.M., Zemel, R.S.: The Helmholtz machine. Neural Computation 7(5), 889–904 (1995)
Gilks, W.R., Richardson, S., Spiegelhakter, D.J.: Markov Chain Monte carlo in Practice. Chapman and Hall, London (1996)
Girosi, F., et al.: Regularization theory and neural architectures. Neural Computation 7, 219–269 (1995)
Grossberg, S.: Adaptive patten classification and universal recording: I&II. Biological Cybernetics 23, 187–202 (1976)
Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length and Helmholtz free energy. In: Advances in NIPS, vol. 6, pp. 3–10 (1994)
Hinton, G.E., Dayan, P., Frey, B.J., Neal, R.N.: The wake-sleep algorithm for unsupervised learning neural networks. Science 268, 1158–1160 (1995)
Hu, X.L., Xu, L.: A Comparative Study on Selection of Cluster Number and Local Subspace Dimension in the Mixture PCA Models. In: Wang, J., Yi, Z., Żurada, J.M., Lu, B.-L., Yin, H. (eds.) ISNN 2006. LNCS, vol. 3971, pp. 1214–1221. Springer, Heidelberg (2006)
Hu, X.L., Xu, L.: A comparative investigation on subspace dimension determination. Neural Networks 17, 1051–1059 (2004)
Jaakkola, T.S.: Tutoiral on variational approximation methods. In: Opper, Saad (eds.) Advanced Mean Field methods: Theory & Pratice, pp. 129–160. MIT Press, Cambridge (2001)
Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L.: Introduction to variational methods for graphical models. Machine Learning 37, 183–233 (1999)
Kass, R.E., Raftery, A.E.: Bayes Factors. Journal of the American Statistical Association 90, 773–795 (1995)
MacKay, D.J.C.: Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge (2003)
Neath, A.A., Cavanaugh, J.E.: Regression and time series model selection using variants of the Schwarz information criterion. Communications in Statistics A 26, 559–580 (1997)
Neal, R., Hinton, G.E.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan, M.I. (ed.) Learning in graphical models, pp. 355–368. MIT Press, Cambridge (1999)
Press, S.J.: Bayesian statistics: principles, models, and applications. Factors. John Wiley & Sons, Inc., Chichester (1989)
Poggio, T., Girosi, F.: Networks for approximation and learning. Proc. of IEEE 78, 1481–1497 (1990)
Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood, and the EM algorithm. SIAM Review 26, 195–239 (1984)
Rissanen, J.: Stochastic complexity and modeling. Annals of Statistics 14(3), 1080–1100 (1986)
Rissanen, J.: Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore (1989)
Rivals, I., Personnaz, L.: On Cross Validation for Model Selection. Neural Computation 11, 863–870 (1999)
Rockafellar, R.: Convex Analysis. Princeton University Press, Princeton (1972)
Ruanaidh, O., Joseph, J.K.: Numerical Bayesian methods applied to signal processing. Springer, New York (1996)
Rustagi, J.: Variational Method in Statistics. Academic Press, New York (1976)
Schwarz, G.: Estimating the dimension of a model. Annals of Statistics 6, 461–464 (1978)
Shi, L.: Bayesian Ying-Yang harmony learning for local factor analysis: a comparative investigation. In: Tizhoosh, Ventresca (eds.) Oppositional Concepts in Computational Intelligence (Studies in CI). Springer, Heidelberg (2008)
Shi, L., Xu, L.: Local Factor Analysis with Automatic Model Selection: A Comparative Study and Digits Recognition Application. In: Kollias, S., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006. LNCS, vol. 4132, pp. 260–269. Springer, Heidelberg (2006)
Stone, M.: Cross-validatory choice and assessment of statistical prediction. J. Royal Statistical Society B 36, 111–147 (1974)
Stone, M.: An asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion. J. Royal Statistical Society B 39(1), 44–47 (1977)
Stone, M.: Cross-validation: A review. Math. Operat. Statist. 9, 127–140 (1978)
Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-posed Problems. Winston and Sons (1977)
Vapnik, V.N.: The Nature Of Statistical Learning Theory. Springer, Heidelberg (1995)
Wallace, C.S., Boulton, D.M.: An information measure for classification. Computer Journal 11, 185–194 (1968)
Wallace, C.S., Freeman, P.R.: Estimation and inference by compact coding. J. of the Royal Statistical Society 49(3), 240–265 (1987)
Wang, L., Feng, J.: Learning Gaussian mixture models by structural risk minimization. In: Proc. ICMLC 2005, August 19-21, Guangzhou, China, pp. 4858–4863 (2005)
Xu, L.: Machine learning problems from optimization perspective. Journal of Global Optimization (to appear, 2008)
Xu, L.: A unified perspective and new results on RHT computing, mixture based learning, and multi-learner based problem solving. Pattern Recognition 40, 2129–2153 (2007)
Xu, L.: Bayesian Ying Yang learning. Scholarpedia 2(3), 1809 (2007), http://scholarpedia.org/article/BayesianYingYangLearning
Xu, L.: Rival penalized competitive learning. Scholarpedia 2(8), 1810 (2007), http://scholarpedia.org/article/RivalPenalizedCompetitiveLearning
Xu, L.: A trend on regularization and model selection in statistical learning: a Bayesian Ying Yang learning perspective. In: Duch, W., Mandziuk, J. (eds.) Challenges for Computational Intelligence, pp. 365–406. Springer, Heidelberg (2007)
Xu, L.: Temporal BYY encoding, Markovian state spaces, and space dimension determination. IEEE Tr. Neural Networks 15, 1276–1295 (2004)
Xu, L.: Advances on BYY harmony learning: information theoretic perspective, generalized projection geometry, and independent factor auto-determination. IEEE Tr. Neural Networks 15, 885–902 (2004)
Xu, L.: Bayesian Ying Yang learning (I) & (II). In: Zhong, Liu (eds.) Intelligent Technologies for Information Analysis, pp. 615–706. Springer, Heidelberg (2004)
Xu, L.: BI-directional BYY learning for mining structures with projected polyhedra and topological map. In: Lin, Smale, Poggio, Liau (eds.) Proc. of FDM 2004: Foundations of Data Mining, Brighton, UK, pp. 5–18 (2004)
Xu, L.: BYY learning, regularized implementation, and model selection on modular networks with One hidden layer of binary units. Neurocomputing 51, 227–301 (2003)
Xu, L.: Data smoothing regularization, multi-sets-learning, and problem solving strategies. Neural Networks 15(5-6), 817–825 (2003)
Xu, L.: Independent component analysis and extensions with noise and time: a Bayesian Ying-Yang learning perspective. Neural Information Processing Letters and Reviews 1, 1–52 (2003)
Xu, L.: BYY harmony learning, structural RPCL, and topological self-organizing on unsupervised and supervised mixture models. Neural Networks 15, 1125–1151 (2002)
Xu, L.: Bayesian Ying Yang harmony learning. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 1231–1237. The MIT Press, Cambridge (2002)
Xu, L.: BYY harmony learning, independent state space and generalized APT financial analyses. IEEE Tr. Neural Networks 12, 822–849 (2001)
Xu, L.: Best harmony, unified RPCL and automated model selection for unsupervised and supervised learning on Gaussian mixtures, ME-RBF models and three-layer nets. Intl J. Neural Systems 11, 3–69 (2001)
Xu, L.: Temporal BYY learning for state space approach, hidden Markov model and blind source separation. IEEE Tr. on Signal Processing 48, 2132–2144 (2000)
Xu, L.: RBF nets, mixture experts, and Bayesian Ying-Yang learning. Neurocomputing 19(1-3), 223–257 (1998)
Xu, L.: Bayesian Ying-Yang machine, clustering and number of clusters. Pattern Recognition Letters 18(11-13), 1167–1178 (1997)
Xu, L.: Bayesian-Kullback coupled YING-YANG machines: unified learnings and new results on vector quantization. In: Proc. ICONIP 1995, Beijing, October 30-November 3, pp. 977–988 (1995)
Xu, L., Krzyzak, A., Oja, E.: Rival penalized competitive learning for clustering analysis, RBF net and curve detection. IEEE Tr. on Neural Networks 4, 636–649 ( Its early version on In: Proc. of 11th ICPR92. vol.I, pp. 672–675 (1992& 1993))
Xu, L.: Least mean square error reconstruction for self-organizing neural-nets. Neural Networks 6, 627–648 (1993) (Its early version on Proc. IJCNN 1991 Singapore. pp. 2363–2373 (1991& 1993))
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Xu, L. (2008). Bayesian Ying Yang System, Best Harmony Learning, and Gaussian Manifold Based Family. In: Zurada, J.M., Yen, G.G., Wang, J. (eds) Computational Intelligence: Research Frontiers. WCCI 2008. Lecture Notes in Computer Science, vol 5050. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68860-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-68860-0_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68858-7
Online ISBN: 978-3-540-68860-0
eBook Packages: Computer ScienceComputer Science (R0)