Abstract
Major dependence structure mining tasks are overviewed from a general statistical learning perspective. Bayesian Ying Yang (BYY) harmony learning has been introduced as a unified framework for mining these dependence structures, with new mechanisms for model selection and regularization on a finite size of samples. Main results are summarized and bibliographic remarks are made. Two typical approaches for implementing learning, namely optimization search and accumulation consensus, are also introduced.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
H. Akaike: A new look at the statistical model identification, IEEE Tr. Automatic Control, 19, 714–723 (1974)
SI. Amari, A. Cichocki, HH. Yang: A new learning algorithm for blind separation of sources. In: DS Touretzky et al. (eds.) Advances in Neural Information Processing 8, MIT Press, 757–763 (1996)
TW. Anderson, H. Rubin: Statistical inference in factor analysis, Proc. Berke-ley Symp. Math. Statist. Prob. 3rd 5, UC Berkeley, 111–150 (1956)
A. Bell, T. Sejnowski: An information maximization approach to blind sepa-ration and blind deconvolution, Neural Computation, 17, 1129–1159 (1995)
CM. Bishop: Training with noise is equivalent to Tikhonov regularization, Neural Computation 7, 108–116 (1995)
H. Bourlard, Y. Kamp: Auto-association by multilayer Perceptrons and singular value decomposition, Biol. Cyb. 59, 291–294 (1988)
P. Comon: Independent component analysis–a new concept? Signal Processing, 36, 287–314 (1994)
KY. Chan, WS. Chu, L. Xu: Experimental Comparison between two computational strategies for topological self-organization, Proc. of IDEAL03, Lecture Notes in Computer Science, LNCS 2690, Springer-Verlag, 410–414 (2003)
AP. Dempster, NM. Laird, DB. Rubin: Maximum-likelihood from incomplete data via the EM algorithm, J. Royal Statistical Society, B39, 1–38 (1977)
PA. Devijver, J. Kittler: Pattern Recognition: A Statistical Approach, Prentice-Hall (1982)
RO. Duda, PE. Hart: Pattern classification and Scene analysis (Wiley, 1973 ) 22. 12 RA. Jacobs et al.: Adaptive mixtures of local experts, Neural Computation, 3, 79–87 (1991)
MI. Jordan, RA. Jacobs: Hierarchical mixtures of experts and the EM algorithm, Neural Computation, 6, 181–214 (1994)
MI. Jordan, L. Xu: Convergence results for the EM approach to mixtures of experts, Neural Networks, 8, 1409–1431 (1995)
C. Jutten, J. Herault: Independent Component Analysis versus Principal Component Analysis, Proc. EUSIPCO88, 643–646 (1988)
H. Kälviäinen, P. Hirvonen, L. Xu, E. Oja: Probabilistic and Non-probabilistic Hough Transforms: Overview and Comparisons, Image and Vision Computing, Vol. 5, No. 4, pp. 239–252 (1995)
J. Han, M. Kamber: Data Mining: Concepts and Techniques (Morgan Kaufmann, 2001 )
GE. Hinton, P. Dayan, BJ. Frey, RN. Neal: The wake-sleep algorithm for unsupervised learning neural networks, Science, 268, 1158–1160 (1995)
H. Hotelling: Simplified calculation of principal components, Psychometrika, 1, 27–35 (1936)
P.V.C. Hough: Method and means for recognizing complex patterns, U.S. Patent 3069654 (Dec. 18, 1962 )
J. Illingworth, J. Kittler: A survey of the Hough Transform, Comput. Vision Graphics and Image Process, 43, 221–238 (1988)
FV. Jensen: An introduction to Bayesian networks (University of Collage London Press) (1996)
T. Kohonen: Self-Organizing Maps ( Springer-Verlag, Berlin, 1995 )
H. Kushner, D. Clark: Stochastic approximation methods for constrained and unconstrained systems ( New York: Springer ) (1998)
HY. Kwok, CM. Chen, L. Xu: Comparison between Mixture of ARMA and Mixture of AR Model with Application to Time Series Forecasting, Proc. ICONIP 98, Oct.21–23, 1998, Kitakyushu, Japan, Vol. 2, 1049–1052
ZY. Liu, KC. Chiu, L. Xu: The One-bit-Matching Conjecture for Independent Component Analysis, Neural Computation, Vol. 16, No. 2, pp. 383–399 (2003)
ZY. Liu, KC. Chiu, L. Xu: Strip Line Detection and Thinning by RPCL- Based Local PCA, Pattern Recognition Letters, 24, pp. 2335–2344 (2003)
ZY. Liu, KC. Chiu, L. Xu: Improved system for object detection and star/galaxy classification via local subspace analysis, Neural Networks, 16, 437451 (2003)
ZY. Liu, L. Xu: Smoothed Local PCA by BYY data smoothing learning, Proc ICCAS 2001, Jeju, Korea, Oct. 17–21, 2001, pp. 924–927
J. Ma, T. Wang, L. Xu: A gradient BYY harmony learning rule on Gaussian mixture with automated model selection, Neurocomputing, 56, 481–487 (2004)
Ch. von der Malsburg: Self-organization of orientation sensitive cells in the striate cortex, Kybernetik 14, 85–100 (1973)
R. McDonald: Factor Analysis and Related Techniques (Lawrence Erlbaum)
GJ. McLachlan, T. Krishnan: The EM Algorithm and Extensions, John Wiley and Son, INC (1997)
E. Oja: Subspace Methods of Pattern Recognition ( Research Studies Press, UK 1983 )
J. Pearl: Probabilistic reasoning in intelligent systems: networks of plausible inference ( San Francisco, CA: Morgan Kaufman 1988 )
L. Rabiner, BH. Juang: Fundamentals of Speech Recognition, Prentice Hall, Inc. (1993)
H. Robbins, S. Monro: A stochastic approximation method, Ann. Math. Statist., 22, 400–407 (1950)
RA. Redner, HF. Walker: Mixture densities, maximum likelihood, and the EM algorithm, SIAM Review, 26, 195–239 (1984)
D. Rubi, D. Thayer: EM algorithm for ML factor analysis, Psychometrika, 57, 69–76 (1976)
L. Saul, MI. Jordan: Exploiting tractable structures in intractable Networks, Advances in neural information processing systems, 8, MIT Press, 486–492 (1995)
C. Spearman: General intelligence domainively determined and measured, Am. J. Psychol. 15, 201–293 (2004)
A. Taleb, C. Jutten: Nonlinear source separation: The post-nonlinear Mixtures, Proc. ESANN97, 279–284 (1997)
H. Tang, KC. Chiu, L. Xu: Finite Mixture of ARMA-GARCH Model For Stock Price Prediction, to appear on Proc. CIE 2003, NC, USA (Sept. 26–30, 2003 )
H. Tang, L. Xu: Mixture-Of-Expert ARMA-GARCH Models For Stock Price Prediction, Proc. of ICCAS 2003, Oct.22–25, 2003 Gyeongju, KOREA, pp. 402407 (2003)
ME. Tipping, CM. Bishop: Mixtures of probabilistic principal component analysis, Neural Computation, 11, 443–482 (1999)
L. Tong, Y. Inouye, R. Liu: Waveform-preserving blind estimation of multiple independent sources, IEEE Tr on Signal Processing, 41, 2461–2470 (1993)
VN. Vapnik: The Nature Of Statistical Learning Theory (Springer-Verlag) (1995)
CS. Wong, WK. Li: On a mixture autoregressive model, Journal of the Royal Statistical Society Series B, Vol. 62, No. 1, pp. 95–115 (2000)
W. Wong, F. Yip, L. Xu: Financial Prediction by Finite Mixture GARCH Model, Proc. ICONIP 98, Oct.21–23, 1998, Kitakyushu, Japan, Vol. 3, pp. 13511354 (1998)
L. Xu: Temporal BYY Learning, Identifiable State Spaces, and Space Dimension Determination, IEEE Trans on Neural Networks, Special Issue on Temporal Coding for Neural Information Processing, in press (2004)
L. Xu: BYY Learning, Regularized Implementation, and Model Selection on Modular Networks with One Hidden Layer of Binary Units“, Neurocomputing, Vol. 51, pp. 227–301 (2003)
L. Xu: Data smoothing regularization, multi-sets-learning, and problem solving strategies, Neural Networks, Vol. 15, No. 56, 817–825 (2003)
L. Xu: Independent Component Analysis and Extensions with Noise and Time: A Bayesian Ying Yang Learning Perspective, Neural Information Processing–Letters and Reviews, Vol. 1, No. 1, pp. 1–52 (2003)
L. Xu: BYY Harmony Learning, Structural RPCL, and Topological Self-Organizing on Mixture Models, Neural Networks, Vol. 15, nos. 8–9, 1125–1151 (2002)
L. Xu: Bayesian Ying Yang Harmony Learning, The Handbook of Brain Theory and Neural Networks, Second edition, (MA Arbib, Ed.), Cambridge, MA: The MIT Press, pp. 1231–1237 (2002)
L. Xu: Mining Dependence Structures from Statistical Learning Perspective. In: H, Yin et al. (eds.), Proc. IDEAL2002: Lecture Notes in Computer Science, 2412, Springer-Verlag, 285–306 (2002)
L. Xu: BYY Harmony Learning, Independent State Space and Generalized APT Financial Analyses, IEEE Tr on Neural Networks, 12 (4), 822–849 (2001)
L. Xu: Best Harmony, Unified RPCL and Automated Model Selection for Unsupervised and Supervised Learning on Gaussian Mixtures, Three-Layer Nets and ME-RBF-SVM Models, Intl J of Neural Systems 11 (1), 43–69 (2001)
L. Xu: Temporal BYY Learning for State Space Approach, Hidden Markov Model and Blind Source Separation“, IEEE Tr on Signal Processing 48, 21322144 (2000)
L. Xu: BYY Learning System and Theory for Parameter Estimation, Data Smoothing Based Regularization and Model Selection, Neural, Parallel and Scientific Computations, Vol. 8, pp. 55–82 (2000)
L. Xu: Temporal Bayesian Ying Yang Dependence Reduction, Blind Source Separation and Principal Independent Components, Proc. IJCNN 99, July 1016, 1999, DC, USA, Vol. 2, pp. 1071–1076 (1999)
L. Xu: Bayesian Ying Yang Unsupervised and Supervised Learning: Theory and Applications, Proc. of 1999 Chinese Conf. on Neural Networks and Signal Processing, pp. 12–29, Shantou, China (Nov. 1999)
L. Xu: Bayesian Ying Yang Theory for Empirical Learning, Regularization and Model Selection: General Formulation, Proc. IJCNN 99, DC, USA, July 10–16, 1999, Vol. 1 of 6, pp. 552–557
L. Xu: Temporal BYY Learning and Its Applications to Extended Kalman Filtering, Hidden Markov Model, and Sensor-Motor Integration, Proc. IJCNN 99, DC, USA, July 10–16, 1999, vol.2 of 6, pp. 949–954
L. Xu: Bayesian Ying Yang System and Theory as a Unified Statistical Learning Approach:(V) Temporal Modeling for Temporal Perception and Control, Proc. ICONIP 98, Kitakyushu, Japan, Vol. 2, pp. 877–884 (1998)
L. Xu: Bayesian Kullback Ying-Yang Dependence Reduction Theory, Neurocomputing, 22 (1–3), 81–112 (1998)
L. Xu: RBF Nets, Mixture Experts, and Bayesian Ying Yang Learning, Neurocomputing, Vol. 19, No. 1–3, 223–257 (1998)
L. Xu: Bayesian Ying Yang Dependence Reduction Theory and Blind Source Separation on Instantaneous Mixture, Proc. Intl ICSC Workshop IandANN 98, Feb.9–10, 1998, Tenerife, Spain, pp. 45–51 (1998)
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach:(VI) Convex Divergence, Convex Entropy and Convex Likelihood? Proc. IDEAL98, Oct.14–16, 1998, Hong Kong, pp. 1–12 (1998)
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach: (IV) Further Advances, Proc. IJCNN 98, May 5–9, 1998, Anchorage, Alaska, Vol. 2, pp. 1275–1270 (1998)
L. Xu: BKYY Dimension Reduction and Determination, Proc. IJCNN98, May 5–9, 1998, Anchorage, Alaska, Vol. 3, pp. 1822–1827 (1998)
L. Xu, CC. Cheung, SI. Amari- Learned Parametric Mixture Based ICA Algorithm, Neurocomputing, 22 (1–3), 69–80 (1998)
L. Xu, CC. Cheung, SI. Amari: Further Results on Nonlinearity and Separation Capability of A Linear Mixture ICA Method and Learned Parametric Mixture Algorithm, Proc. I and ANN 98, Feb.9–10, 1998, Tenerife, Spain, pp. 39–44 (1998)
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach: (I) Unsupervised and Semi-Unsupervised Learning. In: S. Amari, N. Kassabov (eds.), Brain-like Computing and Intelligent Information Systems, Springer-Verlag, pp. 241–274 (1997)
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach (II): From Unsupervised Learning to Supervised Learning and Temporal Modeling. In: KM. Wong et al. (eds.), Theoretical Aspects of Neural Computation: A Multidisciplinary Perspective, Springer-Verlag, pp. 2542 (1997)
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach (III): Models and Algorithms for Dependence Reduction, Data Dimension Reduction, ICA and Supervised Learning. In: KM. Wong et al. (eds.), Theoretical Aspects of Neural Computation: A Multidisciplinary Perspective, Springer-Verlag, pp. 43–60 (1997)
L. Xu: Bayesian Ying Yang Machine, Clustering and Number of Clusters, Pattern Recognition Letters 18, No. 11–13, 1167–1178 (1997)
L. Xu, CC. Cheung, HH. Yang, SI. Amari: Independent Component Analysis by The Information-Theoretic Approach with Mixture of Density, Proc. IJCNN97, Vol. 3, 1821–1826 (1997)
L. Xu, CC. Cheung, J. Ruan, SI. Amari: Nonlinearity and Separation Capability: Further Justification for the ICA Algorithm with A Learned Mixture of Parametric Densities, Proc. ESANN97, Bruges, April 16–18, 1997, pp. 291–296 (1997)
L. Xu: Bayesian Ying Yang Learning Based ICA Models, Proc. 1997 IEEE NNSP VII, Sept.24–26, 1997, Florida, pp. 476–485 (1997)
L. Xu: New Advances on Bayesian Ying Yang Learning System with Kullback and Non-Kullback Separation Functionals, Proc. IJCNN97, June 9–12, 1997, Houston, TX, USA, Vol. 3, pp. 1942–1947 (1997)
L. Xu: Bayesian-Kullback YING-YANG Learning Scheme: Reviews and New Results, Proc. ICONIP 96, Vol. 1, 59–67 (1996)
L. Xu: Bayesian-Kullback YING-YANG Machines for Supervised Learning, Proc. WCNN 96, Sept.15–18, 1996, San Diego, CA, pp. 193–200 (1996)
L. Xu: A Maximum Balanced Mapping Certainty Principle for Pattern Recognition and Associative Mapping, Proc. WCNN 96, Sept. 15–18, 1996, San Diego, CA, pp. 946–949 (1996)
L. Xu: How Many Clusters?: A YING-YANG Machine Based Theory for A Classical Open Problem in Pattern Recognition, Proc. IEEE ICNN 96, June 2–6, 1996, DC, Vol. 3, pp. 1546–1551 (1996)
L. Xu, SI. Amari. A general independent component analysis framework based on Bayesian Kullback Ying Yang Learning, Proc. ICONIP 96, 1253–1240 (1996)
L. Xu, HH. Yang, SI. Amari: Signal Source Separation by Mixtures Accumulative Distribution Functions or Mixture of Bell-Shape Density Distribution Functions, Research Proposal, presented at FRONTIER FORUM, organized by S. Amari, S. Tanaka, A. Cichocki, RIKEN, Japan (April 10, 1996 )
L. Xu: Bayesian-Kullback Coupled YING-YANG Machines: Unified Learnings and New Results on Vector Quantization, Proc. ICONIP 95, Oct 30-Nov.3, 1995, Beijing, China, pp. 977–988 (1995)
L. Xu: YING-YANG Machine for Temporal Signals, Keynote Talk, Proc. 1995 IEEE Intl Conf. on Neural Networks and Signal Processing, Dec. 10–13, 1995, Nanjing, Vol. 1, pp. 644–651 (1995)
L. Xu: New Advances on The YING-YANG Machine, Proc. Intl. Symp. on Artificial Neural Networks, Dec.18–20, 1995, Taiwan, pp. 07–12 (1995)
L. Xu: A unified learning framework: multisets modeling learning, Proceed-ings of 1995 World Congress on Neural Networks, Vol. 1, pp. 35–42 (1995)
L. Xu, MI. Jordan, GE. Hinton: An Alternative Model for Mixtures of Experts. In: JD. Cowan et al. (eds.), Advances in Neural Information Processing Systems 7, MIT Press, 633–640 (1995)
L. Xu: Multisets Modeling Learning: An Unified Theory for Supervised and Unsupervised Learning, Invited Talk, Proc. IEEE ICNN 94, June 26-July 2, 1994, Orlando, Florida, Vol. 1, pp. 315–320 (1994)
L. Xu: Least mean square error reconstruction for self-organizing neural-nets, Neural Networks 6, 627–648, 1993. Its early version on Proc. IJCNN91 Singapore, 2363–2373 (1991)
L. Xu, E. Oja: Randomized Hough Transform (RHT): Basic Mechanisms, Algorithms and Complexities, Computer Vision, Graphics, and Image Processing: Image Understanding, Vol. 57, no. 2, pp. 131–154 (1993)
L. Xu, E. Oja, P. Kultanen: A New Curve Detection Method: Randomized Hough Transform (RHT), Pattern Recognition Letters, 11, 331–338 (1990)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Xu, L. (2004). Bayesian Ying Yang Learning (I): A Unified Perspective for Statistical Modeling. In: Intelligent Technologies for Information Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-07952-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-662-07952-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07378-6
Online ISBN: 978-3-662-07952-2
eBook Packages: Springer Book Archive