Skip to main content

Bayesian Ying Yang Learning (I): A Unified Perspective for Statistical Modeling

  • Chapter
Intelligent Technologies for Information Analysis

Abstract

Major dependence structure mining tasks are overviewed from a general statistical learning perspective. Bayesian Ying Yang (BYY) harmony learning has been introduced as a unified framework for mining these dependence structures, with new mechanisms for model selection and regularization on a finite size of samples. Main results are summarized and bibliographic remarks are made. Two typical approaches for implementing learning, namely optimization search and accumulation consensus, are also introduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. H. Akaike: A new look at the statistical model identification, IEEE Tr. Automatic Control, 19, 714–723 (1974)

    Google Scholar 

  2. SI. Amari, A. Cichocki, HH. Yang: A new learning algorithm for blind separation of sources. In: DS Touretzky et al. (eds.) Advances in Neural Information Processing 8, MIT Press, 757–763 (1996)

    Google Scholar 

  3. TW. Anderson, H. Rubin: Statistical inference in factor analysis, Proc. Berke-ley Symp. Math. Statist. Prob. 3rd 5, UC Berkeley, 111–150 (1956)

    Google Scholar 

  4. A. Bell, T. Sejnowski: An information maximization approach to blind sepa-ration and blind deconvolution, Neural Computation, 17, 1129–1159 (1995)

    Article  Google Scholar 

  5. CM. Bishop: Training with noise is equivalent to Tikhonov regularization, Neural Computation 7, 108–116 (1995)

    Article  Google Scholar 

  6. H. Bourlard, Y. Kamp: Auto-association by multilayer Perceptrons and singular value decomposition, Biol. Cyb. 59, 291–294 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  7. P. Comon: Independent component analysis–a new concept? Signal Processing, 36, 287–314 (1994)

    Article  MATH  Google Scholar 

  8. KY. Chan, WS. Chu, L. Xu: Experimental Comparison between two computational strategies for topological self-organization, Proc. of IDEAL03, Lecture Notes in Computer Science, LNCS 2690, Springer-Verlag, 410–414 (2003)

    Google Scholar 

  9. AP. Dempster, NM. Laird, DB. Rubin: Maximum-likelihood from incomplete data via the EM algorithm, J. Royal Statistical Society, B39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  10. PA. Devijver, J. Kittler: Pattern Recognition: A Statistical Approach, Prentice-Hall (1982)

    MATH  Google Scholar 

  11. RO. Duda, PE. Hart: Pattern classification and Scene analysis (Wiley, 1973 ) 22. 12 RA. Jacobs et al.: Adaptive mixtures of local experts, Neural Computation, 3, 79–87 (1991)

    Google Scholar 

  12. MI. Jordan, RA. Jacobs: Hierarchical mixtures of experts and the EM algorithm, Neural Computation, 6, 181–214 (1994)

    Article  Google Scholar 

  13. MI. Jordan, L. Xu: Convergence results for the EM approach to mixtures of experts, Neural Networks, 8, 1409–1431 (1995)

    Article  Google Scholar 

  14. C. Jutten, J. Herault: Independent Component Analysis versus Principal Component Analysis, Proc. EUSIPCO88, 643–646 (1988)

    Google Scholar 

  15. H. Kälviäinen, P. Hirvonen, L. Xu, E. Oja: Probabilistic and Non-probabilistic Hough Transforms: Overview and Comparisons, Image and Vision Computing, Vol. 5, No. 4, pp. 239–252 (1995)

    Article  Google Scholar 

  16. J. Han, M. Kamber: Data Mining: Concepts and Techniques (Morgan Kaufmann, 2001 )

    Google Scholar 

  17. GE. Hinton, P. Dayan, BJ. Frey, RN. Neal: The wake-sleep algorithm for unsupervised learning neural networks, Science, 268, 1158–1160 (1995)

    Article  Google Scholar 

  18. H. Hotelling: Simplified calculation of principal components, Psychometrika, 1, 27–35 (1936)

    Article  MATH  Google Scholar 

  19. P.V.C. Hough: Method and means for recognizing complex patterns, U.S. Patent 3069654 (Dec. 18, 1962 )

    Google Scholar 

  20. J. Illingworth, J. Kittler: A survey of the Hough Transform, Comput. Vision Graphics and Image Process, 43, 221–238 (1988)

    Article  Google Scholar 

  21. FV. Jensen: An introduction to Bayesian networks (University of Collage London Press) (1996)

    Google Scholar 

  22. T. Kohonen: Self-Organizing Maps ( Springer-Verlag, Berlin, 1995 )

    Book  Google Scholar 

  23. H. Kushner, D. Clark: Stochastic approximation methods for constrained and unconstrained systems ( New York: Springer ) (1998)

    Google Scholar 

  24. HY. Kwok, CM. Chen, L. Xu: Comparison between Mixture of ARMA and Mixture of AR Model with Application to Time Series Forecasting, Proc. ICONIP 98, Oct.21–23, 1998, Kitakyushu, Japan, Vol. 2, 1049–1052

    Google Scholar 

  25. ZY. Liu, KC. Chiu, L. Xu: The One-bit-Matching Conjecture for Independent Component Analysis, Neural Computation, Vol. 16, No. 2, pp. 383–399 (2003)

    Article  Google Scholar 

  26. ZY. Liu, KC. Chiu, L. Xu: Strip Line Detection and Thinning by RPCL- Based Local PCA, Pattern Recognition Letters, 24, pp. 2335–2344 (2003)

    Article  MATH  Google Scholar 

  27. ZY. Liu, KC. Chiu, L. Xu: Improved system for object detection and star/galaxy classification via local subspace analysis, Neural Networks, 16, 437451 (2003)

    Google Scholar 

  28. ZY. Liu, L. Xu: Smoothed Local PCA by BYY data smoothing learning, Proc ICCAS 2001, Jeju, Korea, Oct. 17–21, 2001, pp. 924–927

    Google Scholar 

  29. J. Ma, T. Wang, L. Xu: A gradient BYY harmony learning rule on Gaussian mixture with automated model selection, Neurocomputing, 56, 481–487 (2004)

    Article  Google Scholar 

  30. Ch. von der Malsburg: Self-organization of orientation sensitive cells in the striate cortex, Kybernetik 14, 85–100 (1973)

    Article  Google Scholar 

  31. R. McDonald: Factor Analysis and Related Techniques (Lawrence Erlbaum)

    Google Scholar 

  32. GJ. McLachlan, T. Krishnan: The EM Algorithm and Extensions, John Wiley and Son, INC (1997)

    Google Scholar 

  33. E. Oja: Subspace Methods of Pattern Recognition ( Research Studies Press, UK 1983 )

    Google Scholar 

  34. J. Pearl: Probabilistic reasoning in intelligent systems: networks of plausible inference ( San Francisco, CA: Morgan Kaufman 1988 )

    Google Scholar 

  35. L. Rabiner, BH. Juang: Fundamentals of Speech Recognition, Prentice Hall, Inc. (1993)

    Google Scholar 

  36. H. Robbins, S. Monro: A stochastic approximation method, Ann. Math. Statist., 22, 400–407 (1950)

    Article  MathSciNet  Google Scholar 

  37. RA. Redner, HF. Walker: Mixture densities, maximum likelihood, and the EM algorithm, SIAM Review, 26, 195–239 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  38. D. Rubi, D. Thayer: EM algorithm for ML factor analysis, Psychometrika, 57, 69–76 (1976)

    Google Scholar 

  39. L. Saul, MI. Jordan: Exploiting tractable structures in intractable Networks, Advances in neural information processing systems, 8, MIT Press, 486–492 (1995)

    Google Scholar 

  40. C. Spearman: General intelligence domainively determined and measured, Am. J. Psychol. 15, 201–293 (2004)

    Article  Google Scholar 

  41. A. Taleb, C. Jutten: Nonlinear source separation: The post-nonlinear Mixtures, Proc. ESANN97, 279–284 (1997)

    Google Scholar 

  42. H. Tang, KC. Chiu, L. Xu: Finite Mixture of ARMA-GARCH Model For Stock Price Prediction, to appear on Proc. CIE 2003, NC, USA (Sept. 26–30, 2003 )

    Google Scholar 

  43. H. Tang, L. Xu: Mixture-Of-Expert ARMA-GARCH Models For Stock Price Prediction, Proc. of ICCAS 2003, Oct.22–25, 2003 Gyeongju, KOREA, pp. 402407 (2003)

    Google Scholar 

  44. ME. Tipping, CM. Bishop: Mixtures of probabilistic principal component analysis, Neural Computation, 11, 443–482 (1999)

    Article  Google Scholar 

  45. L. Tong, Y. Inouye, R. Liu: Waveform-preserving blind estimation of multiple independent sources, IEEE Tr on Signal Processing, 41, 2461–2470 (1993)

    Article  MATH  Google Scholar 

  46. VN. Vapnik: The Nature Of Statistical Learning Theory (Springer-Verlag) (1995)

    Google Scholar 

  47. CS. Wong, WK. Li: On a mixture autoregressive model, Journal of the Royal Statistical Society Series B, Vol. 62, No. 1, pp. 95–115 (2000)

    MathSciNet  MATH  Google Scholar 

  48. W. Wong, F. Yip, L. Xu: Financial Prediction by Finite Mixture GARCH Model, Proc. ICONIP 98, Oct.21–23, 1998, Kitakyushu, Japan, Vol. 3, pp. 13511354 (1998)

    Google Scholar 

  49. L. Xu: Temporal BYY Learning, Identifiable State Spaces, and Space Dimension Determination, IEEE Trans on Neural Networks, Special Issue on Temporal Coding for Neural Information Processing, in press (2004)

    Google Scholar 

  50. L. Xu: BYY Learning, Regularized Implementation, and Model Selection on Modular Networks with One Hidden Layer of Binary Units“, Neurocomputing, Vol. 51, pp. 227–301 (2003)

    Article  Google Scholar 

  51. L. Xu: Data smoothing regularization, multi-sets-learning, and problem solving strategies, Neural Networks, Vol. 15, No. 56, 817–825 (2003)

    Article  Google Scholar 

  52. L. Xu: Independent Component Analysis and Extensions with Noise and Time: A Bayesian Ying Yang Learning Perspective, Neural Information Processing–Letters and Reviews, Vol. 1, No. 1, pp. 1–52 (2003)

    Google Scholar 

  53. L. Xu: BYY Harmony Learning, Structural RPCL, and Topological Self-Organizing on Mixture Models, Neural Networks, Vol. 15, nos. 8–9, 1125–1151 (2002)

    Article  Google Scholar 

  54. L. Xu: Bayesian Ying Yang Harmony Learning, The Handbook of Brain Theory and Neural Networks, Second edition, (MA Arbib, Ed.), Cambridge, MA: The MIT Press, pp. 1231–1237 (2002)

    Google Scholar 

  55. L. Xu: Mining Dependence Structures from Statistical Learning Perspective. In: H, Yin et al. (eds.), Proc. IDEAL2002: Lecture Notes in Computer Science, 2412, Springer-Verlag, 285–306 (2002)

    Google Scholar 

  56. L. Xu: BYY Harmony Learning, Independent State Space and Generalized APT Financial Analyses, IEEE Tr on Neural Networks, 12 (4), 822–849 (2001)

    Article  Google Scholar 

  57. L. Xu: Best Harmony, Unified RPCL and Automated Model Selection for Unsupervised and Supervised Learning on Gaussian Mixtures, Three-Layer Nets and ME-RBF-SVM Models, Intl J of Neural Systems 11 (1), 43–69 (2001)

    Google Scholar 

  58. L. Xu: Temporal BYY Learning for State Space Approach, Hidden Markov Model and Blind Source Separation“, IEEE Tr on Signal Processing 48, 21322144 (2000)

    Google Scholar 

  59. L. Xu: BYY Learning System and Theory for Parameter Estimation, Data Smoothing Based Regularization and Model Selection, Neural, Parallel and Scientific Computations, Vol. 8, pp. 55–82 (2000)

    MATH  Google Scholar 

  60. L. Xu: Temporal Bayesian Ying Yang Dependence Reduction, Blind Source Separation and Principal Independent Components, Proc. IJCNN 99, July 1016, 1999, DC, USA, Vol. 2, pp. 1071–1076 (1999)

    Google Scholar 

  61. L. Xu: Bayesian Ying Yang Unsupervised and Supervised Learning: Theory and Applications, Proc. of 1999 Chinese Conf. on Neural Networks and Signal Processing, pp. 12–29, Shantou, China (Nov. 1999)

    Google Scholar 

  62. L. Xu: Bayesian Ying Yang Theory for Empirical Learning, Regularization and Model Selection: General Formulation, Proc. IJCNN 99, DC, USA, July 10–16, 1999, Vol. 1 of 6, pp. 552–557

    Google Scholar 

  63. L. Xu: Temporal BYY Learning and Its Applications to Extended Kalman Filtering, Hidden Markov Model, and Sensor-Motor Integration, Proc. IJCNN 99, DC, USA, July 10–16, 1999, vol.2 of 6, pp. 949–954

    Google Scholar 

  64. L. Xu: Bayesian Ying Yang System and Theory as a Unified Statistical Learning Approach:(V) Temporal Modeling for Temporal Perception and Control, Proc. ICONIP 98, Kitakyushu, Japan, Vol. 2, pp. 877–884 (1998)

    Google Scholar 

  65. L. Xu: Bayesian Kullback Ying-Yang Dependence Reduction Theory, Neurocomputing, 22 (1–3), 81–112 (1998)

    Article  MATH  Google Scholar 

  66. L. Xu: RBF Nets, Mixture Experts, and Bayesian Ying Yang Learning, Neurocomputing, Vol. 19, No. 1–3, 223–257 (1998)

    Article  MATH  Google Scholar 

  67. L. Xu: Bayesian Ying Yang Dependence Reduction Theory and Blind Source Separation on Instantaneous Mixture, Proc. Intl ICSC Workshop IandANN 98, Feb.9–10, 1998, Tenerife, Spain, pp. 45–51 (1998)

    Google Scholar 

  68. L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach:(VI) Convex Divergence, Convex Entropy and Convex Likelihood? Proc. IDEAL98, Oct.14–16, 1998, Hong Kong, pp. 1–12 (1998)

    Google Scholar 

  69. L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach: (IV) Further Advances, Proc. IJCNN 98, May 5–9, 1998, Anchorage, Alaska, Vol. 2, pp. 1275–1270 (1998)

    Google Scholar 

  70. L. Xu: BKYY Dimension Reduction and Determination, Proc. IJCNN98, May 5–9, 1998, Anchorage, Alaska, Vol. 3, pp. 1822–1827 (1998)

    Google Scholar 

  71. L. Xu, CC. Cheung, SI. Amari- Learned Parametric Mixture Based ICA Algorithm, Neurocomputing, 22 (1–3), 69–80 (1998)

    Article  MATH  Google Scholar 

  72. L. Xu, CC. Cheung, SI. Amari: Further Results on Nonlinearity and Separation Capability of A Linear Mixture ICA Method and Learned Parametric Mixture Algorithm, Proc. I and ANN 98, Feb.9–10, 1998, Tenerife, Spain, pp. 39–44 (1998)

    Google Scholar 

  73. L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach: (I) Unsupervised and Semi-Unsupervised Learning. In: S. Amari, N. Kassabov (eds.), Brain-like Computing and Intelligent Information Systems, Springer-Verlag, pp. 241–274 (1997)

    Google Scholar 

  74. L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach (II): From Unsupervised Learning to Supervised Learning and Temporal Modeling. In: KM. Wong et al. (eds.), Theoretical Aspects of Neural Computation: A Multidisciplinary Perspective, Springer-Verlag, pp. 2542 (1997)

    Google Scholar 

  75. L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach (III): Models and Algorithms for Dependence Reduction, Data Dimension Reduction, ICA and Supervised Learning. In: KM. Wong et al. (eds.), Theoretical Aspects of Neural Computation: A Multidisciplinary Perspective, Springer-Verlag, pp. 43–60 (1997)

    Google Scholar 

  76. L. Xu: Bayesian Ying Yang Machine, Clustering and Number of Clusters, Pattern Recognition Letters 18, No. 11–13, 1167–1178 (1997)

    Article  Google Scholar 

  77. L. Xu, CC. Cheung, HH. Yang, SI. Amari: Independent Component Analysis by The Information-Theoretic Approach with Mixture of Density, Proc. IJCNN97, Vol. 3, 1821–1826 (1997)

    Google Scholar 

  78. L. Xu, CC. Cheung, J. Ruan, SI. Amari: Nonlinearity and Separation Capability: Further Justification for the ICA Algorithm with A Learned Mixture of Parametric Densities, Proc. ESANN97, Bruges, April 16–18, 1997, pp. 291–296 (1997)

    Google Scholar 

  79. L. Xu: Bayesian Ying Yang Learning Based ICA Models, Proc. 1997 IEEE NNSP VII, Sept.24–26, 1997, Florida, pp. 476–485 (1997)

    Google Scholar 

  80. L. Xu: New Advances on Bayesian Ying Yang Learning System with Kullback and Non-Kullback Separation Functionals, Proc. IJCNN97, June 9–12, 1997, Houston, TX, USA, Vol. 3, pp. 1942–1947 (1997)

    Google Scholar 

  81. L. Xu: Bayesian-Kullback YING-YANG Learning Scheme: Reviews and New Results, Proc. ICONIP 96, Vol. 1, 59–67 (1996)

    Google Scholar 

  82. L. Xu: Bayesian-Kullback YING-YANG Machines for Supervised Learning, Proc. WCNN 96, Sept.15–18, 1996, San Diego, CA, pp. 193–200 (1996)

    Google Scholar 

  83. L. Xu: A Maximum Balanced Mapping Certainty Principle for Pattern Recognition and Associative Mapping, Proc. WCNN 96, Sept. 15–18, 1996, San Diego, CA, pp. 946–949 (1996)

    Google Scholar 

  84. L. Xu: How Many Clusters?: A YING-YANG Machine Based Theory for A Classical Open Problem in Pattern Recognition, Proc. IEEE ICNN 96, June 2–6, 1996, DC, Vol. 3, pp. 1546–1551 (1996)

    Google Scholar 

  85. L. Xu, SI. Amari. A general independent component analysis framework based on Bayesian Kullback Ying Yang Learning, Proc. ICONIP 96, 1253–1240 (1996)

    Google Scholar 

  86. L. Xu, HH. Yang, SI. Amari: Signal Source Separation by Mixtures Accumulative Distribution Functions or Mixture of Bell-Shape Density Distribution Functions, Research Proposal, presented at FRONTIER FORUM, organized by S. Amari, S. Tanaka, A. Cichocki, RIKEN, Japan (April 10, 1996 )

    Google Scholar 

  87. L. Xu: Bayesian-Kullback Coupled YING-YANG Machines: Unified Learnings and New Results on Vector Quantization, Proc. ICONIP 95, Oct 30-Nov.3, 1995, Beijing, China, pp. 977–988 (1995)

    Google Scholar 

  88. L. Xu: YING-YANG Machine for Temporal Signals, Keynote Talk, Proc. 1995 IEEE Intl Conf. on Neural Networks and Signal Processing, Dec. 10–13, 1995, Nanjing, Vol. 1, pp. 644–651 (1995)

    Google Scholar 

  89. L. Xu: New Advances on The YING-YANG Machine, Proc. Intl. Symp. on Artificial Neural Networks, Dec.18–20, 1995, Taiwan, pp. 07–12 (1995)

    Google Scholar 

  90. L. Xu: A unified learning framework: multisets modeling learning, Proceed-ings of 1995 World Congress on Neural Networks, Vol. 1, pp. 35–42 (1995)

    Google Scholar 

  91. L. Xu, MI. Jordan, GE. Hinton: An Alternative Model for Mixtures of Experts. In: JD. Cowan et al. (eds.), Advances in Neural Information Processing Systems 7, MIT Press, 633–640 (1995)

    Google Scholar 

  92. L. Xu: Multisets Modeling Learning: An Unified Theory for Supervised and Unsupervised Learning, Invited Talk, Proc. IEEE ICNN 94, June 26-July 2, 1994, Orlando, Florida, Vol. 1, pp. 315–320 (1994)

    Google Scholar 

  93. L. Xu: Least mean square error reconstruction for self-organizing neural-nets, Neural Networks 6, 627–648, 1993. Its early version on Proc. IJCNN91 Singapore, 2363–2373 (1991)

    Google Scholar 

  94. L. Xu, E. Oja: Randomized Hough Transform (RHT): Basic Mechanisms, Algorithms and Complexities, Computer Vision, Graphics, and Image Processing: Image Understanding, Vol. 57, no. 2, pp. 131–154 (1993)

    Google Scholar 

  95. L. Xu, E. Oja, P. Kultanen: A New Curve Detection Method: Randomized Hough Transform (RHT), Pattern Recognition Letters, 11, 331–338 (1990)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Xu, L. (2004). Bayesian Ying Yang Learning (I): A Unified Perspective for Statistical Modeling. In: Intelligent Technologies for Information Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-07952-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-07952-2_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-07378-6

  • Online ISBN: 978-3-662-07952-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics