Subsetting Kernel Regression Models Using Genetic Algorithm and the Information Measure of Complexity

Bao, Xinli; Bozdogan, Hamparsum

doi:10.1007/978-3-642-17103-1_19

Xinli Bao²³ &
Hamparsum Bozdogan²³

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organisation ((STUDIES CLASS))

1515 Accesses

Abstract

Recently in statistical data mining and knowledge discovery, kernel-based methods have attracted attention by many researchers. As a result, many kernel-based methods have been developed, particularly, a family of regularized least squares regression models in a Reproducing Kernel Hilbert Space (RKHS) have been developed (Aronszajn, 1950). The RKHS family includes kernel principal component regression K-PCR (Rosipal et al. 2000, 2001), kernel ridge regression K-RR (Saunders et al., 1998, Cristianini and Shawe-Taylor, 2000) and most recently kernel partial least squares K-PLSR (Rosipal and Trejo 2001, Bennett and Emrechts 2003). Rosipal et al. (2001) compared the K-PLSR, K-PCR and K-RR techniques using conventional statistical procedures and demonstrated that “K-PLSR achieves the same results as K-PCR, but uses significantly fewer and qualitatively different components” through computational experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Akaike, H. (1973). “Information theory and an extension of the maximum likelihood principle,” in Second international symposium on information theory, eds. B. N. Petrov and F. Csáki, Académiai Kiadó, Budapest, pp. 267–281.
Google Scholar
Aronszajn, N. (1950). “Theory of reproducing kernels,” Transactions of the American Mathematical Society, 68, 337–404.
Article MathSciNet MATH Google Scholar
Bennett, K. P. and Emrechts, M. J. (2003). “An optimization perspective on kernel partial least squares regression,” in Advances in Learning Theory: Methods, Models and Applications, eds. J. Suykens, G. Horvath, S. Basu, C. Micchelli, J. Vandewalle, NATO Science Series III: Computer & Systems Sciences, Volume 190, IOS Press, Amsterdam, pp. 227–250.
Google Scholar
Bozdogan, H. (1988). “ICOMP: A new model-selection criterion,” in Classification and Related Methods of Data Analysis, ed. H. Bock, Amsterdam, Elsevier Science Publishers B. V. (North Holland), pp. 599–608.
Google Scholar
Bozdogan, H. (1990). “On the information-based measure of covariance complexity and its application to the evaluation of multivariate linear models,” Communications in Statistics Theory and Methods, 19, 221–278.
Article MathSciNet MATH Google Scholar
Bozdogan, H. (1994). “Mixture-model cluster analysis using a new informational complexity and model selection criteria,” in Multivariate Statistical Modeling, ed. H. Bozdogan, Vol. 2, Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach, Kluwer Academic Publishers, the Netherlands, Dordrecht, pp. 69–113.
Google Scholar
Bozdogan, H. (2000). “Akaike’s information criterion and recent developments in information complexity,” Journal of Mathematical Psychology, 44, 62–91.
Article MathSciNet MATH Google Scholar
Bozdogan, H. (2004). “Intelligent statistical data mining with information complexity and genetic algorithm,” in Statistical Data Mining and Knowledge Discovery, ed. H. Bozdogan, Chapman and Hall/CRC, pp. 15–56.
Google Scholar
Cristianini, N. and Shawe-Taylor, J. (2000). An introduction to Support Vector Machines, Cambridge University Press.
Google Scholar
Goldberg, D. E. (1989). Genetic Algorithm in Search, Optimization, and Machine Learning, Addison-Wesley, New York.
Google Scholar
Kullback, S. and Leibler, R. (1951). “On information and sufficiency.” Ann. Math. Statist, 22, 79–86.
Article MathSciNet MATH Google Scholar
Lin, C-T. and Lee, C.S.G. (1996). Neural Fuzzy Systems, Upper Saddle River, Prentice Hall.
Google Scholar
Liu, Z. and Bozdogan, H. (2004). “Kernel PC A for Feature extraction with information complexity,” in Statistical Data Mining and Knowledge Discovery, ed. H. Bozdogan, Chapman and Hall/CRC.
Google Scholar
Mercer, J. (1909). “Functions of positive and negative type and their connection with the theory of integral equations,” Philosophical Transactions Royal Society London, A209, 415–446.
Google Scholar
Rosipal, R. and Trejo, L. (2001). “Kernel partial least squares regression in reproducing kernel Hilbert space.” Journal of Machine Learning Research, 2, 97–123.
Google Scholar
Rosipal R., M. Girolami, and L.J. Trejo (2000). “Kernel PCA for feature extraction of event-related potentials for human signal detection performance,” in Proceedings of ANNIMAB-1 Conference, Gotegorg, Sweden, pp. 321–326.
Google Scholar
Rosipal, R., M. Girolami, L.J. Trejo and A. Cichocki. (2001). “Kernel PCA for feature extraction and de-noising in non-linear regression,” Neural Computing and Applications, 10.
Google Scholar
Saunders C, A. Gammerman, and V. Vovk. (1998). “Ridge regression learning algorithm in dual variables,” in Proceedings of the 15th International Conference on Machine Learning, Madison, Wisconsin, pp. 515–521.
Google Scholar
Ungar, L. H. (1995). UPenn ChemData Repository. Philadelphia, PA. Available electronically via ftp://ftp.cis.upenn.edu/pub/ungar/chemdata.
Van Emden, M. H. (1971). An Analysis of Complexity. Mathematical Centre Tracts,Amsterdam, 35.
Google Scholar
Vapnik, V. (1995). The Nature of Statistical Learning Theory, Springer-Verlag, New York.
MATH Google Scholar
Wold, S. (1978). “Cross-validatory estimation of the number of components in factor and principal components models,” Technometrics, 20, 397–405.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

The University of Tennessee at Knoxville, USA
Xinli Bao & Hamparsum Bozdogan

Authors

Xinli Bao
View author publications
You can also search for this author in PubMed Google Scholar
Hamparsum Bozdogan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Leanna House Institute of Statistics and Decision Sciences, Duke University, 27708, Durham, NC, USA
David Banks
Department of Mathematics, Illinois Institute of Technology, 10 West 32nd Street, 60616-3793, Chicago, IL, USA
Frederick R. McMorris
Faculty of Management, Rutgers University, 180 University Avenue, 07102-1895, Newark, NJ, USA
Phipps Arabie
Institute of Decision Theory, University of Karlsruhe, Kaiserstr. 12, 76128, Karlsruhe, Germany
Wolfgang Gaul

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bao, X., Bozdogan, H. (2004). Subsetting Kernel Regression Models Using Genetic Algorithm and the Information Measure of Complexity. In: Banks, D., McMorris, F.R., Arabie, P., Gaul, W. (eds) Classification, Clustering, and Data Mining Applications. Studies in Classification, Data Analysis, and Knowledge Organisation. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17103-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-17103-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22014-5
Online ISBN: 978-3-642-17103-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics