Abstract
Recently in statistical data mining and knowledge discovery, kernel-based methods have attracted attention by many researchers. As a result, many kernel-based methods have been developed, particularly, a family of regularized least squares regression models in a Reproducing Kernel Hilbert Space (RKHS) have been developed (Aronszajn, 1950). The RKHS family includes kernel principal component regression K-PCR (Rosipal et al. 2000, 2001), kernel ridge regression K-RR (Saunders et al., 1998, Cristianini and Shawe-Taylor, 2000) and most recently kernel partial least squares K-PLSR (Rosipal and Trejo 2001, Bennett and Emrechts 2003). Rosipal et al. (2001) compared the K-PLSR, K-PCR and K-RR techniques using conventional statistical procedures and demonstrated that “K-PLSR achieves the same results as K-PCR, but uses significantly fewer and qualitatively different components” through computational experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akaike, H. (1973). “Information theory and an extension of the maximum likelihood principle,” in Second international symposium on information theory, eds. B. N. Petrov and F. Csáki, Académiai Kiadó, Budapest, pp. 267–281.
Aronszajn, N. (1950). “Theory of reproducing kernels,” Transactions of the American Mathematical Society, 68, 337–404.
Bennett, K. P. and Emrechts, M. J. (2003). “An optimization perspective on kernel partial least squares regression,” in Advances in Learning Theory: Methods, Models and Applications, eds. J. Suykens, G. Horvath, S. Basu, C. Micchelli, J. Vandewalle, NATO Science Series III: Computer & Systems Sciences, Volume 190, IOS Press, Amsterdam, pp. 227–250.
Bozdogan, H. (1988). “ICOMP: A new model-selection criterion,” in Classification and Related Methods of Data Analysis, ed. H. Bock, Amsterdam, Elsevier Science Publishers B. V. (North Holland), pp. 599–608.
Bozdogan, H. (1990). “On the information-based measure of covariance complexity and its application to the evaluation of multivariate linear models,” Communications in Statistics Theory and Methods, 19, 221–278.
Bozdogan, H. (1994). “Mixture-model cluster analysis using a new informational complexity and model selection criteria,” in Multivariate Statistical Modeling, ed. H. Bozdogan, Vol. 2, Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach, Kluwer Academic Publishers, the Netherlands, Dordrecht, pp. 69–113.
Bozdogan, H. (2000). “Akaike’s information criterion and recent developments in information complexity,” Journal of Mathematical Psychology, 44, 62–91.
Bozdogan, H. (2004). “Intelligent statistical data mining with information complexity and genetic algorithm,” in Statistical Data Mining and Knowledge Discovery, ed. H. Bozdogan, Chapman and Hall/CRC, pp. 15–56.
Cristianini, N. and Shawe-Taylor, J. (2000). An introduction to Support Vector Machines, Cambridge University Press.
Goldberg, D. E. (1989). Genetic Algorithm in Search, Optimization, and Machine Learning, Addison-Wesley, New York.
Kullback, S. and Leibler, R. (1951). “On information and sufficiency.” Ann. Math. Statist, 22, 79–86.
Lin, C-T. and Lee, C.S.G. (1996). Neural Fuzzy Systems, Upper Saddle River, Prentice Hall.
Liu, Z. and Bozdogan, H. (2004). “Kernel PC A for Feature extraction with information complexity,” in Statistical Data Mining and Knowledge Discovery, ed. H. Bozdogan, Chapman and Hall/CRC.
Mercer, J. (1909). “Functions of positive and negative type and their connection with the theory of integral equations,” Philosophical Transactions Royal Society London, A209, 415–446.
Rosipal, R. and Trejo, L. (2001). “Kernel partial least squares regression in reproducing kernel Hilbert space.” Journal of Machine Learning Research, 2, 97–123.
Rosipal R., M. Girolami, and L.J. Trejo (2000). “Kernel PCA for feature extraction of event-related potentials for human signal detection performance,” in Proceedings of ANNIMAB-1 Conference, Gotegorg, Sweden, pp. 321–326.
Rosipal, R., M. Girolami, L.J. Trejo and A. Cichocki. (2001). “Kernel PCA for feature extraction and de-noising in non-linear regression,” Neural Computing and Applications, 10.
Saunders C, A. Gammerman, and V. Vovk. (1998). “Ridge regression learning algorithm in dual variables,” in Proceedings of the 15th International Conference on Machine Learning, Madison, Wisconsin, pp. 515–521.
Ungar, L. H. (1995). UPenn ChemData Repository. Philadelphia, PA. Available electronically via ftp://ftp.cis.upenn.edu/pub/ungar/chemdata.
Van Emden, M. H. (1971). An Analysis of Complexity. Mathematical Centre Tracts,Amsterdam, 35.
Vapnik, V. (1995). The Nature of Statistical Learning Theory, Springer-Verlag, New York.
Wold, S. (1978). “Cross-validatory estimation of the number of components in factor and principal components models,” Technometrics, 20, 397–405.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bao, X., Bozdogan, H. (2004). Subsetting Kernel Regression Models Using Genetic Algorithm and the Information Measure of Complexity. In: Banks, D., McMorris, F.R., Arabie, P., Gaul, W. (eds) Classification, Clustering, and Data Mining Applications. Studies in Classification, Data Analysis, and Knowledge Organisation. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17103-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-17103-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22014-5
Online ISBN: 978-3-642-17103-1
eBook Packages: Springer Book Archive