Reducing Hubness for Kernel Regression
In this paper, we point out that hubness—some samples in a high-dimensional dataset emerge as hubs that are similar to many other samples—influences the performance of kernel regression. Because the dimension of feature spaces induced by kernels is usually very high, hubness occurs, giving rise to the problem of multicollinearity, which is known as a cause of instability of regression results. We propose hubness-reduced kernels for kernel regression as an extension of a previous approach for kNN classification that reduces spatial centrality to eliminate hubness.
KeywordsMean Square Error Kernel Function Training Sample Gaussian Kernel Ridge Regression
Unable to display preview. Download preview PDF.
- 2.Gretton, A., Fukumizu, K., Teo, C., Song, L., Schölkopf, B., Smola, A.: A kernel statistical test of independence. Advances in Neural Information Processing Systems 20, 585–592 (2008)Google Scholar
- 3.Hara, K., Suzuki, I., Shimbo, M., Kobayashi, K., Fukumizu, K., Radovanović, M.: Localized centering: reducing hubness in large-sample data. In: AAAI (2015)Google Scholar
- 7.Suzuki, I., Hara, K., Shimbo, M., Matsumoto, Y., Saerens, M.: Investigating the effectiveness of laplacian-based kernels in hub reduction. In: AAAI (2012)Google Scholar
- 8.Suzuki, I., Hara, K., Shimbo, M., Saerens, M., Fukumizu, K.: Centering similarity measures to reduce hubs. In: EMNLP, pp. 613–623 (2013)Google Scholar