Abstract
In linear discriminant (LD) analysis high sample size/feature ratio is desirable. The linear programming procedure (LP) for LD identification handles the curse of dimensionality through simultaneous minimization of the L1 norm of the classification errors and the LD weights. The sparseness of the solution – the fraction of features retained – can be controlled by a parameter in the objective function. By qualitatively analyzing the objective function and the constraints of the problem, we show why sparseness arises. In a sparse solution, large values of the LD weight vector reveal those individual features most important for the decision boundary.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Raudys, S.: Statistical and neural classifiers. Springer, Heidelberg (2001)
Chen, L., Liao, H.M., Ko, M., Lin, J., Yu, G.: A new lda-based face recognition system which can solve the small sample size problem. Pattern recognition 33, 1713–1726 (2000)
Howland, P., Jeon, M., Park, H.: Structure preserving dimension reduction for clustered text data based on the generalized singular value decomposition. SIAM Journal on Matrix Analysis and Applications 25-1, 165–179 (2003)
Bradley, P., Mangasarian, O., Street, W.: Feature selection via mathematical programming. INFORMS Journal on Computing 10(2), 209–217 (1998)
Bhattacharyya, C., Grate, L.R., Rizki, A., et al.: Simultaneous relevant feature identification and classification in high-dimensional spaces: application to molecular profiling data. Signal Processing 83(4), 729–743 (2003)
Guo, G.D., Dyer, C.: Simultaneous Feature Selection and Classifier Training via Linear Programming: A Case Study for Face Expression Recognition. In: Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, June 18-20, vol. 1, pp. 346–352 (2003)
Pedroso, J.P., Murata, N.: Support vector machines with different norms: motivation, formulations and results. Pattern recognition letters 12(2), 1263–1272 (2001)
Kecman, V., Hadzic, I.: Support vectors selection by linear programming. In: Proc. IJCNN 2000, vol. 5, pp. 193–198 (2000)
Vapnik, V.: Introduction to statistical learning theory. Springer, Heidelberg (2001)
Rosen, J.B., Park, H., Glick, J., Zhang, L.: Accurate solutions to overdetermined linear equations with errors using L1 norm minimization. Computational optimization and applications 17, 329–341 (2000)
Szeliski, R.: Regularization in neural nets. In: Smolensky, P., et al. (eds.) Mathematical perspectives on neural networks, pp. 497–532. Lawrence Erlbaum Associates, Mahwah (1996)
Poggio, T., Smale, S.: The mathematics of learning:dealing with data. Notices of the American Mathematical Society. Notices of the American Mathematical Society, vol. 50(5), pp. 537–544 (2003)
Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: Proceedings of the 15th International Conference on Machine Learning (1998)
Mika, S., Ratsch, G., Weston, J., Schoelkopf, B., Smola, A., Muller, K.R.: Constructing descriptive and discriminative nonlinear features: Rayleigh coefficients in kernel feature spaces. IEEE PAMI 25(5), 623–628 (2003)
Arthanari, T.S., Dodge, Y.: Mathematical programming in statistics. John Willey and sons, West Sussex (1981)
Hastie, T., Rosset, S., Tibshirani, R., Zhu, J.: The entire regularization path for the support vector machine (2004)
Figureido, M.: Adaptive sparseness for supervised learning. IEEE PAMI 25(9), 1150–1159 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pranckeviciene, E., Baumgartner, R., Somorjai, R., Bowman, C. (2004). Control of Sparseness for Feature Selection. In: Fred, A., Caelli, T.M., Duin, R.P.W., Campilho, A.C., de Ridder, D. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2004. Lecture Notes in Computer Science, vol 3138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27868-9_77
Download citation
DOI: https://doi.org/10.1007/978-3-540-27868-9_77
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22570-6
Online ISBN: 978-3-540-27868-9
eBook Packages: Springer Book Archive