Advantages of Unbiased Support Vector Classifiers for Data Mining Applications

Navia-Vázquez, A.; Pérez-Cruz, F.; Artés-Rodríguez, A.; Figueiras-Vidal, A.R.

doi:10.1023/B:VLSI.0000027487.93757.91

A. Navia-Vázquez¹,
F. Pérez-Cruz¹,
A. Artés-Rodríguez¹ &
…
A.R. Figueiras-Vidal¹

112 Accesses
4 Citations
Explore all metrics

Abstract

Many learning algorithms have been used for data mining applications, including Support Vector Classifiers (SVC), which have shown improved capabilities with respect to other approaches, since they provide a natural mechanism for implementing Structural Risk Minimization (SRM), obtaining machines with good generalization properties. SVC leads to the optimal hyperplane (maximal margin) criterion for separable datasets but, in the nonseparable case, the SVC minimizes the L ₁ norm of the training errors plus a regularizing term, to control the machine complexity. The L ₁ norm is chosen because it allows to solve the minimization with a Quadratic Programming (QP) scheme, as in the separable case. But the L ₁ norm is not truly an “error counting” term as the Empirical Risk Minimization (ERM) inductive principle indicates, leading therefore to a biased solution. This effect is specially severe in low complexity machines, such as linear classifiers or machines with few nodes (neurons, kernels, basis functions). Since one of the main goals in data mining is that of explanation, these reduced architectures are of great interest because they represent the origins of other techniques such as input selection or rule extraction. Training SVMs as accurately as possible in these situations (i.e., without this bias) is, therefore, an interesting goal.

We propose here an unbiased implementation of SVC by introducing a more appropriate “error counting” term. This way, the number of classification errors is truly minimized, while the maximal margin solution is obtained in the separable case. QP can no longer be used for solving the new minimization problem, and we apply instead an iterated Weighted Least Squares (WLS) procedure. This modification in the cost function of the Support Vector Machine to solve ERM was not possible up to date given the Quadratic or Linear Programming techniques commonly used, but it is now possible using the iterated WLS formulation. Computer experiments show that the proposed method is superior to the classical approach in the sense that it truly solves the ERM problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Do Minimal Complexity Least Squares Support Vector Machines Work?

Complexes of Low Dimensional Linear Classifiers with L1 Margins

Some Remarks on the Statistical Analysis of SVMs and Related Methods

References

V. Vapnik, The Nature of Statistical Learning Theory, New York: Springer-Verlag, 1995.
Book MATH Google Scholar
F. Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms, Washington, DC: Spartan Press, 1961.
Google Scholar
B. Telfer and H. Szu, “Energy Functions for Minimizing Missclassification Error with Minimum Complexity Networks,” Neural Networks, vol. 7, 1994, pp. 805-818.
Article Google Scholar
S. Raudys, “Evolution and Generalization of a Single Neurone I. Single-Layer Perceptron as Seven Statistical Classifiers,” Neural Networks, vol. 11, 1998, pp. 283-296.
Article Google Scholar
S. Raudys, “Evolution and Generalization of a Single Neurone II. Complexity of Statistical Classifiers and Sample Size Considerations,” Neural Networks, vol. 11, 1998, pp. 297-313.
Article Google Scholar
J. Cid-Sueiro and J.L. Sancho-Gómez, “Saturated Perceptrons for Maximum Margin and Minimum Missclassification Error,” Neural Processing Letters, vol. 14, no. 3, 2001, pp. 217-226.
Article MATH Google Scholar
L. Bobrowsky and J. Sklansky, “Linear Classifiers by Window Training,” IEEE Transactions on Systems, Man and Cybernetics, vol. 25, 1995, pp. 1-9.
Article Google Scholar
H. Do-Tu and M. Installe, “Learning Algorithms for Non-Parametric Solution to the Minimum Error Classification Problem,” IEEE Transactions on Computers, vol. 27, no. 7, 1978, pp. 648-659.
Article MathSciNet MATH Google Scholar
A. Guerrero Curieses and J. Cid-Sueiro, “A Natural Approach to Sample Selection in Binary Classification,” in Proceedings of Learning'00, Madrid, Spain (CD-ROM), paper no. 29, 2000.
N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, Cambridge: Cambridge University Press, 2000.
Book Google Scholar
F. Pérez-Cruz, A. Navia-Vázquez, P. Alarcón-Diana. and A. Artés-Rodríguez, “Support Vector Classifier with Hyperbolic Tangent Penalty Function,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP'2000, vol. 6, Piscataway, NJ, USA, 2000, pp. 3458-3461.
Article Google Scholar
B. Scholkopf, P. Knirsch, A. Smola, and C. Burges, “Fast Approximation of Support Vector Kernel Expansions, and an Interpretation of Clustering as Approximation in Feature Spaces,” in Proc. 20. DAGM Symp. Mustererkennung, Springer Lecture Notes in Computer Science, vol. 1, 1998.
S. Ahalt, A. Krishnamurthy, P. Chen, and D. Melton, “Competitive Learning Algorithms for Vector Quantization,” Neural Networks, vol. 3, 1990, pp. 277-290.
Article Google Scholar
A. Navia-Vázquez, F. Pérez-Cruz, A. Artés-Rodríguez, and A. Figueiras-Vidal, “Weighted Least Squares Training of Support Vector Classifiers Leading to Compact and Adaptive Schemes,” IEEE Trans. Neural Networks, vol. 12, no. 5, 2001, pp. 1047-1059.
Article Google Scholar
F. Pérez-Cruz, A. Navia-Vázquez, A.R. Figueiras-Vidal, and A. Artés-Rodríguez, “Empirical Risk Minimization for Support Vector Machines,” IEEE Trans. on Neural Networks, vol. 14, no. 2, 2003, pp. 296-303.
Article Google Scholar
F. Pérez-Cruz, “Máquina de Vectores Soporte Adaptativa y Compacta,” Ph.D. thesis, Universidad Politécnica de Madrid, 2000.
F. Pérez-Cruz, Navia-Vázquez, J. Rojo-álvarez, and A. Artés-Rodríguez, “A New Training Algorithm for Support Vector Machines,” in Proc. 5th Bayona Workshop on Emerging Technologies in Telecomms., vol. 1, 1999, pp. 343-351.
Google Scholar
A.N. Tikhonov and V.Y. Arsenin, Solution of Ill-Posed Problems, Washington, DC: Winston, 1977.
Google Scholar
C.J.C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition,” Data Mining and Knowledge Discovery, vol. 2, 1998, pp. 121-167.
Article Google Scholar
T. Joachims, “Making large-Scale SVM Learning Practical,” in Advances in Kernel Methods—Support Vector Learning, B. Schlkopf, C. Burges, and A. Smola (Eds.), Cambridge, MA: MIT Press, 1999, pp. 169-184.
Google Scholar
Y. Freund and R. Shapire, “A Decission-Theoretic Generalization of On-line Learning and an Application to Boosting,” Journal of Computer Science, vol. 55, 1997, pp. 119-139.
MATH Google Scholar
T. Joachims, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features,” in Proc. 10th European Conf. on Machine Learning (ECML), 1998.
J. Platt, “Inductive Learning Algorithms and Representations for Text Categorization,” in Proc. 7th International Conference on Information and Knowledge Management, 1998.
F. Pérez-Cruz, P. Alarcón-Diana, A. Navia-Vázquez, and A. Artés-Rodríguez, “Fast Training of Support Vector Classifiers,” in Advances in Neural Information Processing Systems, vol. 13, 2000, pp. 734-740.
Google Scholar
A. Dempster, N. Laird, and D. Rubin, “Maximum Likelihood from Incomplete Data via EM Algorithm (with discussion),” Journal of the Royal Statistical Society, vol. 39, 1977, pp. 1-38.
MathSciNet MATH Google Scholar
J. Shawe-Taylor and N. Cristianini, “On the Generalisation of Soft Margin Algorithms,” NeuroCOLT2 Tech. Rep. 82, Dep. Computer Science, Univ. London, 2000.

Download references

Author information

Authors and Affiliations

DTSC, Univ. Carlos III de Madrid, Avda Universidad 30, 28911, Leganés, Madrid, Spain
A. Navia-Vázquez, F. Pérez-Cruz, A. Artés-Rodríguez & A.R. Figueiras-Vidal

Authors

A. Navia-Vázquez
View author publications
You can also search for this author in PubMed Google Scholar
F. Pérez-Cruz
View author publications
You can also search for this author in PubMed Google Scholar
A. Artés-Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
A.R. Figueiras-Vidal
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Navia-Vázquez, A., Pérez-Cruz, F., Artés-Rodríguez, A. et al. Advantages of Unbiased Support Vector Classifiers for Data Mining Applications. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 37, 223–235 (2004). https://doi.org/10.1023/B:VLSI.0000027487.93757.91

Download citation

Published: 01 June 2004
Issue Date: June 2004
DOI: https://doi.org/10.1023/B:VLSI.0000027487.93757.91

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Advantages of Unbiased Support Vector Classifiers for Data Mining Applications

Abstract

Access this article

Similar content being viewed by others

Do Minimal Complexity Least Squares Support Vector Machines Work?

Complexes of Low Dimensional Linear Classifiers with L1 Margins

Some Remarks on the Statistical Analysis of SVMs and Related Methods

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Advantages of Unbiased Support Vector Classifiers for Data Mining Applications

Abstract

Access this article

Similar content being viewed by others

Do Minimal Complexity Least Squares Support Vector Machines Work?

Complexes of Low Dimensional Linear Classifiers with L1 Margins

Some Remarks on the Statistical Analysis of SVMs and Related Methods

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation