Accelerating kernel classifiers through borders mapping

Mills, Peter

doi:10.1007/s11554-018-0769-9

Accelerating kernel classifiers through borders mapping

Original Research Paper
Published: 05 April 2018

Volume 17, pages 313–327, (2020)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Peter Mills¹

158 Accesses
1 Citation
2 Altmetric
Explore all metrics

Abstract

Support vector machine (SVM) and other kernel techniques represent a family of powerful statistical classification methods with high accuracy and broad applicability. Because they use all or a significant portion of the training data, however, they can be slow, especially for large problems. Piecewise linear classifiers are similarly versatile, yet have the additional advantages of simplicity, ease of interpretation and, if the number of component linear classifiers is not too large, speed. Here we show how a simple, piecewise linear classifier can be trained from a kernel-based classifier in order to improve the classification speed. The method works by finding the root of the difference in conditional probabilities between pairs of opposite classes to build up a representation of the decision boundary. When tested on 17 different datasets, it succeeded in improving the classification speed of a SVM for 12 of them by up to two orders of magnitude. Of these, two were less accurate than a simple, linear classifier. The method is best suited to problems with continuum features data and smooth probability functions. Because the component linear classifiers are built up individually from an existing classifier, rather than through a simultaneous optimization procedure, the classifier is also fast to train.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A random forest guided tour

Article 19 April 2016

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Siamese Neural Networks: An Overview

References

Alimoglu, F.: Combining Multiple Classifiers for Pen-Based Handwritten Digit Recognition. Master’s thesis, Bogazici University (1996)
Bagirov, A.M.: Derivative-free methods for unconstrained nonsmooth optimization and its numerical analysis. Invstigacao Operacional 19, 75–93 (1999)
Google Scholar
Bagirov, A.M.: Max-min separability. Optim. Methods Softw. 20(2–3), 277–296 (2005)
Article MathSciNet Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)
Article Google Scholar
Crammer, K., Singer, Y.: On the learnability and design of output codes for multiclass problems. Mach. Learn. 47(2–3), 201–233
Article Google Scholar
Duarte, M.F., Hu, Y.H.: Vehicle classification in distributed sensor networks. J. Parallel Distrib. Comput. 64, 826–838 (2004)
Article Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. Mach. Learn. Open Source Softw. 9, 1871–1874 (2008)
MATH Google Scholar
Feldkamp, L., Puskorius, G.V.: A signal processing framework based on dynamic neural networks with application to problems in adaptation, filtering, and classification. Proc. IEEE 86(11), 2259–2277 (1998)
Article Google Scholar
Frey, P., Slate, D.: Letter recognition using Holland-style adaptive classifiers. Mach. Learn. 6(2), 161–182 (1991)
Google Scholar
Gai, K., Zhang, C.: Learning discriminative piecewise linear models with boundary points. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, pp. 444–450. Association for the Advancement of Artificial Intelligence (2010)
Guyon, I., Gunn, S., Hur, A.B., Dror, G.: Results analysis of the NIPS 2003 feature selection challenge. In: Proceedings of the 17th International Conference on Neural Information Processing Systems, pp. 545–552. MIT Press, Vancouver (2004)
Herman, G.T., Yeung, K.T.D.: On piecewise-linear classification. IEEE Trans. Pattern Anal. Mach. Intell. 14(7), 782–786 (1992)
Article Google Scholar
Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)
Article Google Scholar
Huang, X., Mehrkanoon, S., Suykens, J.A.K.: Support vector machines with piecewise linear feature mapping. Neurocomputing 117(6), 118–127 (2013)
Article Google Scholar
Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16(5), 550–554 (1994)
Article Google Scholar
Iba, W., Wogulis, J., Lngley, P.: Trading of simplicity and coverage in incremental concept learning. In: Proceedings of Fifth International Conference on Machine Learning, pp. 73–79 (1988)
King, R.D., Feng, C., Sutherland, A.: Statlog: comparison of classification problems on large real-world problems. Appl. Artif. Intell. 9(3), 289–333 (1995)
Article Google Scholar
Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Berlin (2000)
MATH Google Scholar
Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J., Torkkola, K.: LVQ PAK: The Learning Vector Quantization Package, Version 3.1 (1995)
Kostin, A.: A simple and fast multi-class piecewise linear pattern classifier. Pattern Recogn. 39, 1949–1962 (2006)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lee, T., Richards, J.A.: Piecewise linear classification using seniority logic committee methods with application to remote sensing. Pattern Recogn. 17(4), 453–464 (1984)
Article Google Scholar
Lee, T., Richards, J.A.: A low cost classifier for multitemporal applications. Int. J. Remote Sens. 6(8), 1405–1417 (1985)
Article Google Scholar
Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml. Accessed 4 Mar 2017
Lin, H.T., Lin, C.J., Weng, R.C.: A note on Platt’s probabilistic outputs for support vector machines. Mach. Learn. 68(267), 276 (2007)
Google Scholar
Michie, D., Spiegelhalter, D.J., Tayler, C.C. (eds.): Machine Learning, Neural and Statistical Classification. Ellis Horwood Series in Artificial Intelligence. Prentice Hall, Upper Saddle River, NJ (1994). http://www.amsta.leeds.ac.uk/~charles/statlog/. Accessed 13 May 2017
Mills, P.: Isoline retrieval: an optimal method for validation of advected contours. Comput. Geosci. 35(11), 2020–2031 (2009)
Article Google Scholar
Mills, P.: Efficient statistical classification of satellite measurements. Int. J. Remote Sens. 32(21), 6109–6132 (2011)
Article Google Scholar
Mohommad, R., Fadi Abdeljaber Thabtah, F.A., McCluskey, T.: Predicting phishing websites based on self-structuring neural network. Neural Comput. Appl. 25(2), 443–458 (2014)
Article Google Scholar
Müller, K.R., Mika, S., Rätsch, G., Tsuda, K., Schölkopf, B.: An introduction to kernel-based learning algorithms. IEEE Trans. Neural Netw. 12(2), 181–201 (2001)
Article Google Scholar
Osborne, M.: Seniority logic: a logic of a committee machine. IEEE Trans. Comput. 26(12), 1302–1306 (1977)
Article Google Scholar
Ott, E.: Chaos in Dynamical Systems. Cambridge University Press, Cambridge (1993)
MATH Google Scholar
Pavlidis, N.G., Hofmeyr, D.P., Tasoulis, S.K.: Minimum density hyperplanes. J. Mach. Learn. Res. 17(156), 1–33 (2016)
MathSciNet MATH Google Scholar
Platt, J.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Advances in Large Margin Classifiers. MIT Press (1999)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C, 2nd edn. Cambridge University Press, Cambridge (1992)
MATH Google Scholar
Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana (1963)
MATH Google Scholar
Sklansky, J., Michelotti, L.: Locally trained piecewise linear classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 2(2), 101–111 (1980)
Article Google Scholar
Tenmoto, H., Kuda, M., Shimbo, M.: Piecewise linear classifiers with an appropriate number of hyperplanes. Pattern Recogn. 31(11), 1627–1634 (1998)
Article Google Scholar
Terrell, D.G., Scott, D.W.: Variable kernel density estimation. Ann. Stat. 20, 1236–1265 (1992)
Article MathSciNet Google Scholar
Uzilov, A.V., Keegan, J.M., Mathews, D.H.: Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinform. 7, 173 (2006)
Article Google Scholar
Vedaldi, A., Zisserman, A.: Efficient additive kernel via explicit feature maps. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 480–492 (2012)
Article Google Scholar
Wang, J., Saligrama, V.: Locally-linear learning machines (L3M). Proc. Mach. Learn. Res. 29, 451–466 (2013)
Google Scholar
Webb, D.: Efficient Piecewise Linear Classifiers and Applications. Ph.D. thesis, University of Ballarat, Victoria, Australia (2012)
Wu, T.F., Lin, C.J., Weng, R.C.: Probability estimates for multi-class classification by pairwise coupling. J. Mach. Learn. Res. 5, 975–1005 (2004)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

Thanks to Chih-Chung Chan and Chih-Jen Lin of the National Taiwan University for data from the LIBSVM archive and also to David Aha and the curators of the UCI Machine Learning Repository for statistical classification datasets.

Author information

Authors and Affiliations

1159 Meadowlane Rd., Cumberland, ON, K4C 1C3, Canada
Peter Mills

Authors

Peter Mills
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter Mills.

Appendix A: Subsampling

Let $n_i$ be the number of samples of the ith class such that:

$$\begin{aligned} n_i \ge n_{i-1} \end{aligned}$$

Let $0 \le \alpha (n) \le 1$ be a function used to subsample each of the class distributions in turn:

$$\begin{aligned} n_i^\prime = \alpha (n_i) n_i \end{aligned}$$

We wish to retain the rank ordering of the class sizes:

$$\begin{aligned} \alpha (n_i) n_i \ge \alpha (n_{i-1}) n_{i-1} \end{aligned}$$

while ensuring that the smallest classes have some minimum representation:

$$\begin{aligned} \alpha (n_i) \le \alpha (n_{i-1}) \end{aligned}$$

(11)

Thus:

$$\begin{aligned} \frac{{\rm d}}{{\rm d} n} \left[ n \alpha (n) \right]&= \alpha (n) + n \frac{{\rm d} \alpha }{{\rm n}} \ge 0\nonumber \\ \frac{{\rm d}\alpha }{{\rm d}n}&\ge - \frac{\alpha (n)}{n} \end{aligned}$$

(12)

The simplest means of ensuring that both (11) and (12) are fulfilled is to multiply the right side of (12) with a constant, $0 \le \zeta \le 1$, and equate it with the left side:

$$\begin{aligned} \frac{{\mathrm{d}} \alpha }{{\mathrm{d}} n} = - \frac{\zeta \alpha (n)}{n} \end{aligned}$$

Integrating:

$$\begin{aligned} \alpha (n)=Cn^{-\zeta } \end{aligned}$$

The parameter, $C$, is set such that $n_1^\prime =n_1$:

$$\begin{aligned} C = n_1^\zeta \end{aligned}$$

while $\zeta$ is set such that:

$$\begin{aligned} f\sum _i n_i= & {} \sum _i \alpha (n_i) n_i \\= & {} n_1^\zeta \sum _i n_i^{1-\zeta } \end{aligned}$$

where $0< f< 1$ is the desired fraction of training data.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mills, P. Accelerating kernel classifiers through borders mapping. J Real-Time Image Proc 17, 313–327 (2020). https://doi.org/10.1007/s11554-018-0769-9

Download citation

Received: 22 September 2017
Accepted: 28 March 2018
Published: 05 April 2018
Issue Date: April 2020
DOI: https://doi.org/10.1007/s11554-018-0769-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating kernel classifiers through borders mapping

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Siamese Neural Networks: An Overview

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix A: Subsampling

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accelerating kernel classifiers through borders mapping

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Siamese Neural Networks: An Overview

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix A: Subsampling

Appendix A: Subsampling

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation