Skip to main content
Log in

Online local fisher risk minimization: a new online kernel method for online classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This study presents a new online kernel algorithm for online classification, called the online local Fisher rick minimization (OLFRM). Motivated by the Fisher criterion, OLFRM stresses the connection between data by introducing a local Fisher criterion, which is represented by two parts: the local Fisher loss function and the local Fisher regularization term. The local Fisher loss function generates a loss when the distance between heterogeneous nearest neighbors is not large enough, whereas the local Fisher regularization term works by minimizing the distance between homogeneous nearest neighbors. To reduce computational complexity and save memory resources, OLFRM is extended to two budgeted OLFRM (BOLFRM) algorithms. One uses a removal method as the budget maintenance strategy, called BOLFRM-R, and the other adopts an approximate projection method as the budget maintenance strategy, named BOLFRM-AP. For both BOLFRM-R and BOLFRM-AP, this study further designs an outlier scheme based on the unique mechanism of the proposed algorithms for selecting pending support vectors. Comprehensive experiments were conducted to compare the performance of the related algorithms on public datasets, and the results demonstrate that BOLFRMs improve in terms of robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S (2011) KEEL Data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple Valued Log Soft Comput 17(2-3):255–287

    Google Scholar 

  2. Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, i Guiu JMG, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318. https://doi.org/10.1007/s00500-008-0323-y

    Article  Google Scholar 

  3. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434

    MathSciNet  MATH  Google Scholar 

  4. Cavallanti G, Cesa-Bianchi N, Gentile C (2007) Tracking the best hyperplane with a simple budget perceptron. Mach Learn 69(2-3):143–167. https://doi.org/10.1007/s10994-007-5003-0

    Article  MATH  Google Scholar 

  5. Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27. https://doi.org/10.1109/TIT.1967.1053964

    Article  MATH  Google Scholar 

  6. Crammer K, Kandola JS, Singer Y (2003) Online classification on a budget. In: Thrun S, Saul LK, Schölkopf B (eds) Advances in neural information processing systems 16 [neural information processing systems, NIPS 2003, 8-13 December, 2003, Vancouver]. MIT Press, pp 225–232

  7. Dekel O, Shalev-Shwartz S, Singer Y (2005) The forgetron: a kernel-based perceptron on a fixed budget. In: Advances in neural information processing systems 18 [neural information processing systems, NIPS 2005, 5-8 December, 2005, Vancouver], pp 259–266

  8. Domingos PM, Hulten G (2000) Mining high-speed data streams. In: Ramakrishnan S, Stolfo J, Bayardo RJ, Parsa I (eds) Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, Boston, 20-23 August, 2000, pp 71–80. https://doi.org/10.1145/347090.347107

  9. Eghbali S, Ashtiani H, Tahvildari L (2020) Online nearest neighbor search using hamming weight trees. IEEE Trans Pattern Anal Mach Intell 42(7):1729–1740. https://doi.org/10.1109/TPAMI.2019.2902391

    Article  Google Scholar 

  10. Ertekin S, Bottou L, Giles CL (2011) Nonconvex online support vector machines. IEEE Trans Pattern Anal Mach Intell 33(2):368–381. https://doi.org/10.1109/TPAMI.2010.109

    Article  Google Scholar 

  11. Gan M, Zhang L (2021) Iteratively local fisher score for feature selection. Appl Intell 51(8):6167–6181. https://doi.org/10.1007/s10489-020-02141-0

    Article  Google Scholar 

  12. Goldberg AB, Li M, Zhu X (2008) Online manifold regularization: a new learning setting and empirical study. In: Daelemans W, Goethals B, Morik K (eds) Machine learning and knowledge discovery in databases, European conference, ECML/PKDD 2008, Antwerp, 15-19 September, 2008, Proceedings, Part I, Lecture notes in computer science. Springer, vol 5211, pp 393–407. https://doi.org/10.1007/978-3-540-87479-9_44

  13. Guo H, Zhang A, Wang W (2020) An accelerator for online SVM based on the fixed-size KKT window. Eng Appl Artif Intell 92:103637. https://doi.org/10.1016/j.engappai.2020.103637

    Article  Google Scholar 

  14. Hao Z, Yu S, Yang X, Zhao F, Hu R, Liang Y (2004) Online LS-SVM learning for classification problems based on incremental chunk. In: Yin F, Wang J, Guo C (eds) Advances in neural networks - ISNN 2004, international symposium on neural networks. Springer, Dalian, 19-21 August, 2004, proceedings, Part I, lecture notes in computer science, vol 3173, pp 558–564. https://doi.org/10.1007/978-3-540-28647-9_92

  15. Hoi SC, Sahoo D, Lu J, Zhao P (2021) Online learning: a comprehensive survey. Neurocomputing 459:249–289

    Article  Google Scholar 

  16. Huang X, Shi L, Suykens JAK (2014) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997. https://doi.org/10.1109/TPAMI.2013.178

    Article  Google Scholar 

  17. Jain P, Kar P, et al. (2017) Non-convex optimization for machine learning. Foundations and Trends®; in Machine Learning 10 (3-4):142–363

    Article  MATH  Google Scholar 

  18. Jaworski M, Duda P, Rutkowski L (2018) New splitting criteria for decision trees in stationary data streams. IEEE Trans Neural Networks Learn Syst 29(6):2516–2529. https://doi.org/10.1109/TNNLS.2017.2698204

    Article  MathSciNet  Google Scholar 

  19. Kimeldorf G, Wahba G (1971) Some results on tchebycheffian spline functions. J Math Anal Appl 33(1):82–95

    Article  MathSciNet  MATH  Google Scholar 

  20. Kivinen J, Smola AJ, Williamson RC (2004) Online learning with kernels. IEEE Trans Signal Process 52(8):2165–2176. https://doi.org/10.1109/TSP.2004.830991

    Article  MathSciNet  MATH  Google Scholar 

  21. Lei Y, Hu T, Li G, Tang K (2020) Stochastic gradient descent for nonconvex learning without bounded gradient assumptions. IEEE Trans Neural Networks Learn Syst 31(10):4394–4400. https://doi.org/10.1109/TNNLS.2019.2952219

    Article  MathSciNet  Google Scholar 

  22. Lu J, Hoi SCH, Wang J, Zhao P, Liu Z (2016) Large scale online kernel learning. J Mach Learn Res 17:47:1–47:43

    MathSciNet  MATH  Google Scholar 

  23. Mason L, Bartlett PL, Baxter J (2000) Improved generalization through explicit optimization of margins. Mach Learn 38(3):243–255. https://doi.org/10.1023/A:1007697429651

    Article  MATH  Google Scholar 

  24. McMahan HB, Holt G, Sculley D, Young M, Ebner D, Grady J, Nie L, Phillips T, Davydov E, Golovin D, Chikkerur S, Liu D, Wattenberg M, Hrafnkelsson AM, Boulos T, Kubica J (2013) Ad click prediction: a view from the trenches. In: Dhillon IS, Koren Y, Ghani R, Senator TE, Bradley P, Parekh R, He J, Grossman RL, Uthurusamy R (eds) The 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD 2013. ACM, Chicago, 11-14 August, 2013, pp 1222–1230. https://doi.org/10.1145/2487575.2488200

  25. Orabona F, Keshet J, Caputo B (2008) The projectron: a bounded kernel-based perceptron. In: Cohen WW, McCallum A, Roweis ST (eds) Machine learning, proceedings of the twenty-fifth international conference (ICML 2008). ACM, Helsinki, 5-9 June, 2008, ACM international conference proceeding series, vol 307, pp 720–727. https://doi.org/10.1145/1390156.1390247

  26. Oza NC (2005) Online bagging and boosting. In: Proceedings of the IEEE international conference on systems, man and cybernetics. IEEE, Waikoloa, 10-12 October, 2005, pp 2340–2345. https://doi.org/10.1109/ICSMC.2005.1571498

  27. Oza NC, Russell SJ, Provost FJ (2001) Experimental comparisons of online and batch versions of bagging and boosting. In: Lee D, Schkolnick M, Srikant R (eds) Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. ACM, San Francisco, 26-29 August, 2001, pp 359–364. https://doi.org/10.1145/502512.502565

  28. Quinlan J R (2014) C4. 5: programs for machine learning. Elsevier

  29. Schölkopf B, Herbrich R, Smola AJ (2001) A generalized representer theorem. In: Helmbold DP, Williamson RC (eds) Computational learning theory, 14th annual conference on computational learning theory, COLT 2001 and 5th European conference on computational learning theory, EuroCOLT 2001. Springer, Amsterdam, 16-19 July, 2001, Proceedings, Lecture notes in computer science, vol 2111, pp 416–426. https://doi.org/10.1007/3-540-44581-1_27

  30. Shalev-Shwartz S, Singer Y, Srebro N, Cotter A (2011) Pegasos: primal estimated sub-gradient solver for SVM. Math Program 127(1):3–30. https://doi.org/10.1007/s10107-010-0420-4

    Article  MathSciNet  MATH  Google Scholar 

  31. Shan J, Zhang H, Liu W, Liu Q (2019) Online active learning ensemble framework for drifted data streams. IEEE Trans Neural Networks Learn Syst 30(2):486–498. https://doi.org/10.1109/TNNLS.2018.2844332

    Article  Google Scholar 

  32. Vapnik V, Levin E, Le Cun Y (1994) Measuring the vc-dimension of a learning machine. Neural Comput 6(5):851–876

    Article  Google Scholar 

  33. Vincent P, Bengio Y (2002) Kernel matching pursuit. Mach Learn 48(1-3):165–187. https://doi.org/10.1023/A:1013955821559

    Article  MATH  Google Scholar 

  34. Zhang L, Zhou WD (2016) Fisher-regularized support vector machine. Inf Sci 343:79–93

    Article  MathSciNet  MATH  Google Scholar 

  35. Zhang Z, Zhang L, Zhang Z (2021) Fisher-regularized support vector machine with pinball loss function. In: International joint conference on neural networks, IJCNN 2021. IEEE, Shenzhen, 18-22 July, 2021, pp 1–8. https://doi.org/10.1109/IJCNN52387.2021.9533502

  36. Zheng X, Zhang L, Yan L (2021) CTSVM: a robust twin support vector machine with correntropy-induced loss function for binary classification problems. Inf Sci 559:22–45. https://doi.org/10.1016/j.ins.2021.01.006

    Article  MathSciNet  MATH  Google Scholar 

  37. Zhu X, Wu X, Chen Q (2003) Eliminating class noise in large datasets. In: Fawcett T, Mishra N (eds) Machine learning, proceedings of the twentieth international conference (ICML 2003). AAAI Press, 21-24 August, 2003, Washington, pp 920–927

Download references

Acknowledgments

We would like to thank the anonymous reviewers and Editor for their valuable comments and suggestions, which have significantly improved this work. This work was supported in part by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant Nos. 19KJA550002 and 19KJA610002, by the Priority Academic Program Development of Jiangsu Higher Education Institutions, and by the Collaborative Innovation Center of Novel Software Technology and Industrialization.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Su, C., Zhang, L. & Zhao, L. Online local fisher risk minimization: a new online kernel method for online classification. Appl Intell 53, 17662–17678 (2023). https://doi.org/10.1007/s10489-022-04400-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04400-8

Keywords

Navigation