Sparse and heuristic support vector machine for binary classifier and regressor fusion

  • Jinhong Huang
  • Zhu Liang YuEmail author
  • Zhenghui Gu
  • Jun Zhang
  • Ling Cen
Original Article


Single-hidden layer feedforward networks (SLFNs) are always viewed as classical methods for binary classification and regression. There are several variant types of SLFNs, such as support vector machines (SVM), extreme learning machines (ELM), etc. It is an open problem for SLFNs to obtain a powerful feature mapper with a simple network structure. In this paper, we propose a framework called sparse and heuristic SVM (SH-SVM) to fuse different SLFNs from the aspect of feature mapping to obtain powerful feature mapping capability and improve the generalization performance. By fusing different SLFNs, the SH-SVM benefits from the learning capabilities of each model. As an example, the fusion of SVM and ELM is studied in detail. Then with the sparse representation method, a compact SLFN is obtained and the most powerful hidden nodes are selected. Furthermore, an efficient method for solving the sparse representation problem in SH-SVM is proposed. Experiments on 25 data sets with eight methods show that SH-SVM has satisfactory results with a compact network structure.


Single hidden layer feedforward networks Extreme learning machine Support vectors sparse and heuristic Classification and regression 



This work was supported in part by the National Natural Science Foundation of China under Grants 61836003, 61573150, 61573152 and 61403085.


  1. 1.
    Bach FR (2008) Consistency of the group lasso and multiple kernel learning. J Mach Learn Res 9(2):1179–1225MathSciNetzbMATHGoogle Scholar
  2. 2.
    Bai Z, Huang GB, Wang D, Wang H, Westover MB (2014) Sparse extreme learning machine for classification. IEEE Trans Syst Man Cybern B, Cybern 44(10):1858–1870Google Scholar
  3. 3.
    Barlow JL (2002) More accurate bidiagonal reduction for computing the singular value decomposition. SIAM J Matrix Anal Appl 23(3):761–798MathSciNetzbMATHCrossRefGoogle Scholar
  4. 4.
    Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, CambridgezbMATHCrossRefGoogle Scholar
  5. 5.
    Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans Intel Syst Tec 2:27:1–27:27. CrossRefGoogle Scholar
  6. 6.
    Chen ZX, Zhu HY, Wang YG (2013) A modified extreme learning machine with sigmoidal activation functions. Neural Comput Appl 22(3):541–550CrossRefGoogle Scholar
  7. 7.
    Chuang CC, Su SF, Jeng JT, Hsiao CC (2002) Robust support vector regression networks for function approximation with outliers. IEEE Trans Neural Netw 13(6):1322–1330CrossRefGoogle Scholar
  8. 8.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  9. 9.
    Elad M (2010) Sparse and redundant representations: from theory to applications in signal and image processing. Springer, New YorkzbMATHCrossRefGoogle Scholar
  10. 10.
    Elad M, Matalon B, Zibulevsky M (2007) Coordinate and subspace optimization methods for linear least squares with non-quadratic regularization. Appl Comput Harm Anal 23(3):346–367MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Grant M, Boyd S (2014) CVX: Matlab software for disciplined convex programming, version 2.1.
  12. 12.
    Grant MC, Boyd SP (2008) Graph implementations for nonsmooth convex programs. In: Recent advances in learning and control, pp. 95–110. SpringerGoogle Scholar
  13. 13.
    Groggel DJ (2000) Practical nonparametric statistics. Technometrics 14(4):977–979Google Scholar
  14. 14.
    Heeswijk MV, Miche Y (2015) Binary/ternary extreme learning machines. Neurocomputing 149(PA):187–197CrossRefGoogle Scholar
  15. 15.
    Hsu CW, Chang CC, Lin CJ et al (2003) A practical guide to support vector classificationGoogle Scholar
  16. 16.
    Huang G, Huang GB, Song S, You K (2015) Trends in extreme learning machines: a review. Neural Netw 61:32–48zbMATHCrossRefGoogle Scholar
  17. 17.
    Huang GB (2014) An insight into extreme learning machines: random neurons, random features and kernels. Cognit Comput 6(3):376–390MathSciNetCrossRefGoogle Scholar
  18. 18.
    Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B, Cybern 42(2):513–529CrossRefGoogle Scholar
  19. 19.
    Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: IJCNN, vol. 2, pp 985–990Google Scholar
  20. 20.
    Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501CrossRefGoogle Scholar
  21. 21.
    Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329CrossRefGoogle Scholar
  22. 22.
    Lan Y, Soh YC, Huang GB (2010) Two-stage extreme learning machine for regression. Neurocomputing 73(16):3028–3038CrossRefGoogle Scholar
  23. 23.
    Lichman M (2013) UCI machine learning repository.
  24. 24.
    Lobo MS, Vandenberghe L, Boyd S, Lebret H (1998) Applications of second-order cone programming. Linear Algebra Appl 284(1):193–228MathSciNetzbMATHCrossRefGoogle Scholar
  25. 25.
    Miche Y, Sorjamaa A, Bas P, Simula O, Jutten C, Lendasse A (2010) Op-elm: optimally pruned extreme learning machine. IEEE Trans Neural Netw 21(1):158–162Google Scholar
  26. 26.
    Mike M (1989) Statistical datasets. Dept. Statist., Univ. Carnegie Mellon, Pittsburgh, PAGoogle Scholar
  27. 27.
    Mikolov T, Karafiát M, Burget L, Černockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: Interspeech, pp 1045–1048Google Scholar
  28. 28.
    Oualil Y, Klakow D (2017) A neural network approach for mixing language models. In: ICASSP, pp 5710–5714Google Scholar
  29. 29.
    Pao YH, Park GH, Sobajic DJ (1994) Learning and generalization characteristics of the random vector functional-link net. Neurocomputing 6(2):163–180CrossRefGoogle Scholar
  30. 30.
    Rockafellar RT (1976) Monotone operators and the proximal point algorithm. SIAM J Control Optim 14(5):877–898MathSciNetzbMATHCrossRefGoogle Scholar
  31. 31.
    Schölkopf B, Burges C, Vapnik V (1996) Incorporating invariances in support vector learning machines. In: ICANN, Springer, pp 47–52CrossRefGoogle Scholar
  32. 32.
    Schölkopf B, Simard P, Vapnik V, Smola A (1997) Improving the accuracy and speed of support vector machines. Adv Neural Inf Process Syst 9:375–381Google Scholar
  33. 33.
    Smola A, Vapnik V (1997) Support vector regression machines. NIPS 9:155–161Google Scholar
  34. 34.
    Sundermeyer M, Ney H (2012) Lstm neural networks for language modeling. In: Interspeech, pp. 601–608Google Scholar
  35. 35.
    Tan M, Wang L, Tsang IW (2010) Learning sparse svm for feature selection on very high dimensional datasets. In: ICML, pp 1047–1054Google Scholar
  36. 36.
    Vapnik V, Golowich SE, Smola A et al (1997) Support vector method for function approximation, regression estimation, and signal processing. NIPS, pp 281–287Google Scholar
  37. 37.
    Wang H, Nie F, Huang H, Risacher S, Ding C, Saykin AJ, Shen L (2011) Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance. In: International Conference on Computer VisionGoogle Scholar
  38. 38.
    Wang Y, Cao F, Yuan Y (2011) A study on effectiveness of extreme learning machine. Neurocomputing 74(16):2483–2490CrossRefGoogle Scholar
  39. 39.
    White H (1989) An additional hidden unit test for neglected nonlinearity in multilayer feedforward networks. In: IJCNN, pp 451 – 455Google Scholar
  40. 40.
    Wolkowicz H, Styan GPH (1980) Bounds for eigenvalues using traces. Linear Algebra Appl 29:471–506MathSciNetzbMATHCrossRefGoogle Scholar
  41. 41.
    Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227CrossRefGoogle Scholar
  42. 42.
    Yang Y, Wang Y, Yuan X (2012) Bidirectional extreme learning machine for regression problem and its learning effectiveness. IEEE Trans Neural Netw Learn Syst 23(9):1498–1505CrossRefGoogle Scholar
  43. 43.
    Yu ZL, Zhou W, Gu Z, Yang Y, Liu GN (2012) An efficient implementation of modified regularized sparse recovery for real-time optical power monitoring. J Lightwave Technol 30(17):2863–2869CrossRefGoogle Scholar
  44. 44.
    Zhu W, Miao J, Qing L (2014) Constrained extreme learning machine: a novel highly discriminative random feedforward neural network. In: IJCNN, pp 800–807Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.College of Automation Science and EngineeringSouth China University of TechnologyGuangzhouChina
  2. 2.College of Information EngineeringGuangdong University of TechnologyGuangzhouChina

Personalised recommendations