Nonparametric kernel smoother on topology learning neural networks for incremental and ensemble regression

Original Article
  • 68 Downloads

Abstract

Incremental learning is a technique which is effective to increase the space efficiency of machine learning algorithms. Ensemble learning can combine different algorithms to form more accurate ones. The parameter selection of incremental methods is difficult because no retraining is allowed, and the combination of incremental and ensemble learning has not been fully explored. In this paper, we propose a parameter-free regression framework and it combines incremental learning and ensemble learning. First, the topology learning neural networks such as growing neural gas (GNG) and self-organizing incremental neural network (SOINN) are employed as solutions to nonlinearity. Then, the vector quantizations of GNG and SOINN are transformed into a feed-forward neural network by an improved Nadaraya–Watson estimator. A maximum likelihood process is devised for adaptive parameter selection of the estimator. Finally, a weighted training strategy is incorporated to enable the topology learning regressors for ensemble learning by AdaBoost. Experiments are carried out on 5 UCI datasets, and an application study of short-term traffic flow prediction is given. The results show that the proposed method gives comparable results to mainstream incremental and non-incremental regression methods, and better performances in the short-term traffic flow prediction.

Keywords

Incremental learning Nonparametric regression Nadaraya–Watson estimator Competitive learning Ensemble learning 

Notes

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Nos. 61301148, 61272061) and the Fundamental Research Funds for the Central Universities of China.

Compliance with ethical standards

Conflicts of interest

The authors declare no conflict of interest.

References

  1. 1.
    Bache K, Lichman M (2013) UCI machine learning repository. URL http://archive.ics.uci.edu/ml 901
  2. 2.
    Beygelzimer A, Kale S, Luo H (2015) Optimal and adaptive algorithms for online boosting. In: Proceedings of the 32nd international conference on machine learning (ICML-15), pp 2323–2331Google Scholar
  3. 3.
    Brugger D, Rosenstiel W, Bogdan M (2011) Online SVR training by solving the primal optimization problem. J Signal Process Syst 65(3):391–402CrossRefGoogle Scholar
  4. 4.
    Castro-Neto M, Jeong YS, Jeong MK, Han LD (2009) Online-SVR for short-term traffic flow prediction under typical and atypical traffic conditions. Expert Syst Appl 36(3, Part 2), 6164 – 6173. doi: 10.1016/j.eswa.2008.07.069. http://www.sciencedirect.com/science/article/pii/S0957417408004740
  5. 5.
    Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585MathSciNetMATHGoogle Scholar
  6. 6.
    Developers N (2016) Numpy. NumPy Numpy. Scipy Developers. http://www.numpy.org
  7. 7.
    Doucet A, Johansen AM (2009) A tutorial on particle filtering and smoothing: fifteen years later. Handb Nonlinear Filter 12:656–704MATHGoogle Scholar
  8. 8.
    Dudek G (2014) Tournament searching method for optimization of the forecasting model based on the Nadaraya–Watson estimator. In: International conference on artificial intelligence and soft computing, Springer, pp 339–348Google Scholar
  9. 9.
    Fink O, Zio E, Weidmann U (2015) Novelty detection by multivariate kernel density estimation and growing neural gas algorithm. Mech Syst Signal Process 50:427–436CrossRefGoogle Scholar
  10. 10.
    Freund Y, Schapire RE et al (1996) Experiments with a new boosting algorithm. ICML 96:148–156Google Scholar
  11. 11.
    Fritzke B et al (1995) A growing neural gas network learns topologies. Adv Neural Inform Process Syst 7:625–632Google Scholar
  12. 12.
    Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42CrossRefMATHGoogle Scholar
  13. 13.
    Grewal MS (2011) Kalman filtering. Springer, New YorkCrossRefGoogle Scholar
  14. 14.
    Xiang H, Shu H (2015) An ensemble model of short-term traffic flow forecasting on freeway. Appl Mech Mater 744–746:1852–7CrossRefGoogle Scholar
  15. 15.
    Kohonen T (1998) The self-organizing map. Neurocomputing 21(1):1–6CrossRefMATHGoogle Scholar
  16. 16.
    Liang NY, Huang GB, Saratchandran P, Sundararajan N (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–1423CrossRefGoogle Scholar
  17. 17.
    McKinney W (2012) Python for Data Analysis. O'Reilly, SebastopolGoogle Scholar
  18. 18.
    Osei-Bryson KM (2007) Post-pruning in decision tree induction using multiple performance measures. Comput Oper Res 34(11):3331–3345MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetMATHGoogle Scholar
  20. 20.
    Polikar R, Upda L, Upda SS, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans Syst Man Cybern Part C Appl Rev 31(4):497–508CrossRefGoogle Scholar
  21. 21.
    Rahimi A, Recht B (2007) Random features for large-scale kernel machines. In: Advances in neural information processing systems, pp 1177–1184Google Scholar
  22. 22.
    Rana PS (2013) Physicochemical properties of protein tertiary structure data set. [Online]. Available: https://archive.ics.uci.edu/ml/machine-learning-databases/00265/
  23. 23.
    Rigatos GG (2015) Nonlinear Kalman filtering based on differential flatness theory. In: Nonlinear control and filtering using differential flatness approaches. Studies in systems, decision and control, vol 25. Springer, Cham, pp 141–181Google Scholar
  24. 24.
    Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, HobokenCrossRefMATHGoogle Scholar
  25. 25.
    Shen F, Yu H, Sakurai K, Hasegawa O (2011) An incremental online semi-supervised active learning algorithm based on self-organizing incremental neural network. Neural Comput Appl 20(7):1061–1074CrossRefGoogle Scholar
  26. 26.
    Silva LA, Del-Moral-Hernandez E (2011) A SOM combined with KNN for classification task. In: The 2011 international joint conference on neural networks (IJCNN), IEEE, pp 2368–2373Google Scholar
  27. 27.
    Silverman BW (1986) Density estimation for statistics and data analysis, vol 27. CRC press, Boca RatonCrossRefMATHGoogle Scholar
  28. 28.
    Sopyla K, Drozda P (2015) Stochastic gradient descent with Barzilai–Borwein update step for SVM. Inf Sci 316:218–233. doi: 10.1016/j.ins.2015.03.073 CrossRefGoogle Scholar
  29. 29.
    Tfekci P (2014) Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods. Int J Electr Power Energy Syst 60:126–140CrossRefGoogle Scholar
  30. 30.
    Thayasivam U, Kuruwita C, Ramachandran RP (2015) Robust \(l_2e\) parameter estimation of Gaussian mixture models: comparison with expectation maximization. In: Neural information processing, Springer, pp 281–288Google Scholar
  31. 31.
    Thompson JJ, Blair MR, Chen L, Henrey AJ (2013) Video game telemetry as a critical tool in the study of complex skill learning. PLoS ONE 8(9):e75,129CrossRefGoogle Scholar
  32. 32.
    Tsanas A, Little MA, McSharry PE, Ramig LO (2010) Accurate telemonitoring of Parkinson’s disease progression by noninvasive speech tests. IEEE Trans Biomed Eng 57(4):884–893CrossRefGoogle Scholar
  33. 33.
    Xiang Z, Xiao Z, Huang Y, Wang D, Fu B, Chen W (2016) Advances in knowledge discovery and data mining: 20th Pacific–Asia conference, PAKDD 2016, Auckland, New Zealand, April 19–22, 2016, proceedings, part I, chap. Unsupervised and semi-supervised dimensionality reduction with self-organizing incremental neural network and graph similarity constraints, Springer International Publishing, Cham, pp 191–202. doi: 10.1007/978-3-319-31753-3_16
  34. 34.
    Xiang Z, Xiao Z, Wang D, Li X (2016) A Gaussian mixture framework for incremental nonparametric regression with topology learning neural networks. Neurocomputing 194:34–44. doi: 10.1016/j.neucom.2016.02.008. http://www.sciencedirect.com/science/article/pii/S0925231216001880
  35. 35.
    Xiang Z, Xiao Z, Wang D, Xiao J (2017) Gaussian kernel smooth regression with topology learning neural networks and python implementation. Neurocomputing 260(C):1–4. doi: 10.1016/j.neucom.2017.01.051. http://www.sciencedirect.com/science/article/pii/S092523121730125X
  36. 36.
    Xiao X, Zhang H, Hasegawa O (2013) Density estimation method based on self-organizing incremental neural networks and error estimation. In: Lee M, Hirose A, Hou ZG, Kil RM (eds) Neural information processing. ICONIP 2013. Lecture Notes in Computer Science, vol 8227. Springer, Berlin, HeidelbergGoogle Scholar
  37. 37.
    Zhang L, Suganthan P (2016) A survey of randomized algorithms for training neural networks. Inf Sci doi: 10.1016/j.ins.2016.01.039. http://www.sciencedirect.com/science/article/pii/S002002551600058X

Copyright information

© The Natural Computing Applications Forum 2017

Authors and Affiliations

  • Jianhua Xiao
    • 1
    • 2
  • Zhiyang Xiang
    • 1
  • Dong Wang
    • 1
  • Zhu Xiao
    • 1
    • 2
  1. 1.College of Computer Science and Electronics EngineeringHunan UniversityChangshaChina
  2. 2.State Key Laboratory of Integrated Services NetworksXidian UniversityXi’anChina

Personalised recommendations