# Incremental model selection and ensemble prediction under virtual concept drifting environments

- 110 Downloads

## Abstract

Model selection for machine learning systems is one of the most important issues to be addressed for obtaining greater generalization capabilities. This paper proposes a strategy to achieve model selection incrementally under virtual concept drifting environments, where the distribution of learning samples varies over time. To carry out incremental model selection, the system generally uses all the learning samples that have been observed until now. Under virtual concept drifting environments, however, the distribution of the observed samples is considerably different from the distribution of cumulative dataset so that model selection is usually unsuccessful. To overcome this problem, the author had earlier proposed the weighted objective function and model-selection criterion based on the predictive input density of the learning samples. Although the previous method described in the author’s previous study shows good performances to some datasets, it occasionally fails to yield appropriate learning results because of the failure in the prediction of the actual input density. To reduce the adverse effect, the method proposed in this paper improves on the previously described method to yield the desired outputs using an ensemble of the constructed radial basis function neural networks (RBFNNs).

## Keywords

Incremental model selection Virtual concept drifting environments Radial basis function neural networks Weighted least squares## References

- Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC-19(6):716–723MathSciNetCrossRefGoogle Scholar
- Ans B, Roussert S (2000) Neural networks with a self-refreshing memory: knowledge transfer in sequential learning tasks without catastrophic forgetting. Connect Sci 12(1):1–19CrossRefGoogle Scholar
- Bezdek J (1980) A convergence theorem for the fuzzy isodata clustering algorithms. IEEE Trans Pattern Anal Mach Intell 2:1–8MATHCrossRefGoogle Scholar
- Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc B 39(1):1–38MathSciNetMATHGoogle Scholar
- Feng G, Huang GB, Lin Q, Gay R (2009) Error minimized extreme learning machinewith growth of hidden nodes and incremental learning. IEEE Trans Neural Netw 20(8):1352–1357CrossRefGoogle Scholar
- French RM (1997) Pseudo-recurrent connectionist networks: an approach to the “sensitivity stability” dilemma. Connect Sci 9(4):353–379CrossRefGoogle Scholar
- Huang GB, Saratchandran P, Sundararajan N (2005) A generalized growing and pruning rbf (ggap-rbf) neural network for function approximation. IEEE Trans Neural Netw 16(1):57–67CrossRefGoogle Scholar
- Lòpez-Rubio E (2009) Multivariate student-
*t*self-organizing maps. Neural Netw 22:1432–1447CrossRefGoogle Scholar - Moody J, Darken CJ (1989) Fast learning in neural networks of locally-tuned processing units. Neural Comput 1:281–294CrossRefGoogle Scholar
- Ozawa S, Okamoto K (2009) An incremental learning algorithm for resource allocating networks based on local linear regression. In: 16th international conference on neural information processing Bangkok, Thailand, December 1–5, 2009, vol LNCS5863, pp 562–569Google Scholar
- Ozawa S, Toh SL, Abe S, Pang S, Kasabov N (2005) Incremental learning of feature space and classifier for face recognition. Neural Netw 18:575–584CrossRefGoogle Scholar
- Platt J (1991) A resource allocating network for function interpolation. Neural Comput 3(2):213–225MathSciNetCrossRefGoogle Scholar
- Pouzols FM, Lendasse A (2010) Evolving fuzzy optimally pruned extreme learning machine for regression problems. Evol Syst 1(1):43–58CrossRefGoogle Scholar
- Sato M, Ishii S (2000) On-line em algorithm for the normalized gaussian network. Neural Comput 12:407–432CrossRefGoogle Scholar
- Schaal S, Atkeson CG (1998) Constructive incremental learning from only local information. Neural Comput 10(8):2047–2084CrossRefGoogle Scholar
- Shimodaira H (2000) Improving predictive inference under covariate shift by weighting the log-likelihood function. J Stat Plan Inference 90(2):227–244MathSciNetMATHCrossRefGoogle Scholar
- Sugiyama M, Nakajima S, Kashima H, von Búnau P, Kawanabe M (2007) Direct importance estimation with model selection and its application to covariate shift adaptation. In: Twenty-first annual conference on neural information processing systems (NIPS2007)Google Scholar
- Yamakawa H, Masumoto D, Kimoto T, Nagata S (1994) Active data selection and subsequent revision for sequential learning with neural networks. In: World congress of neural networks (WCNN’94), vol 3, pp 661–666Google Scholar
- Yamauchi K (2009) Optimal incremental learning under covariate shift. Memet Comput 1(4):271–279CrossRefGoogle Scholar
- Yamauchi K (2010a) Incremental learning and model selection under virtual concept drifting environments. In: The 2010 IEEE World Congress on Computational Intelligence (IEEE WCCI 2010), The Institute of Electrical and Electronics Engineers, Inc. New York, New YorkGoogle Scholar
- Yamauchi K (2010b) Incremental model selection and ensemble prediction under virtual concept drifting environments. In: Zhang BT, Orgun MA (eds) PRICAI 2010: Trends in Artificial Intelligence, Springer, vol LNAI6230, pp 570–582Google Scholar
- Yamauchi K, Hayami J (2007) Incremental learning and model selection for radial basis function network through sleep. IEICE Trans Inf Syst E90-D(4):722–735Google Scholar
- Yamauchi K, Yamaguchi N, Ishii N (1999) Incremental learning methods with retrieving interfered patterns. IEEE Trans Neural Netw 10(6):1351–1365CrossRefGoogle Scholar
- Yoneda T, Yamanaka M, Kakazu Y (1992) Study on optimization of grinding conditions using neural networks—a method of additional learning. J Jpn Soc Precis Eng/Seimitsu kogakukaishi 58(10):1707–1712CrossRefGoogle Scholar