Comparison of Tree-Based Methods for Multi-target Regression on Data Streams
Single-target regression is a classical data mining task that is popular both in the batch and in the streaming setting. Multi-target regression is an extension of the single-target regression task, in which multiple continuous targets have to be predicted together. Recent studies in the batch setting have shown that global approaches, predicting all of the targets at once, tend to outperform local approaches, predicting each target separately. In this paper, we explore how different local and global tree-based approaches for multi-target regression compare in the streaming setting. Specifically, we apply a local method based on the FIMT-DD algorithm and propose a novel global method, named iSOUP-Tree-MTR. Furthermore, we present an experimental evaluation that is mainly oriented towards exploring the differences between the local and the global approach.
KeywordsData Stream Predictive Performance Target Variable Global Approach Memory Consumption
The authors are supported by The Slovenian Research Agency (Grant P2-0103 and a young researcher grant) and the European Commission (Grants ICT-2013-612944 MAESTRA and ICT-2013-604102 HBP).
- 1.Appice, A., Džeroski, S.: Stepwise induction of multi-target model trees. In: 18th European Conference on Machine Learning, pp. 502–509 (2007)Google Scholar
- 2.Bifet, A., Gavaldà, R.: Adaptive learning from evolving data streams. In: 8th International Symposium on Advances in Intelligent Data Analysis, pp. 249–260 (2009)Google Scholar
- 3.Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: Massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)Google Scholar
- 4.Domingos, P., Hulten, G.: Mining high-speed data streams. In: 6th ACM SIGKDD, pp. 71–80. ACM, New York (2000)Google Scholar
- 5.Duarte, J., Gama, J.: Ensembles of adaptive model rules from high-speed data streams. In: 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining, pp. 198–213 (2014)Google Scholar
- 10.Ikonomovska, E., Gama, J., Džeroski, S.: Incremental multi-target model trees for data streams. In: 2011 ACM Symposium on Applied Computing, pp. 988–993. ACM, New York (2011)Google Scholar
- 15.Oza, N.C., Russel, S.J.: Experimental comparisons of online and batch versions of bagging and boosting. In: 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 359–364. ACM, New York (2001)Google Scholar
- 19.Stojanova, D.: Estimating Forest Properties from Remotely Sensed Data by using Machine Learning. Master’s thesis, Jožef Stefan International Postgraduate School, Ljubljana, Slovenia (2009)Google Scholar
- 20.Struyf, J., Dzeroski, S.: Constraint based induction of multi-objective regression trees. In: 4th International Workshop on Knowledge Discovery in Inductive Databases, pp. 222–233 (2005)Google Scholar
- 21.Xioufis, E.S., Groves, W., Tsoumakas, G., Vlahavas, I.P.: Multi-label classification methods for multi-target regression. CoRR abs/1211.6581 (2012). http://arxiv.org/abs/1211.6581