Multi-label Classification via Multi-target Regression on Data Streams

  • Aljaž Osojnik
  • Panče Panov
  • Sašo Džeroski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9356)


Multi-label classification is becoming more and more critical in data mining applications. Many efficient methods exist in the classical batch setting, however, in the streaming setting, comparatively few methods exist. In this paper, we propose a new methodology for multi-label classification via multi-target regression in a streaming setting and develop a streaming multi-target regressor iSOUP-Tree, which uses this approach. We experimentally evaluated two variants of the iSOUP-Tree algorithm, and determined that the use of regression trees is advisable over the use model trees. Furthermore, we compared our results to the state-of-the-art and found that the iSOUP-Tree method is comparable to the other streaming multi-label learners. This is a motivation for the potential use of iSOUP-Tree in an ensemble setting as a base learner.


Concept Drift Binary Relevance Ranking Loss Hoeffding Tree Massive Online Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



We would like to acknowledge the support of the EC through the projects: MAESTRA (FP7-ICT-612944) and HBP (FP7-ICT-604102), and the Slovenian Research Agency through a young researcher grant and the program Knowledge Technologies (P2-0103).


  1. 1.
    Appice, A., Džeroski, S.: Stepwise Induction of Multi-target Model Trees. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 502–509. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  2. 2.
    Bifet, A., Gavaldà, R.: Adaptive Learning from Evolving Data Streams. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 249–260. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  3. 3.
    Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: Moa: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)Google Scholar
  4. 4.
    Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 139–148. ACM (2009)Google Scholar
  5. 5.
    Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach. Learn. 76(2–3), 211–225 (2009)CrossRefGoogle Scholar
  6. 6.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM (2000)Google Scholar
  7. 7.
    Fürnkranz, J., Hüllermeier, E., Mencía, E.L., Brinker, K.: Multilabel classification via calibrated label ranking. Mach. Learn. 73(2), 133–153 (2008)CrossRefGoogle Scholar
  8. 8.
    Gibaja, E., Ventura, S.: A tutorial on multilabel learning. ACM Comput. Surv. (CSUR) 47(3), 52 (2015)CrossRefGoogle Scholar
  9. 9.
    Gonçalves, E.C., Plastino, A., Freitas, A.A.: A genetic algorithm for optimizing the label ordering in multi-label classifier chains. In: IEEE 25th International Conference on Tools with Artificial Intelligence (ICTAI), 2013, pp. 469–476. IEEE (2013)Google Scholar
  10. 10.
    Hulten, G., Domingos, P.: VFML - a toolkit for mining high-speed time-changing data streams (2003).
  11. 11.
    Ikonomovska, E., Gama, J., Džeroski, S.: Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing 150, 458–470 (2015)CrossRefGoogle Scholar
  12. 12.
    Ikonomovska, E., Gama, J., Džeroski, S.: Incremental multi-target model trees for data streams. In: Proceedings of the 2011 ACM Symposium on Applied Computing, pp. 988–993. ACM (2011)Google Scholar
  13. 13.
    Ikonomovska, E., Gama, J., Džeroski, S.: Learning model trees from evolving data streams. Data Min. Knowl. Disc. 23(1), 128–168 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Qu, W., Zhang, Y., Zhu, J., Qiu, Q.: Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble. In: Zhou, Z.-H., Washio, T. (eds.) ACML 2009. LNCS, vol. 5828, pp. 308–321. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  15. 15.
    Read, J.: A pruned problem transformation method for multi-label classification. In: Proceedings of the 2008 New Zealand Computer Science Research Student Conference (NZCSRS 2008), pp. 143–150 (2008)Google Scholar
  16. 16.
    Read, J., Bifet, A., Holmes, G., Pfahringer, B.: Scalable and efficient multi-label classification for evolving data streams. Mach. Learn. 88(1–2), 243–272 (2012)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Read, J., Pfahringer, B., Holmes, G.: Multi-label classification using ensembles of pruned sets. In: Eighth IEEE International Conference on Data Mining, 2008, ICDM 2008, pp. 995–1000. IEEE (2008)Google Scholar
  19. 19.
    Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)CrossRefGoogle Scholar
  20. 20.
    Shaker, A., Hüllermeier, E.: IBLStreams: a system for instance-based classification and regression on data streams. Evolving Syst. 3(4), 235–249 (2012)CrossRefGoogle Scholar
  21. 21.
    Shi, Z., Wen, Y., Feng, C., Zhao, H.: Drift detection for multi-label data streams based on label grouping and entropy. In: 2014 IEEE Data Mining Workshop (ICDMW), pp. 724–731. IEEE (2014)Google Scholar
  22. 22.
    Shi, Z., Xue, Y., Wen, Y., Cai, G.: Efficient class incremental learning for multi-label classification of evolving data streams. In: International Joint Conference on Neural Networks (IJCNN), 2014, pp. 2093–2099. IEEE (2014)Google Scholar
  23. 23.
    Snoek, C.G., Worring, M., Van Gemert, J.C., Geusebroek, J.M., Smeulders, A.W.: The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proceedings of the 14th Annual ACM International Conference on Multimedia, pp. 421–430. ACM (2006)Google Scholar
  24. 24.
    Spyromitros-Xioufis, E.: Dealing with concept drift and class imbalance in multi-label stream classification. Ph.D. thesis, Aristotle University of Thessaloniki (2011)Google Scholar
  25. 25.
    Struyf, J., Džeroski, S.: Constraint Based Induction of Multi-objective Regression Trees. In: Bonchi, F., Boulicaut, J.-F. (eds.) KDID 2005. LNCS, vol. 3933, pp. 222–233. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  26. 26.
    Tsoumakas, G., Vlahavas, I.P.: Random k-Labelsets: An Ensemble Method for Multilabel Classification. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 406–417. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  27. 27.
    Vens, C., Struyf, J., Schietgat, L., Džeroski, S., Blockeel, H.: Decision trees for hierarchical multi-label classification. Mach. Learn. 73(2), 185–214 (2008)CrossRefGoogle Scholar
  28. 28.
    Zhang, M.L., Zhou, Z.H.: A k-nearest neighbor based algorithm for multi-label classification. In: 2005 IEEE Granular Computing, vol. 2, pp. 718–721. IEEE (2005)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Aljaž Osojnik
    • 1
    • 2
  • Panče Panov
    • 1
  • Sašo Džeroski
    • 1
    • 2
    • 3
  1. 1.Jožef Stefan InstituteLjubljanaSlovenia
  2. 2.Jožef Stefan IPSLjubljanaSlovenia
  3. 3.CIPKeBiPLjubljanaSlovenia

Personalised recommendations