Skip to main content

Performance Metrics for Model Fusion in Twitter Data Drifts

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10255))

Abstract

Ensemble approaches have revealed remarkable abilities to tackle different learning challenges, namely in dynamic scenarios with concept drift, e.g. in social networks, as Twitter. Several efforts have been engaged in defining strategies to combine the models that constitute an ensemble. In this work, we investigate the effect of using different metrics for combining ensembles’ models, specifically performance-based metrics. We propose five performance combining metrics, having in mind that we may take advantage of diversity in classifiers, as their individual performance takes a leading role in defining their contribution to the ensemble. Experimental results on a Twitter dataset, artificially timestamped, suggest that using performance metrics to combine the models that constitute an ensemble can introduce relevant improvements in the overall ensemble performance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  2. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  Google Scholar 

  3. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  4. Bagul, R.D., Phulpagar, B.D.: Survey on approaches, problems and applications of ensemble of classifiers. Int. J. Emerg. Trends Technol. Comput. Sci. 5(1), 28–30 (2016)

    Google Scholar 

  5. Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013)

    Article  Google Scholar 

  6. Tabassum, N., Ahmed, T.: A theoretical study on classifier ensemble methods and its applications. In: 3rd International Conference on Computing for Sustainable Global Development, pp. 67–78 (2016)

    Google Scholar 

  7. Ren, Y., Zhang, L., Suganthan, P.N.: Ensemble classification and regression - recent developments, applications and future directions. IEEE Comput. Intell. Mag. 1(1), 41–43 (2016)

    Article  Google Scholar 

  8. Ponti Jr., M.P.: Combining classifiers: from the creation of ensembles to the decision fusion. In: 24th Conference on Graphics, Patterns and Images, pp. 1–10 (2011)

    Google Scholar 

  9. Faria, E., de Carvalho, A., Gonçalves, I., Gama, J.: Novelty detection in data streams. Artif. Intell. Rev. 45(2), 235–269 (2016)

    Article  Google Scholar 

  10. Kuncheva, L.: A theoretical study on six classifier fusion strategies. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 281–286 (2002)

    Article  Google Scholar 

  11. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22, 1517–1531 (2011)

    Article  Google Scholar 

  12. Karnick, M., Muhlbaier, M.D., Polikar, R.: Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach. In: International Conference on Pattern Recognition, pp. 1–4 (2008)

    Google Scholar 

  13. Johnson, S.: How Twitter will change the way we live. Time Mag. 173, 23–32 (2009)

    Google Scholar 

  14. Tsur, O., Rappoport, A.: What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the 5th International Conference on Web Search and Data Mining, pp. 643–652 (2012)

    Google Scholar 

  15. Yang, L., Sun, T., Zhang, M., Mei, Q.: We know what @you #tag: does the dual role affect hashtag adoption? In: Proceedings of the 21st International Conference on World Wide Web, pp. 261–270 (2012)

    Google Scholar 

  16. Chang, H.-C.: A new perspective on Twitter hashtag use: diffusion of innovation theory. In: Proceedings of the 73rd Annual Meeting on Navigating Streams in an Information Ecosystem, pp. 85:1–85:4 (2010)

    Google Scholar 

  17. Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Defining semantic meta-hashtags for Twitter classification. In: Tomassini, M., Antonioni, A., Daolio, F., Buesser, P. (eds.) ICANNGA 2013. LNCS, vol. 7824, pp. 226–235. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37213-1_24

    Chapter  Google Scholar 

  18. Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Choice of best samples for building ensembles in dynamic environments. In: Jayne, C., Iliadis, L. (eds.) EANN 2016. CCIS, vol. 629, pp. 35–47. Springer, Cham (2016). doi:10.1007/978-3-319-44188-7_3

    Chapter  Google Scholar 

  19. Costa, J., Silva, C., Antunes, M., Ribeiro, B.: The impact of longstanding messages in micro-blogging classification. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)

    Google Scholar 

  20. Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Concept drift awareness in Twitter streams. In: Proceedings of the 13th International Conference on Machine Learning and Applications, pp. 294–299 (2014)

    Google Scholar 

  21. Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)

    Article  Google Scholar 

  22. Costa, J., Silva, C., Antunes, M., Ribeiro, B.: DOTS: drift oriented tool system. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9492, pp. 615–623. Springer, Cham (2015). doi:10.1007/978-3-319-26561-2_72

    Chapter  Google Scholar 

  23. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1999)

    MATH  Google Scholar 

Download references

Acknowledgment

It is also financed by national funding via the Foundation for Science and Technology and by the European Regional Development Fund (FEDER), through the COMPETE 2020 - Operational Program for Competitiveness and Internationalization (POCI).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joana Costa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Costa, J., Silva, C., Antunes, M., Ribeiro, B. (2017). Performance Metrics for Model Fusion in Twitter Data Drifts. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58838-4_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58837-7

  • Online ISBN: 978-3-319-58838-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics