Performance Metrics for Model Fusion in Twitter Data Drifts

Costa, Joana; Silva, Catarina; Antunes, Mário; Ribeiro, Bernardete

doi:10.1007/978-3-319-58838-4_2

Performance Metrics for Model Fusion in Twitter Data Drifts

Joana Costa^16,17,
Catarina Silva^16,17,
Mário Antunes^16,18 &
…
Bernardete Ribeiro¹⁷

Conference paper
First Online: 12 May 2017

1821 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10255))

Abstract

Ensemble approaches have revealed remarkable abilities to tackle different learning challenges, namely in dynamic scenarios with concept drift, e.g. in social networks, as Twitter. Several efforts have been engaged in defining strategies to combine the models that constitute an ensemble. In this work, we investigate the effect of using different metrics for combining ensembles’ models, specifically performance-based metrics. We propose five performance combining metrics, having in mind that we may take advantage of diversity in classifiers, as their individual performance takes a leading role in defining their contribution to the ensemble. Experimental results on a Twitter dataset, artificially timestamped, suggest that using performance metrics to combine the models that constitute an ensemble can introduce relevant improvements in the overall ensemble performance.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Article MathSciNet MATH Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MATH Google Scholar
Bagul, R.D., Phulpagar, B.D.: Survey on approaches, problems and applications of ensemble of classifiers. Int. J. Emerg. Trends Technol. Comput. Sci. 5(1), 28–30 (2016)
Google Scholar
Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013)
Article Google Scholar
Tabassum, N., Ahmed, T.: A theoretical study on classifier ensemble methods and its applications. In: 3rd International Conference on Computing for Sustainable Global Development, pp. 67–78 (2016)
Google Scholar
Ren, Y., Zhang, L., Suganthan, P.N.: Ensemble classification and regression - recent developments, applications and future directions. IEEE Comput. Intell. Mag. 1(1), 41–43 (2016)
Article Google Scholar
Ponti Jr., M.P.: Combining classifiers: from the creation of ensembles to the decision fusion. In: 24th Conference on Graphics, Patterns and Images, pp. 1–10 (2011)
Google Scholar
Faria, E., de Carvalho, A., Gonçalves, I., Gama, J.: Novelty detection in data streams. Artif. Intell. Rev. 45(2), 235–269 (2016)
Article Google Scholar
Kuncheva, L.: A theoretical study on six classifier fusion strategies. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 281–286 (2002)
Article Google Scholar
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22, 1517–1531 (2011)
Article Google Scholar
Karnick, M., Muhlbaier, M.D., Polikar, R.: Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach. In: International Conference on Pattern Recognition, pp. 1–4 (2008)
Google Scholar
Johnson, S.: How Twitter will change the way we live. Time Mag. 173, 23–32 (2009)
Google Scholar
Tsur, O., Rappoport, A.: What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the 5th International Conference on Web Search and Data Mining, pp. 643–652 (2012)
Google Scholar
Yang, L., Sun, T., Zhang, M., Mei, Q.: We know what @you #tag: does the dual role affect hashtag adoption? In: Proceedings of the 21st International Conference on World Wide Web, pp. 261–270 (2012)
Google Scholar
Chang, H.-C.: A new perspective on Twitter hashtag use: diffusion of innovation theory. In: Proceedings of the 73rd Annual Meeting on Navigating Streams in an Information Ecosystem, pp. 85:1–85:4 (2010)
Google Scholar
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Defining semantic meta-hashtags for Twitter classification. In: Tomassini, M., Antonioni, A., Daolio, F., Buesser, P. (eds.) ICANNGA 2013. LNCS, vol. 7824, pp. 226–235. Springer, Heidelberg (2013). doi:10.1007/978-3-642-37213-1_24
Chapter Google Scholar
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Choice of best samples for building ensembles in dynamic environments. In: Jayne, C., Iliadis, L. (eds.) EANN 2016. CCIS, vol. 629, pp. 35–47. Springer, Cham (2016). doi:10.1007/978-3-319-44188-7_3
Chapter Google Scholar
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: The impact of longstanding messages in micro-blogging classification. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)
Google Scholar
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Concept drift awareness in Twitter streams. In: Proceedings of the 13th International Conference on Machine Learning and Applications, pp. 294–299 (2014)
Google Scholar
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)
Article Google Scholar
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: DOTS: drift oriented tool system. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9492, pp. 615–623. Springer, Cham (2015). doi:10.1007/978-3-319-26561-2_72
Chapter Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1999)
MATH Google Scholar

Download references

Acknowledgment

It is also financed by national funding via the Foundation for Science and Technology and by the European Regional Development Fund (FEDER), through the COMPETE 2020 - Operational Program for Competitiveness and Internationalization (POCI).

Author information

Authors and Affiliations

School of Technology and Management, Polytechnic Institute of Leiria, Leiria, Portugal
Joana Costa, Catarina Silva & Mário Antunes
Department of Informatics Engineering, Center for Informatics and Systems of the University of Coimbra (CISUC), Coimbra, Portugal
Joana Costa, Catarina Silva & Bernardete Ribeiro
Center for Research in Advanced Computing Systems, INESC-TEC, University of Porto, Porto, Portugal
Mário Antunes

Authors

Joana Costa
View author publications
You can also search for this author in PubMed Google Scholar
Catarina Silva
View author publications
You can also search for this author in PubMed Google Scholar
Mário Antunes
View author publications
You can also search for this author in PubMed Google Scholar
Bernardete Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joana Costa .

Editor information

Editors and Affiliations

Universidade da Beira Interior , Covilhã, Portugal
Luís A. Alexandre
University Jaume I , Castellón, Spain
José Salvador Sánchez
University of the Algarve , Faro, Portugal
João M. F. Rodrigues

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Costa, J., Silva, C., Antunes, M., Ribeiro, B. (2017). Performance Metrics for Model Fusion in Twitter Data Drifts. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-58838-4_2
Published: 12 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58837-7
Online ISBN: 978-3-319-58838-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics