DOTS: Drift Oriented Tool System

Costa, Joana; Silva, Catarina; Antunes, Mário; Ribeiro, Bernardete

doi:10.1007/978-3-319-26561-2_72

Joana Costa^17,18,
Catarina Silva^17,18,
Mário Antunes^17,19 &
…
Bernardete Ribeiro¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9492))

Included in the following conference series:

International Conference on Neural Information Processing

2257 Accesses
3 Citations

Abstract

Drift is a given in most machine learning applications. The idea that models must accommodate for changes, and thus be dynamic, is ubiquitous. Current challenges include temporal data streams, drift and non-stationary scenarios, often with text data, whether in social networks or in business systems. There are multiple drift patterns types: concepts that appear and disappear suddenly, recurrently, or even gradually or incrementally. Researchers strive to propose and test algorithms and techniques to deal with drift in text classification, but it is difficult to find adequate benchmarks in such dynamic environments.

In this paper we present DOTS, Drift Oriented Tool System, a framework that allows for the definition and generation of text-based datasets where drift characteristics can be thoroughly defined, implemented and tested. The usefulness of DOTS is presented using a Twitter stream case study. DOTS is used to define datasets and test the effectiveness of using different document representation in a Twitter scenario. Results show the potential of DOTS in machine learning research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.lemurproject.org/

References

Wu, Q., Hu, W., Wang, B., Han, Z., Qi, Y.: Software aging mechanism analysis and rejuvenation. Int. J. Digit. Content Technol. Appl. 6(22), 552 (2012)
Article Google Scholar
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Concept drift awareness in Twitter streams. In: Proceedings of 13th International Conference on Machine Learning and Applications, pp. 294–299 (2014)
Google Scholar
Mejri, D., Khanchel, R., Limam, M.: An ensemble method for concept drift in nonstationary environment. J. Stat. Comput. Simul. 83(6), 1115–1128 (2013)
Article MathSciNet MATH Google Scholar
Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013)
Article Google Scholar
Tsymbal, A., Pechenizkiy, M., Cunningham, P., Puuronen, S.: Dynamic integration of classifiers for handling concept drift. Inf. Fusion 9(1), 56–68 (2008)
Article Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Google Scholar
Zliobaite, I.: Learning under concept drift: an overview. Vilnius University, Faculty of Mathematics and Informatic, Technical report (2010)
Google Scholar
Willett, P.: The porter stemming algorithm: then and now. Program 40(3), 219–223 (2006)
Article MathSciNet Google Scholar
Krovetz, R.: Viewing morphology as an inference process. In: Proceedings of 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 191–202. ACM (1993)
Google Scholar
Huang, J., Thornton, K.M., Efthimiadis, E.N.: Conversational tagging in Twitter. In: Proceedings of 21st ACM Conference on Hypertext and Hypermedia, pp. 173–178 (2010)
Google Scholar
Merriam-webster’s dictionary, October 2012
Google Scholar
Zappavigna, M.: Ambient affiliation: a linguistic perspective on Twitter. New Media Soc. 13(5), 788–806 (2011)
Article Google Scholar
Johnson, S.: How Twitter will change the way we live. Time Mag. 173, 23–32 (2009)
Google Scholar
Tsur, O., Rappoport, A.: What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of 5th International Conference on Web Search and Data Mining, pp. 643–652 (2012)
Google Scholar
Yang, L., Sun, T., Zhang, M., Mei, Q.: We know what @you #tag: does the dual role affect hashtag adoption? In: Proceedings of 21st International Conference on World Wide Web, pp. 261–270 (2012)
Google Scholar
Chang, H.-C.: A new perspective on Twitter hashtag use: diffusion of innovation theory. In: Proceedings of 73rd Annual Meeting on Navigating Streams in an Information Ecosystem, pp. 85:1–85:4 (2010)
Google Scholar
Costa, J., Silva, C., Antunes, M., Ribeiro, B.: Defining semantic meta-hashtags for twitter classification. In: Tomassini, M., Antonioni, A., Daolio, F., Buesser, P. (eds.) ICANNGA 2013. LNCS, vol. 7824, pp. 226–235. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Acknowledgment

We gratefully acknowledge iCIS project (CENTRO-07-ST24-FEDER - 107002003).

Author information

Authors and Affiliations

School of Technology and Management, Polytechnic Institute of Leiria, Leiria, Portugal
Joana Costa, Catarina Silva & Mário Antunes
Department of Informatics Engineering, Center for Informatics and Systems of the University of Coimbra (CISUC), Coimbra, Portugal
Joana Costa, Catarina Silva & Bernardete Ribeiro
Center for Research in Advanced Computing Systems, INESC-TEC, University of Porto, Porto, Portugal
Mário Antunes

Authors

Joana Costa
View author publications
You can also search for this author in PubMed Google Scholar
Catarina Silva
View author publications
You can also search for this author in PubMed Google Scholar
Mário Antunes
View author publications
You can also search for this author in PubMed Google Scholar
Bernardete Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joana Costa .

Editor information

Editors and Affiliations

University of Istanbul, Istanbul, Turkey
Sabri Arik
University at Qatar, Doha, Qatar
Tingwen Huang
Tunku Abdul Rahman University College, Kuala Lumpur, Malaysia
Weng Kin Lai
University of Science Technology, Wuhan, China
Qingshan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Costa, J., Silva, C., Antunes, M., Ribeiro, B. (2015). DOTS: Drift Oriented Tool System. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9492. Springer, Cham. https://doi.org/10.1007/978-3-319-26561-2_72

Download citation

DOI: https://doi.org/10.1007/978-3-319-26561-2_72
Published: 18 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26560-5
Online ISBN: 978-3-319-26561-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics