Joint European Conference on Machine Learning and Knowledge Discovery in Databases

ECML PKDD 2015: Machine Learning and Knowledge Discovery in Databases pp 401-416

Ageing-Based Multinomial Naive Bayes Classifiers Over Opinionated Data Streams

  • Sebastian Wagner
  • Max Zimmermann
  • Eirini Ntoutsi
  • Myra Spiliopoulou
Conference paper

DOI: 10.1007/978-3-319-23528-8_25

Part of the Lecture Notes in Computer Science book series (LNCS, volume 9284)
Cite this paper as:
Wagner S., Zimmermann M., Ntoutsi E., Spiliopoulou M. (2015) Ageing-Based Multinomial Naive Bayes Classifiers Over Opinionated Data Streams. In: Appice A., Rodrigues P., Santos Costa V., Soares C., Gama J., Jorge A. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science, vol 9284. Springer, Cham

Abstract

The long-term analysis of opinionated streams requires algorithms that predict the polarity of opinionated documents, while adapting to different forms of concept drift: the class distribution may change but also the vocabulary used by the document authors may change. One of the key properties of a stream classifier is adaptation to concept drifts and shifts; this is typically achieved through ageing of the data. Surprisingly, for one of the most popular classifiers, Multinomial Naive Bayes (MNB), no ageing has been considered thus far. MNB is particularly appropriate for opinionated streams, because it allows the seamless adjustment of word probabilities, as new words appear for the first time. However, to adapt properly to drift, MNB must also be extended to take the age of documents and words into account.

In this study, we incorporate ageing into the learning process of MNB, by introducing the notion of fading for words, on the basis of the recency of the documents containing them. We propose two fading versions, gradual fading and aggressive fading, of which the latter discards old data at a faster pace. Our experiments with Twitter data show that the ageing based MNBs outperform the standard accumulative MNB approach and manage to recover very fast in times of change. We experiment with different data granularities in the stream and different data ageing degrees and we show how they “work together” towards adaptation to change.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Sebastian Wagner
    • 1
  • Max Zimmermann
    • 2
  • Eirini Ntoutsi
    • 1
  • Myra Spiliopoulou
    • 2
  1. 1.Ludwig-Maximilians University of Munich (LMU)MunichGermany
  2. 2.Otto-von-Guericke-University MagdeburgMagdeburgGermany

Personalised recommendations