Skip to main content

Temporal Methods to Detect Content-Based Anomalies in Social Media

  • Chapter
  • First Online:
From Security to Community Detection in Social Networking Platforms (ASONAM 2017)

Part of the book series: Lecture Notes in Social Networks ((LNSN))

  • 308 Accesses

Abstract

We develop a method for time-dependent topic tracking and meme trending in social media. Our objective is to identify time periods whose content differs significantly from normal, and we utilize two techniques to do so. The first is an information-theoretic analysis of the distributions of terms emitted during different periods of time. In the second, we cluster documents from each time period and analyze the tightness of each clustering. We also discuss a method of combining the scores created by each technique, and we provide ample empirical analysis of our methodology on various Twitter datasets.

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. SAND2017-12340 J.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. AlSumait, L., Barbará, D., Domeniconi, C.: On-line LDA: adaptive topic models for mining text streams with applications to topic detection and tracking. In: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pp. 3–12. IEEE, Piscataway (2008)

    Google Scholar 

  2. Banerjee, A., Dhillon, I., Ghosh, J., Sra, S.: Generative model-based clustering of directional data. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 19–28. ACM, New York (2003)

    Google Scholar 

  3. Banerjee, A., Dhillon, I.S., Ghosh, J., Sra, S.: Clustering on the unit hypersphere using von Mises–Fisher distributions. J. Mach. Learn. Res. 6, 1345–1382 (2005)

    MathSciNet  MATH  Google Scholar 

  4. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  5. Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine learning, pp. 113–120. ACM, New York (2006)

    Google Scholar 

  6. Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating Twitter users. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 759–768. ACM, New York (2010)

    Google Scholar 

  7. Golbeck, J., Robles, C., Turner, K.: Predicting personality with social media. In: CHI’11 Extended Abstracts on Human Factors in Computing Systems, pp. 253–262. ACM, New York (2011)

    Google Scholar 

  8. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(Suppl. 1), 5228–5235 (2004)

    Article  Google Scholar 

  9. He, D., Parker, D.S.: Topic dynamics: an alternative model of bursts in streams of topics. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 443–452. ACM, New York (2010)

    Google Scholar 

  10. Kleinberg, J.: Bursty and hierarchical structure in streams. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 91–101. ACM, New York (2002)

    Google Scholar 

  11. Lee, K., Palsetia, D., Narayanan, R., Patwary, M.M.A., Agrawal, A., Choudhary, A.: Twitter trending topic classification. In: Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, pp. 251–258. IEEE, Piscataway (2011)

    Google Scholar 

  12. Leskovec, J., Backstrom, L., Kleinberg, J.: Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 497–506. ACM, New York (2009)

    Google Scholar 

  13. Morinaga, S., Yamanishi, K.: Tracking dynamics of topic trends using a finite mixture model. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 811–816. ACM, New York (2004)

    Google Scholar 

  14. Pennebaker, J.W., Mehl, M.R., Niederhoffer, K.G.: Psychological aspects of natural language use: our words, our selves. Annu. Rev. Psychol. 54(1), 547–577 (2003)

    Article  Google Scholar 

  15. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), vol. 14, pp. 1532–1543 (2014)

    Google Scholar 

  16. Piantadosi, S.T.: Zipf’s word frequency law in natural language: a critical review and future directions. Psychon. Bull. Rev. 21(5), 1112–1130 (2014)

    Article  Google Scholar 

  17. Simonson, K.: Probabilistic fusion of ATR results. Tech. Rep. SAND98–1699. Sandia National Laboratories (SNL-NM), Albuquerque, NM (1998)

    Google Scholar 

  18. Skryzalin, J., Field, R., Fisher, A., Bauer, T.: Temporal anomaly detection in social media. In: Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 505–508. ACM, New York (2017)

    Google Scholar 

  19. Spinosa, E.J., de Leon F de Carvalho, A.P., Gama, J.: OLINDDA: a cluster-based approach for detecting novelty and concept drift in data streams. In: Proceedings of the 2007 ACM Symposium on Applied Computing, pp. 448–452. ACM, New York (2007)

    Google Scholar 

  20. Sra, S.: A short note on parameter approximation for von Mises–Fisher distributions: and a fast implementation of i s(x). Comput. Stat. 27(1), 177–190 (2012)

    Article  MathSciNet  Google Scholar 

  21. Swan, R., Allan, J.: Extracting significant time varying features from text. In: Proceedings of the 8th International Conference on Information Knowledge Management, pp. 38–45. ACM, New York (1999)

    Google Scholar 

  22. Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29(1), 24–54 (2010)

    Article  Google Scholar 

  23. Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: Predicting elections with Twitter: what 140 characters reveal about political sentiment. ICWSM 10(1), 178–185 (2010)

    Google Scholar 

  24. Wang, X., McCallum, A.: Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424–433. ACM, New York (2006)

    Google Scholar 

  25. Wang, C., Blei, D., Heckerman, D.: Continuous time dynamic topic models. In: Uncertainty in Artificial Intelligence (UAI). pp. 579–586 (2008)

    Google Scholar 

  26. Yang, J., Leskovec, J.: Patterns of temporal variation in online media. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 177–186. ACM, New York (2011)

    Google Scholar 

  27. Zhang, X., Shasha, D.: Better burst detection. In: Proceedings of the 22nd International Conference on Data Engineering, p. 146. IEEE, Piscataway (2006)

    Google Scholar 

  28. Zhu, Y., Shasha, D.: Efficient elastic burst detection in data streams. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 336–345. ACM, New York (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard Field Jr. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Skryzalin, J., Field, R., Fisher, A., Bauer, T. (2019). Temporal Methods to Detect Content-Based Anomalies in Social Media. In: Karampelas, P., Kawash, J., Özyer, T. (eds) From Security to Community Detection in Social Networking Platforms. ASONAM 2017. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-11286-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-11286-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-11285-1

  • Online ISBN: 978-3-030-11286-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics