Topic Detection in Twitter Using Topology Data Analysis

  • Pablo Torres-Tramón
  • Hugo Hromic
  • Bahareh Rahmanzadeh Heravi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9396)

Abstract

The massive volume of content generated by social media greatly exceeds human capacity to manually process this data in order to identify topics of interest. As a solution, various automated topic detection approaches have been proposed, most of which are based on document clustering and burst detection. These approaches normally represent textual features in standard n-dimensional Euclidean metric spaces. However, in these cases, directly filtering noisy documents is challenging for topic detection. Instead we propose Topol, a topic detection method based on Topology Data Analysis (TDA) that transforms the Euclidean feature space into a topological space where the shapes of noisy irrelevant documents are much easier to distinguish from topically-relevant documents. This topological space is organised in a network according to the connectivity of the points, i.e. the documents, and by only filtering based on the size of the connected components we obtain competitive results compared to other state of the art topic detection methods.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aiello, L.M., et al.: Sensing trending topics in Twitter. IEEE Transactions on Multimedia 15(6), 1268–1282 (2013)CrossRefGoogle Scholar
  2. 2.
    Allan, J.: Topic Detection and Tracking: Event-based Information Organization, vol. 12. Springer Science & Business Media (2002)Google Scholar
  3. 3.
    Atefeh, F., et al.: A Survey of Techniques for Event Detection in Twitter. Computational Intelligence (2013)Google Scholar
  4. 4.
    Carlsson, G.: Topology and Data. Bulletin of the American Mathematical Society 46(2), 255–308 (2009)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Castillo, C., et al.: Information credibility on twitter. In: Proc. of WWW, pp. 675–684. ACM (2011)Google Scholar
  6. 6.
    Conover, M., et al.: Political polarization on twitter. In: Proc. of ICWSM, AAAI (2011)Google Scholar
  7. 7.
    Fung, G.P.C., et al.: Parameter free bursty events detection in text streams. In: Proc. of VLDB, pp. 181–192. VLDB Endowment (2005)Google Scholar
  8. 8.
    He, Q., et al.: Bursty feature representation for clustering text streams. In: Proc. of SDM, pp. 491–496. SIAM (2007)Google Scholar
  9. 9.
    Heravi, B.R., et al.: Introducing Social Semantic Journalism. The Journal of Media Innovations 2(1), 131–140 (2015)CrossRefGoogle Scholar
  10. 10.
    Ifrim, G., et al.: Event Detection in Twitter using Aggressive Filtering and Hierarchical Tweet Clustering. In: SNOW-DC @ WWW, pp. 33–40. ACM (2014)Google Scholar
  11. 11.
    Imran, M., et al.: Processing Social Media Messages in Mass Emergency: A Survey. arXiv preprint arXiv:1407.7071 (2014)
  12. 12.
    Jain, A.K., et al.: Algorithms for Clustering Data, vol. 6. Prentice Hall, Englewood Cliffs (1988)Google Scholar
  13. 13.
    Liu, X., et al.: A Fast Algorithm for Constructing Topological Structure in Large Data. Homology, Homotopy and Applications 14(1), 221–238 (2012)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Lum, P., et al.: Extracting insights from the shape of complex data using topology. Scientific Reports 3 (2013)Google Scholar
  15. 15.
    Panisson, A.: Visualization of Egyptian revolution on Twitter (February 2011). https://www.youtube.com/watch?v=2guKJfvq4uI
  16. 16.
    Petrović, S., et al.: Streaming first story detection with application to Twitter. In: Proc. of HLT, pp. 181–189. ACL (2010)Google Scholar
  17. 17.
    Reeb, G.: Sur les points singuliers d’une forme de Pfaff completement intégrable ou d’une fonction numérique. CR Acad. Sci. Paris 222, 847–849 (1946)Google Scholar
  18. 18.
    Sakaki, T., et al.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proc. of WWW, pp. 851–860. ACM (2010)Google Scholar
  19. 19.
    Salton, G., et al.: Term-weighting Approaches in Automatic Text Retrieval. Information Processing & Management 24(5), 513–523 (1988)CrossRefGoogle Scholar
  20. 20.
    Sayyadi, H., et al.: Event detection and tracking in social streams. In: Proc. of ICWSM. AAAI (2009)Google Scholar
  21. 21.
    Singh, G., et al.: Topological methods for the analysis of high dimensional data sets and 3D object recognition. In: Proc. of SPBG, pp. 91–100. IEEE (2007)Google Scholar
  22. 22.
    Weng, J., et al.: Event detection in twitter. In: Proc. of ICWSM, pp. 401–408. AAAI (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Pablo Torres-Tramón
    • 1
  • Hugo Hromic
    • 1
  • Bahareh Rahmanzadeh Heravi
    • 1
  1. 1.Insight Centre for Data AnalyticsNational University of IrelandGalwayIreland

Personalised recommendations