Abstract
The massive volume of content generated by social media greatly exceeds human capacity to manually process this data in order to identify topics of interest. As a solution, various automated topic detection approaches have been proposed, most of which are based on document clustering and burst detection. These approaches normally represent textual features in standard n-dimensional Euclidean metric spaces. However, in these cases, directly filtering noisy documents is challenging for topic detection. Instead we propose Topol, a topic detection method based on Topology Data Analysis (TDA) that transforms the Euclidean feature space into a topological space where the shapes of noisy irrelevant documents are much easier to distinguish from topically-relevant documents. This topological space is organised in a network according to the connectivity of the points, i.e. the documents, and by only filtering based on the size of the connected components we obtain competitive results compared to other state of the art topic detection methods.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Aiello, L.M., et al.: Sensing trending topics in Twitter. IEEE Transactions on Multimedia 15(6), 1268–1282 (2013)
Allan, J.: Topic Detection and Tracking: Event-based Information Organization, vol. 12. Springer Science & Business Media (2002)
Atefeh, F., et al.: A Survey of Techniques for Event Detection in Twitter. Computational Intelligence (2013)
Carlsson, G.: Topology and Data. Bulletin of the American Mathematical Society 46(2), 255–308 (2009)
Castillo, C., et al.: Information credibility on twitter. In: Proc. of WWW, pp. 675–684. ACM (2011)
Conover, M., et al.: Political polarization on twitter. In: Proc. of ICWSM, AAAI (2011)
Fung, G.P.C., et al.: Parameter free bursty events detection in text streams. In: Proc. of VLDB, pp. 181–192. VLDB Endowment (2005)
He, Q., et al.: Bursty feature representation for clustering text streams. In: Proc. of SDM, pp. 491–496. SIAM (2007)
Heravi, B.R., et al.: Introducing Social Semantic Journalism. The Journal of Media Innovations 2(1), 131–140 (2015)
Ifrim, G., et al.: Event Detection in Twitter using Aggressive Filtering and Hierarchical Tweet Clustering. In: SNOW-DC @ WWW, pp. 33–40. ACM (2014)
Imran, M., et al.: Processing Social Media Messages in Mass Emergency: A Survey. arXiv preprint arXiv:1407.7071 (2014)
Jain, A.K., et al.: Algorithms for Clustering Data, vol. 6. Prentice Hall, Englewood Cliffs (1988)
Liu, X., et al.: A Fast Algorithm for Constructing Topological Structure in Large Data. Homology, Homotopy and Applications 14(1), 221–238 (2012)
Lum, P., et al.: Extracting insights from the shape of complex data using topology. Scientific Reports 3 (2013)
Panisson, A.: Visualization of Egyptian revolution on Twitter (February 2011). https://www.youtube.com/watch?v=2guKJfvq4uI
Petrović, S., et al.: Streaming first story detection with application to Twitter. In: Proc. of HLT, pp. 181–189. ACL (2010)
Reeb, G.: Sur les points singuliers d’une forme de Pfaff completement intégrable ou d’une fonction numérique. CR Acad. Sci. Paris 222, 847–849 (1946)
Sakaki, T., et al.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proc. of WWW, pp. 851–860. ACM (2010)
Salton, G., et al.: Term-weighting Approaches in Automatic Text Retrieval. Information Processing & Management 24(5), 513–523 (1988)
Sayyadi, H., et al.: Event detection and tracking in social streams. In: Proc. of ICWSM. AAAI (2009)
Singh, G., et al.: Topological methods for the analysis of high dimensional data sets and 3D object recognition. In: Proc. of SPBG, pp. 91–100. IEEE (2007)
Weng, J., et al.: Event detection in twitter. In: Proc. of ICWSM, pp. 401–408. AAAI (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Torres-Tramón, P., Hromic, H., Heravi, B.R. (2015). Topic Detection in Twitter Using Topology Data Analysis. In: Daniel, F., Diaz, O. (eds) Current Trends in Web Engineering. ICWE 2015. Lecture Notes in Computer Science(), vol 9396. Springer, Cham. https://doi.org/10.1007/978-3-319-24800-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-24800-4_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24799-1
Online ISBN: 978-3-319-24800-4
eBook Packages: Computer ScienceComputer Science (R0)