Advances in Information Retrieval
Volume 5478 of the series Lecture Notes in Computer Science pp 776-780
Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation
- Levent BolelliAffiliated withGoogle Inc.
- , Şeyda ErtekinAffiliated withDepartment of Computer Science and Engineering, Pennsylvania State University
- , C. Lee GilesAffiliated withCollege of Information Sciences and Technology, Pennsylvania State University
Abstract
Algorithms that enable the process of automatically mining distinct topics in document collections have become increasingly important due to their applications in many fields and the extensive growth of the number of documents in various domains. In this paper, we propose a generative model based on latent Dirichlet allocation that integrates the temporal ordering of the documents into the generative process in an iterative fashion. The document collection is divided into time segments where the discovered topics in each segment is propagated to influence the topic discovery in the subsequent time segments. Our experimental results on a collection of academic papers from CiteSeer repository show that segmented topic model can effectively detect distinct topics and their evolution over time.
- Title
- Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation
- Book Title
- Advances in Information Retrieval
- Book Subtitle
- 31th European Conference on IR Research, ECIR 2009, Toulouse, France, April 6-9, 2009. Proceedings
- Pages
- pp 776-780
- Copyright
- 2009
- DOI
- 10.1007/978-3-642-00958-7_84
- Print ISBN
- 978-3-642-00957-0
- Online ISBN
- 978-3-642-00958-7
- Series Title
- Lecture Notes in Computer Science
- Series Volume
- 5478
- Series ISSN
- 0302-9743
- Publisher
- Springer Berlin Heidelberg
- Copyright Holder
- Springer-Verlag Berlin Heidelberg
- Additional Links
- Topics
- Industry Sectors
- eBook Packages
- Editors
-
- Mohand Boughanem (16)
- Catherine Berrut (17)
- Josiane Mothe (18)
- Chantal Soule-Dupuy (18)
- Editor Affiliations
-
- 16. Université de Toulouse - IRIT,
- 17. Laboratoire d’Informatique de Grenoble, BP 53,, Université Joseph Fourier,
- 18. Université de Toulouse - IRIT,
- Authors
-
- Levent Bolelli (19)
- Şeyda Ertekin (20)
- C. Lee Giles (21)
- Author Affiliations
-
- 19. Google Inc., 76 9th Ave., 4th floor, New York, NY 10011, USA
- 20. Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16802, USA
- 21. College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, 16802, USA
Continue reading...
To view the rest of this content please follow the download PDF link above.