A Statistical Approach to the Discovery of Ephemeral Associations among News Topics*
News reports are an important source of information about society. Their analysis allows understanding its current interests and measuring the social importance and influence of different events. In this paper, we use the analysis of news as a means to explore the society interests. We focus on the study of a very common phenomenon of news: the influence of the peak news topics on other current news topics. We propose a simple, statistical text mining method to analyze such influences. We differentiate between the observable associations— those discovered from the newspapers—and the real-world associations, and propose a technique in which the real ones can be inferred from the observable ones. We illustrate the method with some results obtained from preliminary experiments and argue that the discovery of the ephemeral associations can be translated into knowledge about interests of society and social behavior.
KeywordsNormalization Effect News Report Observable Frequency Society Interest News Topic
Unable to display preview. Download preview PDF.
- 1.Ahonen-Myka, Heinonen, Klemettinen, and Verkamo (1999), Finding Co-occurring Text Phrases by Combining Sequence and Frequent Set Discovery, Proc. of the Workshop on Text Mining: Foundations, Techniques and Applications, IJCAI-99, Stockholm, 1999.Google Scholar
- 2.Allan, Papka and Lavrenko (1998), On-line new Event Detection and Tracking, Proc. of the 21st ACM-SIGIR International Conference on Research and Development in Information Retrieval, August 1998Google Scholar
- 3.Fayyad, Piatetsky-Shapiro, Smyth and Uthurusamy (1996), Advances in Knowledge Discovery and Data Mining, Cambridge, MA: MIT Press, 1996.Google Scholar
- 4.Feldman, editor (1999), Proc. of The 16th International Joint Conference on Artificial Intelligence, Workshop on Text Mining: Foundations, Techniqes and Applications, Stockholm, Sweden, 1999.Google Scholar
- 5.Feldman and Hirsh (1996), Mining Associations in Text in the Presence of Background Knowledge, Proc. of the 2nd International Conference on Knowledge Discovery (KDD-96), Portland, 1996.Google Scholar
- 6.Freund and Walpole (1990), Estadística Matemática con Aplicaciones, Cuarta Edición, Prentice Hall, 1990. (In Spanish)Google Scholar
- 7.García-Menier (1998), Un sistema para la clasificación de notas periodísticas, Simposium Internacional de Computación CIC-98, México, D.F., 1998.Google Scholar
- 10.Han and Kamber (2000), Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2001.Google Scholar
- 11.Hearst (1999), Untangling Text Data Mining, Proc. of ACL’99: the 37th Annual Meeting of the Association for Computational Linguistics, University of Marylnd, 1999.Google Scholar
- 12.Mladenić (2000), Proc. of the Sixth International Conference on Knowledge Discovery and Data Mining, Workshop on Text Mining, Boston, MA, 2000.Google Scholar
- 13.Montes-y-Gómez, López-López and Gelbukh (1999), Text Mining as a Social Thermometer, Proc. of the Workshop on Text Mining: Foundations, Techniques and Applications, IJCAI-99, Stockholm, 1999.Google Scholar
- 14.Rajman and Besançon (1998), Text Mining-Knowledge Extraction from Unstructured Textual Data, 6th Conference of International Federation of Classification Societies (IFCS-98), Rome, 1998.Google Scholar