A Statistical Approach to the Discovery of Ephemeral Associations among News Topics*

  • M. Montes-y-Gómez
  • A. Gelbukh
  • A. López-López
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2113)

Abstract

News reports are an important source of information about society. Their analysis allows understanding its current interests and measuring the social importance and influence of different events. In this paper, we use the analysis of news as a means to explore the society interests. We focus on the study of a very common phenomenon of news: the influence of the peak news topics on other current news topics. We propose a simple, statistical text mining method to analyze such influences. We differentiate between the observable associations— those discovered from the newspapers—and the real-world associations, and propose a technique in which the real ones can be inferred from the observable ones. We illustrate the method with some results obtained from preliminary experiments and argue that the discovery of the ephemeral associations can be translated into knowledge about interests of society and social behavior.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ahonen-Myka, Heinonen, Klemettinen, and Verkamo (1999), Finding Co-occurring Text Phrases by Combining Sequence and Frequent Set Discovery, Proc. of the Workshop on Text Mining: Foundations, Techniques and Applications, IJCAI-99, Stockholm, 1999.Google Scholar
  2. 2.
    Allan, Papka and Lavrenko (1998), On-line new Event Detection and Tracking, Proc. of the 21st ACM-SIGIR International Conference on Research and Development in Information Retrieval, August 1998Google Scholar
  3. 3.
    Fayyad, Piatetsky-Shapiro, Smyth and Uthurusamy (1996), Advances in Knowledge Discovery and Data Mining, Cambridge, MA: MIT Press, 1996.Google Scholar
  4. 4.
    Feldman, editor (1999), Proc. of The 16th International Joint Conference on Artificial Intelligence, Workshop on Text Mining: Foundations, Techniqes and Applications, Stockholm, Sweden, 1999.Google Scholar
  5. 5.
    Feldman and Hirsh (1996), Mining Associations in Text in the Presence of Background Knowledge, Proc. of the 2nd International Conference on Knowledge Discovery (KDD-96), Portland, 1996.Google Scholar
  6. 6.
    Freund and Walpole (1990), Estadística Matemática con Aplicaciones, Cuarta Edición, Prentice Hall, 1990. (In Spanish)Google Scholar
  7. 7.
    García-Menier (1998), Un sistema para la clasificación de notas periodísticas, Simposium Internacional de Computación CIC-98, México, D.F., 1998.Google Scholar
  8. 8.
    Gay and Croft (1990), Interpreting Nominal Compounds for Information Retrieval, Information Processing and Management 26(1): 21–38, 1990.CrossRefGoogle Scholar
  9. 9.
    Glymour, Madigan, Pregibon, and Smyth (1997), Statistical Themes and Lessons for Data Mining. Data Mining and Knowledge Discovery 1, 11–28, 1997.CrossRefGoogle Scholar
  10. 10.
    Han and Kamber (2000), Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2001.Google Scholar
  11. 11.
    Hearst (1999), Untangling Text Data Mining, Proc. of ACL’99: the 37th Annual Meeting of the Association for Computational Linguistics, University of Marylnd, 1999.Google Scholar
  12. 12.
    Mladenić (2000), Proc. of the Sixth International Conference on Knowledge Discovery and Data Mining, Workshop on Text Mining, Boston, MA, 2000.Google Scholar
  13. 13.
    Montes-y-Gómez, López-López and Gelbukh (1999), Text Mining as a Social Thermometer, Proc. of the Workshop on Text Mining: Foundations, Techniques and Applications, IJCAI-99, Stockholm, 1999.Google Scholar
  14. 14.
    Rajman and Besançon (1998), Text Mining-Knowledge Extraction from Unstructured Textual Data, 6th Conference of International Federation of Classification Societies (IFCS-98), Rome, 1998.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • M. Montes-y-Gómez
    • 1
  • A. Gelbukh
    • 1
  • A. López-López
    • 2
  1. 1.Center for Computing Research (CIC), National Polytechnic Institute (IPN)Mexico
  2. 2.Instituto Nacional de Astrofísica, Optica y Electrónica (INAOE)PueblaMexico

Personalised recommendations