Advertisement

Advances in Data Analysis and Classification

, Volume 11, Issue 1, pp 159–178 | Cite as

Evaluation of the evolution of relationships between topics over time

  • Wolfgang Gaul
  • Dominique Vincent
Regular Article

Abstract

Topics that attract public attention can originate from current events or developments, might be influenced by situations in the past, and often continue to be of interest in the future. When respective information is made available textually, one possibility of detecting such topics of public importance consists in scrutinizing, e.g., appropriate press articles using—given the continual growth of information—text processing techniques enriched by computer routines which examine present-day textual material, check historical publications, find newly emerging topics, and are able to track topic trends over time. Information clustering based on content-(dis)similarity of the underlying textual material and graph-theoretical considerations to deal with the network of relationships between content-similar topics are described and combined in a new approach. Explanatory examples of topic detection and tracking in online news articles illustrate the usefulness of the approach in different situations.

Keywords

Topic relationships Topic trend detection Text processing Content-(dis)similarity Information clustering 

Mathematics Subject Classification

01-08 62H30 68M11 68P10 68U15 68W27 90C35 91C20 

References

  1. Allan J (2002a) Detection as multi-topic tracking. Inf Retr 5(2–3):139–157CrossRefGoogle Scholar
  2. Allan J (2002b) Introduction to topic detection and tracking. In: Allan J (ed) Topic detection and tracking. Kluwer Academic Publishers, Norwell, pp 1–16CrossRefGoogle Scholar
  3. Allan J (ed) (2002c) Topic detection and tracking: event-based information organization. Kluwer Academic Publishers, NorwellzbMATHGoogle Scholar
  4. Allan J, Carbonell J, Doddington G, Yamron J, Yang Y (1998) Topic detection and tracking pilot study: final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. Lansdowne, VA, USA, pp 194–218Google Scholar
  5. Allan J, Lavrenko V, Swan R (2002) Exploration within topic tracking and detection. In: Allan J (ed) Topic detection and tracking. Kluwer Academic Publishers, Norwell, pp 197–224CrossRefGoogle Scholar
  6. Benhardus J (2010) Streaming trend detection in Twitter. In: UCCS REU for Artificial Intelligence, Natural Language Processing and Information Retrieval, Final ReportGoogle Scholar
  7. Bock HH (1974) Automatische Klassifikation. Theoretische und praktische Methoden zur Gruppierung und Strukturierung von Daten (Cluster-Analyse). Vandenhoeck & Ruprecht, GöttingenzbMATHGoogle Scholar
  8. Bock HH (1980) Clusteranalyse—Überblick und neuere Entwicklungen. Oper Res Spektrum 1(4):211–232Google Scholar
  9. Brandes U, Erlebach T (eds) (2005) Network analysis: methodological foundations, vol 3418. Lecture Notes in Computer Science. Springer-Verlag New York Inc, SecaucusGoogle Scholar
  10. Bun KK, Ishizuka M (2006) Emerging topic tracking system in WWW. Knowl Based Syst 19(3):164–171CrossRefGoogle Scholar
  11. Gaul W (2011) Web page importance ranking. Adv Data Anal Classif 5:113–128MathSciNetCrossRefzbMATHGoogle Scholar
  12. Jin Y, Myaeng SH, Jung Y (2007) Use of place information for improved event tracking. Inf Process Manage 43(2):365–378CrossRefGoogle Scholar
  13. Khy S, Ishikawa Y, Kitagawa H (2008) A novelty-based clustering method for on-line documents. World Wide Web 11(1):1–37CrossRefGoogle Scholar
  14. Kim P, Myaeng SH (2004) Usefulness of temporal information automatically extracted from news articles for topic tracking. ACM Trans Asian Lang Inf Process 3(4):227–242CrossRefGoogle Scholar
  15. Kupietz M, Keibel H (2009) The Mannheim German reference corpus (DeReKo) as a basis for empirical linguistic research. In: Minegishi M, Kawaguchi Y (eds) Working Papers in Corpus-Based Linguistics and Language Education, Tokyo University of Foreign Studies (TUFS), 3, pp 53–59Google Scholar
  16. Kupietz M, Belica C, Keibel H, Witt A (2010) The German reference corpus DeReKo: A primordial sample for linguistic research. In: Calzolari N, Choukri K, Maegaard B, Mariani J, Odijk J, Piperidis S, Rosner M, Tapias D (eds) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta, MaltaGoogle Scholar
  17. Li B, Li W, Lu Q (2006) Topic tracking with time granularity reasoning. ACM Trans Asian Lang Inf Process 5(4):388–412CrossRefGoogle Scholar
  18. Mathioudakis M, Koudas N (2010) Twittermonitor: trend detection over the twitter stream. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, ACM, New York, NY, USA, SIGMOD ’10, pp 1155–1158Google Scholar
  19. Mei Q, Liu C, Su H, Zhai C (2006) A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: Proceedings of the 15th International Conference on World Wide Web, ACM, New York, NY, USA, WWW ’06, pp 533–542Google Scholar
  20. Oard DW (1999) Topic tracking with the prise information retrieval system. In: Proceedings of the DARPA Broadcast News Workshop, pp 209–211Google Scholar
  21. Oliveira M, Gama J (2010) Bipartite graphs for monitoring clusters transitions. In: Cohen P, Adams N, Berthold M (eds) Advances in intelligent data analysis IX, vol 6065., Lecture Notes in Computer Science. Springer, Berlin Heidelberg, pp 114–124Google Scholar
  22. Pons-Porrata A, Berlanga-Llavori R, Ruiz-Shulcloper J (2002) On-line event and topic detection by using the compact sets clustering algorithm. J Intell Fuzzy Syst 12(3,4):185–194Google Scholar
  23. Rajaraman K, Tan AH (2001) Topic detection, tracking, and trend analysis using self-organizing neural networks. In: Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer-Verlag, London, UK, UK, PAKDD ’01, pp 102–107Google Scholar
  24. Salton G (1989) Automatic text processing: the transformation, analysis, and retrieval of information by computer. Addison-Wesley Longman Publishing Co., Inc, BostonGoogle Scholar
  25. Steiner T, van Hooland S, Summers E (2013) MJ no more: using concurrent wikipedia edit spikes with social network plausibility checks for breaking news detection. Computing Research Repository. arXiV:1303.4702
  26. Tu YN, Seng JL (2012) Indices of novelty for emerging topic detection. Inf Process Manage 48(2):303–325CrossRefGoogle Scholar
  27. Walls F, Jin H, Sista S, Schwartz R (1999) Topic detection in broadcast news. In: Proceedings of the DARPA Broadcast News Workshop, Morgan Kaufmann Publishers, Inc, pp 193–198Google Scholar
  28. Wayne CL (1998) Topic detection and tracking (tdt)—overview and perspective. In: DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne Conference Resort, Lansdowne VirginiaGoogle Scholar
  29. Wei CP, Lee YH (2004) Event detection from online news documents for supporting environmental scanning. Decis Support Syst 36(4):385–401CrossRefGoogle Scholar
  30. Yang C, Shi X, Wei CP (2009) Discovering event evolution graphs from news corpora. IEEE Trans Syst Man Cybern Part A Syst Hum 39(4):850–863CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Institute of Information Systems and MarketingKarlsruhe Institute of TechnologyKarlsruheGermany

Personalised recommendations