Skip to main content

Topic Tracking Based on Keywords Dependency Profile

  • Conference paper
Information Retrieval Technology (AIRS 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4993))

Included in the following conference series:

Abstract

Topic tracking is an important task of Topic Detection and Tracking (TDT). Its purpose is to detect stories, from a stream of news, related to known topics. Each topic is “known” by its association with several sample stories that discuss it. In this paper, we propose a new method to build the keywords dependency profile (KDP) of each story and track topic basing on similarity between the profiles of topic and story. In this method, keywords of a story are selected by document summarization technology. The KDP is built by keywords co-occurrence frequency in the same sentences of the story. We demonstrate this profile can describe the core events in a story accurately. Experiments on the mandarin resource of TDT4 and TDT5 show topic tracking system basing on KDP improves the performance by 13.25% on training dataset and 7.49% on testing dataset comparing to baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allan, J.: Topic Detection and Tracking: Event-based Information Organization. Kluwer Academic Publishers, Massachusetts (2002)

    MATH  Google Scholar 

  2. The, Topic Detection and Tracking (TDT2004) Task Definition and Evaluation Plan (2004), http://www.nist.gov/speech/tests/tdt/tdt2004/TDT04.Eval.Plan.v1.2.pdf

  3. Allan, J., Papka, R., Lavrenko., V.: On-line New Event Detection and Tracking. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 37–45. ACM Press, Melbourne (1998)

    Chapter  Google Scholar 

  4. Radev, R., Jing, H., Budzikowska, M.: Centroid-based Summarization of Multiple Documents: Sentence Extraction, Utility-based Evaluation, and User Studies. In: ANLP/NAACL 2000 Workshop on Automatic Summarization, Association for Computational Linguistics, Seattle, pp. 21–29 (2000)

    Google Scholar 

  5. Masand, B., Linoff, G., Waltz, D.: Classifying News Stories Using Memory Based Reasoning. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 59–65. ACM Press, Copenhagen

    Google Scholar 

  6. Carbonell, J., Yang, Y., Lafferty, J., Brown, R.D., Pierce, T., Liu, X.: CMU Report on TDT-2: Segmentation, Detection and Tracking. In: Proceedings of the DARPA Broadcast News Workshop, pp. 117–120. Morgan Kauffman Publishers, San Francisco (1999)

    Google Scholar 

  7. Kupiec, J., Pedersen, J., Chen, F.: A Trainable Document Summarizer. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 68–73. ACM Press, Seattle (1995)

    Chapter  Google Scholar 

  8. Yang, Y., Zhang, J., Carbonell, J., Jin, C.: Topic-conditioned Novelty Detection. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 668–693. ACM Press, Edmonton (2002)

    Google Scholar 

  9. Li, B., Li, W., Lu, Q.: Topic Tracking with Time Granularity Reasoning. ACM Transactions on Asian Language Information Processing (TALIP) 5, 388–412 (2006)

    Article  Google Scholar 

  10. Li, B., Li, W., Lu, Q., Wu, M.: Profile-based Event Tracking. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 631–632. ACM Press, Salvador (2005)

    Chapter  Google Scholar 

  11. Larkey, L., Feng, F., Connell, M., Lavrenko, V.: Language-specific Models in Multilingual Topic Tracking. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 402–409. ACM Press, Sheffield (2004)

    Google Scholar 

  12. Makkonen, J.: Investigations on Event Evolution in TDT. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Association for Computational Linguistics. Edmonton, pp. 43–48 (2003)

    Google Scholar 

  13. Shah, C., Croft, W.B., Jensen, D.: Representing Documents with Named Entities for Story Link Detection (SLD). In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp. 868–869. ACM Press, Virginia (2006)

    Chapter  Google Scholar 

  14. CMU TEAM-A in TDT 2004 Topic Tracking, http://www.nist.gov/speech/tests/tdt/tdt2004/papers/CMU_A_tracking_TDT2004.ppt

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hang Li Ting Liu Wei-Ying Ma Tetsuya Sakai Kam-Fai Wong Guodong Zhou

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zheng, W., Zhang, Y., Hong, Y., Fan, J., Liu, T. (2008). Topic Tracking Based on Keywords Dependency Profile. In: Li, H., Liu, T., Ma, WY., Sakai, T., Wong, KF., Zhou, G. (eds) Information Retrieval Technology. AIRS 2008. Lecture Notes in Computer Science, vol 4993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68636-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68636-1_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68633-0

  • Online ISBN: 978-3-540-68636-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics