Soft Computing

, Volume 21, Issue 2, pp 391–401 | Cite as

A semantic frame-based intelligent agent for topic detection

  • Yung-Chun Chang
  • Yu-Lun Hsieh
  • Cen-Chieh Chen
  • Wen-Lian Hsu
Focus

Abstract

Detecting the topic of documents can help readers construct the background of the topic and facilitate document comprehension. In this paper, we propose a semantic frame-based topic detection (SFTD) that simulates such process in human perception. We take advantage of multiple knowledge sources and extracted discriminative patterns from documents through a highly automated, knowledge-supported frame generation and matching mechanisms. Using a Chinese news corpus containing over 111,000 news articles, we provide a comprehensive performance evaluation which demonstrates that our novel approach can effectively detect the topic of a document by exploiting the syntactic structures, semantic association, and the context within the text. Experimental results show that SFTD is comparable to other well-known topic detection methods.

Keywords

Topic detection Semantic frame  Semantic class Partial matching 

References

  1. Alani H, Kim S, Millard DE, Weal MJ, Hall W, Lewis PH, Shadbolt NR (2003) Automatic ontology-based knowledge extraction from web documents. Intell Syst IEEE 18(1):14–21CrossRefGoogle Scholar
  2. Baeza-Yates R, Ribeiro-Neto B (2011) Modern information retrieval: the concepts and technology behind search. Addison Wesley, New YorkGoogle Scholar
  3. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022MATHGoogle Scholar
  4. Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. ACM, pp 1247–1250Google Scholar
  5. Bun KK, Ishizuka M (2002) Topic extraction from news archive using tf*pdf algorithm. In: international conference on web information systems engineering. IEEE Computer Society, p 73Google Scholar
  6. CKIP (2009) An introduction to E-HowNet (E-HowNet technical report). Tech. rep, Academia SinicaGoogle Scholar
  7. Dong Z, Dong Q, Hao C (2010) HowNet and its computation of meaning. In: Proceedings of the 23rd international conference on computational linguistics: demonstrations, association for computational linguistics, pp 53–56Google Scholar
  8. García-Sánchez F, Martínez-Béjar R, Contreras L, Fernández-Breis JT, Castellanos-Nieves D (2006) An ontology-based intelligent system for recruitment. Exp Syst Appl 31(2):248–263CrossRefGoogle Scholar
  9. Grineva M, Grinev M, Lizorkin D (2009) Extracting key terms from noisy and multitheme documents. In: Proceedings of the 18th international conference on world wide web. ACM, pp 661–670Google Scholar
  10. Hsu W, Chen Y, Wang Y (1998) A context sensitive model for concept understanding. In: Proceeding of 3rd international conference on information theoretic approaches to logic, language, and computationGoogle Scholar
  11. Lee CS, Jian ZW, Huang LK (2005) A fuzzy ontology and its application to news summarization. IEEE Trans Syst Man Cybernet Part B Cybernet 35(5):859–880CrossRefGoogle Scholar
  12. Lee CS, Chang YC, Wang MH (2009) Ontological recommendation multi-agent for Tainan city travel. Exp Syst Appl 36(3):6740–6753CrossRefGoogle Scholar
  13. Li S, Lv X, Wang T, Shi S (2010) The key technology of topic detection based on k-means. In: 2010 international conference on future information technology and management engineering (FITME), vol 2. IEEE, pp 387–390Google Scholar
  14. Lovász L (1993) Random walks on graphs: a survey. Combinatorics, Paul erdos is eighty 2(1):1–46Google Scholar
  15. Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT Press, CambridgeMATHGoogle Scholar
  16. Nallapati R, Feng A, Peng F, Allan J (2004) Event threading within news topics. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management. ACM, pp 446–453Google Scholar
  17. Scott S, Matwin S (1999) Feature engineering for text classification. ICML (Citeseer) 99:379–388Google Scholar
  18. Shih CW, Hsieh YL, Hsu WL (2014) Sense decomposition from e-hownet for word similarity measurement. In: The 3rd IEEE EM-RITEGoogle Scholar
  19. Tho QT, Hui SC, Fong ACM, Cao TH (2006) Automatic fuzzy ontology generation for semantic web. IEEE Trans Knowl Data Eng 18(6):842–856CrossRefGoogle Scholar
  20. Wang MH, Lee CS, Hsieh KL, Hsu CY, Acampora G, Chang CC (2010) Ontology-based multi-agents for intelligent healthcare applications. J Ambient Intell Humaniz Comput 1(2):111–131CrossRefGoogle Scholar
  21. Wu Y, Ding Y, Wang X, Xu J (2010) On-line hot topic recommendation using tolerance rough set based topic clustering. J Comput 5(4):549–556Google Scholar
  22. Zhang X, Wang T (2010) Topic tracking with dynamic topic model and topic-based weighting method. J Softw 5(5):482–489Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Institute of Information Science, Academia SinicaTaipeiTaiwan
  2. 2.Department of Information ManagementNational Taiwan UniversityTaipeiTaiwan
  3. 3.Department of Computer ScienceNational Chengchi UniversityTaipeiTaiwan

Personalised recommendations