Detecting the topic of documents can help readers construct the background of the topic and facilitate document comprehension. In this paper, we propose a semantic frame-based topic detection (SFTD) that simulates such process in human perception. We take advantage of multiple knowledge sources and extracted discriminative patterns from documents through a highly automated, knowledge-supported frame generation and matching mechanisms. Using a Chinese news corpus containing over 111,000 news articles, we provide a comprehensive performance evaluation which demonstrates that our novel approach can effectively detect the topic of a document by exploiting the syntactic structures, semantic association, and the context within the text. Experimental results show that SFTD is comparable to other well-known topic detection methods.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Alani H, Kim S, Millard DE, Weal MJ, Hall W, Lewis PH, Shadbolt NR (2003) Automatic ontology-based knowledge extraction from web documents. Intell Syst IEEE 18(1):14–21
Baeza-Yates R, Ribeiro-Neto B (2011) Modern information retrieval: the concepts and technology behind search. Addison Wesley, New York
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. ACM, pp 1247–1250
Bun KK, Ishizuka M (2002) Topic extraction from news archive using tf*pdf algorithm. In: international conference on web information systems engineering. IEEE Computer Society, p 73
CKIP (2009) An introduction to E-HowNet (E-HowNet technical report). Tech. rep, Academia Sinica
Dong Z, Dong Q, Hao C (2010) HowNet and its computation of meaning. In: Proceedings of the 23rd international conference on computational linguistics: demonstrations, association for computational linguistics, pp 53–56
García-Sánchez F, Martínez-Béjar R, Contreras L, Fernández-Breis JT, Castellanos-Nieves D (2006) An ontology-based intelligent system for recruitment. Exp Syst Appl 31(2):248–263
Grineva M, Grinev M, Lizorkin D (2009) Extracting key terms from noisy and multitheme documents. In: Proceedings of the 18th international conference on world wide web. ACM, pp 661–670
Hsu W, Chen Y, Wang Y (1998) A context sensitive model for concept understanding. In: Proceeding of 3rd international conference on information theoretic approaches to logic, language, and computation
Lee CS, Jian ZW, Huang LK (2005) A fuzzy ontology and its application to news summarization. IEEE Trans Syst Man Cybernet Part B Cybernet 35(5):859–880
Lee CS, Chang YC, Wang MH (2009) Ontological recommendation multi-agent for Tainan city travel. Exp Syst Appl 36(3):6740–6753
Li S, Lv X, Wang T, Shi S (2010) The key technology of topic detection based on k-means. In: 2010 international conference on future information technology and management engineering (FITME), vol 2. IEEE, pp 387–390
Lovász L (1993) Random walks on graphs: a survey. Combinatorics, Paul erdos is eighty 2(1):1–46
Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT Press, Cambridge
Nallapati R, Feng A, Peng F, Allan J (2004) Event threading within news topics. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management. ACM, pp 446–453
Scott S, Matwin S (1999) Feature engineering for text classification. ICML (Citeseer) 99:379–388
Shih CW, Hsieh YL, Hsu WL (2014) Sense decomposition from e-hownet for word similarity measurement. In: The 3rd IEEE EM-RITE
Tho QT, Hui SC, Fong ACM, Cao TH (2006) Automatic fuzzy ontology generation for semantic web. IEEE Trans Knowl Data Eng 18(6):842–856
Wang MH, Lee CS, Hsieh KL, Hsu CY, Acampora G, Chang CC (2010) Ontology-based multi-agents for intelligent healthcare applications. J Ambient Intell Humaniz Comput 1(2):111–131
Wu Y, Ding Y, Wang X, Xu J (2010) On-line hot topic recommendation using tolerance rough set based topic clustering. J Comput 5(4):549–556
Zhang X, Wang T (2010) Topic tracking with dynamic topic model and topic-based weighting method. J Softw 5(5):482–489
This research was supported by the Ministry of Science and Technology of Taiwan under grant MOST 103-3111-Y-001-027.
This research was supported by the National Science Council of Taiwan under Grant NSC102-3111-Y-001-012, NSC102-3113-P-001-006 and NSC 102-3114-Y-307-026.
Communicated by C.-S. Lee.
About this article
Cite this article
Chang, YC., Hsieh, YL., Chen, CC. et al. A semantic frame-based intelligent agent for topic detection. Soft Comput 21, 391–401 (2017). https://doi.org/10.1007/s00500-015-1695-4
- Topic detection
- Semantic frame
- Semantic class
- Partial matching