A semantic frame-based intelligent agent for topic detection

Abstract

Detecting the topic of documents can help readers construct the background of the topic and facilitate document comprehension. In this paper, we propose a semantic frame-based topic detection (SFTD) that simulates such process in human perception. We take advantage of multiple knowledge sources and extracted discriminative patterns from documents through a highly automated, knowledge-supported frame generation and matching mechanisms. Using a Chinese news corpus containing over 111,000 news articles, we provide a comprehensive performance evaluation which demonstrates that our novel approach can effectively detect the topic of a document by exploiting the syntactic structures, semantic association, and the context within the text. Experimental results show that SFTD is comparable to other well-known topic detection methods.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    http://www-nlp.stanford.edu/ner/.

    Fig. 2
    figure2

    Architecture of named entity ontology

References

  1. Alani H, Kim S, Millard DE, Weal MJ, Hall W, Lewis PH, Shadbolt NR (2003) Automatic ontology-based knowledge extraction from web documents. Intell Syst IEEE 18(1):14–21

    Article  Google Scholar 

  2. Baeza-Yates R, Ribeiro-Neto B (2011) Modern information retrieval: the concepts and technology behind search. Addison Wesley, New York

    Google Scholar 

  3. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  4. Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. ACM, pp 1247–1250

  5. Bun KK, Ishizuka M (2002) Topic extraction from news archive using tf*pdf algorithm. In: international conference on web information systems engineering. IEEE Computer Society, p 73

  6. CKIP (2009) An introduction to E-HowNet (E-HowNet technical report). Tech. rep, Academia Sinica

  7. Dong Z, Dong Q, Hao C (2010) HowNet and its computation of meaning. In: Proceedings of the 23rd international conference on computational linguistics: demonstrations, association for computational linguistics, pp 53–56

  8. García-Sánchez F, Martínez-Béjar R, Contreras L, Fernández-Breis JT, Castellanos-Nieves D (2006) An ontology-based intelligent system for recruitment. Exp Syst Appl 31(2):248–263

    Article  Google Scholar 

  9. Grineva M, Grinev M, Lizorkin D (2009) Extracting key terms from noisy and multitheme documents. In: Proceedings of the 18th international conference on world wide web. ACM, pp 661–670

  10. Hsu W, Chen Y, Wang Y (1998) A context sensitive model for concept understanding. In: Proceeding of 3rd international conference on information theoretic approaches to logic, language, and computation

  11. Lee CS, Jian ZW, Huang LK (2005) A fuzzy ontology and its application to news summarization. IEEE Trans Syst Man Cybernet Part B Cybernet 35(5):859–880

    Article  Google Scholar 

  12. Lee CS, Chang YC, Wang MH (2009) Ontological recommendation multi-agent for Tainan city travel. Exp Syst Appl 36(3):6740–6753

    Article  Google Scholar 

  13. Li S, Lv X, Wang T, Shi S (2010) The key technology of topic detection based on k-means. In: 2010 international conference on future information technology and management engineering (FITME), vol 2. IEEE, pp 387–390

  14. Lovász L (1993) Random walks on graphs: a survey. Combinatorics, Paul erdos is eighty 2(1):1–46

  15. Manning CD, Schütze H (1999) Foundations of statistical natural language processing, vol 999. MIT Press, Cambridge

    Google Scholar 

  16. Nallapati R, Feng A, Peng F, Allan J (2004) Event threading within news topics. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management. ACM, pp 446–453

  17. Scott S, Matwin S (1999) Feature engineering for text classification. ICML (Citeseer) 99:379–388

  18. Shih CW, Hsieh YL, Hsu WL (2014) Sense decomposition from e-hownet for word similarity measurement. In: The 3rd IEEE EM-RITE

  19. Tho QT, Hui SC, Fong ACM, Cao TH (2006) Automatic fuzzy ontology generation for semantic web. IEEE Trans Knowl Data Eng 18(6):842–856

    Article  Google Scholar 

  20. Wang MH, Lee CS, Hsieh KL, Hsu CY, Acampora G, Chang CC (2010) Ontology-based multi-agents for intelligent healthcare applications. J Ambient Intell Humaniz Comput 1(2):111–131

    Article  Google Scholar 

  21. Wu Y, Ding Y, Wang X, Xu J (2010) On-line hot topic recommendation using tolerance rough set based topic clustering. J Comput 5(4):549–556

  22. Zhang X, Wang T (2010) Topic tracking with dynamic topic model and topic-based weighting method. J Softw 5(5):482–489

    Google Scholar 

Download references

Acknowledgments

This research was supported by the Ministry of Science and Technology of Taiwan under grant MOST 103-3111-Y-001-027.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yung-Chun Chang.

Additional information

This research was supported by the National Science Council of Taiwan under Grant NSC102-3111-Y-001-012, NSC102-3113-P-001-006 and NSC 102-3114-Y-307-026.

Communicated by C.-S. Lee.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chang, Y., Hsieh, Y., Chen, C. et al. A semantic frame-based intelligent agent for topic detection. Soft Comput 21, 391–401 (2017). https://doi.org/10.1007/s00500-015-1695-4

Download citation

Keywords

  • Topic detection
  • Semantic frame
  • Semantic class
  • Partial matching