HCI 2013: HCI International 2013 - Posters’ Extended Abstracts pp 450-453 | Cite as
Improved Keyword Extraction by Separation into Multiple Document Sets According to Time Series
Abstract
This study proposes a method of extracting keywords including those that appear locally. Useful keyword extraction methods are available for text mining, such as TF-IDF and support vector machine. However, when keywords are extracted on the basis of time series, the local keywords are not often extracted. We propose a method of extracting the local keywords by separating a document set, which we call the document separation approach. The approach splits a document set into multiple sets according to time series, extracts the keywords for each document set, and integrates them. Using 1812 newspaper articles, we experimentally demonstrate that we can extract the local feature keywords using the document separation approach.
Keywords
keyword extraction document set text miningPreview
Unable to display preview. Download preview PDF.
References
- 1.Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)MATHCrossRefGoogle Scholar
- 2.Zhang, K., Xu, H., Tang, J., Li, J.: Keyword extraction using support vector machine. In: Yu, J.X., Kitsuregawa, M., Leong, H.-V. (eds.) WAIM 2006. LNCS, vol. 4016, pp. 85–96. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 3.Salton, G.: Automatic Text Processing: The Transformation Analysis and Retrieval of Information by Computer. Addison-Wesley Publisher (1988)Google Scholar
- 4.Saga, R., Terachi, M., Tsuji, H.: FACT-Graph: Trend visualization by frequency and co-occurrence. Electronics and Communications in Japan 95, 50–58 (2012)CrossRefGoogle Scholar