Ensemble Learning for Keyphrases Extraction from Scientific Document
Keyphrase extraction is a task with many applications in information retrieval, text mining, and natural language processing. In this paper, a keyphrase extraction approach based on neural network ensemble is proposed. To determine whether a phrase is a keyphrase, the following features of a phrase in a given document are adopted: its term frequency, whether to appear in the title, abstract or headings (subheadings), and its frequency appearing in the paragraphs of the given document. The approach is evaluated by the standard information retrieval metrics of precision and recall. Experiment results show that the ensemble learning can significantly increase the precision and recall.
KeywordsNeural Network Class Label Feature Subset Ensemble Learn AdaBoost Algorithm
Unable to display preview. Download preview PDF.
- 15.Wang, J.B., Peng, H.: Keyphrases Extraction from Web Document by the Least Squares Support Vector Machine. In: Skowron, A., Agrawal, R., Luck, M., et al. (eds.) Proceedings of 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 293–296. IEEE Computer Society Press, Los Almitos (2005)CrossRefGoogle Scholar