Abstract
A named entity (NE) dictionary is an important resource to affect the performance of NE recognition, but it is not easy to construct the NE dictionary manually, because human annotation is time-consuming and labor-intensive. We propose a semi-automatic model to construct an NE dictionary from the free online resource DBpedia. The proposed model expands and purifies an NE dictionary based on an active learning technique. In the experiments, the proposed model classified 99.99% (180,008 out of 180,020 entries) of DBpedia entries into 18 NE categories with macro-averaging F1-measures of 0.6980 for 18 NE categories (0.7519 for 17 NE categories).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cohen, W.W., Sarawagi, S.: Exploiting Dictionaries in Named Entity Extraction: Combining SemiMarkov Extraction Processes and Data Integration Methods. In: The 10th ACM International Conference on Knowledge Discovery and Data Mining, Seattle (2004)
Nadeau, D., Sekine, S.: A Survey of Named Entity Recognition and Classification. Linguisticae Investigationes 30(1), 3–26 (2007)
Agichtein, E., Gravano, L.: Snowball: Extracting Relations from Large Plain-Text Collections. In: The 5th ACM Conference on Digital Libraries, pp. 85–94 (2000)
Thelen, M., Riloff, E.: A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts. In: The Conference on Empirical Methods in NLP, pp. 217–221 (2002)
Shinzato, K., Sekine, S., Yoshinaga, N., Torisawa, K.: Constructing Dictionaries for Named Entity Recognition on Specific Domains from the Web. In: The 5th International Semantic Web Conference - Workshop on Web Content Mining with Human Language Technologies (2006)
Riloff, E., Jones, R.: Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping. In: The 16th National Conference on Artificial Intelligence, pp. 474–479 (1999)
Grishman, R., Sundheim, B.: Message Understanding Conference-6: A Brief History. In: Message Understanding Conference, vol. 6 (2006)
Sekine, S., Isahara, H.: IREX: IR and IE Evaluation Project in Japanese. In: The 2nd International Conference on Language Resources and Evaluation, Athens, Greece (2000)
Tkachenko, M., Ulanov, A., Simanovsky, A.: Fine Grained Classification of Named Entities in Wikipedia. Technical Reports of HP Laboratories (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Song, Y., Kim, H. (2015). Semi-automatic Construction of a Named Entity Dictionary Based on Active Learning. In: Park, J., Stojmenovic, I., Jeong, H., Yi, G. (eds) Computer Science and its Applications. Lecture Notes in Electrical Engineering, vol 330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45402-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-662-45402-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45401-5
Online ISBN: 978-3-662-45402-2
eBook Packages: EngineeringEngineering (R0)