Skip to main content

Semi-automatic Construction of a Named Entity Dictionary Based on Active Learning

  • Conference paper
Computer Science and its Applications

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 330))

Abstract

A named entity (NE) dictionary is an important resource to affect the performance of NE recognition, but it is not easy to construct the NE dictionary manually, because human annotation is time-consuming and labor-intensive. We propose a semi-automatic model to construct an NE dictionary from the free online resource DBpedia. The proposed model expands and purifies an NE dictionary based on an active learning technique. In the experiments, the proposed model classified 99.99% (180,008 out of 180,020 entries) of DBpedia entries into 18 NE categories with macro-averaging F1-measures of 0.6980 for 18 NE categories (0.7519 for 17 NE categories).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cohen, W.W., Sarawagi, S.: Exploiting Dictionaries in Named Entity Extraction: Combining SemiMarkov Extraction Processes and Data Integration Methods. In: The 10th ACM International Conference on Knowledge Discovery and Data Mining, Seattle (2004)

    Google Scholar 

  2. Nadeau, D., Sekine, S.: A Survey of Named Entity Recognition and Classification. Linguisticae Investigationes 30(1), 3–26 (2007)

    Article  Google Scholar 

  3. Agichtein, E., Gravano, L.: Snowball: Extracting Relations from Large Plain-Text Collections. In: The 5th ACM Conference on Digital Libraries, pp. 85–94 (2000)

    Google Scholar 

  4. Thelen, M., Riloff, E.: A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts. In: The Conference on Empirical Methods in NLP, pp. 217–221 (2002)

    Google Scholar 

  5. Shinzato, K., Sekine, S., Yoshinaga, N., Torisawa, K.: Constructing Dictionaries for Named Entity Recognition on Specific Domains from the Web. In: The 5th International Semantic Web Conference - Workshop on Web Content Mining with Human Language Technologies (2006)

    Google Scholar 

  6. Riloff, E., Jones, R.: Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping. In: The 16th National Conference on Artificial Intelligence, pp. 474–479 (1999)

    Google Scholar 

  7. Grishman, R., Sundheim, B.: Message Understanding Conference-6: A Brief History. In: Message Understanding Conference, vol. 6 (2006)

    Google Scholar 

  8. Sekine, S., Isahara, H.: IREX: IR and IE Evaluation Project in Japanese. In: The 2nd International Conference on Language Resources and Evaluation, Athens, Greece (2000)

    Google Scholar 

  9. Tkachenko, M., Ulanov, A., Simanovsky, A.: Fine Grained Classification of Named Entities in Wikipedia. Technical Reports of HP Laboratories (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yeongkil Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Song, Y., Kim, H. (2015). Semi-automatic Construction of a Named Entity Dictionary Based on Active Learning. In: Park, J., Stojmenovic, I., Jeong, H., Yi, G. (eds) Computer Science and its Applications. Lecture Notes in Electrical Engineering, vol 330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45402-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-45402-2_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45401-5

  • Online ISBN: 978-3-662-45402-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics