Skip to main content

A Hybrid Approach for Entity Recognition and Linking

  • Conference paper
  • First Online:
Semantic Web Evaluation Challenges (SemWebEval 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 548))

Included in the following conference series:

Abstract

Numerous research efforts are tackling the entity recognition and entity linking tasks resulting in a large body of literature. One could roughly categorize the proposed approaches in two different strategies: linguistic-based and semantic-based methods. In this paper, we present our participation to the OKE challenge, where we experiment with a hybrid approach, which combines the strength of a linguistic-based method augmented by a high coverage in the annotation obtained by using a large knowledge base as entity dictionary. The main goal of this hybrid approach is to improve the extraction and recognition level to get the best recall in order to apply a pruning step. On the training set, the results are promising and the breakdown figures are comparable with the state of the art performance of top ranked systems. Our hybrid approach has been ranked first to the OKE Challenge on the test set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://lucene.apache.org/.

  2. 2.

    http://wiki.dbpedia.org/services-resources/datasets/datasets2014.

  3. 3.

    https://dumps.wikimedia.org/enwiki/.

  4. 4.

    http://sweble.org/.

  5. 5.

    https://code.google.com/p/gwtwiki/.

  6. 6.

    https://github.com/Stratio/wikipedia-parser.

  7. 7.

    http://www.wsj.com.

  8. 8.

    http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core.

  9. 9.

    https://github.com/wikilinks/neleval.

  10. 10.

    https://gate.d5.mpi-inf.mpg.de/webaida/.

  11. 11.

    http://tagme.di.unipi.it/.

  12. 12.

    http://dbpedia-spotlight.github.io/demo/.

  13. 13.

    http://babelfy.org/.

References

  1. Aha, D., Kibler, D.: Instance-based learning algorithms. Mach. Learn. 6, 37ā€“66 (1991)

    MATHĀ  Google ScholarĀ 

  2. Cano, A.E., Rizzo, G., Varga, A., Rowe, M., Milan, S., Dadzie, A.-S.: Making sense of microposts (#microposts2014) named entity extraction & linking challenge. In: 4th International Workshop on Making Sense of Microposts, Seoul, South Korea (2014)

    Google ScholarĀ 

  3. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: 43rd Annual Meeting on Association for Computational Linguistics (ACL), pp. 363ā€“370, Stroudsburg, PA, USA (2005)

    Google ScholarĀ 

  4. Hoffart, J., Altun, Y., Weikum, G.: Discovering emerging entities with ambiguous names. In: 23rd International Conference on World Wide Web (WWW), pp. 385ā€“396, Seoul, Korea (2014)

    Google ScholarĀ 

  5. Hoffart, J., Yosef, M.A., Bordino, I., FĆ¼rstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, G.: Robust disambiguation of named entities in text. In: 8th Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 782ā€“792, Stroudsburg, PA, USA (2011)

    Google ScholarĀ 

  6. Ji, H., Nothman, J., Hachey, B.: Overview of TAC-KBP2014 entity discovery and linking tasks. In: Text Analysis Conference (TAC), Gaithersburg, USA (2014)

    Google ScholarĀ 

  7. Mendes, P.N., Jakob, M., GarcĆ­a-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: 7th International Conference on Semantic Systems (I-Semantics), pp. 1ā€“8 (2011)

    Google ScholarĀ 

  8. Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. TACL 2, 231ā€“244 (2014)

    Google ScholarĀ 

  9. Piccinno, F., Ferragina, P.: From TagME to WAT: a new entity annotator. In: 1st ACM International Workshop on Entity Recognition & Disambiguation (ERD), pp. 55ā€“62, Gold Coast, Australia (2014)

    Google ScholarĀ 

  10. Toutanova, K., Christopher, D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: North American Chapter of the Association for Computational Linguistics (HLT-NAACL), pp. 252ā€“259, Edmond, Canada (2003)

    Google ScholarĀ 

  11. Usbeck, R., Rƶder, M., Ngonga Ngomo, A.-C., Baron, C., Both, A., BrĆ¼mmer, M., Ceccarelli, D., Cornolti, M., Cherix, D., Eickmann, B., Ferragina, P., Lemke, C., Moro, A., Navigli, R., Piccinno, F., Rizzo, G., Sack, H., Speck, R., Troncy, R., Waitelonis, J., Wesemann, L.: GERBIL - general entity annotation benchmark Framework. In: 24th World Wide Web Conference (WWW), Florence, Italy (2015)

    Google ScholarĀ 

Download references

Acknowledgments

This work was partially supported by the EIT Digital 3cixty project and by French National Research Agency (ANR) within the WAVE Project, under grant number ANR-12-CORD-0027.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julien Plu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Plu, J., Rizzo, G., Troncy, R. (2015). A Hybrid Approach for Entity Recognition and Linking. In: Gandon, F., Cabrio, E., Stankovic, M., Zimmermann, A. (eds) Semantic Web Evaluation Challenges. SemWebEval 2015. Communications in Computer and Information Science, vol 548. Springer, Cham. https://doi.org/10.1007/978-3-319-25518-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25518-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25517-0

  • Online ISBN: 978-3-319-25518-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics