Emerging Entity Discovery Using Web Sources

  • Lei ZhangEmail author
  • Tianxing Wu
  • Liang Xu
  • Meng Wang
  • Guilin Qi
  • Harald Sack
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1134)


The rapidly increasing amount of entities in knowledge bases (KBs) can be beneficial for many applications, where the key issue is to link entity mentions in text with entities in the KB, also called entity linking (EL). Many methods have been proposed to tackle this problem. However, the KB can never be complete, such that emerging entity discovery (EED) is essential for detecting emerging entities (EEs) that are mentioned in text but not yet contained in the KB. In this paper, we propose a new topic-driven approach to EED by representing EEs using the context harvested from online Web sources. Experimental results show that our solution outperforms the state-of-the-art methods in terms of F1 measure for the EED task as well as Micro Accuracy and Macro Accuracy in the full EL setting.


  1. 1.
    Färber, M., Rettinger, A., Asmar, B.E.: On emerging entity detection. In: EKAW, pp. 223–238 (2016)Google Scholar
  2. 2.
    Fetahu, B., Anand, A., Anand, A.: How much is Wikipedia lagging behind news? In: WebSci, pp. 28:1–28:9 (2015)Google Scholar
  3. 3.
    Hoffart, J., Altun, Y., Weikum, G.: Discovering emerging entities with ambiguous names. In: WWW, pp. 385–396 (2014)Google Scholar
  4. 4.
    Hoffart, J., et al.: Robust disambiguation of named entities in text. In: EMNLP, pp. 782–792 (2011)Google Scholar
  5. 5.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: Dbpedia: a nucleus for a web of open data. In: ISWC, pp. 722–735 (2007)Google Scholar
  6. 6.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: WWW, pp. 697–706 (2007)Google Scholar
  7. 7.
    Finkel, J.R., Grenager, T., Manning, C.D.: Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL, pp. 363–370 (2005)Google Scholar
  8. 8.
    Kulkarni, S., Singh, A., Ramakrishnan, G., Chakrabarti, S.: Collective annotation of wikipedia entities in web text. In: KDD, pp. 363–370 (2009)Google Scholar
  9. 9.
    Ratinov, L., Roth, D., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to Wikipedia. In: ACL, pp. 1375–1384 (2011)Google Scholar
  10. 10.
    Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: SIGIR, pp. 765–774 (2011)Google Scholar
  11. 11.
    Raghunathan, K., et al.: A multi-pass sieve for coreference resolution. In: EMNLP, pp. 492–501 (2010)Google Scholar
  12. 12.
    Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: HLT-NAACL (2003)Google Scholar
  13. 13.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  14. 14.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. In: PNAS, vol. 101, suppl. 1, pp. 5228–5235 (2004)CrossRefGoogle Scholar
  15. 15.
    Parker, R.: English gigaword fifth edition. Technical report (2011)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Lei Zhang
    • 1
    Email author
  • Tianxing Wu
    • 2
  • Liang Xu
    • 3
  • Meng Wang
    • 3
  • Guilin Qi
    • 3
  • Harald Sack
    • 1
  1. 1.FIZ Karlsruhe – Leibniz Institute for Information InfrastructureEggenstein-LeopoldshafenGermany
  2. 2.Nanyang Technological UniversitySingaporeSingapore
  3. 3.Southeast UniversityNanjingChina

Personalised recommendations