Insights into Entity Name Evolution on Wikipedia

Holzmann, Helge; Risse, Thomas

doi:10.1007/978-3-319-11746-1_4

Helge Holzmann¹⁹ &
Thomas Risse¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8787))

Included in the following conference series:

International Conference on Web Information Systems Engineering

1424 Accesses
5 Altmetric

Abstract

Working with Web archives raises a number of issues caused by their temporal characteristics. Depending on the age of the content, additional knowledge might be needed to find and understand older texts. Especially facts about entities are subject to change. Most severe in terms of information retrieval are name changes. In order to find entities that have changed their name over time, search engines need to be aware of this evolution. We tackle this problem by analyzing Wikipedia in terms of entity evolutions mentioned in articles regardless the structural elements. We gathered statistics and automatically extracted minimum excerpts covering name changes by incorporating lists dedicated to that subject. In future work, these excerpts are going to be used to discover patterns and detect changes in other sources. In this work we investigate whether or not Wikipedia is a suitable source for extracting the required knowledge.

This work is partly funded by the European Research Council under ALEXANDRIA (ERC 339233).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web Journal (2014)
Google Scholar
Bollacker, K.D., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD Conference, pp. 1247–1250 (2008)
Google Scholar
Miller, G.A.: Wordnet: A lexical database for english. Commun. ACM, 39–41 (1995)
Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: A Core of Semantic Knowledge. In: 16th International World Wide Web Conference (WWW 2007). ACM Press, New York (2007)
Google Scholar
Kanhabua, N., Nørvåg, K.: Exploiting time-based synonyms in searching document archives. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, JCDL 2010, pp. 79–88. ACM, New York (2010)
Google Scholar
The Stanford Natural Language Processing Group. Stanford corenlp - a suite of core nlp tools (2010), http://nlp.stanford.edu/software/corenlp.shtml (accessed February 3, 2014)
Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: Yago2: A spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence 194, 28–61 (2013)
Article MATH MathSciNet Google Scholar
Anderka, M., Stein, B., Lipka, N.: Predicting quality flaws in user-generated content: the case of wikipedia. In: SIGIR, pp. 981–990 (2012)
Google Scholar
Anderka, M., Stein, B.: Overview of the 1th international competition on quality flaw prediction in wikipedia. In: CLEF (Online Working Notes/Labs/Workshop) (2012)
Google Scholar
Ferschke, O., Gurevych, I., Chebotar, Y.: Behind the article: Recognizing dialog acts in wikipedia talk pages. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (2012)
Google Scholar
Ferschke, O., Zesch, T., Gurevych, I.: Wikipedia revision toolkit: Efficiently accessing wikipedia’s edit history. In: ACL (System Demonstrations), pp. 97–102 (2011)
Google Scholar
Goldfarb, D., Arends, M., Froschauer, J., Merkl, W.: Art history on wikipedia, a macroscopic observation. In: Proceedings of the ACM WebSci 2012, pp. 163–168. ACM (2012)
Google Scholar
Berberich, K., Bedathur, S.J., Sozio, M., Weikum, G.: Bridging the terminology gap in web archive search. In: WebDB (2009)
Google Scholar
Kaluarachchi, A.C., Varde, A.S., Bedathur, S.J., Weikum, G., Peng, J., Feldman, A.: Incorporating terminology evolution for query translation in text retrieval with association rules. In: CIKM, pp. 1789–1792. ACM (2010)
Google Scholar
Mazeika, A., Tylenda, T., Weikum, G.: Entity timelines: Visual analytics and named entity evolution. In: Proc. of the 20th ACM Int. Conference on Information and Knowledge Management, CIKM 2011, pp. 2585–2588. ACM, New York (2011)
Google Scholar
Tahmasebi, N., Gossen, G., Kanhabua, N., Holzmann, H., Risse, T.: Neer: An unsupervised method for named entity evolution recognition. In: Proceedings of the 24th International Conference on Computational Linguistics (Coling 2012), Mumbai, India (2012)
Google Scholar
Tahmasebi, N.: Models and Algorithms for Automatic Detection of Language Evolution. Towards Finding and Interpreting of Content in Long-Term Archives. PhD thesis, Leibniz Universität Hannover (2013)
Google Scholar
Holzmann, H., Gossen, G., Tahmasebi, N.: fokas: Formerly known as - a search engine incorporating named entity evolution. In: Proceedings of the 24th International Conference on Computational Linguistics: Demonstration Papers (Coling 2012), Mumbai, India (2012)
Google Scholar
Holzmann, H., Tahmasebi, N., Risse, T.: Blogneer: Applying named entity evolution recognition on the blogosphere. In: Proc. of the 3rd Int. Workshop on Semantic Digital Archives (SDA 2013), in Conjunction with TPDL 2013, Valetta, Malta (September 2013)
Google Scholar

Download references

Author information

Authors and Affiliations

L3S Research Center, Appelstr. 9a, 30167, Hanover, Germany
Helge Holzmann & Thomas Risse

Authors

Helge Holzmann
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Risse
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of New South Wales, Sydney, Australia
Boualem Benatallah
Boston University, Boston, MA, USA
Azer Bestavros
Aristotle University of Thessaloniki, Thessaloniki, Greece
Yannis Manolopoulos & Athena Vakali &
Victoria University, Footscray, VIC, Australia
Yanchun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Holzmann, H., Risse, T. (2014). Insights into Entity Name Evolution on Wikipedia. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2014. WISE 2014. Lecture Notes in Computer Science, vol 8787. Springer, Cham. https://doi.org/10.1007/978-3-319-11746-1_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-11746-1_4
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11745-4
Online ISBN: 978-3-319-11746-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics