Entity Resolution

Bhattacharya, Indrajit; Getoor, Lise

doi:10.1007/978-1-4899-7502-7_81-1

Indrajit Bhattacharya³ &
Lise Getoor⁴

964 Accesses

Abstract

References to real-world entities are often ambiguous, more commonly across data sources but frequently within a single data source as well. Ambiguities occur due to multiple reasons, such as incorrect data entry, or multiple possible representations of the entities. Given such a collection of ambiguous entity references, the goal of entity resolution is to discover the unique set of underlying entities, and map each reference to its corresponding entity. Resolving such entity ambiguities is necessary for removing redundancy and also for accurate entity-level analysis. This is a common problem that comes up in many different applications and has been studied in different branches of computer science. As evidences for entity resolution, traditional approaches consider pair-wise similarity between references, and many sophisticated similarity measures have been proposed to compare attributes of references. The simplest solution classifies reference pairs with similarity above a threshold as referring to the same entity. More sophisticated solutions use a probabilistic framework for reasoning with the pair-wise probabilities. Recently proposed relational approaches for entity resolution make use of relationships between references when available as additional evidences. Instead of reasoning independently for each pair of references, these approaches reason collectively over related pair-wise decisions over references. One line of work within the relational family uses supervised or unsupervised probabilistic learning using probabilistic graphical models, while another uses more scalable greedy techniques for merging references in a hyper-graph. Beyond improving entity resolution accuracy, such relational approaches yield additional knowledge in the form of relationships between the underlying entities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Author information

Authors and Affiliations

IBM India Research Laboratory, New Delhi, India
Indrajit Bhattacharya
University of Maryland, College Park, MD, USA
Lise Getoor

Authors

Indrajit Bhattacharya
View author publications
You can also search for this author in PubMed Google Scholar
Lise Getoor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Indrajit Bhattacharya .

Editor information

Editors and Affiliations

Engineering (CSE), University of New South Wales School of Computer Science &, Sydney, New South Wales, Australia
Claude Sammut
Software Engineering, Monash University School of Computer Science &, Melbourne, Victoria, Australia
Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Bhattacharya, I., Getoor, L. (2014). Entity Resolution. In: Sammut, C., Webb, G. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7502-7_81-1

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7502-7_81-1
Received: 11 November 2014
Accepted: 11 November 2014
Published: 18 February 2015
Publisher Name: Springer, Boston, MA
Online ISBN: 978-1-4899-7502-7
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Entity Resolution

Abstract

Access this chapter

Recommended Reading

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Navigation

Entity Resolution

Abstract

Access this chapter

Recommended Reading

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Search

Navigation