Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Data Conflicts

  • Hong-Hai Do
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_97

Synonyms

Data anomalies; Data errors; Data inconsistencies; Data problems; Data quality problems

Definition

Data conflicts are deviations between data intended to capture the same state of a real-world entity. Data with conflicts are often called “dirty” data and can mislead analysis performed on it. In case of data conflicts, data cleaning is needed in order to improve the data quality and to avoid wrong analysis results. With an understanding of different kinds of data conflicts and their characteristics, corresponding techniques for data cleaning can be developed.

Historical Background

Statisticians were probably the first who had to face data conflicts on a large scale. Early applications, which needed intensive resolution of data conflicts, were statistical surveys in the areas of governmental administration, public health, and scientific experiments. In 1946, Halbert L. Dunn already observed the problem of duplicates in data records of a person’s life captured at different places...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Barateiro J, Galhardas H. A survey of data quality tools. Datenbank-Spektrum. 2005;14(15–21):48.Google Scholar
  2. 2.
    Batini C, Scannapieco M. Data quality – concepts, methodologies and techniques. Berlin: Springer; 2006.zbMATHGoogle Scholar
  3. 3.
    Dunn HL. Record linkage. Am J Public Health. 1946;36(12):1412–6.CrossRefGoogle Scholar
  4. 4.
    Fellegi IP, Sunter AB. A theory for record linkage. J Am Stat Assoc. 1969;64(328):1183–210.CrossRefzbMATHGoogle Scholar
  5. 5.
    Kim W, Choi B-J, Kim S-K, Lee D. A taxonomy of dirty data. Data Min Knowl Discov. 2003;7(1):81–99.CrossRefMathSciNetGoogle Scholar
  6. 6.
    Rahm E, Do H-H. Data cleaning – problems and current approaches. IEEE Techn Bull Data Eng. 2000;23(4):3–13.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.SAP AGDresdenGermany

Section editors and affiliations

  • Felix Naumann
    • 1
  1. 1.Information SystemsHasso-Plattner-InstitutePotsdamGermany