Advertisement

An Indicator Function for Insufficient Data Quality – A Contribution to Data Accuracy

  • Quirin Görz
  • Marcus Kaiser
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 129)

Abstract

Owing to the fact that insufficient data quality usually leads to wrong decisions and high costs, managing data quality is a prerequisite for the successful execution of business and decision processes. An economics-driven management of data quality is in need of efficient measurement procedures, which allow for a predominantly automated identification of poor data quality. Against this background the paper investigates how metrics for the DQ dimensions completeness, validity, and currency can be aggregated to derive an indicator for accuracy. Therefore existing approaches to measure these dimensions are analyzed in order to make explicit, which metric addresses which aspect of data quality. Based on this analysis, an indicator function is designed returning a measure for accuracy on different levels of a data resource. The indicator function’s applicability is demonstrated using a customer database example.

Keywords

Data quality data quality management measurement accuracy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ballou, D.P., Pazer, H.L.: Modeling completeness versus consistency tradeoffs in information decision contexts. IEEE Trans. Knowled. Data Eng. 1, 240–243 (2003)Google Scholar
  2. 2.
    Ballou, D.P., Pazer, H.L.: Designing information systems to optimize the accuracy-timeliness tradeoff. Information Systems Research 1, 51–72 (1995)CrossRefGoogle Scholar
  3. 3.
    Ballou, D.P., Tayi, G.K.: Enhancing Data Quality in Data Warehouse Environments. Communications of the ACM 1, 73–78 (1999)CrossRefGoogle Scholar
  4. 4.
    Ballou, D.P., Wang, R.Y., Pazer, H.L., Tayi, G.K.: Modeling Information Manufacturing Systems to Determine Information Product Quality. Management Science 4, 462–484 (1998)CrossRefGoogle Scholar
  5. 5.
    Batini, C., Barone, D., Cabitza, F., Grega, S.: A Data Quality Methodology for Heterogenous Data. International Journal of Database Management Systems 1, 60–79 (2011)Google Scholar
  6. 6.
    Batini, C., Scannapieco, M.: Data Quality. Concepts, Methodologies and Techniques (Data-Centric Systems and Applications), vol. 1, Berlin (2006)Google Scholar
  7. 7.
    Blake, R., Mangiameli, P.: The Effects and Interactions of Data Quality and Problem Complexity on Classification. Journal of Data and Information Quality (JDIQ) 2, 8 (2011)Google Scholar
  8. 8.
    Calero, C., Caro, A., Piattini, M.: An applicable data quality model for web portal data consumers. World Wide Web 4, 465–484 (2008)CrossRefGoogle Scholar
  9. 9.
    Cappiello, C., Comuzzi, M.: A Utility-Based Model to Define the Optimal Data Quality Level in IT Service Offering. In: Proceedings of the 17th European Conference on Information Systems (ECIS), Verona (Italy), pp. 1062–1074 (2009)Google Scholar
  10. 10.
    Cappiello, C., Francalanci, C., Pernici, B.: Time-Related Factors of Data Quality in Multichannel Information Systems. Journal of Management Information Systems 3, 71–91 (2004)Google Scholar
  11. 11.
    Caro, A., Calero, C., Piattini, M.: Development Process of the Operational Version of PDQM. In: Benatallah, B., Casati, F., Georgakopoulos, D., Bartolini, C., Sadiq, W., Godart, C. (eds.) WISE 2007. LNCS, vol. 4831, pp. 436–448. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Codd, E.F.: Extending the database relational model to capture more meaning. ACM Transactions on Database Systems (TODS) 4, 397–434 (1979)CrossRefGoogle Scholar
  13. 13.
    CSO Insights: 2005 Executive Report: Target Marketing Priorities Analysis (2005) Google Scholar
  14. 14.
    De Amicis, F., Barone, D., Batini, C.: An analytical framework to analyze dependencies among data quality dimensions. In: Proceedings of the 11th International Conference on Information Quality (ICIQ), Cambridge, MA (USA), pp. 369–383 (2006)Google Scholar
  15. 15.
    Eppler, M.J.: Managing information quality, vol. 1, Berlin (2003)Google Scholar
  16. 16.
    Even, A., Shankaranarayanan, G.: Utility-Driven Assessment of Data Quality. The DATA BASE for Advances in Information Systems 2, 75–93 (2007)CrossRefGoogle Scholar
  17. 17.
    Even, A., Shankaranarayanan, G.: Value-driven data quality assessment. In: Proceedings of the 10th International Conference on Information Quality (ICIQ), pp. 221–236. MIT Press, Cambridge (2005)Google Scholar
  18. 18.
    Even, A., Shankaranarayanan, G., Berger, P.D.: Economics-Driven Data Management: An Application to the Design of Tabular Datasets. IEEE Transactions on Knowledge and Data Engineering 6, 818–831 (2007)CrossRefGoogle Scholar
  19. 19.
    Fisher, C.W., Chengalur-Smith, I.N., Ballou, D.P.: The Impact of Experience and Time on the Use of Data Quality Information in Decision Making. Information Systems Research 2, 170–188 (2003)CrossRefGoogle Scholar
  20. 20.
    Fox, C., Levitin, A., Redman, T.C.: The Notion of Data and Its Quality Dimensions. Information Processing & Management 1, 9–19 (1994)CrossRefGoogle Scholar
  21. 21.
    Gackowski, Z.J.: Logical interdependence of data/information quality dimensions—A purpose-focused view on IQ. In: Proceedings of the Ninth International Conference on Information Quality (ICIQ 2004), Cambridge, MA, USA (2004)Google Scholar
  22. 22.
    Görz, Q.: An Economics-Driven Decision Model for Data Quality Improvement – A Contribution to Data Currency. In: Proceedings of the 17th Americas Conference on Information Systems (AMCIS), Detroit, Michigan, USA (2011)Google Scholar
  23. 23.
    Information Workers Beware: Your Business Data Can’t Be Trusted, http://www.sap.com/about/newsroom/businessobjects/20060625_005028.epx
  24. 24.
    Heinrich, B., Kaiser, M., Klier, M.: A Procedure to Develop Metrics For Currency and its Application in CRM. ACM Journal of Data and Information Quality 1, 5:1–5:28 (2009) Google Scholar
  25. 25.
    Heinrich, B., Kaiser, M., Klier, M.: Does the EU Insurance Mediation Directive help to improve Data Quality? - A metric-based analysis. In: Proceedings of the 16th European Conference on Information Systems (ECIS), Galway, Irland (2008)Google Scholar
  26. 26.
    Heinrich, B., Kaiser, M., Klier, M.: How to measure data quality? – a metric based approach. In: Proceedings of the 28th International Conference on Information Systems (ICIS), Montreal, Canada (2007)Google Scholar
  27. 27.
    Heinrich, B., Kaiser, M., Klier, M.: Metrics for measuring data quality – Foundations for an economic data quality management. In: 2nd International Conference on Software and Data Technologies (ICSOFT), Barcelona, Spain (2007)Google Scholar
  28. 28.
    Helfert, M., Foley, O., Ge, M., Cappiello, C.: Limitations of Weighted Sum Measures for Information Quality. In: Proceedings of the 15th Americas Conference on Information Systems (AMCIS), San Francisco, CA, USA (2009)Google Scholar
  29. 29.
    Juran, J.M.: How to think about Quality, New York, vol. 5, pp. 2.1–2.18 (1998) Google Scholar
  30. 30.
    Kahn, B.K., Strong, D.M., Wang, R.Y.: Information quality benchmarks: product and service performance. Commun. ACM 4, 184–192 (2002)CrossRefGoogle Scholar
  31. 31.
    Lee, Y.W., Pipino, L., Strong, D.M., Wang, R.Y.: Process-Embedded Data Integrity. Journal of Database Management 1, 87–103 (2004)CrossRefGoogle Scholar
  32. 32.
    Lee, Y.W., Strong, D.M., Kahn, B.K., Wang, R.Y.: AIMQ: a methodology for information quality assessment. Information & Management 2, 133–146 (2002)CrossRefGoogle Scholar
  33. 33.
    Naumann, F., Freytag, J., Leser, U.: Completeness of Integrated Information Sources. Information Systems 7, 583–615 (2004)CrossRefGoogle Scholar
  34. 34.
    Orr, K.: Data Quality and Systems Theory. Communications of the ACM 2, 66–71 (1998)CrossRefGoogle Scholar
  35. 35.
    Otto, B., Lee, Y.W., Caballero, I.: Information and data quality in business networking: a key concept for enterprises in its early stages of development. Electronic Markets, 83–97 (2011)Google Scholar
  36. 36.
    Parssian, A., Sarkar, S., Jacob, V.S.: Assessing Data Quality for Information Products: Impact of Selection, Projection, and Cartesian Product. Management Science 7, 967–982 (2004)CrossRefGoogle Scholar
  37. 37.
    Pipino, L., Lee, Y.W., Wang, R.Y.: Data Quality Assessment. Communications of the ACM 4, 211–218 (2002)CrossRefGoogle Scholar
  38. 38.
    Russom, P.: Taking Data Quality to the Enterprise through Data Governance. The Data Warehousing Institute, Seattle (2006)Google Scholar
  39. 39.
    Vassiliou, Y.: Null values in data base management - a denotational semantics approach. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data (SIGMOD 1979), pp. 162–169. ACM, Boston (1979)CrossRefGoogle Scholar
  40. 40.
    Wand, Y., Wang, R.Y.: Anchoring data quality dimensions in ontological foundations. Communications of the ACM 11, 86–95 (1996)CrossRefGoogle Scholar
  41. 41.
    Wang, R.Y.: A Product Perspective on Total Data Quality Management. Communications of the ACM 2, 58–65 (1998)CrossRefGoogle Scholar
  42. 42.
    Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. Journal of Management Information Systems 4, 5–33 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Quirin Görz
    • 1
  • Marcus Kaiser
    • 2
  1. 1.FIM Research CenterUniversity of AugsburgAugsburgGermany
  2. 2.Senacor Technologies AGMunichGermany

Personalised recommendations