Abstract
This chapter provides a prospective look at the “big research issues” in data quality. It is based on 25 years experience, most as a practitioner; early work with a terrific team of researchers and business people at Bell Labs and AT&T; constant reflection on the meanings and methods of quality, the strange and wondrous properties of data, the importance of data and data quality in markets and companies, and the underlying reasons that some enterprises make rapid progress and others fall flat; and interactions with most of the leading companies, practitioners, and researchers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Specialized data providers serve many industries. Bloomberg, Morningstar, and Thomson-Reuters are household names in the financial services sector, for example. In these “pure data markets,” data quality is indeed front and center.
- 2.
Quite obviously, government agencies and other nonprofits should not aim to “make money” from data. For them, “advance organizational mission” might be more appropriate. I’ve purposefully left “make money” in the body of text here, as I want to leave a hard edge on the point. Sooner or later, data must be recognized as equally important as capital and people (and maybe a few other) assets.
- 3.
We explicitly recognize that a customer need not be a person. A computer program, an organization, and any other entity that uses data may qualify.
- 4.
The late Dr. William Barnard of the Juran Institute introduced me to this notion.
- 5.
And, to a lesser degree from good practice in data collection for scientific experimentation, though I know of no good reference to back up the assertion.
- 6.
A (perhaps) interesting historical note: In early 90s, the team I worked on at Bell Labs struggled to come up with a good definition of data, for quality purposes, and define dimensions of data quality. We finally came up with definitions we found acceptable. And then, we realized we had completely missed the point. Our approach treated data as “static,” in a database. But static data are stunningly uninteresting. Data are interesting when they are created, moved about, morphed to suit individual needs, put to work, and combined with other data. We wrote [20] as a result and I personally think it is this team’s most important paper. At the same time, the conclusion, once stated, is obvious!
- 7.
I’ve put “root causes” in quotes because a proper root cause analysis is considerably more disciplined than that conducted here.
- 8.
Several comments here. First, these are by no means the only issues. See Chapter 7 of Data Driven [26] for a fuller explanation of these and many others. Second, I am not the only person to have observed such issues. See Silverman [32] and Thomas [34] for other perspectives. Third, and most importantly, I have no formal training as a social scientist. It would be enormously helpful if sociologists, anthropologists, political scientists, and others brought more sophisticated tools to bear in helping understand these issues.
- 9.
Variously, Tech groups may be called Information Technology, the Chief Information Office, Information Management, Management Information Systems, etc.
- 10.
Dr. Godfrey is Dean, School of Textiles, at North Carolina State University. He made the comment repeatedly as Head of the Quality Theory and Methods Department at Bell Labs and as CEO of the Juran Institute in the 1980s and 1990s. I don’t recall ever seeing it in print nor can I confirm that he was first to make the observation.
- 11.
I believe this observation is due to Robert W. Pautke, Cincinnati, OH.
- 12.
To be clear, I have every expectation that this list is incomplete.
- 13.
Note here another reason not to share data!
- 14.
Some may argue that these points merely reflect our progress in economic development and thus do not constitute a root cause at all. They have a fair point.
- 15.
Earlier, I noted that there was considerable debate on the definition of “information” in our field. I think definitions should be based on entropy and/or uncertainty.
- 16.
I want to be careful here. This statement is not strictly true, as entropy is a probabilistic measure.
- 17.
The careful reader will object that “if data are structured,” then “unstructured data” is nonsensical. Unfortunately, those who coined the phrase appear not to have taken this into account.
- 18.
Even data tracking, in many ways, the most powerful measurement technique employs a form of business rules.
- 19.
Rob Hilliard and I have started a research project along these lines.
- 20.
This phrase closely mirrors the moniker of Data Blueprint, “better data for better decisions.”
- 21.
“Stunning” in the sense that these roles were so unexpected even a few years ago.
- 22.
Some may argue that one cannot complete a “fundamental rethink” in advance of a “fundamental think.” They have a point.
- 23.
References
Beer S (1979) The heart of enterprise. Wiley, New York
Borek A, Parlikad AL, Woodall P (2011) Towards a process for total information risk management. In: Proceedings of the 16th international conference on information quality, University of South Australia, Adelaide, 18–20 November 2011
Brackett MH (2000) Data resource quality turning bad habits into good practice. Addison-Wesley, Boston
Byrnjolfsson E, Hitt LM, Kin HH (2011) Strength in numbers: how does data-drive decision making affect firm performance? SSRN: http://ssrn.com/abstract=1819486 or http://dx.doi.org/10.2139/ssrn.1819486
Carr N (2003) IT doesn’t matter. Harv Bus Rev 81(5):41–49
Chandler AD (1977) The visible hand the managerial revolution in American Business. The Belknap Press, Cambridge
Chandler AD, Cortada JW (eds) (2000) A nation transformed how information has shaped the United States from colonial times to present. Oxford University Press, England
English LP (1999) Improving data warehouse and business information quality. Wiley, New York
Eppler MJ (2003) Managing information quality. Verlag, Berlin
Fisher T (2009) The data asset: how smart companies govern their data for business success. Wiley, Hoboken
Fox C, Levitin AV, Redman TC (1994) The notion of data and its quality dimensions. Inf Process Manag 30(1):9–19
Greene R, Elffers J (1998) The 48 laws of power. Viking, New York
Hillard R (2010) Information-driven business: how to manage data and information for maximum advantage. Wiley, Hoboken
Huang KT, Lee YW, Wang RY (1999) Quality information and knowledge. Prentice-Hall, Upper Saddle River
Jacques E (1988) Requisite organization. Cason Hall & Company, Arlington
Juran JM, Godfrey AM (1999) Juran’s quality handbook, 5th edn, McGraw-Hill, New York
Kushner T, Villar M (2009) Managing your business data: from chaos to confidence. Racom Communications, Chicago
Laney D (2011) Infonomics: the economics of information and principles of information asset management. In: Proceedings of 5th MIT information quality industry symposium, Cambridge Massachusetts, 13–15 July 2011
Lee YW, Pipino LL, Funk JD, Wang RY, (2006) Journey to data quality. MIT Press, Cambridge
Levitin AV, Redman TC (1993) A model of data (life) cycles with applications to quality. Inf Softw Technol 35(4):217–224
Levitin AV, Redman TC (1995) Quality dimensions of a conceptual view. Inf Process Manag 31(1):81–88
Loshin D (2011) The practitioner’s guide to data quality improvement. Elsevier, Amsterdam
McGilvray D (2008) Executing data quality projects ten steps to trusted data. Morgan Kaufmann, Amsterdam
Olson JE (2009) Data quality the accuracy dimension. Morgan Kaufmann, Amsterdam
Pyzdek T, Keller P (2009) The six-sigma handbook. 3rd edn. McGraw-Hill, New York
Redman TC (2008) Data driven: profiting from your most important business asset. Harv Bus Press, Boston
Redman TC (2001) Data quality: the field guide. Digital Press, Boston
Redman TC (2004) Measuring data accuracy: a framework and review. Stud Commun Sci 4(2):53–58.
Roberts DJ (2004) The modern firm. Oxford University Press, Oxford
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:623–656
Silverman L (2006) Wake me when the data is over: how organizations use stories to drive results. Jossey-Bass, San Francisco
Talburt JR (2011) Entity resolution and information quality. Morgan Kaufmann, Amsterdam
Thomas G (2006) Alpha males and data disasters: the case for data governance. Brass Cannon, Orlando
Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers, J Manag Inf Syst 12(4):5–33
Yoon Y, Aiken P, Guimaraes T, (2000) Managing organizational data resources: quality dimensions. Inf Resour Manag J 13(3):5–13
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Thomas C. Redman, Ph.D.
About this chapter
Cite this chapter
Redman, T.C. (2012). Data Quality Management Past, Present, and Future: Towards a Management System for Data. In: Sadiq, S. (eds) Handbook of Data Quality. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36257-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-36257-6_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36256-9
Online ISBN: 978-3-642-36257-6
eBook Packages: Computer ScienceComputer Science (R0)