Abstract
This paper is divided into two parts. In the first, I shall briefly analyse the phenomenon of “big data”, and argue that the real epistemological challenge posed by the zettabyte era is small patterns. The valuable undercurrents in the ocean of data that we are accumulating are invisible to the computationally-naked eye, so more and better technology will help. However, because the problem with big data is small patterns, ultimately, the game will be won by those who “know how to ask and answer questions” (Plato, Cratylus, 390c). This introduces the second part, concerning information quality (IQ): which data may be useful and relevant, and so worth collecting, curating, and querying, in order to exploit their valuable (small) patterns? I shall argue that the standard way of seeing IQ in terms of fit-for-purpose is correct but needs to be complemented by a methodology of abstraction, which allows IQ to be indexed to different purposes. This fundamental step can be taken by adopting a bi-categorical approach. This means distinguishing between purpose/s for which some information is produced (P-purpose) and purpose/s for which the same information is consumed (C-purpose). Such a bi-categorical approach in turn allows one to analyse a variety of so-called IQ dimensions, such as accuracy, completeness, consistency, and timeliness. I shall show that the bi-categorical approach lends itself to simple visualisations in terms of radar charts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Fracking (hydraulic fracturing) is a technique in which a liquid (usually water), mixed with sand and chemicals, is injected underground at high pressure in order to cause small fractures (typically less than 1 mm), along which fluids such as gas (especially shale gas), petroleum and brine water can surface.
- 2.
- 3.
- 4.
See more recently United States. Congress. House. Committee on Government Reform. Subcommittee on Regulatory Affairs (2006).
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
Borges, “The Analytical Language of John Wilkins”, originally published in 1952, English translation in Borges (1964).
- 12.
- 13.
References
Al-Hakim, L. (2007). Information quality management: Theory and applications. Hershey: Idea Group Pub.
Batini, C., & Scannapieco, M. (2006). Data quality – Concepts, methodologies and techniques. Berlin/New York: Springer.
Borges, J. L. (1964). Other inquisitions, 1937–1952. Austin: University of Texas Press.
Cajochen, C., Altanay-Ekici, S., Münch, M., Frey, S., Knoblauch, V., & Wirz-Justice, A. (2013). Evidence that the lunar cycle influences human sleep. Current Biology: CB, 23(15), 1485–1488.
Census. (2011). Census data quality assurance strategy. http://www.ons.gov.uk/ons/guide-method/census/2011/the-2011-census/processing-the-information/data-quality-assurance/2011-census---data-quality-assurance-strategy.pdf
English, L. (2009). Information quality applied: Best practices for improving business information, processes, and systems. Indianapolis: Wiley.
Floridi, L. (2008). The method of levels of abstraction. Minds and Machines, 18(3), 303–329.
Floridi, L. (2011). The philosophy of information. Oxford: Oxford University Press.
Herzog, T. N., Scheuren, F., & Winkler, W. E. (2007). Data quality and record linkage techniques. New York: Springer.
Lee, Y. W., Pipino, L. L., Funk, J. D., & Wang, R. Y. (2006). Journey to data quality. Cambridge, MA: MIT Press.
Luebke, D. M., & Milton, S. (1994). Locating the victim: An overview of census-taking, tabulation technology and persecution in Nazi Germany. Annals of the History of Computing, IEEE, 16(3), 25–39.
Maydanchik, A. (2007). Data quality assessment. Bradley Beach: Technics Publications.
McGilvray, D. (2008). Executing data quality projects ten steps to quality data and trusted information. Amsterdam/Boston: Morgan Kaufmann/Elsevier.
Olson, J. E. (2003). Data quality the accuracy dimension. San Francisco: Morgan Kaufmann Publishers.
Raper, J. F., Rhind, D., & Shepherd, J. F. (1992). Postcodes: The new geography. Harlow: Longman.
Redman, T. C. (1996). Data quality for the information age. Boston: Artech House.
Theys, P. P. (2011). Quest for quality data. Paris: Editions TECHNIP.
Tozer, G. V. (1994). Information quality management. Oxford: Blackwell.
United States Federal Trade Commission. (2010). Social security numbers and id theft. New York: Nova.
United States. Congress. House. Committee on Government Reform. Subcommittee on Regulatory Affairs. (2006). Improving information quality in the federal government: Hearing before the subcommittee on regulatory affairs of the committee on government reform, house of representatives, one hundred ninth congress, first session, july 20, 2005. Washington, DC: U.S. G.P.O.
Wang, R. Y. (1998). A product perspective on total data quality management. Communication of the ACM, 41(2), 58–65.
Wang, Y. R., & Kon, H. B. (1992). Toward quality data: An attributes-based approach to data quality. Cambridge, MA: MIT Press.
Wang, R. Y., Pierce, E. M., Madnik, S. E., Zwass, V., & Fisher, C. W. (Eds.). (2005). Information quality. Armonk/London: M.E. Sharpe.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Floridi, L. (2014). Big Data and Information Quality. In: Floridi, L., Illari, P. (eds) The Philosophy of Information Quality. Synthese Library, vol 358. Springer, Cham. https://doi.org/10.1007/978-3-319-07121-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-07121-3_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07120-6
Online ISBN: 978-3-319-07121-3
eBook Packages: Humanities, Social Sciences and LawPhilosophy and Religion (R0)