Big Data and Information Quality

Floridi, Luciano

doi:10.1007/978-3-319-07121-3_15

Luciano Floridi⁹

Part of the book series: Synthese Library ((SYLI,volume 358))

2114 Accesses
5 Citations

Abstract

This paper is divided into two parts. In the first, I shall briefly analyse the phenomenon of “big data”, and argue that the real epistemological challenge posed by the zettabyte era is small patterns. The valuable undercurrents in the ocean of data that we are accumulating are invisible to the computationally-naked eye, so more and better technology will help. However, because the problem with big data is small patterns, ultimately, the game will be won by those who “know how to ask and answer questions” (Plato, Cratylus, 390c). This introduces the second part, concerning information quality (IQ): which data may be useful and relevant, and so worth collecting, curating, and querying, in order to exploit their valuable (small) patterns? I shall argue that the standard way of seeing IQ in terms of fit-for-purpose is correct but needs to be complemented by a methodology of abstraction, which allows IQ to be indexed to different purposes. This fundamental step can be taken by adopting a bi-categorical approach. This means distinguishing between purpose/s for which some information is produced (P-purpose) and purpose/s for which the same information is consumed (C-purpose). Such a bi-categorical approach in turn allows one to analyse a variety of so-called IQ dimensions, such as accuracy, completeness, consistency, and timeliness. I shall show that the bi-categorical approach lends itself to simple visualisations in terms of radar charts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Big Data and Data Quality

Quality of Web Data and Quality of Big Data: Open Problems

Data Quality Issues in Big Data: A Review

Notes

1.
Fracking (hydraulic fracturing) is a technique in which a liquid (usually water), mixed with sand and chemicals, is injected underground at high pressure in order to cause small fractures (typically less than 1 mm), along which fluids such as gas (especially shale gas), petroleum and brine water can surface.
2.
The body of literature on IQ is growing, see for example (Olson (2003), Wang et al. (2005), Batini and Scannapieco (2006), Lee et al. (2006), Al-Hakim (2007), Herzog et al. (2007), Maydanchik (2007), McGilvray (2008), Theys (2011)).
3.
http://www.whitehouse.gov/omb/fedreg_reproducible
4.
See more recently United States. Congress. House. Committee on Government Reform. Subcommittee on Regulatory Affairs (2006).
5.
http://webarchive.nationalarchives.gov.uk/20090811143745/http://www.bristol-inquiry.org.uk
6.
http://webarchive.nationalarchives.gov.uk/+/www.dh.gov.uk/en/Publicationsandstatistics/Publications/PublicationsPolicyAndGuidance/DH_4125508
7.
http://mitiq.mit.edu/ICIQ/2013/
8.
http://jdiq.acm.org/
9.
http://www.dataqualitysummit.com/
10.
http://mitiq.mit.edu/
11.
Borges, “The Analytical Language of John Wilkins”, originally published in 1952, English translation in Borges (1964).
12.
On the method of abstraction and LoA see Floridi (2008) and Floridi (2011).
13.
http://www.ons.gov.uk/ons/guide-method/census/2011/how-our-census-works/how-we-took-the-2011-census/how-we-processed-the-information/data-quality-assurance/index.html

References

Al-Hakim, L. (2007). Information quality management: Theory and applications. Hershey: Idea Group Pub.
Book Google Scholar
Batini, C., & Scannapieco, M. (2006). Data quality – Concepts, methodologies and techniques. Berlin/New York: Springer.
Google Scholar
Borges, J. L. (1964). Other inquisitions, 1937–1952. Austin: University of Texas Press.
Google Scholar
Cajochen, C., Altanay-Ekici, S., Münch, M., Frey, S., Knoblauch, V., & Wirz-Justice, A. (2013). Evidence that the lunar cycle influences human sleep. Current Biology: CB, 23(15), 1485–1488.
Article Google Scholar
Census. (2011). Census data quality assurance strategy. http://www.ons.gov.uk/ons/guide-method/census/2011/the-2011-census/processing-the-information/data-quality-assurance/2011-census---data-quality-assurance-strategy.pdf
English, L. (2009). Information quality applied: Best practices for improving business information, processes, and systems. Indianapolis: Wiley.
Google Scholar
Floridi, L. (2008). The method of levels of abstraction. Minds and Machines, 18(3), 303–329.
Article Google Scholar
Floridi, L. (2011). The philosophy of information. Oxford: Oxford University Press.
Book Google Scholar
Herzog, T. N., Scheuren, F., & Winkler, W. E. (2007). Data quality and record linkage techniques. New York: Springer.
Google Scholar
Lee, Y. W., Pipino, L. L., Funk, J. D., & Wang, R. Y. (2006). Journey to data quality. Cambridge, MA: MIT Press.
Google Scholar
Luebke, D. M., & Milton, S. (1994). Locating the victim: An overview of census-taking, tabulation technology and persecution in Nazi Germany. Annals of the History of Computing, IEEE, 16(3), 25–39.
Article Google Scholar
Maydanchik, A. (2007). Data quality assessment. Bradley Beach: Technics Publications.
Google Scholar
McGilvray, D. (2008). Executing data quality projects ten steps to quality data and trusted information. Amsterdam/Boston: Morgan Kaufmann/Elsevier.
Google Scholar
Olson, J. E. (2003). Data quality the accuracy dimension. San Francisco: Morgan Kaufmann Publishers.
Google Scholar
Raper, J. F., Rhind, D., & Shepherd, J. F. (1992). Postcodes: The new geography. Harlow: Longman.
Google Scholar
Redman, T. C. (1996). Data quality for the information age. Boston: Artech House.
Google Scholar
Theys, P. P. (2011). Quest for quality data. Paris: Editions TECHNIP.
Google Scholar
Tozer, G. V. (1994). Information quality management. Oxford: Blackwell.
Google Scholar
United States Federal Trade Commission. (2010). Social security numbers and id theft. New York: Nova.
Google Scholar
United States. Congress. House. Committee on Government Reform. Subcommittee on Regulatory Affairs. (2006). Improving information quality in the federal government: Hearing before the subcommittee on regulatory affairs of the committee on government reform, house of representatives, one hundred ninth congress, first session, july 20, 2005. Washington, DC: U.S. G.P.O.
Google Scholar
Wang, R. Y. (1998). A product perspective on total data quality management. Communication of the ACM, 41(2), 58–65.
Article Google Scholar
Wang, Y. R., & Kon, H. B. (1992). Toward quality data: An attributes-based approach to data quality. Cambridge, MA: MIT Press.
Google Scholar
Wang, R. Y., Pierce, E. M., Madnik, S. E., Zwass, V., & Fisher, C. W. (Eds.). (2005). Information quality. Armonk/London: M.E. Sharpe.
Google Scholar

Download references

Author information

Authors and Affiliations

Oxford Internet Institute, University of Oxford, 1 St Giles, Oxford, OX1 3JS, UK
Luciano Floridi

Authors

Luciano Floridi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luciano Floridi .

Editor information

Editors and Affiliations

Oxford Internet Institute, University of Oxford, Oxford, Oxfordshire, UK
Luciano Floridi
Department of Science and Technology Studies, University College London, London, UK
Phyllis Illari

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Floridi, L. (2014). Big Data and Information Quality. In: Floridi, L., Illari, P. (eds) The Philosophy of Information Quality. Synthese Library, vol 358. Springer, Cham. https://doi.org/10.1007/978-3-319-07121-3_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-07121-3_15
Published: 15 July 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07120-6
Online ISBN: 978-3-319-07121-3
eBook Packages: Humanities, Social Sciences and LawPhilosophy and Religion (R0)

Publish with us

Policies and ethics

Big Data and Information Quality

Abstract

Access this chapter

Similar content being viewed by others

Big Data and Data Quality

Quality of Web Data and Quality of Big Data: Open Problems

Data Quality Issues in Big Data: A Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Big Data and Information Quality

Abstract

Access this chapter

Similar content being viewed by others

Big Data and Data Quality

Quality of Web Data and Quality of Big Data: Open Problems

Data Quality Issues in Big Data: A Review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation