Abstract
Although true in some aspects, the suggested characterization of today’s science as a dichotomy between traditional science and data-driven science misses some of the nuance, complexity, and possibility that exists between the two positions. Part of the problem is the claim that Data Science works without theories. There are many theories behind the data that are used in science. However, for data science, the only theories that matter are those in mathematics, statistics, and computer science. In this conceptual paper, we add two other philosophy of science tenets, experiments and data, to the discussion to create a more nuanced view of how data science uses theories. Following Ihde’s concept of technoscience and the incessant quest for more precision, magnification, and resolution, we argue that technology-driven science created a need for more technology-driven science, culminating in data science. Further, we adapt Hacking and Galison’s views on physics to argue that data science is also an experimental science, which uses data objects in experiments. Drawing from Heelan (The Journal of Philosophy 85:515–524, 1988), we called these objects “data-objects-for-knowing”. Finally, we conclude that data science is a science to study artificially created phenomena—a science to study the data manipulated by the equations and operations of AI. It disregards the connections between data and the real world that were carefully built by the theories from other sciences. In the experiments of data science, data are the world itself. The knowledge created by data science is purposely disconnected from any theory from other sciences; it is a knowledge for the sake of itself.
Similar content being viewed by others
Notes
Slota et al. define prospecting as “the work of rendering data, knowledge, expertise, and practices of worldly domains available or amenable to engagement with data scientific method and epistemology, including mapping available data sources and tools, surveying potential organizational connections, and reasoning about future resources. Prospecting precedes data analysis or visualization, and is constituted by the activities of discovering disordered or inaccessible data resources, thereafter to be ordered and rendered available for data scientific work.” (2020, p.1).
References
Ackermann RJ (2014) Data, instruments, and theory: a dialectical approach to understanding science. Princeton University Press, Princeton Legacy Library
Agarwal R, Dhar V (2014) Big data, data science, and analytics: The opportunity and challenge for is research. Inf Syst Res 25(3):443–448. https://doi.org/10.1287/isre.2014.0546
Anderson C (2008) The End of theory: The data deluge makes the scientific method obsolete. Wired 16(07). https://www.wired.com/2008/06/pb-theory/. Accessed 30 Jan 2021
Appenzeller T (2017) The scientists’ apprentice. Science 357(6346):16–17
Bohannon J (2017) The cyberscientist. Science 357(6346):18–21
Canali S (2016) Big data, epistemology and causality: knowledge in and knowledge out in EXPOsOMICS. Big Data Soc 3(2):2053951716669530
Cao L (2017) Data science: a comprehensive overview. ACM Comput Surv 50(3):2–42. https://doi.org/10.1145/3076253
Dhar V (2013) Data science and prediction. Commun ACM 56(12):64–73
Dourish P (2017) The stuff of bits: an essay on the materialities of information. The MIT Press, Cambridge, MA
Feyerabend P (1970) Consolations for the specialist. Criticism and the Growth of Knowledge 4:197–230
Feyerabend P (1993) Against method. Verso, London
Fonseca F, Marcinkowski M, Davis C (2019) Cyber-human systems of thought and understanding. J Assoc Inf Sci Technolo 70(4):402–411
Gadamer H-G, Fantel H (1975) The problem of historical consciousness. Graduate Fac Philos J 5(1):8–52
Galison P (1987) How experiments end. University of Chicago Press, Chicago
Galison P (1997) Image and logic: a material culture of microphysics. University of Chicago Press, Image and Logic
Galison P (2011) Computer simulations and the trading zone. In: Gramelsberger G (ed) From science to computational science. Diaphanes, Diaphanes, Zürich, pp 118–157
Galison P, Stump D (1996) The disunity of science: boundaries, contexts, and power. Stanford University Press, Writing science
Glazebrook T (2000) Heidegger’s philosophy of science. Perspectives in continental philosophy. Fordham University Press, New York
Hacking I (1982) Experimentation and scientific realism. Philosophical Topics 13(1):71–87
Hacking I (1988) On the stability of the laboratory sciences. J Philos 85(10):507–514
Hacking I (1990) Book review: how experiments end. J Philos 87(2):103–106
Heelan P (1988) Experiment and theory: constitution and reality. J Philos 85(10):515–524
Heelan P (1989a) After experiment: realism and research. Am Philos Q 26(4):297–308
Heelan P (1989b) Space-perception and the philosophy of science. University of California Press, California
Heidegger M (1977) The question concerning technology, and other essays. Garland Pub, New York
Hey T, Hey AJG, Tansley S, Tolle KM (2009) The fourth paradigm: data-intensive scientific discovery. Microsoft Research, Redmond, WA
Horvitz E (2017) AI, people, and society. Science 357(6346):7
Humphreys P (2009) The philosophical novelty of computer simulation methods. Synthese 169(3):615–626
Ihde D (1991) Instrumental realism: the interface between philosophy of science and philosophy of technology. Indiana University Press, Bloomington, IN
Ihde D (2016) Husserl’s missing technologies. perspectives in continental philosophy. Fordham University Press, New York
Kitchin R (2014) Big data, new epistemologies and paradigm shifts. Big Data Soc 1(1):1–12
Kuhn TS (1970) The structure of scientific revolutions. University of Chicago Press, Chicago
Latour B (1987) Science in action: how to follow scientists and engineers through society. Harvard University Press, Cambridge, MA
Mayer-Schonberger V, Cukier K (2013) Big data: a revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt, Boston
Naur P (1974) Concise survey of computer methods. Petrocelli Books.
Norvig P (2012) Colorless green ideas learn furiously: Chomsky and the two cultures of statistical learning. Significance 9(4):30–33
Pietsch W (2015) Aspects of theory-ladenness in data-intensive science. Philosophy of Science 82(5):905–916
Popper K (2002) The logic of scientific discovery. Classics series. Routledge, UK
Radder H (2003) The philosophy of scientific experimentation. University of Pittsburgh Press, Pittsburgh
Ratti E (2015) Big data biology: between eliminative inferences and exploratory experiments. Philosophy of Science 82(2):198–218
Ribes D, Hoffman AS, Slota SC, Bowker GC (2019) The logic of domains. Soc Stud Sci 49(3):281–309
Ruse M (2005) Theory. In: Honderich T (ed) The Oxford companion to philosophy, 2nd edn. Oxford University Press, Oxford
Slota SC, Hoffman AS, Ribes D, Bowker GC (2020) Prospecting (in) the data sciences. Big Data Soc 7(1):2053951720906849
Susser D (2016) Ihde’s missing sciences: postphenomenology, big data, and the human sciences. Techné: Res Philos Technol 20(2):137–152
Tuana N (2013) Embedding philosophers in the practices of science: bringing humanities to the sciences. Synthese 190(11):1955–1973
Waters CK (2007) The nature and context of exploratory experimentation: an introduction to three case studies of exploratory research. Hist Philos Life Sci 29(3):275–284
Winther RG (2016) The Structure of Scientific Theories. In: Zalta EN (ed) The stanford encyclopedia of philosophy. Metaphysics Research Lab, Stanford University, California
Wittgenstein L, Pears D, Russell B, McGuinness B (2001) Tractatus Logico-philosophicus. Routledge, Routledge classics
Acknowledgments
We gratefully acknowledge the Rock Ethics Institute at The Pennsylvania State University for supporting this work under the Faculty Fellows program.
Funding
Funded by Rock Ethics Institute Faculty Fellowship at The Pennsylvania State University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fonseca, F. Data objects for knowing. AI & Soc 37, 195–204 (2022). https://doi.org/10.1007/s00146-021-01150-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00146-021-01150-y