Abstract
Predicting the future state of a system has always been a natural motivation for science and practical applications. Such a topic, beyond its obvious technical and societal relevance, is also interesting from a conceptual point of view. This owes to the fact that forecasting lends itself to two equally radical, yet opposite methodologies. A reductionist one, based on first principles, and the naïve-inductivist one, based only on data. This latter view has recently gained some attention in response to the availability of unprecedented amounts of data and increasingly sophisticated algorithmic analytic techniques. The purpose of this note is to assess critically the role of big data in reshaping the key aspects of forecasting and in particular the claim that bigger data leads to better predictions. Drawing on the representative example of weather forecasts we argue that this is not generally the case. We conclude by suggesting that a clever and context-dependent compromise between modelling and quantitative analysis stands out as the best forecasting strategy, as anticipated nearly a century ago by Richardson and von Neumann.
Similar content being viewed by others
Notes
See, e.g. Casacuberta and Vallverdú (2014) for an appraisal of how, experiments of this kind, may lead to a paradigm shift in the philosophy of science.
The Risk to Civil Liberties of Fighting Crime With Big Data, 6 November 2016
Quoted in Lewis Campbell and William Garnett, The Life of James Clerk Maxwell, Macmillan, London (1882); reprinted by Johnson Reprint, New York (1969), p. 440.
In its original version the Poincaré recurrence theorem states that:
Given aHamiltonian system with abounded phase space Γ, and aset A ∈ Γ, all the trajectories starting from x ∈ A will return back to A after some time repeatedly and infinitely many times, except for some of them in aset of zero probability.
Actually, though this is seldom stressed in elementary courses, the theorem can be easily extended to dissipative ergodic systems provided one only considers initial conditions on the attractor, and “zero probability” is interpreted with respect to the invariant probability on the attractor (Collet and Eckmann 2006).
To be precise, if the system is dissipative, D is the fractal dimension D A of the attractor (Cecconi et al. 2012).
References
Calude, C.S., & Longo, G. (2016). The deluge of spurious correlations in big data. Foundations of Science, 21, 1–18.
Casacuberta, D., & Vallverdú, J. (2014). E-science and the data deluge. Philosophical Psychology, 27(1), 126–140.
Canali, S. (2016). Big data, epistemology and causality: knowledge in and knowledge out in EXPOsOMICS. Big Data & Society, 3(2), 1–11.
Cecconi, F., Cencini, M., Falcioni, M., & Vulpiani, A. (2012). The prediction of future from the past: an old problem from a modern perspective. American Journal of Physics, 80(11), 1001–1008.
Chibbaro, S., Rondoni, L., & Vulpiani, A. (2014). Reductionism, emergence and levels of reality. Berlin: Springer.
Collet, P., & Eckmann, J.-P. (2006). Concepts and results in chaotic dynamics: A short course. Berlin: Springer.
Coveney, P.V., Dougherty, E.R., & Highfield, R.R. (2016). Big data need big theory too. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 280(374), 1–11.
Crutchfield, J.P. (2014). The dreams of theory . Wiley Interdisciplinary Reviews: Computational Statistics, 6, 75–79.
Dahan Dalmedico, A. (2001). History and epistemology of models: meteorology as a case study. Archive for the History of Exact Sciences, 55, 395–422.
de Finetti, B. (1974). Theory of probability Vol. 1. New York: Wiley.
de Finetti, B. (2008). Philosophical lectures on probability In A. Mura (Ed.), Translated by H. Hosni. Berlin: Springer.
Domingos, P. (2015). The master algorithm: How the quest for the ultimate learning machine will remake our world. New York: Basic Books.
Halmos, P. R. (1956). Lectures on Ergodic Theory. London: Chelsea Publishing.
Kac, M. (1947). On the notion of recurrence in discrete stochastic processes. Bullettin of the American Mathematical Society, 53, 1002–1010.
Kitchin, R. (2014). Big data, new epistemologies and shifts. Big Data & Society, 1, 1–12.
Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The Parable of Google Flu Traps in Big Data Analysis. Science, 343(6167), 1203–1205.
Leonelli, S. (2016). Data-Centric Biology: A philosophical study. Chicago: Chicago University Press.
Lorenz, E. N. (1996). Predictability- A problem partly solved, Proceedings of the Seminar on Predictability (pp. 1–18). Reading: ECMWF.
Lynch, P. (2006). The Emergence of Numerical Weather Prediction: Richardson’s Dream. Cambridge: Cambridge University Press.
Ma, S.K. (1985). Statistical mechanics. Singapore: World Scientific.
Mayer-Schönberger, V., & Cukier, K. (2013). Big Data: A Revolution That Will Transform How We Live, Work, and Think, (p. 2013). New York: Houghton Mifflin.
Nowotny, E. (2016). The Cunning of Uncertainty. London: Polity.
Nural, M. , Cotterell, M.E., & Miller, J. (2015). Using Semantics in Predictive Big Data Analytics, Proceedings - 2015 IEEE International Congress on Big Data, BigData Congress, (Vol. 2015 pp. 254–261).
Onsager, L., & Machlup, S. (1953). Fluctuations and irreversible processes . Physical Review, 91, 1505–1512.
Parisi, G. (1999). Complex Systems: A Physicist’s Viewpoint. Physica A, 263, 557–564.
Pasquale, F. (2015). The Black Box Society Vol. 36. Harvard: Harvard University Press.
Perry, W.L., McInnes, B., Price, C.C., Smith, S.C., & Hollywood, J.S. (2013). Predictive Policing: The role of crime forecasting in law enforcement operations. RAND Corporation, Santa Monica.
Poincaré, H. (1890). Sur le problème des trois corps et les équations de la dynamique. Acta Mathematica, 13, 1–270.
Richardson, L.F. (1922). Weather Prediction by Numerical Methods. Cambridge: Cambridge University Press.
Robbins, M. (2016). Has a rampaging AI algorithm really killed thousands in Pakistan? The Guardian. http://www.theguardian.com/science/the-lay-scientist/2016/feb/18/has-a-rampaging-ai-algorithm-really-killed-thousands-in-pakistan .
SKYNET (2005). Applying Advanced Cloud-based Behavior Analytics. The Intercept. https://theintercept.com/document/2015/05/08/skynet-applying-advanced-cloud-based-behavior-analytics/.
Saunders, J., Hunt, P., & Hollywood, J.S. (2016). Predictions put into practice: A quasi-experimental evaluation of Chicago’s predictive policing pilot. Journal of Experimental Criminology, 12, 1–25.
Takens, F. (1981). Detecting strange attractors in turbulence. In Rand, D., & Young, L.-S. (Eds.), Dynamical Systems and Turbulence, Lecture Notes in Mathematics, (Vol. 898 pp. 366–381).
Weigend, A.S., & Gershenfeld, N.A. (Eds.) (1994). Time Series Prediction: Forecasting the Future and Understanding the Past Addison-Wesley, Reading.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Hosni, H., Vulpiani, A. Forecasting in Light of Big Data. Philos. Technol. 31, 557–569 (2018). https://doi.org/10.1007/s13347-017-0265-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13347-017-0265-3