Machine learning methods have recently created high expectations in the climate modelling context in view of addressing climate change, but they are often considered as non-physics-based ‘black boxes’ that may not provide any understanding. However, in many ways, understanding seems indispensable to appropriately evaluate climate models and to build confidence in climate projections. Relying on two case studies, we compare how machine learning and standard statistical techniques affect our ability to understand the climate system. For that purpose, we put five evaluative criteria of understanding to work: intelligibility, representational accuracy, empirical accuracy, coherence with background knowledge, and assessment of the domain of validity. We argue that the two families of methods are part of the same continuum where these various criteria of understanding come in degrees, and that therefore machine learning methods do not necessarily constitute a radical departure from standard statistical tools, as far as understanding is concerned.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Such identification may actually be constrained by a form of holism of confirmation and refutation that generally characterizes (complex) climate models (Lenhard and Winsberg 2010).
The approach here is similar to the framework proposed in Knüsel and Baumberger (2020), although our aims are different (and complementary): whereas they want to show that climate models involving machine learning can provide some understanding in certain cases (they discuss a case study, on which part of their argument crucially relies), we aim to emphasise that this is all a matter of degree, already within ‘standard’ climate modelling without machine learning.
Adequacy and intelligibility are commonly considered as the two central ‘pillars’ of understanding. Thus, de Regt distinguishes understanding a phenomenon–that is, having an adequate explanation of the phenomenon–and understanding a theory–that is, being able to use the theory (2017, p. 23). When Wilkenfeld (2017) introduces his Multiple Understanding Dimensions (MUD) theory as “a natural synthesis of existing views” of understanding, he argues that “representational-accuracy (of which we assume truth is one kind) and intelligibility (which we will define so as to entail abilities) are good-making features of a state of understanding” (Wilkenfeld 2017, p. 1274). Following the example, Knüsel and Baumberger (2020, §3) offer three “dimensions” of understanding encompassing representational accuracy, representational depth, and graspability.
In Knüsel and Baumberger (2020), empirical accuracy and representational accuracy are also closely related notions: they see the first as an evaluative criterion for the second. What we mean with representational accuracy here also partly includes what they call—but leave on the side—representational depth. Note that we do not make the distinction between dimensions of understanding and evaluative criteria for understanding, since we reckon that the former, as we define them here, are also directly evaluative.
Biases in model outputs can have different origins, one of the most obvious being the finite resolution of climate models, leading to various types of model errors at the global, regional and local scales. It is important to emphasise that bias correction “cannot overcome errors from a substantial misrepresentation of relevant processes” (Maraun and Widman 2018, p. 117), in particular such as global scale circulation biases or missing (or misrepresented) local scale processes (e.g. linked to complex orography, as in the example discussed below); to evaluate precisely when relevant processes are being substantially misrepresented can of course be a tricky issue (especially in the climate change context) and actually lies at the heart of the discussion below.
We closely follow CH2018 here, to which we refer for more details.
CMIP5 is the fifth phase of the Coupled Model Intercomparison Project, which provides a standardised framework for comparing GCM simulations; EURO-CORDEX is the European branch of the Coordinate Regional Climate Downscaling Experiment, which is the regional counterpart of CMIP5 for RCM simulations.
From the point of view of understanding, regional climate modelling involving ‘only’ dynamical downscaling raises similar issues as global climate modelling (e.g. about model complexity, parameterisation and opacity); in contrast, and to a certain extent, statistical downscaling involves some different—typically statistical—issues for understanding, akin to those encountered in machine learning approaches (see Sect. 4).
Various limitations to statistical downscaling (and to the intelligibility of statistically downscaled models) arise in particular in “topographically structured terrain” such as the Alpine region of Switzerland.
For instance, empirical accuracy is not the same for all variables; e.g., it is in general better for temperature than for precipitation. It should be noted that, overall, empirical accuracy is better at the global scale than at the regional and local scales.
One reason has to do with the role of calibration of parameter values in achieving empirical accuracy with past and current observations. Another important reason relates to the criteria concerning representational accuracy and the domain of validity (see below): in the context of radically different boundary conditions, such as high forcing scenarios, certain empirical parameterisation procedures may not be valid anymore and important feedbacks may be missing.
It could be asked whether this case is representative of the use of machine learning in climate science. It is difficult to answer this question, because the use of DNNs in climate science is still relatively novel, see Reichstein et al. (2019). The results by Gentine et al. (2018) can be interpreted as proof of concept: they show that DNNs have the potential to address some computational problems. Consequently, climate projections under different forcings are not considered.
Gentine et al. (2018) do not examine or discuss how CBRAIN would perform with respect to different forcings, or how well it is suited to address the issue of climate change in general. In principle, it is possible to evaluate how well CBRAIN performs in comparison to SPCAM for different boundary conditions, because SPCAM can generate test data for a variety of boundary conditions, and one could evaluate CBRAIN on these test sets.
The precise extent to which statistical downscaling methods allow for some manipulability (and hence for some intelligibility) is an open (and a case-by-case) issue; there is actually a call in the climate modelling community for designing “ensembles of statistical downscaling methods or even ensembles combining GCMs, RCMs and a range of statistical methods” (Maraun and Widman 2018, p. 284)—such ensembles would help to get a clearer picture on the manipulability issue.
Downscaling techniques are applied in weather forecasting since the late 1950s (see Maraun and Widman 2018, ch. 3).
Alain, G., & Bengio, Y. (2016). Understanding intermediate layers using linear classifier probes. arXiv:1610.01644v4.
Baumberger, C. (2019). Explicating objectual understanding: Taking degrees seriously. Journal for General Philosophy of Science, 50, 367–388.
Baumberger, C., Knutti, R., & Hirsch Hadorn, G. (2017). Building confidence in climate model projections: An analysis of inferences from fit. WIREs Climate Change, 8, e454.
CH2018. (2018). Climate Scenarios for Switzerland. Technical Report. National Centre for Climate Services, Zurich.
de Regt, H. W. (2017). Understanding scientific understanding. New York: Oxford University Press.
de Regt, H. W., & Dieks, D. (2005). A contextual approach to scientific understanding. Synthese, 144, 133–170.
Gentine, P., Pritchard, M., Rasp, S., Reinaudi, G., & Yacalis, G. (2018). Could machine learning break the convection parameterization deadlock? Geophysical Research Letters, 45, 5742–51.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge: MIT Press.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.)., Springer series in statistics Berlin: Springer.
Held, I. M. (2005). The gap between simulation and understanding in climate modeling. Bulletin of the American Meteorological Society, 86(11), 1609–1614.
Hempel, C. G., & Oppenheim, P. (1948). Studies in the logic of explanation. Philosophy of Science, 15(2), 135–175.
Hewitson, B. C., Daron, J., Crane, R. G., Zermoglio, M. F., & Jack, C. (2014). Interrogating empirical-statistical downscaling. Climatic Change, 122, 539–554.
IPCC. (2013). Climate change 2013: The physical science basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge: Cambridge University Press.
Kawamleh, S. (2021). Can machines learn how clouds work? The epistemic implications of machine learning methods in climate science. Philosophy of Science, 88(5).
Khairoutdinov, M., Randall, D., & Demott, C. (2005). Simulations of the atmospheric general circulation using a cloud-resolving model as a superparameterization of physical processes. Journal of the Atmospheric Sciences, 62, 2136–54.
Knüsel, B., & Baumberger, C. (2020). Understanding climate phenomena with data-driven models. Studies in History and Philosophy of Science Part A. https://doi.org/10.1016/j.shpsa.2020.08.003.
Knutti, R. (2018). Climate model confirmation: From philosophy to predicting climate in the real world. In E. A. Lloyd & E. Winsberg (Eds.), Climate modelling: Philosophical and conceptual issues (pp. 325–359). Cham: Palgrave Macmillan.
Kuorikoski, J. (2011). Simulation and the sense of understanding. In P. Humphreys & C. Imbert (Eds.), Models, simulations, and representations, Chapter 8 (pp. 250–273). London: Routledge.
Kuorikoski, J., & Ylikoski, P. (2015). External representations and scientific understanding. Synthese, 192, 3817–3837.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436–44.
Lenhard, J., & Winsberg, E. (2010). Holism, entrenchment, and the future of climate model pluralism. Studies in History and Philosophy of Science Part B, 41(3), 253–262.
López-Rubio, E., & Ratti, E. (2019). Data science and molecular biology: Prediction and mechanistic explanation. Synthese. https://doi.org/10.1007/s11229-019-02271-0.
Maraun, D., & Widman, M. (2018). Statistical downscaling and bias correction for climate research. Cambridge: Cambridge University Press.
Maraun, D., et al. (2017). Towards process-informed bias correction of climate change simulations. Nature Climate Change, 7, 764–773.
Meiburg, E. (1986). Comparison of the molecular dynamics method and the direct simulation Monte Carlo technique for flows around simple geometries. Physics of Fluids, 29, 3107–3113.
Parker, W. S. (2014). Simulation and understanding in the study of weather and climate. Perspectives on Science, 22(3), 336–356.
Parker, W. S. (2020). Model evaluation: An adequacy-for-purpose view. Philosophy of Science, 87(3), 457–477. https://doi.org/10.1086/708691.
Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., et al. (2019). Deep learning and process understanding for data-driven earth system science. Nature, 566, 195–204.
Rummukainen, M. (2016). Added value in regional climate modeling. WIREs Climate Change, 7, 145–159.
Sullivan, E. (2019). Understanding from machine learning models. British Journal for the Philosophy of Science. https://doi.org/10.1093/bjps/axz035.
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). Intriguing properties of neural nets. arXiv:1312.6199v4.
Trout, J. (2002). Scientific explanation and the sense of understanding. Philosophy of Science, 69, 212–233.
Vidal, R., Bruna, J., Giryes, R., & Soatto, S. (2017). Mathematics of deep learning. arXiv:1712.04741.
Wilkenfeld, D. A. (2017). Muddy understanding. Synthese, 194(4), 1273–93.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank the participants of the philosophy of science research colloquium in the Spring semester 2020 at the University of Bern for valuable feedback on an earlier draft of the paper. We also wish to thank the participants of the seminar ‘Philosophy of science perspectives on the climate challenge’ and the workshop ‘Big data, machine learning, climate modelling and understanding’ in the Fall semester 2019 at the University of Bern and supported by the Oeschger Centre for Climate Change Research. JJ and VL are grateful to the Swiss National Science Foundation for financial support (Grant PP00P1_170460). TR was funded by the cogito foundation.
About this article
Cite this article
Jebeile, J., Lam, V. & Räz, T. Understanding climate change with statistical downscaling and machine learning. Synthese (2020). https://doi.org/10.1007/s11229-020-02865-z
- Climate models
- Dynamical and statistical downscaling
- Deep neural networks
- Machine learning
- Climate change