Abstract
In empirical modeling, an important desiderata for deeming theoretical entities and processes as real is that they can be reproducible in a statistical sense. Current day crises regarding replicability in science intertwines with the question of how statistical methods link data to statistical and substantive theories and models. Different answers to this question have important methodological consequences for inference, which are intertwined with a contrast between the ontological commitments of the two types of models. The key to untangling them is the realization that behind every substantive model there is a statistical model that pertains exclusively to the probabilistic assumptions imposed on the data. It is not that the methodology determines whether to be a realist about entities and processes in a substantive field. It is rather that the substantive and statistical models refer to different entities and processes, and therefore call for different criteria of adequacy.
Similar content being viewed by others
References
Baggerly, K. A., & Coombes, K. R. (2009). Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology. Annals of Applied Statistics, 3, 1309–1334.
Box, G. E. P. (1976). Science and statistics. Journal of the American Statistical, 71, 791–799.
Cox, D. R., & Hinkley, D. V. (1974). Theoretical statistics. London: Chapman & Hall.
Cox, D. R., & Mayo, D. G. (2010). Objectivity and conditionality in frequentist inference. In D. G. Mayo & A. Spanos (Eds.), Error and inference (pp. 276–304). Cambridge: Cambridge University Press.
Fama, E. F., & French, K. R. (2004). The capital asset pricing model: Theory and evidence. The Journal of Economic Perspectives, 18, 25–46.
Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society A, 222, 309–368.
Fisher, R. A. (1935). The design of experiments. Edinburgh: Oliver and Boyd.
Jensen, M. C. (1968). The performance of mutual funds in the period 1945–1964. Journal of Finance, 23, 389–416.
Lai, T. L., & Xing, H. (2008). Statistical models and methods for financial markets. NY: Springer.
Lintner, J. (1965). The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. Review of Economics and Statistics, 47, 13–37.
McGuirk, A., & Spanos, A. (2008). Revisiting error autocorrelation correction: Common factor restrictions and granger non-causality. Oxford Bulletin of Economics and Statistics, 71, 273–294.
Mayo, D. G. (1996). Error and the growth of experimental knowledge. Chicago: The University of Chicago Press.
Mayo, D. G. (1997). Duhem’s problem, the Bayesian way, and error statistics, or ”What’s belief got to do with It?”. Philosophy of Science, 64, 222–244.
Mayo, D. G. (2010a). Learning from error, severe testing, and the growth of theoretical knowledge. In D. G. Mayo & A. Spanos (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 28–57). Cambridge: Cambridge University Press.
Mayo, D.G. (2010b). Learning from error: The theoretical significance of experimental knowledge, The modern schoolman. Guest editor, Kent Staley. Volume 87, Issue 3/4, March/May 2010, Experimental and theoretical knowledge, The 9th Henle conference in the history of philosophy, 191–217.
Mayo, D. G., & Cox, D. R. (2010). Frequentist statistics as a theory of inductive inference. In D. G. Mayo & A. Spanos (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (Vol. 7, pp. 247–275). Cambridge: Cambridge University Press.
Mayo, D. G., & Spanos, A. (2004). Methodology in practice: Statistical misspecification testing. Philosophy of Science, 71, 1007–1025.
Mayo, D. G., & Spanos, A. (2006). Severe testing as a basic concept in a Neyman-Pearson philosophy of induction. British Journal for the Philosophy of Science, 57, 323–57.
Mayo, D. G., & Spanos, A. (2010). Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science. Cambridge: Cambridge University Press.
Mayo, D. G., & Spanos, A. (2011). Error statistics. In D. Gabbay, P. Thagard, & J. Woods (Eds.), Philosophy of statistics, handbook of philosophy of science. Amsterdam: Elsevier.
Potti, A., Dressman, H. K., Bild, A., Riedel, R. F., Chan, G., Sayer, R., et al. (2006). Genomic signatures to guide the use of chemotherapeutics. National Medicine, 12, 1294–1300.
Senn, S. J. (2001). Two cheers for P-values. Journal of Epidemiology and Biostatistics, 6(2), 193–204.
Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. Journal of Finance, 19, 425–442.
Spanos, A. (1990). The simultaneous equations model revisited: Statistical adequacy and identification. Journal of Econometrics, 44, 87–108.
Spanos, A. (1999). Probability theory and statistical inference: Econometric modeling with observational data. Cambridge: Cambridge University Press.
Spanos, A. (2006). Where do statistical models come from? Revisiting the problem of specification, pp. 98–119 In Optimality: The second Erich L. Lehmann Symposium, Rojo, J. (Ed.) Lecture notes-monograph series, vol. 49, Institute of Mathematical Statistics.
Spanos, A. (2007). Curve-fitting, the reliability of inductive inference and the error-statistical approach. Philosophy of Science, 74(5), 1046–1066.
Spanos, A. (2010a). Theory testing in economics and the error statistical perspective. In D. G. Mayo & A. Spanos (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 202–246). Cambridge: Cambridge University Press.
Spanos, A. (2010b). Akaike-type criteria and the reliability of inference: Model selection versus statistical model specification. Journal of Econometrics, 158, 204–220.
Spanos, A. (2010c). Statistical adequacy and the trustworthiness of empirical evidence: Statistical vs. substantive information. Economic Modelling, 27, 1436–1452.
Spanos, A. (2010d). The discovery of argon: A case for learning from data? Philosophy of Science, 77(3), 359–380.
Spanos, A. (2010e). Is frequentist testing vulnerable to the base-rate fallacy? Philosophy of Science, 77, 565–583.
Spanos, A. (2013). A frequentist interpretation of probability for model-based inductive inference. Synthese, 190, 1555–1585.
Spanos, A., & McGuirk, A. (2001). The model specification problem from a probabilistic reduction perspective. Journal of the American Agricultural Association, 83, 1168–1176.
Author information
Authors and Affiliations
Corresponding author
Additional information
Thanks are due to two anonymous reviewers for many contructive and valuable comments/suggestions.
Appendix: M-S testing and auxiliary regressions
Appendix: M-S testing and auxiliary regressions
In light of the fact that the Linear Regression model (Table 1) is specified in terms of the conditional mean and variance:
one can test for any departures from the linear regression assumptions: [1] Normality, [2] linearity, [3] homoskedasticity, [4] independence, and [5] t-invariance, by expanding the orthogonal decompositions stemming from (14) (Spanos 1999):
to include additional terms representing potential violations from these assumptions. Whereas the adequacy of the model assumes that \(E\left( u_{t}\mathbf {~|~}X_{t}\mathbf {=}x_{t}\right) \mathbf {=}0\), the true error might be non-zero when any of the assumptions [2]-[5] are invalid; similarly for \(E\left( u_{t}^{2}\mathbf {~|~}X_{t}\mathbf {=}x_{t}\right) \mathbf {=} \sigma ^{2}\). A particular example of such auxiliary regressions whose terms are only indicative of the kind of terms one could use to seek out any remaining systematic information in the residuals, is:
In each case the null hypotheses \(H_{0}\) assert that the model assumptions hold, taking us back to (15). The terms beyond \(\gamma _{10}+\gamma _{11}x_{t}\) in (16) and beyond \(\gamma _{20}\) in (17) represent different types of statistical systematic information that the original model might have overlooked. The interesting upshot of this is that the additional terms represent potential violations, which are expressed in generic terms that represent systematic statistical information already in \(\mathbf {Z}\) and do not directly refer to any specific substantive factors. Their statistical significance, however, raises questions about how generic terms such as \(t\) and \(t^{2}\)—which represent substantive ignorance—can be replaced by relevant explanatory variables for substantive adequacy purposes; see Spanos (2010c).
One has reduced the problem of probing for model violations to testing the statistical significance of these additional terms, individually or in groups, using simple t-tests and F-tests (Spanos 1999). A rejection of a null hypothesis indicates departures from the underlying model assumption(s).
Rights and permissions
About this article
Cite this article
Spanos, A., Mayo, D.G. Error statistical modeling and inference: Where methodology meets ontology. Synthese 192, 3533–3555 (2015). https://doi.org/10.1007/s11229-015-0744-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11229-015-0744-y