Quantitative Geosciences: Data Analytics, Geostatistics, Reservoir Characterization and Modeling pp 123-149 | Cite as

# Regression-Based Predictive Analytics

Chapter

First Online:

## Abstract

Regression is one of the most commonly used multivariate statistical methods. Multivariate linear regression can integrate many explanatory variables to predict the target variable. However, collinearity due to intercorrelations in the explanatory variables leads to many surprises in multivariate regression. This chapter presents both basic and advanced regression methods, including standard least square linear regression, ridge regression and principal component regression. Pitfalls in using these methods for geoscience applications are also discussed.

## References

- Bertrand, P. V., & Holder, R. L. (1988). A quirk in multiple regression: The whole regression can be greater than the sum of its parts.
*The Statistician, 37*, 371–374.CrossRefGoogle Scholar - Chen, A., Bengtsson, T., & Ho, T. K. (2009). A regression paradox for linear models: Sufficient conditions and relation to Simpson’s paradox.
*The American Statistician, 63*(3), 218–225.MathSciNetCrossRefGoogle Scholar - Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003).
*Applied multiple regression/correlation for the behavioral sciences*(3rd edn) (1st edition, 1975), Mahwah: Lawrence Erlbaum Associates, 703 p.Google Scholar - Darmawan, I. G. N., & Keeves, J. P. (2006). Suppressor variables and multilevel mixture modeling.
*International Education Journal, 7*(2), 160–173.Google Scholar - Delfiner, P. (2007). Three pitfalls of Phi-K transforms.
*SPE Formation Evaluation & Engineering, 10*(6), 609–617.CrossRefGoogle Scholar - Friedman, L., & Wall, M. (2005). Graphic views of suppression and multicollinearity in multiple linear regression.
*The American Statistician, 59*(2), 127–136.MathSciNetCrossRefGoogle Scholar - Gonzalez, A. B., & Cox, D. R. (2007). Interpretation of interaction: A review.
*The Annals of Statistics, 1*(2), 371–385.MathSciNetCrossRefGoogle Scholar - Hastie, T., Tibshirani, R., & Friedman, J. (2009).
*The elements of statistical learning: Data mining, inference, and prediction*(2nd ed.). New York: Springer.CrossRefGoogle Scholar - Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for non-orthogonal problems.
*Technometrics, 12*, 55–68.CrossRefGoogle Scholar - Huang, D. Y., Lee, R. F., & Panchapakesan, S. (2006). On some variable selection procedures based on data for regression models.
*Journal of Statistical Planning and Inference, 136*(7), 2020–2034.MathSciNetCrossRefGoogle Scholar - Jones, T. A. (1972). Multiple regression with correlated independent variables.
*Mathematical Geology, 4*, 203–218.CrossRefGoogle Scholar - Liao, D., & Valliant, R. (2012). Variance inflation factors in the analysis of complex survey data.
*Survey Methodology, 38*(1), 53–62.Google Scholar - Lord, F. M. (1967). A paradox in the interpretation of group comparisons.
*Psychological Bulletin, 68*, 304–305.CrossRefGoogle Scholar - Ma, Y. Z. (2010). Error types in reservoir characterization and management.
*Journal of Petroleum Science and Engineering, 72*(3–4), 290–301. https://doi.org/10.1016/j.petrol.2010.03.030.CrossRefGoogle Scholar - Ma, Y. Z. (2011). Pitfalls in predictions of rock properties using multivariate analysis and regression method.
*Journal of Applied Geophysics, 75*, 390–400.CrossRefGoogle Scholar - O’Brien, R. M. (2007). A caution regarding rules of thumb for variance inflation factors.
*Quality & Quantity, 41*, 673–690.CrossRefGoogle Scholar - Smith, A. C., Koper, N., Francis, C. M., & Farig, L. (2009). Confronting collinearity: Comparing methods for disentangling the effects of habitat loss and fragmentation.
*Landscape Ecology, 24*, 1271–1285.CrossRefGoogle Scholar - Tibshirani, R. (1996). Regression shrinkage and selection via the lasso: A retrospective.
*Journal of the Royal Statistical Society, Series B, 58*(1), 267–288.MathSciNetzbMATHGoogle Scholar - Vargas-Guzman, J. A. (2009). Unbiased estimation of intrinsic permeability with cumulants beyond the lognormal assumption.
*SPE Journal, 14*, 805–810.CrossRefGoogle Scholar - Webster, J. T., Gunst, R. F., & Mason, R. L. (1974). Latent root regression analysis.
*Technometrics, 16*(4), 513–522.MathSciNetCrossRefGoogle Scholar

## Copyright information

© Springer Nature Switzerland AG 2019