Abstract
Multicollinearity exists when some explanatory variables of a multiple linear regression model are highly correlated. High correlation among explanatory variables reduces the reliability of the analysis. To eliminate multicollinearity from a linear regression model, we consider how to select a subset of significant variables by means of the variance inflation factor (VIF), which is the most common indicator used in detecting multicollinearity. In particular, we adopt the mixed integer optimization (MIO) approach to subset selection. The MIO approach was proposed in the 1970s, and recently it has received renewed attention due to advances in algorithms and hardware. However, none of the existing studies have developed a computationally tractable MIO formulation for eliminating multicollinearity on the basis of VIF. In this paper, we propose mixed integer quadratic optimization (MIQO) formulations for selecting the best subset of explanatory variables subject to the upper bounds on the VIFs of selected variables. Our two MIQO formulations are based on the two equivalent definitions of VIF. Computational results illustrate the effectiveness of our MIQO formulations by comparison with conventional local search algorithms and MIO-based cutting plane algorithms.
Similar content being viewed by others
References
Arthanari, T.S., Dodge, Y.: Mathematical Programming in Statistics. Wiley, New York (1981)
Beale, E.M.L.: Two transportation problems. In: Kreweras, G., Morlat, G. (eds.) Proceedings of the Third International Conference on Operational Research, pp. 780–788 (1963)
Beale, E.M.L., Tomlin, J.A.: Special facilities in a general mathematical programming system for non-convex problems using ordered sets of variables. In: Lawrence, J. (ed.) Proceedings of the Fifth International Conference on Operational Research, pp. 447–454 (1970)
Belsley, D.A., Kuh, E., Welsch, R.E.: Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Wiley, Hoboken (2005)
Benati, S., García, S.: A mixed integer linear model for clustering with variable selection. Comput. Oper. Res. 43, 280–285 (2014)
Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 136, 1039–1082 (2017)
Bertsimas, D., King, A.: OR forum: an algorithmic approach to linear regression. Oper. Res. 64, 2–16 (2016)
Bertsimas, D., King, A.: Logistic regression: from art to science. Stat. Sci. 32, 367–384 (2017)
Bertsimas, D., King, A., Mazumder, R.: Best subset selection via a modern optimization lens. Ann. Stat. 44, 813–852 (2016)
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97, 245–271 (1997)
Chatterjee, S., Hadi, A.S.: Regression Analysis by Example. Wiley, Hoboken (2012)
Dormann, C.F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., García Marquéz, J.R., Gruber, B., Lafoourcade, B., Leitão, P.J., Münkemüller, T., McClean, C., Osborne, P.E., Reineking, B., Schröder, B., Skidmore, A.K., Zurell, D., Lautenbach, S.: Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 36, 27–46 (2013)
Gurobi Optimization, Inc.: Gurobi Optimizer Reference Manual. http://www.gurobi.com (2016). Accessed 6 Oct 2017
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Hastie, T., Tibshirani, R., Tibshirani, R.J.: Extended comparisons of best subset selection, forward stepwise selection, and the lasso. arXiv preprint arXiv:1707.08692 (2017)
Hocking, R.R.: The analysis and selection of variables in linear regression. Biometrics 32, 1–49 (1976)
Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12, 55–67 (1970)
Huberty, C.J.: Issues in the use and interpretation of discriminant analysis. Psychol. Bull. 95, 156–171 (1984)
IBM: IBM ILOG CPLEX Optimization Studio. https://www-01.ibm.com/software/commerce/optimization/cplex-optimizer/ (2015). Accessed 6 Oct 2017
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Springer, New York (2013)
Jolliffe, I.T.: A note on the use of principal components in regression. Appl. Stat. 31, 300–303 (1982)
Kimura, K., Waki, H.: Minimization of Akaike’s information criterion in linear regression analysis via mixed integer nonlinear program. Optim. Methods Softw. 33, 633–649 (2018)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
Konno, H., Yamamoto, R.: Choosing the best set of variables in regression analysis using integer programming. J. Glob. Optim. 44, 273–282 (2009)
Lichman, M.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine. http://archive.ics.uci.edu/ml (2013)
Liu, H., Motoda, H.: Computational Methods of Feature Selection. CRC Press, Boca Raton (2007)
Maldonado, S., Pérez, J., Weber, R., Labbé, M.: Feature selection for support vector machines via mixed integer linear programming. Inf. Sci. 279, 163–175 (2014)
Massy, W.F.: Principal components regression in exploratory statistical research. J. Am. Stat. Assoc. 60, 234–256 (1965)
Mazumder, R., Radchenko, P.: The discrete Dantzig selector: estimating sparse linear models via mixed integer linear optimization. IEEE Trans. Inf. Theory 63, 3053–3075 (2017)
Miller, A.: Subset Selection in Regression. CRC Press, Boca Raton (2002)
Miyashiro, R., Takano, Y.: Subset selection by Mallows’ \(C_p\): a mixed integer programming approach. Expert. Syst. Appl. 42, 325–331 (2015)
Miyashiro, R., Takano, Y.: Mixed integer second-order cone programming formulations for variable selection in linear regression. Eur. J. Oper. Res. 247, 721–731 (2015)
R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org (2014). Accessed 6 Oct 2017
Sato, T., Takano, Y., Miyashiro, R., Yoshise, A.: Feature subset selection for logistic regression via mixed integer optimization. Comput. Optim. Appl. 64, 865–880 (2016)
Sato, T., Takano, Y., Miyashiro, R.: Piecewise-linear approximation for feature subset selection in a sequential logit model. J. Oper. Res. Soc. Jpn. 60, 1–14 (2017)
Tamura, R., Kobayashi, K., Takano, Y., Miyashiro, R., Nakata, K., Matsui, T.: Best subset selection for eliminating multicollinearity. J. Oper. Res. Soc. Jpn. 60, 321–336 (2017)
Ustun, B., Rudin, C.: Supersparse linear integer models for optimized medical scoring systems. Mach. Learn. 102, 349–391 (2016)
Wilson, Z.T., Sahinidis, N.V.: The ALAMO approach to machine learning. Comput. Chem. Eng. 106, 785–795 (2017)
Wold, S., Ruhe, A., Wold, H., Dunn III, W.J.: The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM J. Sci. Stat. Comput. 5, 735–743 (1984)
Wold, S., Sjöström, M., Eriksson, L.: PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 58, 109–130 (2001)
Acknowledgements
This work was partially supported by JSPS KAKENHI Grant Nos. JP17K01246 and JP17K12983.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tamura, R., Kobayashi, K., Takano, Y. et al. Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor. J Glob Optim 73, 431–446 (2019). https://doi.org/10.1007/s10898-018-0713-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-018-0713-3