Journal of Global Optimization

, Volume 73, Issue 2, pp 431–446 | Cite as

Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor

  • Ryuta Tamura
  • Ken Kobayashi
  • Yuichi Takano
  • Ryuhei MiyashiroEmail author
  • Kazuhide Nakata
  • Tomomi Matsui


Multicollinearity exists when some explanatory variables of a multiple linear regression model are highly correlated. High correlation among explanatory variables reduces the reliability of the analysis. To eliminate multicollinearity from a linear regression model, we consider how to select a subset of significant variables by means of the variance inflation factor (VIF), which is the most common indicator used in detecting multicollinearity. In particular, we adopt the mixed integer optimization (MIO) approach to subset selection. The MIO approach was proposed in the 1970s, and recently it has received renewed attention due to advances in algorithms and hardware. However, none of the existing studies have developed a computationally tractable MIO formulation for eliminating multicollinearity on the basis of VIF. In this paper, we propose mixed integer quadratic optimization (MIQO) formulations for selecting the best subset of explanatory variables subject to the upper bounds on the VIFs of selected variables. Our two MIQO formulations are based on the two equivalent definitions of VIF. Computational results illustrate the effectiveness of our MIQO formulations by comparison with conventional local search algorithms and MIO-based cutting plane algorithms.


Integer programming Subset selection Multicollinearity Variance inflation factor Multiple linear regression Statistics 



This work was partially supported by JSPS KAKENHI Grant Nos. JP17K01246 and JP17K12983.


  1. 1.
    Arthanari, T.S., Dodge, Y.: Mathematical Programming in Statistics. Wiley, New York (1981)zbMATHGoogle Scholar
  2. 2.
    Beale, E.M.L.: Two transportation problems. In: Kreweras, G., Morlat, G. (eds.) Proceedings of the Third International Conference on Operational Research, pp. 780–788 (1963)Google Scholar
  3. 3.
    Beale, E.M.L., Tomlin, J.A.: Special facilities in a general mathematical programming system for non-convex problems using ordered sets of variables. In: Lawrence, J. (ed.) Proceedings of the Fifth International Conference on Operational Research, pp. 447–454 (1970)Google Scholar
  4. 4.
    Belsley, D.A., Kuh, E., Welsch, R.E.: Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Wiley, Hoboken (2005)zbMATHGoogle Scholar
  5. 5.
    Benati, S., García, S.: A mixed integer linear model for clustering with variable selection. Comput. Oper. Res. 43, 280–285 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 136, 1039–1082 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Bertsimas, D., King, A.: OR forum: an algorithmic approach to linear regression. Oper. Res. 64, 2–16 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Bertsimas, D., King, A.: Logistic regression: from art to science. Stat. Sci. 32, 367–384 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Bertsimas, D., King, A., Mazumder, R.: Best subset selection via a modern optimization lens. Ann. Stat. 44, 813–852 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97, 245–271 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Chatterjee, S., Hadi, A.S.: Regression Analysis by Example. Wiley, Hoboken (2012)zbMATHGoogle Scholar
  12. 12.
    Dormann, C.F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., García Marquéz, J.R., Gruber, B., Lafoourcade, B., Leitão, P.J., Münkemüller, T., McClean, C., Osborne, P.E., Reineking, B., Schröder, B., Skidmore, A.K., Zurell, D., Lautenbach, S.: Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 36, 27–46 (2013)CrossRefGoogle Scholar
  13. 13.
    Gurobi Optimization, Inc.: Gurobi Optimizer Reference Manual. (2016). Accessed 6 Oct 2017
  14. 14.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHGoogle Scholar
  15. 15.
    Hastie, T., Tibshirani, R., Tibshirani, R.J.: Extended comparisons of best subset selection, forward stepwise selection, and the lasso. arXiv preprint arXiv:1707.08692 (2017)
  16. 16.
    Hocking, R.R.: The analysis and selection of variables in linear regression. Biometrics 32, 1–49 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12, 55–67 (1970)CrossRefzbMATHGoogle Scholar
  18. 18.
    Huberty, C.J.: Issues in the use and interpretation of discriminant analysis. Psychol. Bull. 95, 156–171 (1984)CrossRefGoogle Scholar
  19. 19.
    IBM: IBM ILOG CPLEX Optimization Studio. (2015). Accessed 6 Oct 2017
  20. 20.
    James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Springer, New York (2013)CrossRefzbMATHGoogle Scholar
  21. 21.
    Jolliffe, I.T.: A note on the use of principal components in regression. Appl. Stat. 31, 300–303 (1982)CrossRefGoogle Scholar
  22. 22.
    Kimura, K., Waki, H.: Minimization of Akaike’s information criterion in linear regression analysis via mixed integer nonlinear program. Optim. Methods Softw. 33, 633–649 (2018)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)CrossRefzbMATHGoogle Scholar
  24. 24.
    Konno, H., Yamamoto, R.: Choosing the best set of variables in regression analysis using integer programming. J. Glob. Optim. 44, 273–282 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Lichman, M.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine. (2013)
  26. 26.
    Liu, H., Motoda, H.: Computational Methods of Feature Selection. CRC Press, Boca Raton (2007)zbMATHGoogle Scholar
  27. 27.
    Maldonado, S., Pérez, J., Weber, R., Labbé, M.: Feature selection for support vector machines via mixed integer linear programming. Inf. Sci. 279, 163–175 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Massy, W.F.: Principal components regression in exploratory statistical research. J. Am. Stat. Assoc. 60, 234–256 (1965)CrossRefGoogle Scholar
  29. 29.
    Mazumder, R., Radchenko, P.: The discrete Dantzig selector: estimating sparse linear models via mixed integer linear optimization. IEEE Trans. Inf. Theory 63, 3053–3075 (2017)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Miller, A.: Subset Selection in Regression. CRC Press, Boca Raton (2002)CrossRefzbMATHGoogle Scholar
  31. 31.
    Miyashiro, R., Takano, Y.: Subset selection by Mallows’ \(C_p\): a mixed integer programming approach. Expert. Syst. Appl. 42, 325–331 (2015)CrossRefGoogle Scholar
  32. 32.
    Miyashiro, R., Takano, Y.: Mixed integer second-order cone programming formulations for variable selection in linear regression. Eur. J. Oper. Res. 247, 721–731 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing. (2014). Accessed 6 Oct 2017
  34. 34.
    Sato, T., Takano, Y., Miyashiro, R., Yoshise, A.: Feature subset selection for logistic regression via mixed integer optimization. Comput. Optim. Appl. 64, 865–880 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Sato, T., Takano, Y., Miyashiro, R.: Piecewise-linear approximation for feature subset selection in a sequential logit model. J. Oper. Res. Soc. Jpn. 60, 1–14 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Tamura, R., Kobayashi, K., Takano, Y., Miyashiro, R., Nakata, K., Matsui, T.: Best subset selection for eliminating multicollinearity. J. Oper. Res. Soc. Jpn. 60, 321–336 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Ustun, B., Rudin, C.: Supersparse linear integer models for optimized medical scoring systems. Mach. Learn. 102, 349–391 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Wilson, Z.T., Sahinidis, N.V.: The ALAMO approach to machine learning. Comput. Chem. Eng. 106, 785–795 (2017)CrossRefGoogle Scholar
  39. 39.
    Wold, S., Ruhe, A., Wold, H., Dunn III, W.J.: The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM J. Sci. Stat. Comput. 5, 735–743 (1984)Google Scholar
  40. 40.
    Wold, S., Sjöström, M., Eriksson, L.: PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 58, 109–130 (2001)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Ryuta Tamura
    • 1
    • 2
  • Ken Kobayashi
    • 3
  • Yuichi Takano
    • 4
    • 5
  • Ryuhei Miyashiro
    • 6
    Email author
  • Kazuhide Nakata
    • 7
  • Tomomi Matsui
    • 7
  1. 1.Graduate School of EngineeringTokyo University of Agriculture and TechnologyKoganei-shiJapan
  2. 2.October Sky Co., Ltd.Fuchu-shiJapan
  3. 3.Artificial Intelligence LaboratoryFujitsu Laboratories Ltd.Kawasaki-shiJapan
  4. 4.School of Network and InformationSenshu UniversityKawasaki-shiJapan
  5. 5.Faculty of Engineering, Information and SystemsUniversity of TsukubaTsukuba-shiJapan
  6. 6.Institute of EngineeringTokyo University of Agriculture and TechnologyKoganei-shiJapan
  7. 7.School of EngineeringTokyo Institute of TechnologyMeguro-kuJapan

Personalised recommendations