Skip to main content
Log in

Multi-step methods for choosing the best set of variables in regression analysis

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

In a recent article (Konno and Yamamoto in ISE 07-01, Department of Industrial and Systems Engineering, Chuo University, February 2007), one of the authors formulated the problem of choosing the best set of explanatory variables from a large number of candidate variables in a linear regression model as a mixed 0–1 integer linear programming problem and showed that it can be solved by the state-of-the-art integer programming software.

In this paper, we will propose multi-step methods for calculating a close to optimal solution of the problem which may not be solved by a single-step method presented in Konno and Yamamoto (ISE 07-01, Department of Industrial and Systems Engineering, Chuo University, February 2007). It will be shown that a multi-step method can generate a nearly optimal solution within a fraction of computation time of the single step method.

Also, we will demonstrate that the best set of variables in terms of the squared error can be recovered under normality assumption.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Arthanari, T.S., Dodge, Y.: Mathematical Programming in Statistics. Wiley, New York (1981)

    MATH  Google Scholar 

  2. Balas, E.: Disjunctive programming: properties of the convex hull of feasible points. Discrete Appl. Math. 89, 3–44 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  3. Crowder, H., Johnson, E.L., Padberg, M.: Solving large-scale zero-one linear programming problems. Oper. Res. 31, 803–834 (1983)

    Article  MATH  Google Scholar 

  4. Efroymson, M.A.: Multiple regression analysis. In: Ralson, A., Wiff, H.S. (eds.) Mathematical Methods for Digital Computers. Wiley, New York (1960)

    Google Scholar 

  5. Furnival, G.M., Wilson, R.W. Jr.: Regressions by leaps and bounds. Technometrics 16, 499–511 (1974)

    Article  MATH  Google Scholar 

  6. Galindo, J., Tamayo, P.: Credit risk assessment using statistical and machine learning: basic methodology and risk modeling applications. Comput. Econ. 15, 107–143 (2000)

    Article  MATH  Google Scholar 

  7. Judge, G.R., Hill, R.C., Griffiths, W.E., Lutkepohl, H., Lee, T.-C.: Introduction to the Theory and Practice of Econometrics. Wiley, New York (1998)

    Google Scholar 

  8. Konno, H., Kawadai, N., Wu, D.: Estimation of failure probability using semi-definite logit model. Comput. Manag. Sci. 1, 59–73 (2004)

    MATH  Google Scholar 

  9. Konno, H., Koshizuka, T.: Mean-absolute deviation model. IIE Trans. 37, 893–900 (2005)

    Article  Google Scholar 

  10. Konno, H., Yamamoto, R.: Choosing the best set of variables in regression analysis using integer programming. ISE 07-01 Department of Industrial and Systems Engineering, Chuo University, February 2007

  11. Konno, H., Yamazaki, H.: Mean-absolute deviation portfolio optimization model and its applications to Tokyo stock market. Manag. Sci. 37, 519–531 (1991)

    Article  Google Scholar 

  12. Osborne, M.R.: On the computation of stepwise regression. Aust. Comput. J. 8, 61–68 (1976)

    MATH  MathSciNet  Google Scholar 

  13. Padberg, M., Rinaldi, G.: A branch-and-cut algorithm for the resolution of large-scale symmetric traveling salesman problems. SIAM Rev. 33, 60–100 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  14. Pardalos, P., Boginski, V., Vazacopoulos, A.: Data Mining in Biomedicine. Springer, Berlin (2007)

    Book  MATH  Google Scholar 

  15. Wolsey, L.A.: Integer Programming. Wiley, New York (1998)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiroshi Konno.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Konno, H., Takaya, Y. Multi-step methods for choosing the best set of variables in regression analysis. Comput Optim Appl 46, 417–426 (2010). https://doi.org/10.1007/s10589-008-9193-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-008-9193-6

Keywords

Navigation