Journal of Global Optimization

, Volume 49, Issue 1, pp 49–67 | Cite as

Model building using bi-level optimization

  • G. K. D. Saharidis
  • I. P. Androulakis
  • M. G. Ierapetritou
Article

Abstract

In many problems from different disciplines such as engineering, physics, medicine, and biology, a series of experimental data is used in order to generate a model that can describe a system with minimum noise. The procedure for building a model provides a description of the behavior of the system under study and can be used to give a prediction for the future. Herein a novel hierarchical bi-level implementation of the cross validation method is presented. In this bi-level schema, the leader optimization problem builds (training) the model and the follower checks (testing) the developed model. The problem of synthesis and analysis of regulatory networks is used to compare the classical cross validation method to the proposed methodology referred to as bi-level cross validation. In all the examples considered, the bi-level cross validation results in a better model compared with the classical cross validation approach.

Keywords

Model building Bi-level optimization Cross-validation Regulatory networks 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fox J.: Applied Regression Analysis, Linear Models and Related Methods. Sage Publication INC, Thousand Oaks (1997)Google Scholar
  2. 2.
    Draper N.R., Smith H.: Applied Regression Analysis. Wiley, New York (1981)Google Scholar
  3. 3.
    Fraser D.A.S.: Probability and Statistics: Theory and Applications. Duxbury Press, Massachusetts (1976)Google Scholar
  4. 4.
    Strang G.: Linear Algebra and its Applications. Academic Press, New York (1976)Google Scholar
  5. 5.
    Creed F., Trick C.G., Band L.E., Morrison I.K.: Characterizing the special pattern of soil carbon and nitrogen pools in the turkey lakes watershed: a comparison of regression techniques. Water Air Soil Pollut. Focus 2, 81–102 (2002)Google Scholar
  6. 6.
    Diamantopoulou M.J., Antonopoulos V.Z., Papamichail D.M.: Cascade correlation artificial neural networks for estimating missing monthly values of water quality parameters in rivers. Water Resour. Manag. 21, 649–662 (2007)CrossRefGoogle Scholar
  7. 7.
    Kohler M.: Nonparametric regression function estimation using interaction least squares splines and complexity regularization. Metrika 47, 147–163 (1998)CrossRefGoogle Scholar
  8. 8.
    Meer P., Mintz D., Rosenfeld A., Kim D.Y.: Robust regression in computer vision: a review. Int. J. Comput. Vis. 6, 59–71 (1991)CrossRefGoogle Scholar
  9. 9.
    Cawley, G.C.: Leave-one-out cross-validation based model selection criteria for weighted LSSVMs. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN). http://theoval.cmp.uea.ac.uk/~gcc/publications/pdf/ijcnn2006a.pdf (2006)
  10. 10.
    Fukunaga K.: Introduction to Statistical Pattern Recognition. Academic Press, New York (1972)Google Scholar
  11. 11.
    Fukunaga K.: Introduction to Statistical Pattern Recognition. Academic Press, NY (1990)Google Scholar
  12. 12.
    Steppe J.M., Bauer K.W.: Improved feature screening in feed forward neural networks. Neurocomputing 13, 47–58 (1996)CrossRefGoogle Scholar
  13. 13.
    Duda R.O., Hart P.E.: Classification and Scene Analysis. Wiley, New York (1973)Google Scholar
  14. 14.
    Detting M., Buhlmann P.: Boosting for tumor classification with gene expression data. Bioinformatics 19(9), 1061–1069 (2003)CrossRefGoogle Scholar
  15. 15.
    Li, W., Yang, Y.: How many genes are needed for a discriminant microarray data analysis. In: Lin, S.M., Johnson, K.F. Methods of Microarray Data Analysis, pp. 137–150. Kluwer Academic, Boston (2002)Google Scholar
  16. 16.
    Yeung K.Y., Bumgarner R.E., Raftery A.E.: Bayesian model averaging: development of an improved multiclass, gene selection and classification tool for microarray data. Bioinformatics 21(10), 2394–2402 (2005)CrossRefGoogle Scholar
  17. 17.
    Li L., Clarice R., Darden T.A., Pedersen L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12), 1131–1142 (2001)CrossRefGoogle Scholar
  18. 18.
    Ooi C.H., Tan P.: Genetic algorithms applied to multi-class prediction for the analysis of gene expression data. Bioinformatics 19(1), 37–44 (2003)CrossRefGoogle Scholar
  19. 19.
    Ho S.Y., Chen J.H., Huang M.H.: Inheritable genetic algorithm for biobjective 0/1 combinatorial optimization problems and its applications. IEEE Trans. Syst. Man Cybern. Part B 34, 609–620 (2004)CrossRefGoogle Scholar
  20. 20.
    Peng S., Xu Q., Ling X.B., Peng X., Du W., Chen L.: Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS Lett. 555(2), 358–362 (2003)CrossRefGoogle Scholar
  21. 21.
    Saeys Y., Inza I., Larrañaga P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRefGoogle Scholar
  22. 22.
    Hjorth J.S.: Computer Intensive Statistical Methods. Validation Model Selection and Bootstrap. Chapman & Hall, London (1984)Google Scholar
  23. 23.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on artificial intelligence (1995)Google Scholar
  24. 24.
    Iyer V.R., Horak C.E., Scafe C.S., Botstein D., Snyder M., Brown P.O.: Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409(6819), 533–538 (2001)CrossRefGoogle Scholar
  25. 25.
    Steensel B.V., Delrow J., Bussemaker H.J.: Genomewide analysis of Drosophila GAGA factor target genes reveals context-dependent DNA binding. Proc. Natl. Acad. Sci. USA 100(5), 2580–2585 (2003)CrossRefGoogle Scholar
  26. 26.
    Foteinou P., Yang E., Saharidis G.K.D., Ierapetritou M.G., Androulakis I.P.: A mixed-integer optimization framework for the synthesis and analysis of regulatory networks. J. Glob. Optim. 43(2–3), 263–276 (2009)CrossRefGoogle Scholar
  27. 27.
    Elton S.: On the financial and applications of discriminant analysis. J. Financ. Quant. Anal. 13(1), 201–210 (1978)CrossRefGoogle Scholar
  28. 28.
    Burman P.: A comparative study of ordinary cross-validation, u-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3), 503–514 (1989)Google Scholar
  29. 29.
    Bengio, Y., Grandvalet, Y.: No Unbiased Estimator of the Variance of k-Fold Cross-Validation. CIRANO Scientific Series Montreal, CA (2003)Google Scholar
  30. 30.
    Schneider, J., Moore, A.: A Locally Weighted Learning Tutorial using Vizier 1.0. http://citeseer.ist.psu.edu/schneider97locally.html (1997)
  31. 31.
    Vapnik V.N.: Statistical Learning theory. Wiley, New York (1998)Google Scholar
  32. 32.
    Efron B.: Estimating the error rate of a prediction rule: improvement on cross-validation. J. Am. Stat. Assoc. 78(382), 316–330 (1983)CrossRefGoogle Scholar
  33. 33.
    Breiman L., Spector P.: Submodel selection and evaluation in regression: the X-random case. Int. Stat. Rev. 60, 291–319 (1992)CrossRefGoogle Scholar
  34. 34.
    Huang W.L., Tung C.W., Huang H.L., Hwang S.F., Ho S.Y.: ProLoc: prediction of protein subnuclear localization using SVM with automatic selection from physicochemical composition features. BioSystems 90, 573–581 (2007)CrossRefGoogle Scholar
  35. 35.
    Tuy, H., Pardalos, P.M., Mauricion, G.C.: Handbook of Applied Optimization, Hierarchical Optimization. Oxford University Press (2002). Chapter 12Google Scholar
  36. 36.
    Candler W., Townsley R.: A linear two-level programming problem. Comput. Oper. Res. 9, 59–76 (1982)CrossRefGoogle Scholar
  37. 37.
    Moore J.T., Bard J.F.: The mixed integer linear bi-level programming problem. Oper. Res. 38(5), 911–921 (1990)CrossRefGoogle Scholar
  38. 38.
    Wen U.P., Yang Y.H.: Algorithms for solving the mixed integer two level linear programming problem. Comput. Oper. Res. 17, 133–142 (1990)CrossRefGoogle Scholar
  39. 39.
    Dempe, S.: Discrete bi-level optimization problems. TU Chemnizt. http://www.mathe.tufreiberg.de/dempe (1995)
  40. 40.
    Faisca N., Dua V., Rustem B., Saraiva P.M., Pistikopoulos E.N.: Parametric global optimization for bi-level programming. J. Glob. Optim. 38(4), 609–623 (2007)CrossRefGoogle Scholar
  41. 41.
    Gümus Z.H., Floudas C.A.: Global optimization of mixed-integer bilevel programming problems. Comput. Manag. Sci. 2, 181–212 (2005)CrossRefGoogle Scholar
  42. 42.
    Migdalas A., Pardalos P.M., Varbrand P.: Multilevel Optimization: Algorithms and Applications. Kluwer, The Netherlands (1997)Google Scholar
  43. 43.
    Saharidis G.K.D., Ierapetritou M.G.: Resolution method for mixed integer bi-level linear problems based on decomposition technique. J. Glob. Optim. 44(1), 29–51 (2009)CrossRefGoogle Scholar
  44. 44.
    Grossmann I.E., Floudas C.A.: Active constraint strategy for flexibility analysis in chemical processes. Comput. Chem. Eng. 11(6), 675–693 (1987)CrossRefGoogle Scholar
  45. 45.
    Saharidis, G.K., Minoux, M., Ierapetritou, M.G.: Accelerating Benders decomposition using covering cut bundle generation. Accepted in Int. Trans. Oper. Res. (2009)Google Scholar
  46. 46.
    Saharidis, G.K.D., Ierapetritou, M.G.: Improving Benders decomposition using Maximum Feasible sub-system (MFS) cut generation strategy. Comp. Chem. Eng. (2009, in press)Google Scholar
  47. 47.
    Hager W., Huang S.J., Pardalos P.M., Prokopyev O.: Multiscale Optimization Methods and Applications. Springer, New York (2006)CrossRefGoogle Scholar
  48. 48.
    Huang H.X., Pardalos P.M.: A multivariate partition approach to optimization problems. Cybern. Syst. Anal. 38(2), 265–275 (2002)CrossRefGoogle Scholar
  49. 49.
    Magnanti T., Wong R.: Accelerating benders decomposition algorithmic enhancement and model selection criteria. Oper. Res. 29, 464–484 (1981)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC. 2010

Authors and Affiliations

  • G. K. D. Saharidis
    • 1
  • I. P. Androulakis
    • 2
  • M. G. Ierapetritou
    • 3
  1. 1.Center for Advanced Infrastructure and TransportationRutgers, The State University of New JerseyPiscatawayUSA
  2. 2.Department of Biomedical EngineeringRutgers, The State University of New JerseyPiscatawayUSA
  3. 3.Department of Chemical and Biochemical EngineeringRutgers, The State University of New JerseyPiscatawayUSA

Personalised recommendations