Abstract
In many problems from different disciplines such as engineering, physics, medicine, and biology, a series of experimental data is used in order to generate a model that can describe a system with minimum noise. The procedure for building a model provides a description of the behavior of the system under study and can be used to give a prediction for the future. Herein a novel hierarchical bi-level implementation of the cross validation method is presented. In this bi-level schema, the leader optimization problem builds (training) the model and the follower checks (testing) the developed model. The problem of synthesis and analysis of regulatory networks is used to compare the classical cross validation method to the proposed methodology referred to as bi-level cross validation. In all the examples considered, the bi-level cross validation results in a better model compared with the classical cross validation approach.
Similar content being viewed by others
References
Fox J.: Applied Regression Analysis, Linear Models and Related Methods. Sage Publication INC, Thousand Oaks (1997)
Draper N.R., Smith H.: Applied Regression Analysis. Wiley, New York (1981)
Fraser D.A.S.: Probability and Statistics: Theory and Applications. Duxbury Press, Massachusetts (1976)
Strang G.: Linear Algebra and its Applications. Academic Press, New York (1976)
Creed F., Trick C.G., Band L.E., Morrison I.K.: Characterizing the special pattern of soil carbon and nitrogen pools in the turkey lakes watershed: a comparison of regression techniques. Water Air Soil Pollut. Focus 2, 81–102 (2002)
Diamantopoulou M.J., Antonopoulos V.Z., Papamichail D.M.: Cascade correlation artificial neural networks for estimating missing monthly values of water quality parameters in rivers. Water Resour. Manag. 21, 649–662 (2007)
Kohler M.: Nonparametric regression function estimation using interaction least squares splines and complexity regularization. Metrika 47, 147–163 (1998)
Meer P., Mintz D., Rosenfeld A., Kim D.Y.: Robust regression in computer vision: a review. Int. J. Comput. Vis. 6, 59–71 (1991)
Cawley, G.C.: Leave-one-out cross-validation based model selection criteria for weighted LSSVMs. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN). http://theoval.cmp.uea.ac.uk/~gcc/publications/pdf/ijcnn2006a.pdf (2006)
Fukunaga K.: Introduction to Statistical Pattern Recognition. Academic Press, New York (1972)
Fukunaga K.: Introduction to Statistical Pattern Recognition. Academic Press, NY (1990)
Steppe J.M., Bauer K.W.: Improved feature screening in feed forward neural networks. Neurocomputing 13, 47–58 (1996)
Duda R.O., Hart P.E.: Classification and Scene Analysis. Wiley, New York (1973)
Detting M., Buhlmann P.: Boosting for tumor classification with gene expression data. Bioinformatics 19(9), 1061–1069 (2003)
Li, W., Yang, Y.: How many genes are needed for a discriminant microarray data analysis. In: Lin, S.M., Johnson, K.F. Methods of Microarray Data Analysis, pp. 137–150. Kluwer Academic, Boston (2002)
Yeung K.Y., Bumgarner R.E., Raftery A.E.: Bayesian model averaging: development of an improved multiclass, gene selection and classification tool for microarray data. Bioinformatics 21(10), 2394–2402 (2005)
Li L., Clarice R., Darden T.A., Pedersen L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12), 1131–1142 (2001)
Ooi C.H., Tan P.: Genetic algorithms applied to multi-class prediction for the analysis of gene expression data. Bioinformatics 19(1), 37–44 (2003)
Ho S.Y., Chen J.H., Huang M.H.: Inheritable genetic algorithm for biobjective 0/1 combinatorial optimization problems and its applications. IEEE Trans. Syst. Man Cybern. Part B 34, 609–620 (2004)
Peng S., Xu Q., Ling X.B., Peng X., Du W., Chen L.: Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS Lett. 555(2), 358–362 (2003)
Saeys Y., Inza I., Larrañaga P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Hjorth J.S.: Computer Intensive Statistical Methods. Validation Model Selection and Bootstrap. Chapman & Hall, London (1984)
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on artificial intelligence (1995)
Iyer V.R., Horak C.E., Scafe C.S., Botstein D., Snyder M., Brown P.O.: Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409(6819), 533–538 (2001)
Steensel B.V., Delrow J., Bussemaker H.J.: Genomewide analysis of Drosophila GAGA factor target genes reveals context-dependent DNA binding. Proc. Natl. Acad. Sci. USA 100(5), 2580–2585 (2003)
Foteinou P., Yang E., Saharidis G.K.D., Ierapetritou M.G., Androulakis I.P.: A mixed-integer optimization framework for the synthesis and analysis of regulatory networks. J. Glob. Optim. 43(2–3), 263–276 (2009)
Elton S.: On the financial and applications of discriminant analysis. J. Financ. Quant. Anal. 13(1), 201–210 (1978)
Burman P.: A comparative study of ordinary cross-validation, u-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3), 503–514 (1989)
Bengio, Y., Grandvalet, Y.: No Unbiased Estimator of the Variance of k-Fold Cross-Validation. CIRANO Scientific Series Montreal, CA (2003)
Schneider, J., Moore, A.: A Locally Weighted Learning Tutorial using Vizier 1.0. http://citeseer.ist.psu.edu/schneider97locally.html (1997)
Vapnik V.N.: Statistical Learning theory. Wiley, New York (1998)
Efron B.: Estimating the error rate of a prediction rule: improvement on cross-validation. J. Am. Stat. Assoc. 78(382), 316–330 (1983)
Breiman L., Spector P.: Submodel selection and evaluation in regression: the X-random case. Int. Stat. Rev. 60, 291–319 (1992)
Huang W.L., Tung C.W., Huang H.L., Hwang S.F., Ho S.Y.: ProLoc: prediction of protein subnuclear localization using SVM with automatic selection from physicochemical composition features. BioSystems 90, 573–581 (2007)
Tuy, H., Pardalos, P.M., Mauricion, G.C.: Handbook of Applied Optimization, Hierarchical Optimization. Oxford University Press (2002). Chapter 12
Candler W., Townsley R.: A linear two-level programming problem. Comput. Oper. Res. 9, 59–76 (1982)
Moore J.T., Bard J.F.: The mixed integer linear bi-level programming problem. Oper. Res. 38(5), 911–921 (1990)
Wen U.P., Yang Y.H.: Algorithms for solving the mixed integer two level linear programming problem. Comput. Oper. Res. 17, 133–142 (1990)
Dempe, S.: Discrete bi-level optimization problems. TU Chemnizt. http://www.mathe.tufreiberg.de/dempe (1995)
Faisca N., Dua V., Rustem B., Saraiva P.M., Pistikopoulos E.N.: Parametric global optimization for bi-level programming. J. Glob. Optim. 38(4), 609–623 (2007)
Gümus Z.H., Floudas C.A.: Global optimization of mixed-integer bilevel programming problems. Comput. Manag. Sci. 2, 181–212 (2005)
Migdalas A., Pardalos P.M., Varbrand P.: Multilevel Optimization: Algorithms and Applications. Kluwer, The Netherlands (1997)
Saharidis G.K.D., Ierapetritou M.G.: Resolution method for mixed integer bi-level linear problems based on decomposition technique. J. Glob. Optim. 44(1), 29–51 (2009)
Grossmann I.E., Floudas C.A.: Active constraint strategy for flexibility analysis in chemical processes. Comput. Chem. Eng. 11(6), 675–693 (1987)
Saharidis, G.K., Minoux, M., Ierapetritou, M.G.: Accelerating Benders decomposition using covering cut bundle generation. Accepted in Int. Trans. Oper. Res. (2009)
Saharidis, G.K.D., Ierapetritou, M.G.: Improving Benders decomposition using Maximum Feasible sub-system (MFS) cut generation strategy. Comp. Chem. Eng. (2009, in press)
Hager W., Huang S.J., Pardalos P.M., Prokopyev O.: Multiscale Optimization Methods and Applications. Springer, New York (2006)
Huang H.X., Pardalos P.M.: A multivariate partition approach to optimization problems. Cybern. Syst. Anal. 38(2), 265–275 (2002)
Magnanti T., Wong R.: Accelerating benders decomposition algorithmic enhancement and model selection criteria. Oper. Res. 29, 464–484 (1981)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Saharidis, G.K.D., Androulakis, I.P. & Ierapetritou, M.G. Model building using bi-level optimization. J Glob Optim 49, 49–67 (2011). https://doi.org/10.1007/s10898-010-9533-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-010-9533-9