Ensemble Modeling for Bio-medical Applications
In this paper we propose to use ensembles of models constructed using methods of Statistical Learning. The input data for model construction consists of real measurements taken in physical system under consideration. Further we propose a program toolbox which allows the construction of single models as well as heterogenous ensembles of linear and nonlinear models types. Several well performing model types, among which are ridge regression, k-nearest neighbor models and neural networks have been implemented. Ensembles of heterogenous models typically yield a better generalization performance than homogenous ensembles. Additionally given are methods for model validation and assessment as well as adaptor classes performing transparent feature selection or random subspace training on large number of input variables. The toolbox is implemented in Matlab and C++ and available under the GPL. Several applications of the described methods and the numerical toolbox itself are described. These include ECG modeling, classification of activity in drug design and ...
KeywordsEnsemble Modeling Multivariate Adaptive Regression Spline Generalization Error Ensemble Class Stochastic Gradient Descent
Unable to display preview. Download preview PDF.
- 2.Perrone, M.P., Cooper, L.N.: When Networks Disagree: Ensemble Methods for Hybrid Neural Networks. In: Mammone, R.J. (ed.) Neural Networks for Speech and Image Processing, pp. 126–142. Chapman and Hall, Boca Raton (1993)Google Scholar
- 9.Merkwirth, C., Ogorzalek, M., Wichard, J.: Stochastic gradient descent training of ensembles of dt-cnn classifiers for digit recognition. In: Proceedings of the European Conference on Circuit Theory and Design ECCTD 2003, Kraków, Poland, vol. 2, pp. 337–341 (September 2003)Google Scholar
- 10.Wichard, J., Ogorzałek, M.: Iterated time series prediction with ensemble models. In: Proceedings of the 23rd International Conference on Modelling Identification and Control (2004)Google Scholar
- 11.Suykens, J., Vandewalle, J. (eds.): Nonlinear Modeling - Advanced Black–Box Techniques. Kluwer Academic Publishers, Dordrecht (1998)Google Scholar
- 13.Merkwirth, C., Lengauer, T.: Automatic generation of complementary descriptors with molecular graph networks (2004)Google Scholar
- 15.Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based approaches for classifying chemical compounds. In: Proceedings of the Third IEEE International Conference on Data Mining ICDM 2003, Melbourne, Florida, pp. 35–42 (November 2003)Google Scholar
- 16.Wilton, D., Willett, P., Lawson, K., Mullier, G.: Comparison of ranking methods for virtual screening in lead-discovery programs. J. Chem. Inf. Comput. Sci. 43, 469–474 (2003)Google Scholar
- 18.Kirkland, D., Aardema, M., Henderson, L., Muller, L.: Evaluation of the ability of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens and non-carcinogens. Mutat. Res. 584, 1–256 (2005)Google Scholar
- 20.Todeschini, R.: Dragon Software, http://www.talete.mi.it/dragon_exp.htm
- 24.McNames, J.: Innovations in Local Modeling for Time Series Prediction, Ph.D. Thesis, Stanford University (1999)Google Scholar
- 25.Norgaard, M.: Neural Network Based System Identification Toolbox, Tech. Report. 00-E-891, Department of Automation, Technical University of Denmark (2000), http://www.iau.dtu.dk/research/control/nnsysid.html