Boosting of Tree-Based Classifiers for Predictive Risk Modeling in GIS
Boosting of tree-based classifiers has been interfaced to the Geographical Information System (GIS) GRASS to create predictive classification models from digital maps. On a risk management problem in landscape ecology, the performance of the boosted tree model is better than either with a single classifier or with bagging. This results in an improved digital map of the risk of human exposure to tick-borne diseases in Trentino (Italian Alps) given sampling on 388 sites and the use of several overlaying georeferenced data bases. Margin distributions are compared for bagging and boosting. Boosting is confirmed to give the most accurate model on two additional and independent test sets of reported cases of bites on humans and of infestation measured on roe deer. An interesting feature of combining classification models within a GIS is the visualization through maps of the single elements of the combination: each boosting step map focuses on different details of data distribution. In this problem, the best performance is obtained without controlling tree sizes, which indicates that there is a strong interaction between input variables.
KeywordsGeographical Information System Margin Distribution Lyme Disease Maximal Tree Geographical Information System Data
Unable to display preview. Download preview PDF.
- 2.Freund, Y., and Schapire R.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference (1996) 148–156Google Scholar
- 3.Breiman, L.: Combining predictors. In:Sharkey, A, (ed): Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems. Springer-Verlag, London (1999) 31–50Google Scholar
- 5.Friedman, J., Hastie, T., and Tibshirani R.: Additive logistic regression: a statistical view of boosting. Technical report, Stanford University, (1999)Google Scholar
- 6.Merler, S., Furlanello, C., Chemini, C., and Nicolini, G.: Classification tree methods for analysis of mesoscale distribution of ixodes ricinus (acari: ixodidae) in Trentino, Italian Alps. Journal of Medical Entomology 33(6) (1996) 888–8937.Google Scholar
- 7.Efron, B., and Tibshirani, R.: Cross-validation and the bootstrap: estimating the error rate of a prediction rule. Technical report, Standford University, (1995)Google Scholar
- 8.Merler, S., and Furlanello, C.: Selection of tree-based classifiers with the bootstrap 632+ rule. Biometrical Journal 39(2) (1997) 1–14Google Scholar
- 9.Furlanello, C., Merler, S., Rizzoli, A., Chemini, C., and Genchi, C.: Bagging as a predictive method for landscape epidemiology of Lyme disease. Giornale Italiano di Cardiologia 29(5) (1999) 143–147Google Scholar
- 10.Furlanello, C., Merler, S., and Chemini, C.: Tree-based classifiers and GIS for biological risk forecasting. In:Morabito, F., (ed): Advanced in Intelligent Systems. IOS Press, Amsterdam (1997) 316–323Google Scholar
- 11.Dietterich, T.G: An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Machine Learning (1999) (to appear).Google Scholar