Abstract
Ensemble learning, which combines multiple base learners to improve statistical prediction accuracy, is frequently used in statistical science and data mining. However, because of their “black box” nature, ensemble learning models are difficult to interpret. A recently proposed rule ensemble method known as RuleFit presents the base learner as a production rule and also generates a measure that influences the response variable. The RuleFit method for binary response applies a squared-error ramp loss function, and base learners are weighted by shrinkage regression using the lasso method. Thus, RuleFit is not constructed by a logistic regression model. Moreover, highly correlated pairs of base learners may be excessively pruned by the lasso method. In this study, we solved the excess pruning problem by constructing RuleFit within a logistic regression framework, weighting the base learners by elastic net. The effectiveness ofour proposed RuleFit model is illustrated through a real data set. In small-scale simulations, this method demonstrated higher predictive performance than the original RuleFit model.
Similar content being viewed by others
References
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Breiman, L., Friedman, J.H., Olshen, R. and Stone, C.J. (1984). Classification and regression trees. Wadsworth International Group.
Fteund, Y. and Schapire, R.E. (1996). Experiments with a new boosting algorithm, Machine Learning: Proceedings of the Thirteenth International Conference, 96, 148–156.
Friedman, J.H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29, 1189–1232.
Friedman, J.H., Hastie, T., Höfling, H., and Tibshirani. R. (2007). Pathwise coordinate optimization, Ann. Appl. StaL, 1(2), 302–332.
Friedman, J. H. and Popescu, B. E. (2003). Importance sampled learning ensembles, Journal of Machine Learning Research, 94305.
Friedman, J. H. and Popescu, B. E. (2004). Gradient directed regularization for linear regression and classification. Technical Report, Statistics Department, Stanford University.
Friedman, J. H. and Popescu, B. E. (2008). Predictive learning via rule ensemble. Ann. Appl. Stal. 2(3), 916–954.
Hastie, T., Tibshirani, R., and Friedman, J. H. (2009). The elements of statistical learning (2nd edition). New York: Springer-Verlag.
Li, L., Yan, K., Shimokawa, T., Oyama, I., and Kitamura, S. (2013). Investigation of factors affecting the evaluation of street scapes in Japan and China, International Journal of Affective Engineering, 12(1), 1–10.
Ridgeway, G. (2007). Generalized boosted models: a guide to the gbm package, http://cran.rproject.org/web/psdcages/gbm/vignettes/gbm.pdf#search=’generalized+boosting+machines+ridgeway
Sexton, J. and Laake, P. (2007). Boosted regression trees with errors in variables, Biometrics, 63(2), 586–592.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. B58(2), 267–288.
Shimokawa, T., Tsuji, M., and Goto, M. (2011). Modified rule ensemble method and its application for bioceutical data, Japanese Journal of Applied Statistics, 40(1), 19–40 (in Japanese).
Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. B67(2), 301–320.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Shimokawa, T., Li, L., Yan, K. et al. Modified Rule Ensemble Method for Binary Data and Its Applications. Behaviormetrika 41, 225–244 (2014). https://doi.org/10.2333/bhmk.41.225
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.2333/bhmk.41.225