Modified Rule Ensemble Method for Binary Data and Its Applications

Shimokawa, Toshio; Li, Li; Yan, Kun; Kitamura, Shinnichi; Goto, Masashi

doi:10.2333/bhmk.41.225

Modified Rule Ensemble Method for Binary Data and Its Applications

Published: 15 August 2014

Volume 41, pages 225–244, (2014)
Cite this article

Behaviormetrika Aims and scope Submit manuscript

Toshio Shimokawa¹,
Li Li²,
Kun Yan²,
Shinnichi Kitamura¹ &
…
Masashi Goto³

42 Accesses
2 Citations
Explore all metrics

Abstract

Ensemble learning, which combines multiple base learners to improve statistical prediction accuracy, is frequently used in statistical science and data mining. However, because of their “black box” nature, ensemble learning models are difficult to interpret. A recently proposed rule ensemble method known as RuleFit presents the base learner as a production rule and also generates a measure that influences the response variable. The RuleFit method for binary response applies a squared-error ramp loss function, and base learners are weighted by shrinkage regression using the lasso method. Thus, RuleFit is not constructed by a logistic regression model. Moreover, highly correlated pairs of base learners may be excessively pruned by the lasso method. In this study, we solved the excess pruning problem by constructing RuleFit within a logistic regression framework, weighting the base learners by elastic net. The effectiveness ofour proposed RuleFit model is illustrated through a real data set. In small-scale simulations, this method demonstrated higher predictive performance than the original RuleFit model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.
MATH Google Scholar
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Article Google Scholar
Breiman, L., Friedman, J.H., Olshen, R. and Stone, C.J. (1984). Classification and regression trees. Wadsworth International Group.
Google Scholar
Fteund, Y. and Schapire, R.E. (1996). Experiments with a new boosting algorithm, Machine Learning: Proceedings of the Thirteenth International Conference, 96, 148–156.
Google Scholar
Friedman, J.H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29, 1189–1232.
Article MathSciNet Google Scholar
Friedman, J.H., Hastie, T., Höfling, H., and Tibshirani. R. (2007). Pathwise coordinate optimization, Ann. Appl. StaL, 1(2), 302–332.
Article MathSciNet Google Scholar
Friedman, J. H. and Popescu, B. E. (2003). Importance sampled learning ensembles, Journal of Machine Learning Research, 94305.
Google Scholar
Friedman, J. H. and Popescu, B. E. (2004). Gradient directed regularization for linear regression and classification. Technical Report, Statistics Department, Stanford University.
Google Scholar
Friedman, J. H. and Popescu, B. E. (2008). Predictive learning via rule ensemble. Ann. Appl. Stal. 2(3), 916–954.
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., and Friedman, J. H. (2009). The elements of statistical learning (2nd edition). New York: Springer-Verlag.
Book Google Scholar
Li, L., Yan, K., Shimokawa, T., Oyama, I., and Kitamura, S. (2013). Investigation of factors affecting the evaluation of street scapes in Japan and China, International Journal of Affective Engineering, 12(1), 1–10.
Article Google Scholar
Ridgeway, G. (2007). Generalized boosted models: a guide to the gbm package, http://cran.rproject.org/web/psdcages/gbm/vignettes/gbm.pdf#search=’generalized+boosting+machines+ridgeway
Google Scholar
Sexton, J. and Laake, P. (2007). Boosted regression trees with errors in variables, Biometrics, 63(2), 586–592.
Article MathSciNet Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. B58(2), 267–288.
MathSciNet MATH Google Scholar
Shimokawa, T., Tsuji, M., and Goto, M. (2011). Modified rule ensemble method and its application for bioceutical data, Japanese Journal of Applied Statistics, 40(1), 19–40 (in Japanese).
Article Google Scholar
Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. B67(2), 301–320.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi, 4-3-11 Takeda, Kofu, Yamanashi, 400-8511, Japan
Toshio Shimokawa & Shinnichi Kitamura
School of Transportation and Logistics, Southwest Jiaotong University, 61031, Sichuan, Chengu, China
Li Li & Kun Yan
Biostatsitical Research Association, NPO, Japan
Masashi Goto

Authors

Toshio Shimokawa
View author publications
You can also search for this author in PubMed Google Scholar
Li Li
View author publications
You can also search for this author in PubMed Google Scholar
Kun Yan
View author publications
You can also search for this author in PubMed Google Scholar
Shinnichi Kitamura
View author publications
You can also search for this author in PubMed Google Scholar
Masashi Goto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toshio Shimokawa.

About this article

Cite this article

Shimokawa, T., Li, L., Yan, K. et al. Modified Rule Ensemble Method for Binary Data and Its Applications. Behaviormetrika 41, 225–244 (2014). https://doi.org/10.2333/bhmk.41.225

Download citation

Received: 12 April 2014
Revised: 20 July 2014
Published: 15 August 2014
Issue Date: July 2014
DOI: https://doi.org/10.2333/bhmk.41.225

Key Words and Phrases

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modified Rule Ensemble Method for Binary Data and Its Applications

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A random forest guided tour

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Key Words and Phrases

Navigation

Modified Rule Ensemble Method for Binary Data and Its Applications

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A random forest guided tour

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Share this article

Key Words and Phrases

Search

Navigation