Advertisement

Ordinal Forests

  • Roman HornungEmail author
Article
  • 135 Downloads

Abstract

The ordinal forest method is a random forest–based prediction method for ordinal response variables. Ordinal forests allow prediction using both low-dimensional and high-dimensional covariate data and can additionally be used to rank covariates with respect to their importance for prediction. An extensive comparison study reveals that ordinal forests tend to outperform competitors in terms of prediction performance. Moreover, it is seen that the covariate importance measure currently used by ordinal forest discriminates influential covariates from noise covariates at least similarly well as the measures used by competitors. Several further important properties of the ordinal forest algorithm are studied in additional investigations. The rationale underlying ordinal forests of using optimized score values in place of the class values of the ordinal response variable is in principle applicable to any regression method beyond random forests for continuous outcome that is considered in the ordinal forest method.

Keywords

Prediction Ordinal response variable Covariate importance ranking Random forest 

Notes

Acknowledgments

The author thanks Giuseppe Casalicchio for proofreading and comments and Jenny Lee for language corrections. This work was supported by the German Science Foundation (DFG-Einzelförderung BO3139/6-1 to Anne-Laure Boulesteix).

Supplementary material

357_2018_9302_MOESM1_ESM.pdf (1 mb)
(PDF 1.01 MB)
357_2018_9302_MOESM2_ESM.zip (1.6 mb)
(ZIP 1.58 MB)

References

  1. Ben-David, A. (2008). Comparison of classification accuracy using Cohen’s weighted Kappa. Expert Systems with Applications, 34(2), 825–832.CrossRefGoogle Scholar
  2. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.CrossRefGoogle Scholar
  3. Breiman, L., Friedman, J.H., Olshen, R.A., Ston, C.J. (1984). Classification and regression trees. Monterey: Wadsworth International Group.Google Scholar
  4. Cohen, J. (1960). A Coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.CrossRefGoogle Scholar
  5. Cohen, J. (1968). Weighed Kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220.CrossRefGoogle Scholar
  6. Hornung, R. (2018). ordinalForest: Ordinal Forests: Prediction and Variable Ranking with Ordinal Target Variables, R package version 2.2.Google Scholar
  7. Hothorn, T., Hornik, K., Zeileis, A. (2006). Unbiased recursive partitioning: a conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651–674.MathSciNetCrossRefGoogle Scholar
  8. Jakobsson, U., & Westergren, A. (2005). Statistical methods for assessing agreement for ordinal data. Scandinavian Journal of Caring Sciences, 19(4), 427–431.CrossRefGoogle Scholar
  9. Janitza, S., Tutz, G., Boulesteix, A.L. (2016). Random forest for ordinal responses: prediction and variable selection. Computational Statistics and Data Analysis, 96, 57–73.MathSciNetCrossRefGoogle Scholar
  10. McCullagh, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society Series B, 42(2), 109–142.MathSciNetzbMATHGoogle Scholar
  11. Probst, P., Bischl, B., Boulesteix, A.L. (2018). Tunability: importance of hyperparameters of machine learning algorithms. arXiv:1802.09596.
  12. Wright, M.N., & Ziegler, A. (2017). ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software, 77 (1), 1–17.CrossRefGoogle Scholar

Copyright information

© Classification Society of North America 2019

Authors and Affiliations

  1. 1.Institute for Medical Information Processing, Biometry and EpidemiologyUniversity of MunichMunichGermany

Personalised recommendations