Skip to main content

A comparison of random forest based algorithms: random credal random forest versus oblique random forest

Abstract

Random forest (RF) is an ensemble learning method, and it is considered a reference due to its excellent performance. Several improvements in RF have been published. A kind of improvement for the RF algorithm is based on the use of multivariate decision trees with local optimization process (oblique RF). Another type of improvement is to provide additional diversity for the univariate decision trees by means of the use of imprecise probabilities (random credal random forest, RCRF). The aim of this work is to compare experimentally these improvements of the RF algorithm. It is shown that the improvement in RF with the use of additional diversity and imprecise probabilities achieves better results than the use of RF with multivariate decision trees.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. Normally, the value used for m is the integer part of \(\log _2\) (number of features) \(+1\).

References

Download references

Acknowledgements

This work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project TEC2015-69496-R.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlos J. Mantas.

Ethics declarations

Conflict of interest

Carlos J. Mantas, Javier G. Castellano, Serafín Moral-García and Joaquín Abellán declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by V. Loia.

Appendix A: Tables about accuracy results

Appendix A: Tables about accuracy results

Tables 6789 and 10 show the accuracy results obtained by the ensemble methods when they classify data sets with different added noise levels.

Tables 111213, 14 and 15 show the p values of the Nemenyi test on the pairs of comparisons when they are applied on data sets with different percentage of added noise. In all the cases, Nemenyi’s procedures reject the hypotheses which have a corresponding p value \(\le 0.01\). When there is a significative difference, the best algorithm is distinguished with bold fonts.

Table 6 Accuracy results of the ensemble methods when they are used on data sets without added noise
Table 7 Accuracy results of the ensemble methods when they are used on data sets with a percentage of added label noise equal to 5%
Table 8 Accuracy results of the ensemble methods when they are used on data sets with a percentage of added label noise equal to 10%
Table 9 Accuracy results of the ensemble methods when they are used on data sets with a percentage of added label noise equal to 20%
Table 10 Accuracy results of the ensemble methods when they are used on data sets with a percentage of added label noise equal to 30%
Table 11 p Values of the Nemenyi test about the accuracy on data sets without added noise
Table 12 p Values of the Nemenyi test about the accuracy on data sets with \(5\%\) of added noise
Table 13 p Values of the Nemenyi test about the accuracy on data sets with \(10\%\) of added noise
Table 14 p Values of the Nemenyi test about the accuracy on data sets with \(20\%\) of added noise
Table 15 p Values of the Nemenyi test about the accuracy on data sets with \(30\%\) of added noise

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mantas, C.J., Castellano, J.G., Moral-García, S. et al. A comparison of random forest based algorithms: random credal random forest versus oblique random forest. Soft Comput 23, 10739–10754 (2019). https://doi.org/10.1007/s00500-018-3628-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3628-5

Keywords

  • Classification
  • Ensemble schemes
  • Random forest
  • Imprecise probabilities
  • Credal sets