# On the effectiveness of heuristics for learning nested dichotomies: an empirical analysis

- 285 Downloads
- 1 Citations

**Part of the following topical collections:**

## Abstract

In machine learning, so-called nested dichotomies are utilized as a reduction technique, i.e., to decompose a multi-class classification problem into a set of binary problems, which are solved using a simple binary classifier as a base learner. The performance of the (multi-class) classifier thus produced strongly depends on the structure of the decomposition. In this paper, we conduct an empirical study, in which we compare existing heuristics for selecting a suitable structure in the form of a nested dichotomy. Moreover, we propose two additional heuristics as natural completions. One of them is the Best-of-K heuristic, which picks the (presumably) best among *K* randomly generated nested dichotomies. Surprisingly, and in spite of its simplicity, it turns out to outperform the state of the art.

## Keywords

Nested dichotomies Multi-class classification Decomposition method## Notes

### Acknowledgements

This work has been conducted as part of the Collaborative Research Center “On-the-Fly Computing” (SFB 901) at Paderborn University, which is supported by the German Research Foundation (DFG).

## References

- Dietterich, T., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes.
*Journal of Artificial Intelligence Research*,*2*, 263–286.CrossRefzbMATHGoogle Scholar - Ding, Y., & Simonoff, J. S. (2010). An investigation of missing data methods for classification trees applied to binary response data.
*Journal of Machine Learning Research*,*11*(Jan), 131–170.MathSciNetzbMATHGoogle Scholar - Dong, L., Frank, E., & Kramer, S. (2005). Ensembles of balanced nested dichotomies for multi-class problems.
*Knowledge discovery in databases, Lecture Notes in computer science*(Vol. 3721, pp. 84–95). Berlin and Heidelberg and New York: Springer.Google Scholar - Duarte-Villaseñor, M. M., Carrasco-Ochoa, J. A., Martínez-Trinidad, J. F., & Flores-Garrido, M. (2012). Nested dichotomies based on clustering. In
*Progress in pattern recognition, image analysis, computer vision, and applications: 17th iberoamerican congress, CIARP 2012, Buenos Aires, Argentina, September 3–6, 2012. Proceedings*(pp. 162–169). Berlin Heidelberg, Berlin, Heidelberg: Springer.Google Scholar - Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F. (2015). Efficient and robust automated machine learning. In
*Advances in neural information processing systems*(pp. 2962–2970).Google Scholar - Frank, E., & Kramer, S. (2004). Ensembles of nested dichotomies for multi-class problems. In
*Proceedings of the twenty-first international conference on machine learning, ICML ’04*. New York: ACM.Google Scholar - Furnas, G. W. (1984). The generation of random, binary unordered trees.
*Journal of Classification*,*1*(1), 187–233.MathSciNetCrossRefzbMATHGoogle Scholar - Fürnkranz, J. (2002). Round robin classification.
*Journal of Machine Learning Research*,*2*, 721–747.MathSciNetzbMATHGoogle Scholar - Leathart, T., Pfahringer, B., & Frank, E. (2016). Building ensembles of adaptive nested dichotomies with random-pair selection. In
*Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2016, Riva del Garda, Italy, September 19–23, 2016, Proceedings, Part II*(pp. 179–194). Springer International Publishing.Google Scholar - Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python.
*Journal of Machine Learning Research*,*12*, 2825–2830.MathSciNetzbMATHGoogle Scholar - Rifkin, R., & Klautau, A. (2004). In defense of one-vs-all classification.
*Journal of Machine Learning Research*,*5*, 101–141.MathSciNetzbMATHGoogle Scholar - Rodríguez, J. J., García-Osorio, C., & Maudes, J. (2010). Forests of nested dichotomies.
*Pattern Recognition Letters*,*31*(2), 125–132.CrossRefGoogle Scholar - Rohlf, F. J. (1983). Numbering binary trees with labeled terminal vertices.
*Bulletin of Mathematical Biology*,*45*(1), 33–40.MathSciNetCrossRefzbMATHGoogle Scholar - Sokal, R. R. (1958). A statistical method for evaluating systematic relationship.
*University of Kansas Science Bulletin*,*28*, 1409–1438.Google Scholar - Stanley, R. P., & Fomin, S. (1999).
*Enumerative combinatorics, Cambridge studies in advanced mathematics*(Vol. 2). Cambridge: Cambridge University Press.CrossRefGoogle Scholar - Thornton, C., Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2013). Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In
*The 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD 2013*(pp. 847–855). Chicago, IL, USA.Google Scholar - Vanschoren, J., van Rijn, J. N., Bischl, B., & Torgo, L. (2013). Openml: Networked science in machine learning.
*SIGKDD Explorations*,*15*(2), 49–60.CrossRefGoogle Scholar