Skip to main content

Greedy hierarchical binary classifiers for multi-class classification of biological data


Multi-class classification is an important and challenging problem for biological data classification. Typical methods for dealing with multi-class classification use a powerful single classifier such as neural networks to classify the data into one of many classes. Alternatively, the binary classifiers are used in one-versus-one (OVO) and one-versus-all (OVA) classifier schemes for multi-class classification. However, it is not clear whether OVO or OVA yields good performance results. In this paper, we propose a greedy method for developing a hierarchical classifier where each node corresponds to a binary classifier. The advantage of our greedy hierarchical classifier is that at the nodes any type of classifier can be used. In this paper, we analyze the performance of the proposed technique using neural networks and naive Bayesian classifiers and compare our results with OVO, OVA, and exhaustive methods. Our greedy technique provided better and more robust accuracy than others in general for biological data sets including 3- to 8-classes.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10


  • Allwein E, Schapire R, Singer Y (2002) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1:113–141. doi:10.1162/15324430152733133

    MathSciNet  Google Scholar 

  • Asuncion A, Newman DJ (2007) Uci Machine Learning Repository. University of California, School of Information and Computer Sciences, Irvine. Accessed May 2012

  • Bay SD (1998) Combining nearest neighbor classifiers through multiple feature subsets. In: Proceedings of the 17th international conference on machine learning, Madison, WI, pp 37–45

  • Begum S, Aygun R (2012) Analyzing the performance of hierarchical binary classifiers for multi-class classification problem using biological data. ICMLA 2, IEEE, pp 145–150. doi:10.1109/ICMLA.2012.165

  • Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press

  • Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Chapman and Hall, New York

    MATH  Google Scholar 

  • Casasent D, Wang Y (2005) A hierarchical classifier using new support vector machine for automatic target recognition. IJCNN, IEEE 18(5–6):541–548. doi:10.1016/j.neunet.2005.06.033

    Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. doi:10.1023/A:1022627411411

    MATH  Google Scholar 

  • Demuth H, Baele M (1994) Neural network toolbox. User’s guide. The MathWorks Inc, Natick

    Google Scholar 

  • Duda R, Hart P, Stork D (2000) Pattern classification. Wiley-Interscience, New York

    Google Scholar 

  • El-Alfy E (2010) A hierarchical GMDH-based polynomial neural network for handwritten numeral recognition using topological features. In: IJCNN, IEEE, pp 1–7

  • Escalera S, Pujol O, Radeva P (2008) On the decoding process in ternary error-correcting output codes. IEEE Trans Pattern Anal Mach Intell 32(1):120–134. doi:10.1109/TPAMI.2008.266

    Article  Google Scholar 

  • Escalera S, Pujol O, Radeva P (2009) Separability of ternary codes for sparse designs of error correcting output codes. Pattern Recogn Lett 30:285–297. doi:10.1016/j.patrec.2008.10.002

    Article  Google Scholar 

  • Escalera S, Pujol O, Radeva P (2010) Error-correcting ouput codes library. J Mach Learn Res 11:661–664

    Google Scholar 

  • Friedman J (1996) Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University

  • Gupta K, Agarwal K, Prakash N, Singh B, Misra K (2012) Prediction of miRNA in HIV-1 genome and its targets through artificial neural network: a bioinformatics approach. Netw Model Anal Health Inform Bioinform 1(4):141–151. doi:10.1007/s13721-012-0017-3

    Article  Google Scholar 

  • Guyon I, Gunn S, Nikravesh M, Zadeh L (2006) An enhanced selective naive Bayes method with optimal discretization. Feature extraction: foundations and applications, Chap. 25. Springer, pp 499–507

  • Hastie T, Tibshirani R (1998) Classification by pairwise coupling. Advances in neural information processing systems, vol 10. MIT Press, Cambridge, pp 507–513

    Google Scholar 

  • Hulse J, Khoshgoftaar M, Napolitano A, Wald R (2012) Threshold-based feature selection techniques for high-dimensional bioinformatics data. Netw Model Anal Health Inform Bioinform 1(1–2):47–61. doi:10.1007/s13721-012-0006-6

    Article  Google Scholar 

  • Jain P, Wadhwa P, Aygun R, Podila G (2008) Vector-G: multi-modular SVM-based heterotrimeric G-protein prediction. Silico Biol 8(2):141–155

    Google Scholar 

  • Kumar S, Gosh J, Crawford M (2002) Hierarchical fusion of multiple classifiers for hyperspectral data analysis. Pattern Anal Appl 5:210–220. doi:10.1007/s100440200019

    Article  MATH  Google Scholar 

  • Lorena A, Carvalho A (2008) Tree decomposition of multiclass problems. In: Proceedings of the Brazilian symposium on neural networks (SBRN), pp 189–194. doi:10.1109/SBRN.2008.43

  • Nagi S, Bhattacharyya D (2013) Classification of microarray cancer data using ensemble approach. Netw Model Anal Health Inform Bioinform. doi:10.1007/s13721-013-0034-x

    MATH  Google Scholar 

  • Platt JC, Cristianini N, Shawe-Taylor J (2000) Large margin DAGs for multiclass classification. Advances in neural information processing systems. MIT Press, pp 547–553

  • Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco

    Google Scholar 

  • Rish I (2001) An empirical study of the naive Bayes classifier. In: IJCAI workshop on empirical methods in artificial intelligence

  • Sánchez-Maroño N, Alonso-Betanzos A, Garcia-Gonzalez P, Bolón-Canedo V (2010) Multiclass classifiers vs multiple binary classifiers using filters for feature selection. In: IJCNN, IEEE, pp 1–8

  • Tibshirani R, Hastie T (2007) Margin trees for high-dimensional classification. J Mach Learn Res 8:637–652

    MATH  Google Scholar 

  • Vural V, Dy JG (2004) A hierarchical method for multi-class support vector machines. In: Proceedings of the 21st international conference on machine learning, p 105. doi:10.1145/1015330.1015427

  • Wang Y, Casasent D (2006) Hierarchical K-means clustering using new support vector machines for multi-class classification. In: Proceedings of the international joint conference on neural networks, pp 3457–3464

Download references


We would like to acknowledge Marc Pusey, Ph.D., of iXpressGenes, Inc. for providing the Protein Crystallization dataset and Madhav Sigdel for extracting features from this dataset.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Salma Begum.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Begum, S., Aygun, R.S. Greedy hierarchical binary classifiers for multi-class classification of biological data. Netw Model Anal Health Inform Bioinforma 3, 53 (2014).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI:


  • Hierarchical binary classifiers
  • Neural networks
  • Error-correcting output codes
  • Biological data