Greedy hierarchical binary classifiers for multi-class classification of biological data

Begum, Salma; Aygun, Ramazan S.

doi:10.1007/s13721-014-0053-2

Greedy hierarchical binary classifiers for multi-class classification of biological data

Original Article
Published: 15 February 2014

Volume 3, article number 53, (2014)
Cite this article

Network Modeling Analysis in Health Informatics and Bioinformatics Aims and scope Submit manuscript

Salma Begum¹ &
Ramazan S. Aygun¹

320 Accesses
3 Citations
Explore all metrics

Abstract

Multi-class classification is an important and challenging problem for biological data classification. Typical methods for dealing with multi-class classification use a powerful single classifier such as neural networks to classify the data into one of many classes. Alternatively, the binary classifiers are used in one-versus-one (OVO) and one-versus-all (OVA) classifier schemes for multi-class classification. However, it is not clear whether OVO or OVA yields good performance results. In this paper, we propose a greedy method for developing a hierarchical classifier where each node corresponds to a binary classifier. The advantage of our greedy hierarchical classifier is that at the nodes any type of classifier can be used. In this paper, we analyze the performance of the proposed technique using neural networks and naive Bayesian classifiers and compare our results with OVO, OVA, and exhaustive methods. Our greedy technique provided better and more robust accuracy than others in general for biological data sets including 3- to 8-classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Bartosz Krawczyk

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Allwein E, Schapire R, Singer Y (2002) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1:113–141. doi:10.1162/15324430152733133
MathSciNet Google Scholar
Asuncion A, Newman DJ (2007) Uci Machine Learning Repository. University of California, School of Information and Computer Sciences, Irvine. http://mlearn.ics.uci.edu/MLRepository.html. Accessed May 2012
Bay SD (1998) Combining nearest neighbor classifiers through multiple feature subsets. In: Proceedings of the 17th international conference on machine learning, Madison, WI, pp 37–45
Begum S, Aygun R (2012) Analyzing the performance of hierarchical binary classifiers for multi-class classification problem using biological data. ICMLA 2, IEEE, pp 145–150. doi:10.1109/ICMLA.2012.165
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Chapman and Hall, New York
MATH Google Scholar
Casasent D, Wang Y (2005) A hierarchical classifier using new support vector machine for automatic target recognition. IJCNN, IEEE 18(5–6):541–548. doi:10.1016/j.neunet.2005.06.033
Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. doi:10.1023/A:1022627411411
MATH Google Scholar
Demuth H, Baele M (1994) Neural network toolbox. User’s guide. The MathWorks Inc, Natick
Google Scholar
Duda R, Hart P, Stork D (2000) Pattern classification. Wiley-Interscience, New York
Google Scholar
El-Alfy E (2010) A hierarchical GMDH-based polynomial neural network for handwritten numeral recognition using topological features. In: IJCNN, IEEE, pp 1–7
Escalera S, Pujol O, Radeva P (2008) On the decoding process in ternary error-correcting output codes. IEEE Trans Pattern Anal Mach Intell 32(1):120–134. doi:10.1109/TPAMI.2008.266
Article Google Scholar
Escalera S, Pujol O, Radeva P (2009) Separability of ternary codes for sparse designs of error correcting output codes. Pattern Recogn Lett 30:285–297. doi:10.1016/j.patrec.2008.10.002
Article Google Scholar
Escalera S, Pujol O, Radeva P (2010) Error-correcting ouput codes library. J Mach Learn Res 11:661–664
Google Scholar
Friedman J (1996) Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University
Gupta K, Agarwal K, Prakash N, Singh B, Misra K (2012) Prediction of miRNA in HIV-1 genome and its targets through artificial neural network: a bioinformatics approach. Netw Model Anal Health Inform Bioinform 1(4):141–151. doi:10.1007/s13721-012-0017-3
Article Google Scholar
Guyon I, Gunn S, Nikravesh M, Zadeh L (2006) An enhanced selective naive Bayes method with optimal discretization. Feature extraction: foundations and applications, Chap. 25. Springer, pp 499–507
Hastie T, Tibshirani R (1998) Classification by pairwise coupling. Advances in neural information processing systems, vol 10. MIT Press, Cambridge, pp 507–513
Google Scholar
Hulse J, Khoshgoftaar M, Napolitano A, Wald R (2012) Threshold-based feature selection techniques for high-dimensional bioinformatics data. Netw Model Anal Health Inform Bioinform 1(1–2):47–61. doi:10.1007/s13721-012-0006-6
Article Google Scholar
Jain P, Wadhwa P, Aygun R, Podila G (2008) Vector-G: multi-modular SVM-based heterotrimeric G-protein prediction. Silico Biol 8(2):141–155
Google Scholar
Kumar S, Gosh J, Crawford M (2002) Hierarchical fusion of multiple classifiers for hyperspectral data analysis. Pattern Anal Appl 5:210–220. doi:10.1007/s100440200019
Article MATH Google Scholar
Lorena A, Carvalho A (2008) Tree decomposition of multiclass problems. In: Proceedings of the Brazilian symposium on neural networks (SBRN), pp 189–194. doi:10.1109/SBRN.2008.43
Nagi S, Bhattacharyya D (2013) Classification of microarray cancer data using ensemble approach. Netw Model Anal Health Inform Bioinform. doi:10.1007/s13721-013-0034-x
MATH Google Scholar
Platt JC, Cristianini N, Shawe-Taylor J (2000) Large margin DAGs for multiclass classification. Advances in neural information processing systems. MIT Press, pp 547–553
Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco
Google Scholar
Rish I (2001) An empirical study of the naive Bayes classifier. In: IJCAI workshop on empirical methods in artificial intelligence
Sánchez-Maroño N, Alonso-Betanzos A, Garcia-Gonzalez P, Bolón-Canedo V (2010) Multiclass classifiers vs multiple binary classifiers using filters for feature selection. In: IJCNN, IEEE, pp 1–8
Tibshirani R, Hastie T (2007) Margin trees for high-dimensional classification. J Mach Learn Res 8:637–652
MATH Google Scholar
Vural V, Dy JG (2004) A hierarchical method for multi-class support vector machines. In: Proceedings of the 21st international conference on machine learning, p 105. doi:10.1145/1015330.1015427
Wang Y, Casasent D (2006) Hierarchical K-means clustering using new support vector machines for multi-class classification. In: Proceedings of the international joint conference on neural networks, pp 3457–3464

Download references

Acknowledgments

We would like to acknowledge Marc Pusey, Ph.D., of iXpressGenes, Inc. for providing the Protein Crystallization dataset and Madhav Sigdel for extracting features from this dataset.

Author information

Authors and Affiliations

Computer Science Department, University of Alabama in Huntsville, Huntsville, AL, USA
Salma Begum & Ramazan S. Aygun

Authors

Salma Begum
View author publications
You can also search for this author in PubMed Google Scholar
Ramazan S. Aygun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Salma Begum.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Begum, S., Aygun, R.S. Greedy hierarchical binary classifiers for multi-class classification of biological data. Netw Model Anal Health Inform Bioinforma 3, 53 (2014). https://doi.org/10.1007/s13721-014-0053-2

Download citation

Received: 09 July 2013
Revised: 18 December 2013
Accepted: 29 January 2014
Published: 15 February 2014
DOI: https://doi.org/10.1007/s13721-014-0053-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Greedy hierarchical binary classifiers for multi-class classification of biological data

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Learning from imbalanced data: open challenges and future directions

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Greedy hierarchical binary classifiers for multi-class classification of biological data

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Learning from imbalanced data: open challenges and future directions

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation