g-MARS: Protein Classification Using Gapped Markov Chains and Support Vector Machines

  • Xiaonan Ji
  • James Bailey
  • Kotagiri Ramamohanarao
Conference paper

DOI: 10.1007/978-3-540-88436-1_15

Part of the Lecture Notes in Computer Science book series (LNCS, volume 5265)
Cite this paper as:
Ji X., Bailey J., Ramamohanarao K. (2008) g-MARS: Protein Classification Using Gapped Markov Chains and Support Vector Machines. In: Chetty M., Ngom A., Ahmad S. (eds) Pattern Recognition in Bioinformatics. PRIB 2008. Lecture Notes in Computer Science, vol 5265. Springer, Berlin, Heidelberg

Abstract

Classifying protein sequences has important applications in areas such as disease diagnosis, treatment development and drug design. In this paper we present a highly accurate classifier called the g-MARS (gapped Markov Chain with Support Vector Machine) protein classifier. It models the structure of a protein sequence by measuring the transition probabilities between pairs of amino acids. This results in a Markov chain style model for each protein sequence. Then, to capture the similarity among non-exactly matching protein sequences, we show that this model can be generalized to incorporate gaps in the Markov chain. We perform a thorough experimental study and compare g-MARS to several other state-of-the-art protein classifiers. Overall, we demonstrate that g-MARS has superior accuracy and operates efficiently on a diverse range of protein families.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Xiaonan Ji
    • 1
  • James Bailey
    • 1
  • Kotagiri Ramamohanarao
    • 1
  1. 1.NICTA Victoria Laboratory Department of Computer Science and Software EngineeringUniversity of MelbourneAustralia

Personalised recommendations