VOGUE: A Novel Variable Order-Gap State Machine for Modeling Sequences

  • Bouchra Bouqata
  • Christopher D. Carothers
  • Boleslaw K. Szymanski
  • Mohammed J. Zaki
Conference paper

DOI: 10.1007/11871637_9

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4213)
Cite this paper as:
Bouqata B., Carothers C.D., Szymanski B.K., Zaki M.J. (2006) VOGUE: A Novel Variable Order-Gap State Machine for Modeling Sequences. In: Fürnkranz J., Scheffer T., Spiliopoulou M. (eds) Knowledge Discovery in Databases: PKDD 2006. PKDD 2006. Lecture Notes in Computer Science, vol 4213. Springer, Berlin, Heidelberg

Abstract

We present VOGUE, a new state machine that combines two separate techniques for modeling long range dependencies in sequential data: data mining and data modeling. VOGUE relies on a novel Variable-Gap Sequence mining method (VGS), to mine frequent patterns with different lengths and gaps between elements. It then uses these mined sequences to build the state machine. We applied VOGUE to the task of protein sequence classification on real data from the PROSITE protein families. We show that VOGUE yields significantly better scores than higher-order Hidden Markov Models. Moreover, we show that VOGUE’s classification sensitivity outperforms that of HMMER, a state-of-the-art method for protein classification.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Bouchra Bouqata
    • 1
  • Christopher D. Carothers
    • 1
  • Boleslaw K. Szymanski
    • 1
  • Mohammed J. Zaki
    • 1
  1. 1.CS DepartmentRensselaer Polytechnic InstituteTroyUSA

Personalised recommendations