Skip to main content

Locality Kernels for Protein Classification

  • Conference paper
  • 1090 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4645))

Abstract

We propose kernels that take advantage of local correlations in sequential data and present their application to the protein classification problem. Our locality kernels measure protein sequence similarities within a small window constructed around matching amino acids. The kernels incorporate positional information of the amino acids inside the window and allow a range of position dependent similarity evaluations. We use these kernels with regularized least-squares algorithm (RLS) for protein classification on the SCOP database. Our experiments demonstrate that the locality kernels perform significantly better than the spectrum and the mismatch kernels. When used together with RLS, performance of the locality kernels is comparable with some state-of-the-art methods of protein classification and remote homology detection.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jaakkola, T., Diekhans, M., Haussler, D.: A discriminative framework for detecting remote protein homologies. Journal of Computational Biology 7, 95–114 (2000)

    Article  Google Scholar 

  2. Liao, L., Noble, W.S.: Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships. Journal of Computational Biology 10, 857–868 (2003)

    Article  Google Scholar 

  3. Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: A string kernel for svm protein classification. In: Pacific Symposium on Biocomputing, pp. 566–575 (2002)

    Google Scholar 

  4. Tsivtsivadze, E., Pahikkala, T., Boberg, J., Salakoski, T.: Locality-convolution kernel and its application to dependency parse ranking. In: Ali, M., Dapoigny, R. (eds.) IEA/AIE 2006. LNCS (LNAI), vol. 4031, pp. 610–618. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Hubbard, T.J.P., Murzin, A.G., Brenner, S.E., Chothia, C.: Scop: a structural classification of proteins database. Nucleic Acids Research 25, 236–239 (1997)

    Article  Google Scholar 

  6. Leslie, C.S., Eskin, E., Cohen, A., Weston, J., Noble, W.S.: Mismatch string kernels for discriminative protein classification. Bioinformatics 20, 467–476 (2004)

    Article  Google Scholar 

  7. Leslie, C., Kuang, R.: Fast string kernels using inexact matching for protein sequences. J. Mach. Learn. Res. 5, 1435–1455 (2004)

    MathSciNet  Google Scholar 

  8. Poggio, T., Smale, S.: The mathematics of learning: Dealing with data. Amer. Math. Soc. Notice 50, 537–544 (2003)

    MATH  MathSciNet  Google Scholar 

  9. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, New York, USA (2004)

    Google Scholar 

  10. Pahikkala, T., Pyysalo, S., Ginter, F., Boberg, J., Järvinen, J., Salakoski, T.: Kernels incorporating word positional information in natural language disambiguation tasks. In: Russell, I., Markov, Z. (eds.) Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference, Menlo Park, Ca., pp. 442–447. AAAI Press, Stanford, California, USA (2005)

    Google Scholar 

  11. Zien, A., Ratsch, G., Mika, S., Scholkopf, B., Lengauer, T., Muller, K.-R.: Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics 16, 799–807 (2000)

    Article  Google Scholar 

  12. Haussler, D.: Convolution kernels on discrete structures. Technical Report UCSC-CRL-99-10, UC Santa Cruz (1999)

    Google Scholar 

  13. Gribskov, M., Robinson, N.L.: Use of receiver operating characteristic (roc) analysis to evaluate sequence matching. Computers & Chemistry 20, 25–33 (1996)

    Article  Google Scholar 

  14. Fawcett, T.: Roc graphs: Notes and practical considerations for data mining researchers. Technical Report HPL-2003-4, HP Labs (2003)

    Google Scholar 

  15. Kuang, R., Ie, E., Wang, K., Wang, K., Siddiqi, M., Freund, Y., Leslie, C.: Profile-based string kernels for remote homology detection and motif extraction. J. Bioinform. Comput. Biol. 3, 527–550 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Raffaele Giancarlo Sridhar Hannenhalli

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tsivtsivadze, E., Boberg, J., Salakoski, T. (2007). Locality Kernels for Protein Classification. In: Giancarlo, R., Hannenhalli, S. (eds) Algorithms in Bioinformatics. WABI 2007. Lecture Notes in Computer Science(), vol 4645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74126-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74126-8_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74125-1

  • Online ISBN: 978-3-540-74126-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics