Featureless Pattern Recognition in an Imaginary Hilbert Space and Its Application to Protein Fold Classification

  • Vadim Mottl
  • Sergey Dvoenko
  • Oleg Seredin
  • Casimir Kulikowski
  • Ilya Muchnik
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2123)

Abstract

The featureless pattern recognition methodology based on measuring some numerical characteristics of similarity between pairs of entities is applied to the problem of protein fold classification. In computational biology, a commonly adopted way of measuring the likelihood that two proteins have the same evolutionary origin is calculating the so-called alignment score between two amino acid sequences that shows properties of inner product rather than those of a similarity measure. Therefore, in solving the problem of determining the membership of a protein given by its amino acid sequence (primary structure) in one of preset fold classes (spatial structure), we treat the set of all feasible amino acid sequences as a subset of isolated points in an imaginary space in which the linear operations and inner product are defined in an arbitrary unknown manner, but without any conjecture on the dimension, i.e. as a Hilbert space.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cortes, C, Vapnik, V.: Support-vector networks. Machine Learning, Vol. 20, No. 3, 1995.Google Scholar
  2. 2.
    Vapnik, V. Statistical Learning Theory. John-Wiley & Sons, Inc. 1998.Google Scholar
  3. 3.
    Duin, R.P.W, De Ridder, D., Tax, D.M.J. Featureless classification. Proceedings of the Workshop on Statistical Pattern Recognition, Prague, June 1997.Google Scholar
  4. 4.
    Duin, R.P.W, De Ridder, D., Tax, D.M.J. Experiments with a featureless approach to pattern recognition. Pattern Recognition Letters, vol. 18, no. 11-13, 1997, pp. 1159–1166.CrossRefGoogle Scholar
  5. 5.
    Duin, R.P.W, Pekalska, E., De Ridder, D. Relational discriminant analysis. Pattern Recognition Letters, Vol. 20, 1999, No. 11-13, pp. 1175–1181.CrossRefGoogle Scholar
  6. 6.
    Fetrow J.S., Bryant S.H. New programs for protein tertiary structure prediction. Biotechnology, Vol. 11, April 1993, pp. 479–484.Google Scholar
  7. 7.
    Dubchak, I., Muchnik, I., Mayor, C, Dralyuk, I., Kim, S.-H. Recognition of a protein fold in the context of the SCOP classification. Proteins: Structure, Function, and Genetics, 1999, 35, 401–407.CrossRefGoogle Scholar
  8. 8.
    Mottl, V., Dvoenko, S., Seredin, O., Kulikowski, C, Muchnik, I. Alignment Scores in a Regularized Support Vector Classification Method for Fold Recognition of Remote Protein Families. DIMACS Technical Report 2001-01, January 2001. Center for Discrete Mathematics and Theoretical Computer Science. Rutgers University, the State University of New Jersey, 33 p.Google Scholar
  9. 9.
    Durbin, R., Eddy, S., Krogh, A., Mitchison, G. Biological Sequence Analysis. Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1988.Google Scholar
  10. 10.
    Pearson, W. R., Lipman, D. J. Improved tools for biological sequence analysis. PNAS, 1988, 85, 2444–2448.CrossRefGoogle Scholar
  11. 11.
    Pearson, W. R. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods in Enzymology, 1990, 183, 63–98.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Vadim Mottl
    • 1
  • Sergey Dvoenko
    • 1
  • Oleg Seredin
    • 1
  • Casimir Kulikowski
    • 2
  • Ilya Muchnik
    • 2
  1. 1.Tula State UniversityTulaRussia
  2. 2.Rutgers UniversityPiscatawayUSA

Personalised recommendations