Functional Classification of G-Protein Coupled Receptors, Based on Their Specific Ligand Coupling Patterns

  • Burcu Bakir
  • Osman Ugur Sezerman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3907)

Abstract

Functional identification of G-Protein Coupled Receptors (GPCRs) is one of the current focus areas of pharmaceutical research. Although thousands of GPCR sequences are known, many of them remain as orphan sequences (the activating ligand is unknown). Therefore, classification methods for automated characterization of orphan GPCRs are imperative. In this study, for predicting Level 2 subfamilies of Amine GPCRs, a novel method for obtaining fixed-length feature vectors, based on the existence of activating ligand specific patterns, has been developed and utilized for a Support Vector Machine (SVM)-based classification. Exploiting the fact that there is a non-promiscuous relationship between the specific binding of GPCRs into their ligands and their functional classification, our method classifies Level 2 subfamilies of Amine GPCRs with a high predictive accuracy of 97.02% in a ten-fold cross validation test. The presented machine learning approach, bridges the gulf between the excess amount of GPCR sequence data and their poor functional characterization.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altshul, S., et al.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)Google Scholar
  2. 2.
    Attwood, T.K., et al.: PRINTS and its automatic supplement, prePRINTS. Nucleic Acids Research 31, 400–402 (2003)CrossRefGoogle Scholar
  3. 3.
    Bhasin, M., Raghava, G.: GPCRpred: an SVM-based method for prediction of families and subfamilies of G-protein coupled receptors. Nucleic Acids Research 32, 383–389 (2004)CrossRefGoogle Scholar
  4. 4.
    Brazma, A., et al.: Discovering patterns and subfamilies in biosequences. In: Proceedings of the Fourth International Conference on Intellignent Systems for Molecular Biology (ISMB 1996), pp. 34–43. AAAI Press, Menlo Park (1996), Pratt 2.1 software is available at www.ebi.ac.uk/pratt
  5. 5.
    Byvatov, E., Schneider, G.: Support vector machine applications in bioinformatics. Appl. Bioinformatics 2, 67–77 (2003)Google Scholar
  6. 6.
    Chalmers, D.T., Behan, D.P.: The Use of Constitutively Active GPCRs in Drug Discovery and Functional Genomics. Nature Reviews, Drug Discovery 1, 599–608 (2002)CrossRefGoogle Scholar
  7. 7.
    Chang, C.C. and Lin, C.J.: LIBSVM : a library for support vector machines. (2001) LIBSVM software is available at http://www.csie.ntu.edu.tw/ cjlin/libsvmGoogle Scholar
  8. 8.
    Chou, K.C.: A Novel Approach to Predicting Protein Structural Classes in a (ZO-l)- D Amino Acid Composition Space. PROTEINS: Structure, Function, and Genetics 21, 319–344 (1995)CrossRefGoogle Scholar
  9. 9.
    Elrod, D.W., Chou, K.C.: A study on the correlation of G-protein-coupled receptor types with amino acid composition. Protein Eng. 15, 713–715 (2002)CrossRefGoogle Scholar
  10. 10.
    Horn, F., et al.: GPCRDB: an information system for G protein coupled receptors. Nucleic Acids Res. 26, 275–279 (1998), Available at http://www.gpcr.org/7tm
  11. 11.
    Hsu, C.W., et al.: A Practical Guide to Support Vector Classification. Image, Speech and Intelligent Systems (ISIS) Seminars (2004)Google Scholar
  12. 12.
    Jonassen, I., et al.: Finding flexible patterns in unaligned protein sequences. Protein Sci. 4, 1587–1595 (1995)CrossRefGoogle Scholar
  13. 13.
    Karchin, R., et al.: Classifying G-protein coupled receptors with support vector machines. Bioinformatics 18, 147–159 (2001)CrossRefGoogle Scholar
  14. 14.
    Keerthi, S.S., Lin, C.J.: Asymptotic behaviors of support vector machines with Gaussian kernel. Neural Computation 15, 1667–1689 (2003)MATHCrossRefGoogle Scholar
  15. 15.
    Krzysztof, P., et al.: Crystal Structure of Rhodopsin: A G- Protein-Coupled Receptor. Science 4, 739–745 (2000)Google Scholar
  16. 16.
    Lin, H.T., Lin, C.J.: A study on sigmoid kernels for SVM and the train ing of nonPSDkernels by SMO type methods. Technical report, Department of Computer Science and Information Engineering, National Taiwan University cjlin/papers/tanh.pdf (2003), Available at http://www.csie.ntu.edu.tw/
  17. 17.
    Mulder, N.J., et al.: The InterPro Database - 2003 brings increased coverage and new features. Nucleic Acids Research 31, 315–318 (2003)CrossRefGoogle Scholar
  18. 18.
    Neuwald, A., Green, P.: Detecting Patterns in Protein Sequences. J. Mol. Biol. 239, 698–712 (1994)CrossRefGoogle Scholar
  19. 19.
    Otaki, J.M., Firestein, S.: Length analyses of mammalian g-protein-coupled receptors. J. Theor. Biol. 211, 77–100 (2001)CrossRefGoogle Scholar
  20. 20.
    Pearson, W., Lipman, D.: Improved tools for biological sequence analysis. Proceedings of National Academic Science 85, 2444–2448 (1988)CrossRefGoogle Scholar
  21. 21.
    Quinlan, J.R.: C4.5; Programs for Machine Learning. Morgan Kauffman Publishers, San Francisco (1988)Google Scholar
  22. 22.
    Sadka, T., Linial, M.: Families of membranous proteins can be characterized by the amino acid composition of their transmembrane domains. Bioinformatics 21, 378–386 (2005)CrossRefGoogle Scholar
  23. 23.
    Schoneberg, T., et al.: The structural basis of g-protein-coupled receptor function and dysfunction in human diseases. Rev. Physiol. Biochem. Pharmacol. 144, 143–227 (2002)Google Scholar
  24. 24.
    Sreekumar, K.R., et al.: Predicting GPCR-G-Protein coupling using hidden Markov models. Bioinformatics 20, 3490–3499 (2004), http://www.expasy.org CrossRefGoogle Scholar
  25. 25.
    Tusndy, G.E., Simon, I.: Principles Governing Amino Acid Composition of Integral Membrane Proteins: Applications to topology prediction. J. Mol. Biol. 283, 489–506 (1998)CrossRefGoogle Scholar
  26. 26.
    Tusndy, G.E., Simon, I.: The HMMTOP transmembrane topology prediction server. Bioinformatics 17, 849–850 (2001), http://www.enzim.hu/hmmtop CrossRefGoogle Scholar
  27. 27.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)MATHGoogle Scholar
  28. 28.
    Vert, J.P.: Introduction to Support Vector Machines and applications to computational biology (2001)Google Scholar
  29. 29.
    Vilo, J., et al.: Prediction of the Coupling Specificity of G Protein Coupled Receptors to their G Proteins. Bioinformatics 17, 174–181 (2001)CrossRefGoogle Scholar
  30. 30.
    Yang, Z.R.: Biological applications of support vector machines. Brief. Bioinform. 5, 328–338 (2004)CrossRefGoogle Scholar
  31. 31.
    Huang, Y., Li, Y.: Classifying G-protein Coupled Receptors with Support Vector Machine. In: Yin, F.-L., Wang, J., Guo, C. (eds.) ISNN 2004. LNCS, vol. 3174, pp. 448–452. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  32. 32.
    Ying, H., et al.: Classifying G-protein Coupled receptors with bagging classification tree. Computational Biology and Chemistry 28, 275–280 (2004)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Burcu Bakir
    • 1
  • Osman Ugur Sezerman
    • 2
  1. 1.School of BiologyGeorgia Institute of TechnologyAtlantaUSA
  2. 2.Sabanci UniversityIstanbulTurkey

Personalised recommendations