Finding Class C GPCR Subtype-Discriminating N-grams through Feature Selection

  • Caroline KönigEmail author
  • René Alquézar
  • Alfredo Vellido
  • Jesús Giraldo
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 294)


G protein-coupled receptors (GPCRs) are a large and heterogeneous superfamily of receptors that are key cell players for their role as extracellular signal transmitters. Class C GPCRs, in particular, are of great interest in pharmacology. The lack of knowledge about their full 3-D structure prompts the use of their primary amino acid sequences for the construction of robust classifiers, capable of discriminating their different subtypes. In this paper, we describe the use of feature selection techniques to build Support Vector Machine (SVM)-based classification models from selected receptor subsequences described as n-grams. We show that this approach to classification is useful for finding class C GPCR subtype-specific motifs.


G-Protein coupled receptors pharmaco-proteomics feature selection n-grams support vector machines 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Caragea, C., Silvescu, A., Mitra, P.: Protein Sequence Classification Using Feature Hashing. In: 2011 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 538–543. IEEE (2011)Google Scholar
  2. 2.
    Chang, C., Lin, C.: LIBSVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology 2(3), 27:1–27:27 (2011)Google Scholar
  3. 3.
    Cheng, B., Carbonell, J., Klein-Seetharaman, J.: Protein classification based on text document classification techniques. Proteins: Structure, Function, and Bioinformatics 58(4), 955–970 (2005)CrossRefGoogle Scholar
  4. 4.
    Can Cobanoglu, M., Saygin, Y.l., Sezerman, U.: Classification of GPCRs Using Family Specific Motifs. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(6), 1495–1508 (2011)CrossRefGoogle Scholar
  5. 5.
    Davies, M.N., Secker, A., Freitas, A., Clark, E., Timmis, J., Flower, D.R.: Optimizing amino acid groupings for GPCR classification. Bioinformatics 24(18), 1980–1986 (2008)CrossRefGoogle Scholar
  6. 6.
    Katritch, V., Cherezov, V., Stevens, R.C.: Structure-Function of the G Protein Coupled Receptor Superfamily. Annual Review of Pharmacology and Toxicology 53(1), 531–556 (2013)CrossRefGoogle Scholar
  7. 7.
    Kittler, J.: Feature Set Search Algorithms. In: Chen, C.H. (ed.) Pattern Recognition and Signal Processing, pp. 41–60. Sijthoff and Noordhoff, Alphen aan den Rijn (1978)CrossRefGoogle Scholar
  8. 8.
    König, C., Cruz-Barbosa, R., Alquézar, R., Vellido, A.: SVM-based classification of class C GPCRs from alignment-free physicochemical transformations of their sequences. In: Petrosino, A., Maddalena, L., Pala, P. (eds.) ICIAP 2013 Workshops. LNCS, vol. 8158, pp. 336–343. Springer, Heidelberg (2013)Google Scholar
  9. 9.
    Mhamdi, F., Elloumi, M., Rakotomalala, R.: Textmining, features selection and datamining for proteins classification. In: Proceedings of the 2004 International Conference on Information and Comunication Technologies: From Theory to Applications, pp. 457–458. IEEE (2004)Google Scholar
  10. 10.
    Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRefGoogle Scholar
  11. 11.
    Trzaskowski, B., Latek, D., Yuan, S., Ghoshdastider, U., Debinski, A., et al.: Action of molecular switches in GPCRs– theoretical and experimental studies. Current Medicinal Chemistry 19(8), 1090–1109 (2012)CrossRefGoogle Scholar
  12. 12.
    Vroling, B., Sanders, M., Baakman, C., Borrmann, A., Verhoeven, S., Klomp, J., Oliveira, L., de Vlieg, J., Vriend, G.: GPCRDB: information system for G protein-coupled receptors. Nucleic Acids Research 39(suppl. 1), D309–D319 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Caroline König
    • 1
    Email author
  • René Alquézar
    • 1
    • 2
  • Alfredo Vellido
    • 1
    • 3
  • Jesús Giraldo
    • 4
  1. 1.Departament de Llenguatges i Sistemes InformàticsUniv. Politècnica de Catalunya, BarcelonaTechBarcelonaSpain
  2. 2.Institut de Robòtica i Informàtica Industrial, CSIC-UPCBarcelonaSpain
  3. 3.Centro de Investigación Biomédica en Red en BioingenieríaBiomateriales y Nanomedicina (CIBER-BBN)Cerdanyola del VallèsSpain
  4. 4.Institut de Neurociències - Unitat de Bioestadìstica, Univ. Autònoma de BarcelonaCerdanyola del VallèsSpain

Personalised recommendations