Skip to main content
Log in

Prediction of protein amidation sites by feature selection and analysis

  • Original Paper
  • Published:
Molecular Genetics and Genomics Aims and scope Submit manuscript

Abstract

Carboxy-terminal α-amidation is a widespread post-translational modification of proteins found widely in vertebrates and invertebrates. The α-amide group is required for full biological activity, since it may render a peptide more hydrophobic and thus better be able to bind to other proteins, preventing ionization of the C-terminus. However, in particular, the C-terminal amidation is very difficult to detect because experimental methods are often labor-intensive, time-consuming and expensive. Therefore, in silico methods may complement due to their high efficiency. In this study, a computational method was developed to predict protein amidation sites, by incorporating the maximum relevance minimum redundancy method and the incremental feature selection method based on the nearest neighbor algorithm. From a total of 735 features, 41 optimal features were selected and were utilized to construct the final predictor. As a result, the predictor achieved an overall Matthews correlation coefficient of 0.8308. Feature analysis showed that PSSM conservation scores and amino acid factors played the most important roles in the α-amidation site prediction. Site-specific feature analyses showed that features derived from the amidation site itself and adjacent sites were most significant. This method presented could be used as an efficient tool to theoretically predict amidated peptides. And the selected features from our study could shed some light on the in-depth understanding of the mechanisms of the amidation modification, providing guidelines for experimental validation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402 [pii: gka562]

    Article  PubMed  CAS  Google Scholar 

  • Apweiler R, Martin MJ, O’Donovan C, Magrane M, Alam-Faruque Y, Antunes R, Barrell D, Bely B, Bingley M, Binns D, Bower L, Browne P, Chan WM, Dimmer E, Eberhardt R, Fedotov A, Foulger R, Garavelli J, Huntley R, Jacobsen J, Kleen M, Laiho K, Leinonen R, Legge D, Lin Q, Liu WD, Luo J, Orchard S, Patient S, Poggioli D, Pruess M, Corbett M, di Martino G, Donnelly M, van Rensburg P, Bairoch A, Bougueleret L, Xenarios I, Altairac S, Auchincloss A, Argoud-Puy G, Axelsen K, Baratin D, Blatter MC, Boeckmann B, Bolleman J, Bollondi L, Boutet E, Quintaje SB, Breuza L, Bridge A, deCastro E, Ciapina L, Coral D, Coudert E, Cusin I, Delbard G, Doche M, Dornevil D, Roggli PD, Duvaud S, Estreicher A, Famiglietti L, Feuermann M, Gehant S, Farriol-Mathis N, Ferro S, Gasteiger E, Gateau A, Gerritsen V, Gos A, Gruaz-Gumowski N, Hinz U, Hulo C, Hulo N, James J, Jimenez S, Jungo F, Kappler T, Keller G, Lachaize C, Lane-Guermonprez L, Langendijk-Genevaux P, Lara V, Lemercier P, Lieberherr D, Lima TD, Mangold V, Martin X, Masson P, Moinat M, Morgat A, Mottaz A, Paesano S, Pedruzzi I, Pilbout S, Pillet V, Poux S, Pozzato M, Redaschi N, Rivoire C, Roechert B, Schneider M, Sigrist C, Sonesson K, Staehli S, Stanley E, Stutz A, Sundaram S, Tognolli M, Verbregue L, Veuthey AL, Yip LN, Zuletta L, Wu C, Arighi C, Arminski L, Barker W, Chen CM, Chen YX, Hu ZZ, Huang HZ, Mazumder R, McGarvey P, Natale DA, Nchoutmboube J, Petrova N, Subramanian N, Suzek BE, Ugochukwu U, Vasudevan S, Vinayaka CR, Yeh LS, Zhang J, Consortium U (2010) The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res 38:D142–D148. doi:10.1093/Nar/Gkp846

    Google Scholar 

  • Atchley WR, Zhao J, Fernandes AD, Druke T (2005) Solving the protein sequence metric problem. Proc Natl Acad Sci USA 102(18):6395–6400. doi:10.1073/pnas.0408677102

    Article  PubMed  CAS  Google Scholar 

  • Bousquet-Moore D, Prohaska JR, Nillni EA, Czyzyk T, Wetsel WC, Mains RE, Eipper BA (2010) Interactions of peptide amidation and copper: novel biomarkers and mechanisms of neural dysfunction. Neurobiol Dis 37(1):130–140. doi:10.1016/j.nbd.2009.09.016

    Article  PubMed  CAS  Google Scholar 

  • Cai Y, He J, Li X, Lu L, Yang X, Feng K, Lu W, Kong X (2009) A novel computational approach to predict transcription factor DNA binding preference. J Proteome Res 8(2):999–1003. doi:10.1021/pr800717y

    Article  PubMed  CAS  Google Scholar 

  • Cheng J, Randall AZ, Sweredoski MJ, Baldi P (2005) SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res 33:W72–W76. doi:10.1093/Nar/Gki396

    Article  PubMed  CAS  Google Scholar 

  • Chufan EE, De M, Eipper BA, Mains RE, Amzel LM (2009) Amidation of bioactive peptides: the structure of the lyase domain of the amidating enzyme. Structure 17(7):965–973. doi:10.1016/j.str.2009.05.008

    Article  PubMed  CAS  Google Scholar 

  • Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14(6):1188–1190. doi:10.1101/gr.849004

    Article  PubMed  CAS  Google Scholar 

  • De M, Bell J, Blackburn NJ, Mains RE, Eipper BA (2006) Role for an essential tyrosine in peptide amidation. J Biol Chem 281(30):20873–20882

    Article  PubMed  CAS  Google Scholar 

  • Dennison SR, Phoenix DA (2011) Influence of C-terminal amidation on the efficacy of Modelin-5. Biochemistry 50:1514–1523

    Article  PubMed  CAS  Google Scholar 

  • Dennison SR, Harris F, Bhatt T, Singh J, Phoenix DA (2009) The effect of C-terminal amidation on the efficacy and selectivity of antimicrobial and anticancer peptides. Mol Cell Biochem 332(1–2):43–50. doi:10.1007/s11010-009-0172-8

    Article  PubMed  CAS  Google Scholar 

  • Driscoll WJ, Mueller SA, Eipper BA, Mueller GP (1999) Differential regulation of peptide α-amidation by dexamethasone and disulfiram. Mol Pharmacol 55:1067–1076

    PubMed  CAS  Google Scholar 

  • Eipper BA, Milgram SL, Husten EJ, Yun HY, Mains RE (1993) Peptidylglycine alpha-amidating monooxygenase: a multifunctional protein with catalytic, processing, and routing domains. Protein Sci 2(4):489–497

    Article  PubMed  CAS  Google Scholar 

  • Eipper BA, Stoffers DA, Mains RE (1992) The biosynthesis of neuropeptides: peptide alpha-amidation. Annu Rev Neurosci 15:57–85. doi:10.1146/annurev.ne.15.030192.000421

    Article  PubMed  CAS  Google Scholar 

  • Goldschmidt L, Teng PK, Riek R, Eisenberg D (2010) Identifying the amylome, proteins capable of forming amyloid-like fibrils. Proc Natl Acad Sci USA 107(8):3487–3492. doi:10.1073/pnas.0915166107

    Article  PubMed  CAS  Google Scholar 

  • Huang T, Cui W, Hu L, Feng K, Li YX, Cai YD (2009) Prediction of pharmacological and xenobiotic responses to drugs based on time course gene expression profiles. PLoS ONE 4(12):e8126. doi:10.1371/journal.pone.0008126

    Article  PubMed  Google Scholar 

  • Huang T, Shi XH, Wang P, He Z, Feng KY, Hu L, Kong X, Li YX, Cai YD, Chou KC (2010) Analysis and prediction of the metabolic stability of proteins based on their sequential features, subcellular locations and interaction networks. PLoS ONE 5(6):e10972. doi:10.1371/journal.pone.0010972

    Article  PubMed  Google Scholar 

  • Huang T, Tu K, Shyr Y, Wei CC, Xie L, Li YX (2008) The prediction of interferon treatment effects based on time series microarray gene expression profiles. J Transl Med 6:44. doi:10.1186/1479-5876-6-44

    Article  PubMed  Google Scholar 

  • In Y, Ono H, Ishida T (2002) Structural studies on C-amidated amino acids and peptides: function of amide group in molecular association in crystal structures of Val-Gly-NH2, Ser-Phe-NH2, Gly-Tyr-NH2 and Pro-Tyr-NH2 hydrochloride salts. Chem Pharm Bull (Tokyo) 50(5):571–577

    Article  CAS  Google Scholar 

  • Jain E, Bairoch A, Duvaud S, Phan I, Redaschi N, Suzek BE, Martin MJ, McGarvey P, Gasteiger E (2009) Infrastructure for the life sciences: design and implementation of the UniProt website. BMC Bioinformatics 10:136. doi:10.1186/1471-2105-10-136

    Article  PubMed  Google Scholar 

  • Jordan IK, Kondrashov FA, Adzhubei IA, Wolf YI, Koonin EV, Kondrashov AS, Sunyaev S (2005) A universal trend of amino acid gain and loss in protein evolution. Nature 433(7026):633–638. doi:10.1038/Nature03306

    Article  PubMed  CAS  Google Scholar 

  • Kang TS, Vivekanandan S, Jois SDS, Kini RM (2005) Effect of C-terminal amidation on folding and disulfide-pairing of α-conotoxin ImI. Angew Chem Int Ed 44:6333–6337

    Article  CAS  Google Scholar 

  • Katopodis AG, Ping DS, Smith CE, May SW (1991) Functional and structural characterization of peptidylamidoglycolate lyase, the enzyme catalyzing the second step in peptide amidation. Biochemistry 30(25):6189–6194

    Article  PubMed  CAS  Google Scholar 

  • Kawashima S, Kanehisa M (2000) AAindex: amino acid index database. Nucleic Acids Res 28(1):374. doi:gkd029

    Article  PubMed  CAS  Google Scholar 

  • Kolhekar AS, Roberts MS, Jiang N, Johnson RC, Mains RE, Eipper BA, Taghert PH (1997) Neuropeptide amidation in Drosophila: separate genes encode the two enzymes catalyzing amidation. J Neurosci 17(4):1363–1376

    PubMed  CAS  Google Scholar 

  • Kuyama H, Nakajima C, Nakazawa T, Nishimura O, Tsunasawa S (2009) A new approach for detecting C-terminal amidation of proteins and peptides by mass spectrometry in conjunction with chemical derivatization. Proteomics 9(16):4063–4070. doi:10.1002/pmic.200900267

    Article  PubMed  CAS  Google Scholar 

  • Liu J, Tan H, Rost B (2002) Loopy proteins appear conserved in evolution. J Mol Biol 322(1):53–64 [pii: S0022283602007362]

    Article  PubMed  CAS  Google Scholar 

  • Liu MC, Yasuda S, Idell S (2007) Sulfation of nitrotyrosine: biochemistry and functional implications. IUBMB Life 59(10):622–627. doi:10.1080/15216540701589320

    Article  PubMed  CAS  Google Scholar 

  • Ma BY, Elkayam T, Wolfson H, Nussinov R (2003) Protein–protein interactions: structurally conserved residues distinguish between binding sites and exposed protein surfaces. P Natl Acad Sci USA 100(10):5772–5777. doi:10.1073/pnas.1030237100

    Article  CAS  Google Scholar 

  • Mueller GP, Driscoll WJ (2008) alpha-Amidated peptides: approaches for analysis. Methods Mol Biol 446:67–84. doi:10.1007/978-1-60327-084-7_5

    Article  PubMed  CAS  Google Scholar 

  • Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238. doi:10.1109/TPAMI.2005.159

    Article  PubMed  Google Scholar 

  • Peng K, Radivojac P, Vucetic S, Dunker AK, Obradovic Z (2006) Length-dependent prediction of protein intrinsic disorder. BMC Bioinformatics 7:208. doi:10.1186/1471-2105-7-208

    Article  PubMed  Google Scholar 

  • Plewczynski D, Tkacz A, Wyrwicz LS, Rychlewski L (2005) AutoMotif server: prediction of single residue post-translational modifications in proteins. Bioinformatics 21(10):2525–2527

    Article  PubMed  CAS  Google Scholar 

  • Popelier PL, Aicken FM (2003) Atomic properties of selected biomolecules: quantum topological atom types of carbon occurring in natural amino acids and derived molecules. J Am Chem Soc 125(5):1284–1292. doi:10.1021/ja0284198

    Article  PubMed  CAS  Google Scholar 

  • Priggea ST, Mainsb RE, Eipperb BA, Amzel LM (2000) New insights into copper monooxygenases and peptide amidation: structure, mechanism and function. CMLS Cell Mol Life Sci 57:1236–1259

    Article  Google Scholar 

  • Qian Z, Cai YD, Li Y (2006) A novel computational method to predict transcription factor DNA binding preference. Biochem Biophys Res Commun 348(3):1034–1037

    Article  PubMed  CAS  Google Scholar 

  • Rocchi P, Boudouresque F, Zamora AJ, Muracciole X, Lechevallier E, Martin PM, Ouafik L (2001) Expression of adrenomedullin and peptide amidation activity in human prostate cancer and in human prostate cancer cell lines. Cancer Res 61(3):1196–1206

    PubMed  CAS  Google Scholar 

  • Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18(20):6097–6100

    Article  PubMed  CAS  Google Scholar 

  • Shimosawa T, Kanozawa K, Nagasawa R, Mitarai T, Isoda K, Takahashi K, Ando K, Tozawa Y, Nagase M, Sasaki N, Fujita M, Takano K, Iiri T, Fujita T (2000) Adrenomedullin amidation enzyme activities in hypertensive patients. Hypertens Res 23(2):167–171

    Article  PubMed  CAS  Google Scholar 

  • Tompa P (2002) Intrinsically unstructured proteins. Trends Biochem Sci 27(10):527–533 [pii: S0968000402021692]

    Article  PubMed  CAS  Google Scholar 

  • Trouillas P, Berges J, Houee-Levin C (2011) Toward understanding the protein oxidation processes: (OH)-O-center dot addition on tyrosine, phenylalanine, or methionine? Int J Quantum Chem 111(6):1143–1151. doi:10.1002/Qua.22556

    Article  CAS  Google Scholar 

  • Wilkins MR, Gasteiger E, Gooley AA, Herbert BR, Molloy MP, Binz PA, Ou K, Sanchez JC, Bairoch A, Williams KL, Hochstrasser DF (1999) High-throughput mass spectrometric discovery of protein post-translational modifications. J Mol Biol 289(3):645–657. doi:10.1006/jmbi.1999.2794

    Article  PubMed  CAS  Google Scholar 

  • Wright PE, Dyson HJ (1999) Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J Mol Biol 293(2):321–331. doi:10.1006/jmbi.1999.3110

    Article  PubMed  CAS  Google Scholar 

  • Yamaguchi H, Sasaki K, Satomi Y, Shimbara T, Kageyama H, Mondal MS, Toshinai K, Date Y, Gonzalez LJ, Shioda S, Takao T, Nakazato M, Minamino N (2007) Peptidomic identification and biological validation of neuroendocrine regulatory peptide-1 and -2. J Biol Chem 282(36):26354–26360. doi:10.1074/jbc.M701665200

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported by grants from National Basic Research Program of China (2011CB510102, 2011CB510101), National Science Foundation of China (81171342, 81201148), Innovation Program of Shanghai Municipal Education Commission (12ZZ087), Independent Innovation Foundation of Tianjin University (60302069, 60302064), The National Research Foundation for the Doctoral Program of Higher Education of China (20120032120073) and the grant of “The First-class Discipline of Universities in Shanghai”.

Conflict of interest

The authors declare no competing interests.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Kaiyan Feng, Ning Zhang, Yudong Cai or Yixue Li.

Additional information

Communicated by S. Hohmann.

W. Cui, S. Niu and L. Zheng contributed equally to this work.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cui, W., Niu, S., Zheng, L. et al. Prediction of protein amidation sites by feature selection and analysis. Mol Genet Genomics 288, 391–400 (2013). https://doi.org/10.1007/s00438-013-0760-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00438-013-0760-x

Keywords

Navigation