Skip to main content
Log in

Protein acetylation sites with complex-valued polynomial model

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Protein acetylation refers to a process of adding acetyl groups (CH3CO-) to lysine residues on protein chains. As one of the most commonly used protein post-translational modifications, lysine acetylation plays an important role in different organisms. In our study, we developed a human-specific method which uses a cascade classifier of complex-valued polynomial model (CVPM), combined with sequence and structural feature descriptors to solve the problem of imbalance between positive and negative samples. Complex-valued gene expression programming and differential evolution are utilized to search the optimal CVPM model. We also made a systematic and comprehensive analysis of the acetylation data and the prediction results. The performances of our proposed method aie 79.15% in Sp, 78.17% in Sn, 78.66% in ACC 78.76% in F1, and 0.5733 in MCC, which performs better than other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Kouzarides T. Chromatin modifications and their function. Cell, 2007, 128(4): 693–705

    Article  Google Scholar 

  2. Mann M, Jensen O N. Proteomic analysis of post-translational modifications. Nature Biotechnology, 2003, 21(3): 255–261

    Article  Google Scholar 

  3. Lu CT, Lee TY, Chen YJ, et al. “An intelligent system for identifying acetylated lysine on histones and nonhistone proteins,” BioMed research international, 6(528650), 2014.

  4. Deng W, Wang C, Zhang Y, et al. “GPS-PAIL: prediction of lysine acetyltransferase-specific modification sites from protein sequences,” Scientific reports, 6(39787), 2016.

  5. Wysocka J, Swigut T, Xiao H, Milne T A, Kwon S Y, Landry J, Kauer M, Tackett A J, Chait B T, Badenhorst P, Wu C, Allis C D. A PHD finger of NURF couples histone H3 lysine 4 trimethylation with chromatin remodelling. Nature, 2006, 442(7098): 86–90

    Article  Google Scholar 

  6. Wysocka J, Swigut T, Milne T A, Dou Y, Zhang X, Burlingame A L, Roeder R G, Brivanlou A H, Allis C D. WDR5 associates with histone H3 methylated at K4 and is essential for H3 K4 methylation and vertebrate development. Cell, 2005, 121(6): 859–872

    Article  Google Scholar 

  7. Zeng L, Zhou M M. Bromodomain: an acetyl-lysine binding domain. FEBS Letters, 2002, 513(1): 124–128

    Article  Google Scholar 

  8. Jenuwein T, Allis C D. Translating the histone code. Science, 2001, 293(5532): 1074–1080

    Article  Google Scholar 

  9. Marmorstein R, Roth S Y. Histone acetyltransferases: function, structure, and catalysis. Current Opinion in Genetics & Development, 2001, 11(2): 155–161

    Article  Google Scholar 

  10. Bode A M, Dong Z. Post-translational modification of p53 in tumorigenesis. Nature Reviews Cancer, 2004, 4(10): 793–805

    Article  Google Scholar 

  11. Walsh G, Jefferis R. Post-translational modifications in the context of therapeutic proteins. Nature Biotechnology, 2006, 24(10): 1241–1252

    Article  Google Scholar 

  12. Westermann S, Weber K. Post-translational modifications regulate microtubule function. Nature Reviews Molecular Cell Biology, 2003, 4(12): 938–948

    Article  Google Scholar 

  13. Janke C, Bulinski J C. Post-translational regulation of the microtubule cytoskeleton: mechanisms and functions. Nature Reviews Molecular Cell Biology, 2011, 12(12): 773–786

    Article  Google Scholar 

  14. Xu Y, Shao X J, Wu L Y, Deng N Y, Chou K C. iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ, 2013, 1: e171

    Article  Google Scholar 

  15. Qiu W R, Xiao X, Lin W Z, Chou K C. iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach. BioMed Research International, 2014: 947416

  16. Xu Y, Wen X, Shao X J, Deng N Y, Chou K C. iHyd-PseAAC: predicting hydroxyproline and hydroxylysine in proteins by incorporating dipeptide position-specific propensity into pseudo amino acid composition. International Journal of Molecular Sciences, 2014, 15(5): 7594–7610

    Article  Google Scholar 

  17. Xiao X, Ye H X, Liu Z, Jia J H, Chou K C. iROS-gPseKNC: predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition. Oncotarget, 2016, 7(23): 34180–34189

    Article  Google Scholar 

  18. Tu Y, Lin Y, Hou C, Mao S. Complex-valued networks for automatic modulation classification. IEEE Transactions on Vehicular Technology, 2020, 69(9): 10085–10089

    Article  Google Scholar 

  19. Rawat S, Rana K P S, Kumar V. A novel complex-valued convolutional neural network for medical image denoising. Biomedical Signal Processing and Control, 2021, 69: 102859

    Article  Google Scholar 

  20. Yang B, Bao W. Complex-valued ordinary differential equation modeling for time series identification. IEEE Access, 2019, 7: 41033–41042

    Article  Google Scholar 

  21. Chen W, Tang H, Ye J, Lin H, Chou K C. iRNA-PseU: identifying RNA pseudouridine sites. Molecular Therapy Nucleic Acids, 2016, 5: e332

    Google Scholar 

  22. Jia J, Liu Z, Xiao X, Liu B, Chou K C. iCar-PseCp: identify carbonylation sites in proteins by Monte Carlo sampling and incorporating sequence coupled effects into general PseAAC. Oncotarget, 2016, 7(23): 34558–34570

    Article  Google Scholar 

  23. Jia J, Zhang L, Liu Z, Xiao X, Chou K C. pSumo-CD: predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC. Bioinformatics, 2016, 32(20): 3133–3141

    Article  Google Scholar 

  24. Liu Z, Xiao X, Yu D J, Jia J, Qiu W R, Chou K C. pRNAm-PC: predicting N6-methyladenosine sites in RNA sequences via physical-chemical properties. Analytical Biochemistry, 2016, 497: 60–67

    Article  Google Scholar 

  25. Qiu W R, Sun B Q, Xiao X, Xu Z C, Chou K C. iPTM-mLys: identifying multiple lysine PTM sites and their different types. Bioinformatics, 2016, 32(20): 3116–3123

    Article  Google Scholar 

  26. Qiu W R, Xiao X, Xu Z C, Chou K C. iPhos-PseEn: identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier. Oncotarget, 2016, 7(32): 51270–51283

    Article  Google Scholar 

  27. Feng P, Ding H, Yang H, Chen W, Lin H, Chou K C. iRNA-PseColl: identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC. Molecular Therapy Nucleic Acids, 2017, 7: 155–163

    Article  Google Scholar 

  28. Bao W, Huang Z, Yuan C A, Huang D S. Pupylation sites prediction with ensemble classification model. International Journal of Data Mining and Bioinformatics, 2017, 18(2): 91–104

    Article  Google Scholar 

  29. Qiu W R, Jiang S Y, Xu Z C, Xiao X, Chou K C. iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition. Oncotarget, 2017, 8(25): 41178–41188

    Article  Google Scholar 

  30. Qiu W R, Sun B Q, Xiao X, Xu D, Chou K C. iPhos - PseEvo: identifying human phosphorylated proteins by incorporating evolutionary information into general PseAAC via grey system theory. Molecular Informatics, 2017, 36(5–6): 1600010

    Article  Google Scholar 

  31. Qiu W R, Sun B Q, Xiao X, Xu Z C, Jia J H, Chou K C. iKcr-PseEns: identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics, 2018, 110(5): 239–246

    Article  Google Scholar 

  32. Xu Y, Wang Z, Li C, Chou K C. iPreny-PseAAC: identify C-terminal cysteine prenylation sites in proteins by incorporating two tiers of sequence couplings into PseAAC. Medicinal Chemistry, 2017, 13(6): 544–551

    Article  Google Scholar 

  33. Bao W, Jiang Z, Huang D S. Novel human microbe-disease association prediction using network consistency projection. BMC Bioinformatics, 2017, 18(S16): 543

    Article  Google Scholar 

  34. Chou K C. Prediction of human immunodeficiency virus protease cleavage sites in proteins. Analytical Biochemistry, 1996, 233(1): 1–14

    Article  MathSciNet  Google Scholar 

  35. Khan Y D, Rasool N, Hussain W, Khan S A, Chou K C. iPhosT-PseAAC: identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Analytical Biochemistry, 2018, 550: 109–116

    Article  Google Scholar 

  36. Liu B, Liu F, Wang X, Chen J, Fang L, Chou K C. Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Research, 2015, 43(W1): W65–W71

    Article  Google Scholar 

  37. Chou K C. Impacts of bioinformatics to medicinal chemistry. Medicinal Chemistry, 2015, 11(3): 218–234

    Article  Google Scholar 

  38. Yuan L F, Ding C, Guo S H, Ding H, Chen W, Lin H. Prediction of the types of ion channel-targeted conotoxins based on radial basis function network. Toxicology in Vitro, 2013, 27(2): 852–856

    Article  Google Scholar 

  39. Chen W, Lin H, Chou K C. Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. Molecular Biosystems, 2015, 11(10): 2620–2634

    Article  Google Scholar 

  40. Cheng X, Zhao S G, Lin W Z, Xiao X, Chou K C. pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics, 2017, 33(22): 3524–3531

    Article  Google Scholar 

  41. Cheng X, Xiao X, Chou K C. pLoc-mGneg: predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics, 2018, 110(4): 231–239

    Article  Google Scholar 

  42. Cheng X, Xiao X, Chou K C. pLoc-mEuk: predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics, 2018, 110(1): 50–58

    Article  Google Scholar 

  43. Bao W, Chen Y, Wang D. Prediction of protein structure classes with flexible neural tree. Bio-Medical Materials and Engineering, 2014, 24(6): 3797–3806

    Article  Google Scholar 

  44. Bao W, Wang D, Chen Y. Classification of protein structure classes on flexible neutral tree. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2017, 14(5): 1122–1133

    Article  Google Scholar 

  45. Chen Y, Yang B, Dong J, Abraham A. Time-series forecasting using flexible neural tree model. Information Sciences, 2005, 174(3–4): 219–235

    Article  MathSciNet  Google Scholar 

  46. Chen Y, Abraham A, Yang B. Hybrid flexible neural-tree-based intrusion detection systems. International Journal of Intelligent Systems, 2007, 22(4): 337–352

    Article  Google Scholar 

  47. Chen Y, Abraham A, Yang B. Feature selection and classification using flexible neural tree. Neurocomputing, 2006, 70(1–3): 305–313

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 61902337), Xuzhou Science and Technology Plan Project (KC21047), Jiangsu Provincial Natural Science Foundation (No. SBK2019040953), Natural Science Fund for Colleges and Universities in Jiangsu Province (No. 19KJB520016) and Young Talents of Science and Technology in Jiangsu, the Key Research Program of the Science Foundation of Shandong Province (ZR2020KE001), the talent project of “Qingtan Scholar” of Zaozhuang University, the PhD research startup foundation of Zaozhuang University (No.2014BS13), and Zaozhuang University Foundation (No. 2015YY02).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Yang.

Additional information

Wenzheng Bao received the PhD degree in Computer Science from Tongji University, China in 2018. He is an associate professor, the master’s tutor of School of Information Engineering, Xuzhou University of Technology, China. His research interests include bioinformatics and machine learning.

Bin Yang received the PhD degree in Computer Science from Shandong University, China in 2014. He is a professor, the master’s tutor of School of Information Science and Engineering, Zaozhuang University, China. His research interests include bioinformatics and machine learning.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bao, W., Yang, B. Protein acetylation sites with complex-valued polynomial model. Front. Comput. Sci. 18, 183904 (2024). https://doi.org/10.1007/s11704-023-2640-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-023-2640-9

Keywords

Navigation