Abstract
Apoptotic proteins play key roles in understanding the mechanism of programmed cell death. Knowledge about the subcellular localization of apoptotic protein is constructive in understanding the mechanism of programmed cell death, determining the functional characterization of the protein, screening candidates in drug design, and selecting protein for relevant studies. It is also proclaimed that the information required for determining the subcellular localization of protein resides in their corresponding amino acid sequence. In this work, a new biological feature, class pattern frequency of physiochemical descriptor, was effectively used in accordance with the amino acid composition, protein similarity measure, CTD (composition, translation, and distribution) of physiochemical descriptors, and sequence similarity to predict the subcellular localization of apoptosis protein. AdaBoost with the weak learner as Random-Forest was designed for the five modules and prediction is made based on the weighted voting system. Bench mark dataset of 317 apoptosis proteins were subjected to prediction by our system and the accuracy was found to be 100.0 and 92.4 %, and 90.1 % for self-consistency test, jack-knife test, and tenfold cross validation test respectively, which is 0.9 % higher than that of other existing methods. Beside this, the independent data (N151 and ZW98) set prediction resulted in the accuracy of 90.7 and 87.7 %, respectively. These results show that the protein feature represented by a combined feature vector along with AdaBoost algorithm holds well in effective prediction of subcellular localization of apoptosis proteins. The user friendly web interface “APSLAP” has been constructed, which is freely available at http://apslap.bicpu.edu.in and it is anticipated that this tool will play a significant role in determining the specific role of apoptosis proteins with reliability.
Similar content being viewed by others
References
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410. doi:10.1016/S0022-2836(05)80360-2
Binkowski TA, Adamian L, Liang J (2003) Inferring functional relationships of proteins from local sequence and spatial surface patterns. J Mol Biol 332(2):505–526
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Bulashevska A, Eils R (2006) Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains. BMC Bioinform 7:298. doi:10.1186/1471-2105-7-298
Carr K, Murray E, Armah E, He RL, Yau SS (2010) A rapid method for characterization of protein relatedness using feature vectors. PLoS One 5(3):e9550. doi:10.1371/journal.pone.0009550
Chen Y, Li Q (2004) Prediction of the subcellular location apoptosis proteins using the algorithm of measure of diversity. Acta Sci Nat Univ NeiMongol 25:413–417
Chen YL, Li QZ (2007) Prediction of the subcellular location of apoptosis proteins. J Theor Biol 245(4):775–783. doi:10.1016/j.jtbi.2006.11.010
Chou KC (1995) A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space. Proteins 21(4):319–344. doi:10.1002/prot.340210406
Chou KC, Shen HB (2007) Recent progress in protein subcellular location prediction. Anal Biochem 370(1):1–16. doi:10.1016/j.ab.2007.07.006
Deng M, Yu C, Liang Q, He RL, Yau SS (2011) A novel method of characterizing genetic sequences: genome space with biological distance and applications. PLoS ONE 6(3):e17293. doi:10.1371/journal.pone.0017293
Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40(2):139–157
Ding CH, Dubchak I (2001) Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4):349–358
Ding Y-S, Zhang T-L (2008) Using Chou’s pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier. Pattern Recogn Lett 29(13):1887–1892
Dubchak I, Muchnik I, Holbrook SR, Kim SH (1995) Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci USA 92(19):8700–8704
Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: International conference on machine learning, pp 148–156
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Gu Q, Ding YS, Jiang XY, Zhang TL (2010) Prediction of subcellular location apoptosis proteins with ensemble classifier and feature selection. Amino Acids 38(4):975–983. doi:10.1007/s00726-008-0209-4
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18
Huang J, Shi F (2005) Support vector machines for predicting apoptosis proteins types. Acta Biotheor 53(1):39–47. doi:10.1007/s10441-005-7002-5
Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, Bernard T, Binns D, Bork P, Burge S, de Castro E, Coggill P, Corbett M, Das U, Daugherty L, Duquenne L, Finn RD, Fraser M, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, McMenamin C, Mi H, Mutowo-Muellenet P, Mulder N, Natale D, Orengo C, Pesseat S, Punta M, Quinn AF, Rivoire C, Sangrador-Vegas A, Selengut JD, Sigrist CJ, Scheremetjew M, Tate J, Thimmajanarthanan M, Thomas PD, Wu CH, Yeats C, Yong SY (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic acids research 40 (Database issue):D306-312. doi:10.1093/nar/gkr948
Jiang X, Wei R, Zhang T, Gu Q (2008) Using the concept of Chou’s pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy. Protein Pept Lett 15:392–396
Kandaswamy KK, Pugalenthi G, Moller S, Hartmann E, Kalies KU, Suganthan PN, Martinetz T (2010) Prediction of apoptosis protein locations with genetic algorithms and support vector machines through a new mode of pseudo amino acid composition. Protein Pept Lett 17(12):1473–1479
Kerr JF, Wyllie AH, Currie AR (1972) Apoptosis: a basic biological phenomenon with wide-ranging implications in tissue kinetics. Br J Cancer 26(4):239–257
Liao B, Jiang JB, Zeng QG, Zhu W (2011) Predicting apoptosis protein subcellular location with PseAAC by incorporating tripeptide composition. Protein Pept Lett 18(11):1086–1092
Lin H, Wang H, Ding H, Chen YL, Li QZ (2009) Prediction of subcellular localization of apoptosis protein using Chou’s pseudo amino acid composition. Acta Biotheor 57(3):321–330. doi:10.1007/s10441-008-9067-4
Matsuda S, Vert JP, Saigo H, Ueda N, Toh H, Akutsu T (2005) A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci Publ Protein Soc 14(11):2804–2813. doi:10.1110/ps.051597405
Petersen TN, Brunak S, von Heijne G, Nielsen H (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8(10):785–786. doi:10.1038/nmeth.1701
Raff M (1998) Cell suicide for beginners. Nature 396(6707):119–122. doi:10.1038/24055
Saravanan V, Lakshmi PT (2013) SCLAP: an adaptive boosting method for predicting subchloroplast localization of plant proteins. OMICS 17(2):106–115. doi:10.1089/omi.2012.0070
Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336
Schulz JB, Weller M, Moskowitz MA (1999) Caspases as treatment targets in stroke and neurodegenerative diseases. Ann Neurol 45(4):421–429
Shen HB, Chou KC (2006) Ensemble classifier for protein fold pattern recognition. Bioinformatics 22(14):1717–1722. doi:10.1093/bioinformatics/btl170
Suzuki M, Youle RJ, Tjandra N (2000) Structure of Bax: coregulation of dimer formation and intracellular localization. Cell 103(4):645–654
Tantoso E, Li KB (2008) AAIndexLoc: predicting subcellular localization of proteins based on a new representation of sequences using amino acid indices. Amino Acids 35(2):345–353. doi:10.1007/s00726-007-0616-y
Thompson CB (1995) Apoptosis in the pathogenesis and treatment of disease. Science 267(5203):1456–1462
Wang G, Dunbrack RL Jr (2003) PISCES: a protein sequence culling server. Bioinformatics 19(12):1589–1591
Yau SS, Yu C, He R (2008) A protein map and its application. DNA Cell Biol 27(5):241–250. doi:10.1089/dna.2007.0676
Yu C, Liang Q, Yin C, He RL, Yau SS (2010) A novel construction of genome space with biological geometry. DNA Res Int J Rapid Publ Reports Genes Genomes 17(3):155–168. doi:10.1093/dnares/dsq008
Yu C, Cheng SY, He RL, Yau SS (2011) Protein map: an alignment-free sequence comparison method based on various properties of amino acids. Gene 486(1–2):110–118. doi:10.1016/j.gene.2011.07.002
Yu X, Zheng X, Liu T, Dou Y, Wang J (2012) Predicting subcellular location of apoptosis proteins with pseudo amino acid composition: approach from amino acid substitution matrix and auto covariance transformation. Amino Acids 42(5):1619–1625. doi:10.1007/s00726-011-0848-8
Yu C, Deng M, Cheng SY, Yau SC, He RL, Yau SS (2013) Protein space: a natural method for realizing the nature of protein universe. J Theor Biol 318:197–204. doi:10.1016/j.jtbi.2012.11.005
Zhang H, Gu C (2006). Support Vector Machines versus Boosting. Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, USA
Zhang ZH, Wang ZH, Zhang ZR, Wang YX (2006) A novel method for apoptosis protein subcellular localization prediction combining encoding based on grouped weight and support vector machine. FEBS Lett 580(26):6169–6174. doi:10.1016/j.febslet.2006.10.017
Zhou GP, Doctor K (2003) Subcellular location prediction of apoptosis proteins. Proteins 50(1):44–48. doi:10.1002/prot.10251
Zou KH, O’Malley AJ, Mauri L (2007) Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. Circulation 115(5):654–657. doi:10.1161/CIRCULATIONAHA.105.594929
Acknowledgments
Saravanan Vijayakumar is supported by the DBT-BINC junior research fellow. The authors thank Borhan Kazimipour of Shiraz University for suggestions in algorithm design. The authors thank Dr. Archana Pan (Centre for Bioinforamtics, Pondicherry University) and Dr. Sivasathya (Department of Computer Science, Ponidcherry University) for their valuable suggestions and P. Elavarasan, E. Malanco and S. Manikandan of Centre for Bioinformatics, Pondicherry University for their technical support in hosting the tool onto the web server.
Conflict of interest
The author declare no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Saravanan, V., Lakshmi, P.T.V. APSLAP: An Adaptive Boosting Technique for Predicting Subcellular Localization of Apoptosis Protein. Acta Biotheor 61, 481–497 (2013). https://doi.org/10.1007/s10441-013-9197-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10441-013-9197-1