APSLAP: An Adaptive Boosting Technique for Predicting Subcellular Localization of Apoptosis Protein
Apoptotic proteins play key roles in understanding the mechanism of programmed cell death. Knowledge about the subcellular localization of apoptotic protein is constructive in understanding the mechanism of programmed cell death, determining the functional characterization of the protein, screening candidates in drug design, and selecting protein for relevant studies. It is also proclaimed that the information required for determining the subcellular localization of protein resides in their corresponding amino acid sequence. In this work, a new biological feature, class pattern frequency of physiochemical descriptor, was effectively used in accordance with the amino acid composition, protein similarity measure, CTD (composition, translation, and distribution) of physiochemical descriptors, and sequence similarity to predict the subcellular localization of apoptosis protein. AdaBoost with the weak learner as Random-Forest was designed for the five modules and prediction is made based on the weighted voting system. Bench mark dataset of 317 apoptosis proteins were subjected to prediction by our system and the accuracy was found to be 100.0 and 92.4 %, and 90.1 % for self-consistency test, jack-knife test, and tenfold cross validation test respectively, which is 0.9 % higher than that of other existing methods. Beside this, the independent data (N151 and ZW98) set prediction resulted in the accuracy of 90.7 and 87.7 %, respectively. These results show that the protein feature represented by a combined feature vector along with AdaBoost algorithm holds well in effective prediction of subcellular localization of apoptosis proteins. The user friendly web interface “APSLAP” has been constructed, which is freely available at http://apslap.bicpu.edu.in and it is anticipated that this tool will play a significant role in determining the specific role of apoptosis proteins with reliability.
KeywordsAdaBoost Apoptosis protein Jack-knife test Physio-chemical parameres Random forest Subcellular localization Web-server
- Chen Y, Li Q (2004) Prediction of the subcellular location apoptosis proteins using the algorithm of measure of diversity. Acta Sci Nat Univ NeiMongol 25:413–417Google Scholar
- Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: International conference on machine learning, pp 148–156Google Scholar
- Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, Bernard T, Binns D, Bork P, Burge S, de Castro E, Coggill P, Corbett M, Das U, Daugherty L, Duquenne L, Finn RD, Fraser M, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, McMenamin C, Mi H, Mutowo-Muellenet P, Mulder N, Natale D, Orengo C, Pesseat S, Punta M, Quinn AF, Rivoire C, Sangrador-Vegas A, Selengut JD, Sigrist CJ, Scheremetjew M, Tate J, Thimmajanarthanan M, Thomas PD, Wu CH, Yeats C, Yong SY (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic acids research 40 (Database issue):D306-312. doi:10.1093/nar/gkr948
- Jiang X, Wei R, Zhang T, Gu Q (2008) Using the concept of Chou’s pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy. Protein Pept Lett 15:392–396Google Scholar
- Zhang H, Gu C (2006). Support Vector Machines versus Boosting. Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, USAGoogle Scholar