NcPred for Accurate Nuclear Protein Prediction Using n-mer Statistics with Various Classification Algorithms
Prediction of nuclear proteins is one of the major challenges in genome annotation. A method, NcPred is described, for predicting nuclear proteins with higher accuracy exploiting n − mer statistics with different classification algorithms namely Alternating Decision (AD) Tree, Best First (BF) Tree, Random Tree and Adaptive (Ada) Boost. On BaCello dataset , NcPred improves about 20% accuracy with Random Tree and about 10% sensitivity with Ada Boost for Animal proteins compared to existing techniques. It also increases the accuracy of Fungal protein prediction by 20% and recall by 4% with AD Tree. In case of Human protein, the accuracy is improved by about 25% and sensitivity about 10% with BF Tree. Performance analysis of NcPred clearly demonstrates its suitability over the contemporary in-silico nuclear protein classification research.
KeywordsSubcellular Localization Nuclear Protein Fanconis Anaemia Random Tree Alternate Decision
Unable to display preview. Download preview PDF.
- 2.Kumar, M., Verma, R., Raghvan, S.: Prediction of mitochondrial proteins using support vector machine and hidden markov model. Int. J. of Biol. Chem. 28(19), 5357–5363 (2006)Google Scholar
- 4.Ganesh, A., Kenue, R., Mitra, S.: Retinoblastoma and the 13q deletion syndrome. J. of Ped. Ophth. & Strab. 38(4), 247–250 (2001)Google Scholar
- 5.Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P.: Molecular Biology of Cell, 4th edn. Garland Science, New York (2000)Google Scholar
- 9.Marcotte, E., Xenarios, I., Bliek, A., Eisenberg, D.: Localizing proteins in the cell from their phylogenetic profiles. Proc. of Nat. Aca. of Sci. 97(12), 115–120 (2000)Google Scholar
- 10.Bhasin, M., Raghava, G.: ESLpred: SVM based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. Nuc. Acids Res., 414–419 (2004)Google Scholar
- 11.Garg, A., Bhasin, M., Raghva, G.: Support vector machine based method for subcellular localization of human proteins using amino acid compositions, their order and similarity search. J. of Bio. Chem. 280(14), 427–433 (2005)Google Scholar
- 14.Makhoul, J., Kubala, F., Schwartz, R., Weischedel, R.: Performance measures for information extraction. In: Proc. of DARPA Broadcast News Workshop, pp. 249–252 (1999)Google Scholar
- 15.Mathews, B.: Comparison of the predicted and observed secondary structure of t4 phase lysozyme. Bio. et bioph. acta. 405(2), 442–451 (1975)Google Scholar
- 18.Kumar, M., Raghava, G.: Prediction of nuclear proteins using svm and HMM models. BMC Bioinformatics 10(22) (2009)Google Scholar