Abstract
Since endo-symbiotic events occur, all genes of mitochondrial aminoacyl tRNA synthetase (AARS) were lost or transferred from ancestral mitochondrial genome into the nucleus. The canonical pattern is that both cytosolic and mitochondrial AARSs coexist in the nuclear genome. In the present scenario all mitochondrial AARSs are nucleus-encoded, synthesized on cytosolic ribosomes and post-translationally imported from the cytosol into the mitochondria in eukaryotic cell. The site-based discrimination between similar types of enzymes is very challenging because they have almost same physico-chemical properties. It is very important to predict the sub-cellular location of AARSs, to understand the mitochondrial protein synthesis. We have analyzed and optimized the distinguishable patterns between cytosolic and mitochondrial AARSs. Firstly, support vector machines (SVM)-based modules have been developed using amino acid and dipeptide compositions and achieved Mathews correlation coefficient (MCC) of 0.82 and 0.73, respectively. Secondly, we have developed SVM modules using position-specific scoring matrix and achieved the maximum MCC of 0.78. Thirdly, we developed SVM modules using N-terminal, intermediate residues, C-terminal and split amino acid composition (SAAC) and achieved MCC of 0.82, 0.70, 0.39 and 0.86, respectively. Finally, a SVM module was developed using selected attributes of split amino acid composition (SA-SAAC) approach and achieved MCC of 0.92 with an accuracy of 96.00%. All modules were trained and tested on a non-redundant data set and evaluated using fivefold cross-validation technique. On the independent data sets, SA-SAAC based prediction model achieved MCC of 0.95 with an accuracy of 97.77%. The web-server ‘MARSpred’ based on above study is available at http://www.imtech.res.in/raghava/marspred/.
Similar content being viewed by others
References
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Antonellis A, Green ED (2008) The role of aminoacyl-tRNA synthetases in genetic diseases. Annu Rev Genomics Hum Genet 9:87–107
Baker MJ, Frazier AE, Gulbis JM, Ryan MT (2007) Mitochondrial protein-import machinery: correlating structure with function. Trends Cell Biol 17:456–464
Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16:412–424
Berg P (1961) Specificity in protein synthesis. Annu Rev Biochem 30:293–324
Bhasin M, Raghava GPS (2004) Classification of nuclear receptors based on amino acid composition and dipeptide composition. J Biol Chem 279:23262–23266
Bhasin M, Raghava GPS (2005) GPCRsclass: a web tool for classification of amine type of G-protein coupled receptors. Nucleic Acids Res 33:W143–W147
Brindefalk B, Viklund J, Larsson D, Thollesson M, Andersson SG (2007) Origin and evolution of the mitochondrial aminoacyl-tRNA synthetases. Mol Biol Evol 24:743–756
Chou KC, Shen HB (2007) Recent progresses in protein subcellular location prediction. Anal Biochem 370:1–16
Claros MG, Vincens P (1996) Computational method to predict mitochondrially imported proteins and their targeting sequences. Eur J Biochem 241:779–786
Doolittle WF (1998) You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet 14:307–311
Duchêne AM, Pujol C, Maréchal-Drouard L (2009) Import of tRNAs and aminoacyl-tRNA synthetases into mitochondria. Curr Genet 55:1–18
Español Y, Thut D, Schneider A, de Pouplana LR (2009) A mechanism for functional segregation of mitochondrial and cytosolic genetic codes. Proc Natl Acad Sci USA 106(46):19420–19425
Garg A, Bhasin M, Raghava GPS (2005) SVM-based method for subcellular localization of human proteins using amino acid compositions, their order and similarity search. J Biol Chem 280(15):14427–14432
Guda C, Guda P, Fahy E, Subramaniam S (2004) MITOPRED: a web server for the prediction of mitochondrial proteins. Nucleic Acids Res 32:W372–W374
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA Data Mining Software: an update. SIGKDD Explorations 11(1):10–18
Joachims T (1999) Making large-scale SVM learning particles. In: Scholkopf B, Berges C, Smola A (eds) Advances in kernel methods support vector learning. MIT Press, Cambridge, pp 42–56
Kaur H, Raghava GPS (2004) A neural network method for prediction of beta-turn types in proteins using evolutionary information. Bioinformatics 20:2751–2758
Kumar M, Raghava GPS (2009) Prediction of nuclear proteins using SVM and HMM models. BMC Bioinformatics 10:22
Kumar M, Bhasin M, Natt NK, Raghava GPS (2005) BhairPred: a webserver for prediction of beta-hairpins in proteins from multiple alignment information using ANN and SVM techniques. Nucleic Acids Res 33:W154–W159
Kumar M, Verma R, Raghava GPS (2006) Prediction of mitochondrial proteins using support vector machine and hidden Markov model. J Biol Chem 281(9):5357–5363
Kumar M, Gromiha MM, Raghava GPS (2007) Identification of DNA-binding proteins using support vector machines and evolutionary profiles. BMC Bioinformatics 8:463
Lee JW, Beebe K, Nangle LA, Jang J, Longo-Guess CM, Cook SA, Muriel TD, Sundberg JP, Schimmel P, Ackerman SL (2006) Editing-defective tRNA synthetase causes protein misfolding and neurodegeration. Nature 443:50–55
Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large datasets of proteins or nucleotide sequences. Bioinformatics 22:1658–1659
Miyaki M, Iijima T, Shiba K, Aki T, Kita Y, Yasuno M, Mori T, Kuroki T, Iwama T (2001) Alterations of repeated sequences in 5′ upstream and coding regions in colorectal tumors from patients with hereditary nonpolyposis colorectal cancer and Turcot syndrome. Oncogene 20:5215–5218
Panwar B, Raghava GPS (2010) Prediction and classification of aminoacyl tRNA synthetases using PROSITE domains. BMC Genomics 11:507
Park SG, Schimmel P, Kim S (2008) Aminoacyl tRNA synthetases and their connections to disease. Proc Natl Acad Sci USA 105:11043–11049
Raghava GPS, Han JH (2005) Correlation and prediction of gene expression level from amino acid and dipeptide composition of its protein. BMC Bioinformatics 6:59
Rajbhandary UL (1997) Once there were twenty. Proc Natl Acad Sci USA 94:11761–11763
Scheper GC, van der Klok T, van Andel RJ, van Berkel CG, Sissler M, Smet J, Muravina TI, Serkov SV, Uziel G, Bugiani M, Schiffmann R, Krägeloh-Mann I, Smeitink JA, Florentz C, Van Coster R, Pronk JC, van der Knaap MS (2007) Mitochondrial aspartyl-tRNA synthetase deficiency causes leukoencephalopathy with brain stem and spinal cord involvement and lactate elevation. Nat Genet 39:534–539
Schimmel P (2008) Development of tRNA synthetases and connection to genetic code and disease. Protein Sci 17:1643–1652
Small I, Peeters N, Legeai F, Lurin C (2004) Predotar: a tool for rapidly screening proteomes for N-terminal targeting sequences. Proteomics 4(6):1581–1590
t Hart LM, Hansen T, Rietveld I, Dekker JM, Nijpels G, Janssen GM, Arp PA, Uitterlinden AG, Jørgensen T, Borch-Johnsen K, Pols HA, Pedersen O, van Duijn CM, Heine RJ, Maassen JA (2005) Evidence that the mitochondrial leucyl tRNA synthetase (LARS2) gene represents a novel type 2 diabetes susceptibility gene. Diabetes 54:1892–1895
Unseld M, Marienfeld JR, Brandt P, Brennicke A (1997) The mitochondrial genome of Arabidopsis thaliana contains 57 genes in 366, 924 nucleotides. Nat Genet 15:57–61
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10:988–999
Acknowledgments
The authors are thankful to the Council of Scientific and Industrial Research (CSIR) and Department of Biotechnology (DBT), Government of India for financial assistance.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Panwar, B., Raghava, G.P.S. Predicting sub-cellular localization of tRNA synthetases from their primary structures. Amino Acids 42, 1703–1713 (2012). https://doi.org/10.1007/s00726-011-0872-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00726-011-0872-8