Abstract
Bacterial proteins play an important role in cell biology due to their importance in drug design and antibiotics research. The localization of bacterial proteins is very important since the function of a protein is closely linked with its location. A single gram-negative bacteria proteins can be located in multiple locations in a protein. Prediction of subcellular locations of gram-negative bacteria proteins is thus more difficult. In this paper, we propose a novel method for subcellular localization of gram-negative bacteria. Our method uses label powerset encoding scheme for the associated multi-label classification problem. Using a set of effective features also used in the literature our encoding significantly improves over the traditional approaches on several base classifiers. Our method was tested using a standard benchmark dataset and showed promising results.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Gunsolus, I.L., Hu, D., Mihai, C., Lohse, S.E., Lee, C.s., Torelli, M.D., Hamers, R.J., Murhpy, C.J., Orr, G., Haynes, C.L.: Facile method to stain the bacterial cell surface for super-resolution fluorescence microscopy. Analyst 139(12), 3174–3178 (2014)
Sharma, R., Dehzangi, A., Lyons, J., Paliwal, K., Tsunoda, T., Sharma, A.: Predict gram-positive and gram-negative subcellular localization via incorporating evolutionary information and physicochemical features into chou’s general pseaac. IEEE Trans. Nanobiosci. 14(8), 915–926 (2015)
Emanuelsson, O., Nielsen, H., Brunak, S., Von Heijne, G.: Predicting subcellular localization of proteins based on their n-terminal amino acid sequence. J. Mol. Biol. 300(4), 1005–1016 (2000)
Lu, Z., Szafron, D., Greiner, R., Lu, P., Wishart, D.S., Poulin, B., Anvik, J., Macdonell, C., Eisner, R.: Predicting subcellular localization of proteins using machine-learned classifiers. Bioinformatics 20(4), 547–556 (2004)
Shen, Y.Q., Burger, G.: ‘Unite and conquer’: enhanced prediction of protein subcellular localization by integrating multiple specialized tools. BMC Bioinform. 8(1), 420 (2007)
Chou, K.C., Shen, H.B.: Cell-ploc: a package of web servers for predicting subcellular localization of proteins in various organisms. Nat. Protoc. 3(2), 153 (2008)
Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehous. Min. 3(3) (2006)
Saini, H., Raicar, G., Dehzangi, A., Lal, S., Sharma, A.: Subcellular localization for gram positive and gram negative bacterial proteins using linear interpolation smoothing model. J. Theor. Biol. 386, 25–33 (2015)
Dehzangi, A., Sohrabi, S., Heffernan, R., Sharma, A., Lyons, J., Paliwal, K., Sattar, A.: Gram-positive and gram-negative subcellular localization using rotation forest and physicochemical-based features. BMC Bioinform. 16(4), S1 (2015)
Chou, K.C.: Impacts of bioinformatics to medicinal chemistry. Med. Chem. 11(3), 218–234 (2015)
Dehzangi, A., Heffernan, R., Sharma, A., Lyons, J., Paliwal, K., Sattar, A.: Gram-positive and gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into chou’s general pseaac. J. Theor. Biol. 364, 284–294 (2015)
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. Knowl. Discov. Databases, 254–269 (2009)
Tai, F., Lin, H.T.: Multilabel classification with principal label space transformation. Neural Comput. 24(9), 2508–2542 (2012)
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)
Heffernan, R., Yang, Y., Paliwal, K., Zhou, Y.: Capturing non-local interactions by long short term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers, and solvent accessibility. Bioinformatics 33(18), 2842–2849 (2017)
Pacharawongsakda, E., Theeramunkong, T.: Predict subcellular locations of singleplex and multiplex proteins by semi-supervised learning and dimension-reducing general mode of chou’s pseaac. IEEE Trans. Nanobiosci. 12(4), 311–320 (2013)
Dobra, A.: Decision tree classification. In: Encyclopedia of Database Systems. Springer, pp. 765–769 (2009)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Efron, B., Gong, G.: A leisurely look at the bootstrap, the jackknife, and cross-validation. Am. Statist. 37(1), 36–48 (1983)
Chou, K.C.: Some remarks on protein attribute prediction and pseudo amino acid composition. J. Theor. Biol. 273(1), 236–247 (2011)
Garner, S.R., et al.: Weka: the waikato environment for knowledge analysis. In: Proceedings of the New Zealand Computer Science Research Students Conference, pp. 57–64 (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ferdous, H., Uddin, R., Shatabda, S. (2019). Subcellular Localization of Gram-Negative Proteins Using Label Powerset Encoding. In: Abraham, A., Dutta, P., Mandal, J., Bhattacharya, A., Dutta, S. (eds) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, vol 755. Springer, Singapore. https://doi.org/10.1007/978-981-13-1951-8_48
Download citation
DOI: https://doi.org/10.1007/978-981-13-1951-8_48
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1950-1
Online ISBN: 978-981-13-1951-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)