Subcellular Localization of Gram-Negative Proteins Using Label Powerset Encoding

  • Hasnaeen Ferdous
  • Raihan Uddin
  • Swakkhar ShatabdaEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 755)


Bacterial proteins play an important role in cell biology due to their importance in drug design and antibiotics research. The localization of bacterial proteins is very important since the function of a protein is closely linked with its location. A single gram-negative bacteria proteins can be located in multiple locations in a protein. Prediction of subcellular locations of gram-negative bacteria proteins is thus more difficult. In this paper, we propose a novel method for subcellular localization of gram-negative bacteria. Our method uses label powerset encoding scheme for the associated multi-label classification problem. Using a set of effective features also used in the literature our encoding significantly improves over the traditional approaches on several base classifiers. Our method was tested using a standard benchmark dataset and showed promising results.


Supervised learning Classification problem Label encoding Protein subcellular localization 


  1. 1.
    Gunsolus, I.L., Hu, D., Mihai, C., Lohse, S.E., Lee, C.s., Torelli, M.D., Hamers, R.J., Murhpy, C.J., Orr, G., Haynes, C.L.: Facile method to stain the bacterial cell surface for super-resolution fluorescence microscopy. Analyst 139(12), 3174–3178 (2014)Google Scholar
  2. 2.
    Sharma, R., Dehzangi, A., Lyons, J., Paliwal, K., Tsunoda, T., Sharma, A.: Predict gram-positive and gram-negative subcellular localization via incorporating evolutionary information and physicochemical features into chou’s general pseaac. IEEE Trans. Nanobiosci. 14(8), 915–926 (2015)CrossRefGoogle Scholar
  3. 3.
    Emanuelsson, O., Nielsen, H., Brunak, S., Von Heijne, G.: Predicting subcellular localization of proteins based on their n-terminal amino acid sequence. J. Mol. Biol. 300(4), 1005–1016 (2000)CrossRefGoogle Scholar
  4. 4.
    Lu, Z., Szafron, D., Greiner, R., Lu, P., Wishart, D.S., Poulin, B., Anvik, J., Macdonell, C., Eisner, R.: Predicting subcellular localization of proteins using machine-learned classifiers. Bioinformatics 20(4), 547–556 (2004)CrossRefGoogle Scholar
  5. 5.
    Shen, Y.Q., Burger, G.: ‘Unite and conquer’: enhanced prediction of protein subcellular localization by integrating multiple specialized tools. BMC Bioinform. 8(1), 420 (2007)CrossRefGoogle Scholar
  6. 6.
    Chou, K.C., Shen, H.B.: Cell-ploc: a package of web servers for predicting subcellular localization of proteins in various organisms. Nat. Protoc. 3(2), 153 (2008)CrossRefGoogle Scholar
  7. 7.
    Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehous. Min. 3(3) (2006)Google Scholar
  8. 8.
    Saini, H., Raicar, G., Dehzangi, A., Lal, S., Sharma, A.: Subcellular localization for gram positive and gram negative bacterial proteins using linear interpolation smoothing model. J. Theor. Biol. 386, 25–33 (2015)CrossRefGoogle Scholar
  9. 9.
    Dehzangi, A., Sohrabi, S., Heffernan, R., Sharma, A., Lyons, J., Paliwal, K., Sattar, A.: Gram-positive and gram-negative subcellular localization using rotation forest and physicochemical-based features. BMC Bioinform. 16(4), S1 (2015)CrossRefGoogle Scholar
  10. 10.
    Chou, K.C.: Impacts of bioinformatics to medicinal chemistry. Med. Chem. 11(3), 218–234 (2015)CrossRefGoogle Scholar
  11. 11.
    Dehzangi, A., Heffernan, R., Sharma, A., Lyons, J., Paliwal, K., Sattar, A.: Gram-positive and gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into chou’s general pseaac. J. Theor. Biol. 364, 284–294 (2015)CrossRefGoogle Scholar
  12. 12.
    Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. Knowl. Discov. Databases, 254–269 (2009)Google Scholar
  13. 13.
    Tai, F., Lin, H.T.: Multilabel classification with principal label space transformation. Neural Comput. 24(9), 2508–2542 (2012)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)CrossRefGoogle Scholar
  15. 15.
    Heffernan, R., Yang, Y., Paliwal, K., Zhou, Y.: Capturing non-local interactions by long short term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers, and solvent accessibility. Bioinformatics 33(18), 2842–2849 (2017)Google Scholar
  16. 16.
    Pacharawongsakda, E., Theeramunkong, T.: Predict subcellular locations of singleplex and multiplex proteins by semi-supervised learning and dimension-reducing general mode of chou’s pseaac. IEEE Trans. Nanobiosci. 12(4), 311–320 (2013)CrossRefGoogle Scholar
  17. 17.
    Dobra, A.: Decision tree classification. In: Encyclopedia of Database Systems. Springer, pp. 765–769 (2009)Google Scholar
  18. 18.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  19. 19.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  20. 20.
    Efron, B., Gong, G.: A leisurely look at the bootstrap, the jackknife, and cross-validation. Am. Statist. 37(1), 36–48 (1983)MathSciNetGoogle Scholar
  21. 21.
    Chou, K.C.: Some remarks on protein attribute prediction and pseudo amino acid composition. J. Theor. Biol. 273(1), 236–247 (2011)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Garner, S.R., et al.: Weka: the waikato environment for knowledge analysis. In: Proceedings of the New Zealand Computer Science Research Students Conference, pp. 57–64 (1995)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Hasnaeen Ferdous
    • 1
  • Raihan Uddin
    • 1
  • Swakkhar Shatabda
    • 1
    Email author
  1. 1.Department of Computer Science and EngineeringUnited International UniversityDhakaBangladesh

Personalised recommendations