Abstract
The identification of protein-protein interaction (PPI) is one of the most important tasks to understand the biological functions and disease mechanisms. Although numerous databases of biological interactions have been published in debt to advanced high-throughput technology, the study of inter-species protein-protein interactions, especially between human and bacterium pathogens, remains an active yet challenging topic to harness computational models tackling the complex analysis and prediction tasks. In this paper, we comprehensively revisit the prediction task of human-bacterium protein-protein interactions (HB-PPI), which is a first ever endeavour to report an empirical evaluation in learning and predicting HB-PPI based on machine learning models. Firstly, we summarise the literature review of human-bacterium interaction (HBI) study, meanwhile a vast number of databases published in the last decades are carefully examined. Secondly, a broader and deeper experimental framework is designed for HB-PPI prediction task, which explores a variety of feature representation algorithms and different computational models to learn from the curated HB-PPI dataset and perform predictions. Furthermore, a bidirectional LSTM-based model is proposed for the prediction task, which demonstrates a more effective performance in comparison with the others. Finally, opportunities for improving the performance and robustness of machine learning models for HP-PPI prediction are also discussed, laying a foundation for future work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The code and data are available on: https://huaming-chen.com/Bi-LSTM-Predictor/.
References
Ahmed, H.R., et al.: Pattern discovery in protein networks reveals high-confidence predictions of novel interactions. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 2938–2945 (2014)
Ammari, M.G., et al.: Hpidb 2.0: a curated database for host–pathogen interactions. Database 2016 (2016)
Ben-Hur, A., et al.: Kernel methods for predicting protein–protein interactions. Bioinformatics 21(suppl\_1), i38–i46 (2005)
Breuer, K., et al.: Innatedb: systems biology of innate immunity and beyond–recent updates and continuing curation. Nucleic Acids Res. 41(D1), D1228–D1233 (2013)
Calderone, A., et al.: Mentha: a resource for browsing integrated protein-interaction networks. Nat. Meth. 10(8), 690–691 (2013)
Chatr-Aryamontri, A., et al.: The biogrid interaction database: 2017 update. Nucleic Acids Res. 45(D1), D369–D379 (2017)
Chou, K.C., et al.: Memtype-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through PSE-PSSM. Biochem. Biophys. Res. Commun. 360(2), 339–345 (2007)
Consortium, U., et al.: Uniprot: the universal protein knowledgebase. Nucleic Acids Res. 46(5), 2699 (2018)
Cui, G., et al.: Prediction of protein-protein interactions between viruses and human by an SVM model. In: BMC bioinformatics. vol. 13, p. S5. Springer (2012)
DurmuÅŸ, S., et al.: A review on computational systems biology of pathogen-host interactions. Front. Microbiol. 6, 235 (2015)
Durmuş Tekir, S., et al.: Phisto: pathogen-host interaction search tool. Bioinformatics 29(10), 1357–1358 (2013)
Durmus Tekir, S., et al.: Infection strategies of bacterial and viral pathogens through pathogen–human protein–protein interactions. Front. Microbiol. 3, 46 (2012)
Dyer, M.D., et al.: Supervised learning and prediction of physical interactions between human and hiv proteins. Infect. Genet. Evolut. 11(5), 917–923 (2011)
Eid, F.E., et al.: Denovo: virus-host sequence-based protein-protein interaction prediction. Bioinformatics 32(8), 1144–1150 (2016)
Emamjomeh, A., et al.: Predicting protein-protein interactions between human and hepatitis c virus via an ensemble learning method. Molecular Biosyst. 10(12), 3147–3154 (2014)
Gomez, S.M., et al.: Learning to predict protein-protein interactions from protein sequences. Bioinformatics 19(15), 1875–1881 (2003)
Guo, Y., et al.: Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Res. 36(9), 3025–3030 (2008)
Hochreiter, S., et al.: Long short-term memory. Neural Computation 9(8), 1735–1780 (1997)
Cheol Jeong, J., et al.: On position-specific scoring matrix for protein function prediction. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(2), 308–315 (2010)
Joshi-Tope, G., et al.: Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 33(suppl\_1), D428–D432 (2005)
Kerrien, S., et al.: The intact molecular interaction database in 2012. Nucleic Acids Res. 40(D1), D841–D846 (2012)
König, R., et al.: Global analysis of host-pathogen interactions that regulate early-stage hiv-1 replication. Cell 135(1), 49–60 (2008)
Kshirsagar, M., et al.: Multitask learning for host-pathogen protein interactions. Bioinformatics 29(13), i217–i226 (2013)
Licata, L., et al.: Mint, the molecular interaction database: 2012 update. Nucleic Acids Res. 40(D1), D857–D861 (2012)
Lin, T.Y., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Nanni, L., et al.: An empirical study of different approaches for protein classification. The Scientific World Journal 2014 (2014)
Nourani, E., et al.: Computational approaches for prediction of pathogen-host protein-protein interactions. Front. Microbiol. 6, 94 (2015)
Prieto, C., et al.: Apid: agile protein interaction dataanalyzer. Nucleic Acids Res. 34(suppl\_2), W298–W302 (2006)
Salwinski, L., et al.: The database of interacting proteins: 2004 update. Nucleic Acids Res. 32(suppl\_1), D449–D451 (2004)
Schuster, M., et al.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Sen, R., et al.: A review on host-pathogen interactions: classification and prediction. Euro. J. Clin. Microbiol. Infect. Dis. 35(10), 1581–1599 (2016)
Shen, J., et al.: Predicting protein-protein interactions based only on sequences information. PNAS 104(11), 4337–4341 (2007)
Soyemi, J., et al.: Inter-species/host-parasite protein interaction predictions reviewed. Curr. Bioinform. 13(4), 396–406 (2018)
Wang, X., et al.: A novel matrix of sequence descriptors for predicting protein-protein interactions from amino acid sequences. PLoS One 14(6), e0217312 (2019)
Wattam, A.R., et al.: Patric, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res. 42(D1), D581–D591 (2014)
Wu, J., et al.: Towards a general prediction system for the primary delay in urban railways. In: 2019 IEEE ITSC, pp. 3482–3487. IEEE (2019)
Wuchty, S.: Computational prediction of host-parasite protein interactions between p. falciparum and h. sapiens. PLoS One 6(11), e26960 (2011)
Yao, Y., et al.: Bi-directional LSTM recurrent neural network for chinese word segmentation. In: ICONIP, pp. 345–353. Springer (2016)
Zhang, J., et al.: Review and comparative assessment of sequence-based predictors of protein-binding residues. Brief. Bioinform. 19(5), 821–837 (2018)
Zhang, L.: Sequence-based prediction of protein-protein interactions using random tree and genetic algorithm. In: ICIC, pp. 334–341. Springer (2012)
Zhou, H., et al.: Progress in computational studies of host-pathogen interactions. J. Bioinform. Comput. Biol. 11(02), 1230001 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, H., Shen, J., Wang, L., Jin, Y. (2021). Towards A More Effective Bidirectional LSTM-Based Learning Model for Human-Bacterium Protein-Protein Interactions. In: Panuccio, G., Rocha, M., Fdez-Riverola, F., Mohamad, M., Casado-Vara, R. (eds) Practical Applications of Computational Biology & Bioinformatics, 14th International Conference (PACBB 2020). PACBB 2020. Advances in Intelligent Systems and Computing, vol 1240. Springer, Cham. https://doi.org/10.1007/978-3-030-54568-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-54568-0_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54567-3
Online ISBN: 978-3-030-54568-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)