Skip to main content

Computational Prediction of Host-Pathogen Interactions Through Omics Data Analysis and Machine Learning

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2017)

Abstract

The emergence and rapid dissemination of antibiotic resistance, worldwide, threatens medical progress and calls for innovative approaches for the management of multidrug resistant infections. Phage-therapy, i.e., the use of viruses (phages) that specifically infect and kill bacteria during their life cycle, is a re-emerging and promising alternative to solve this problem. The success of phage therapy mainly relies on the exact matching between the target pathogenic bacteria and the therapeutic phage. Currently, there are only a few tools or methodologies that efficiently predict phage-bacteria interactions suitable for the phage therapy, and the pairs phage-bacterium are thus empirically tested in laboratory. In this paper we present an original methodology, based on an ensemble-learning approach, to predict whether or not a given pair of phage-bacteria would interact. Using publicly available information from Genbank and phagesdb.org, we assembled a dataset containing more than two thousand phage-bacterium interactions with their corresponding genomes. A set of informative features, extracted from these genomes, form the base of the quantitative datasets used to train our predictive models. These features include the distribution of predicted protein-protein interaction scores, as well as the amino acid frequency, the chemical composition, and the molecular weight of such proteins. Using an independent test dataset to evaluate the performance of our methodology, our approach gets encouraging performance with more than 90% of accuracy, specificity, and sensitivity.

A. Neves and C. Peña-Reyes–Contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Flores, C.O., Meyer, J.R., Valverde, S., Farr, L., Weitz, J.S.: Statistical structure of host-phage interactions. Proc. Natl. Acad. Sci. 108, 288–297 (2011)

    Article  Google Scholar 

  2. Weitz, J.S., Poisot, T., Meyer, J.R., Flores, C.O., Valverde, S., Sullivan, M.B., Hochberg, M.E.: Phage–bacteria infection networks. Trends Microbiol. 21, 82–91 (2013)

    Article  Google Scholar 

  3. Beckett, S.J., Williams, H.T.P.: Coevolutionary diversification creates nested-modular structure in phage-bacteria interaction networks. Interface Focus 3(6), 20130033 (2013)

    Google Scholar 

  4. Labrie, S.J., Samson, J.E., Moineau, S.: Bacteriophage resistance mechanisms. Nat. Rev. Microbiol. 8, 317–327 (2010)

    Article  Google Scholar 

  5. Samson, J.E., Magadán, A.H., Sabri, M., Moineau, S.: Revenge of the phages: defeating bacterial defences. Nat. Rev. Microbiol. 11, 675–687 (2013)

    Article  Google Scholar 

  6. Seed, K.D.: Battling phages: how bacteria defend against viral attack. PLoS Pathog. 11, e1004847 (2015)

    Article  Google Scholar 

  7. Rakhuba, D.V., Kolomiets, E.I., Dey, E.S., Novik, G.I.: Bacteriophage receptors, mechanisms of phage adsorption and penetration into host cell. Polish J. Microbiol. 59, 145–155 (2010)

    Google Scholar 

  8. McNair, K., Bailey, B.A., Edwards, R.A.: PHACTS, a computational approach to classifying the lifestyle of phages. Bioinformatics 28, 614–618 (2012)

    Google Scholar 

  9. Garneau, J.E., Dupuis, M.-È., Villion, M., Romero, D.A., Barrangou, R., Boyaval, P., Fremaux, C., Horvath, P., Magadán, A.H., Moineau, S.: The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature 468, 67–71 (2010)

    Article  Google Scholar 

  10. Ram, G., Chen, J., Kumar, K., Ross, H.F., Ubeda, C., Damle, P.K., Lane, K.D., Penades, J.R., Christie, G.E., Novick, R.P.: Staphylococcal pathogenicity island interference with helper phage reproduction is a paradigm of molecular parasitism. Proc. Natl. Acad. Sci. 109, 16300–16305 (2012)

    Article  Google Scholar 

  11. Matsuzaki, S., Rashel, M., Uchiyama, J., Sakurai, S., Ujihara, T., Kuroda, M., Ikeuchi, M., Tani, T., Fujieda, M., Wakiguchi, H., Imai, S.: Bacteriophage therapy: a revitalized therapy against bacterial infectious diseases. J. Infect. Chemother. 11, 211–219 (2005)

    Article  Google Scholar 

  12. Fischetti, V.A.: Bacteriophage lysins as effective antibacterials. Curr. Opin. Microbiol. 11, 393–400 (2008)

    Article  Google Scholar 

  13. Edgar, R., Friedman, N., Molshanski-Mor, S., Qimron, U.: Reversing bacterial resistance to antibiotics by phage-mediated delivery of dominant sensitive genes. Appl. Environ. Microbiol. 78, 744–751 (2012)

    Article  Google Scholar 

  14. Yosef, I., Kiro, R., Molshanski-Mor, S., Edgar, R., Qimron, U.: Different approaches for using bacteriophages against antibiotic-resistant bacteria. Bacteriophage 4, e28491 (2014)

    Article  Google Scholar 

  15. Lu, T.K., Koeris, M.S.: The next generation of bacteriophage therapy. Curr. Opin. Microbiol. 14, 524–531 (2011)

    Article  Google Scholar 

  16. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers, San Francisco (2011)

    MATH  Google Scholar 

  17. Hatfull, G., Russell, D., Jacobs-Sera, D., Pop, W.H., Sivanathan, V., Tse, E.: The Actinobacteriophage DataBase at PhagesDB.org. http://phagesdb.org/

  18. Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Sayers, E.W.: GenBank. Nucleic Acids Res. 41, D36–D42 (2013)

    Article  Google Scholar 

  19. Besemer, J., Lomsadze, A., Borodovsky, M.: GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res. 29, 2607–2618 (2001)

    Article  Google Scholar 

  20. NCBI – Genome. https://www.ncbi.nlm.nih.gov/genome/

  21. NCBI – Nucleotide. https://www.ncbi.nlm.nih.gov/nucleotide/

  22. PubMed Central: Entrez Help (2006)

    Google Scholar 

  23. Hyman, P., Abedon, S.T.: Bacteriophage host range and bacterial resistance. Adv. Appl. Microbiol. 70, 217–48 (2010)

    Google Scholar 

  24. Duplessis, M., Moineau, S.: Identification of a genetic determinant responsible for host specificity in streptococcus thermophilus bacteriophages. Mol. Microbiol. 41, 325–336 (2001)

    Article  Google Scholar 

  25. Miklič, A., Rogelj, I.: Characterization of lactococcal bacteriophages isolated from slovenian dairies. Int. J. Food Sci. Technol. 38, 305–311 (2003)

    Article  Google Scholar 

  26. Duckworth, D.H., Gulig, P.A.: Bacteriophages: potential treatment for bacterial infections. BioDrugs 16, 57–62 (2002)

    Article  Google Scholar 

  27. Ben-Hur, A., Noble, W.S.: Choosing negative examples for the prediction of protein-protein interactions. BMC Bioinformatics. 7(Suppl 1), S2 (2006)

    Google Scholar 

  28. Coelho, E.D., Arrais, J.P., Matos, S., Pereira, C., Rosa, N., Correia, M.J., Barros, M., Oliveira, J.L.: Computational prediction of the human-microbial oral interactome. BMC Syst. Biol. 8, 24 (2014)

    Article  Google Scholar 

  29. Parham, P.: Structure des anticorps et origines de la diversité des cellules B. In: De Boeck (ed.) Le système immunitaire, pp. 31–35. De Boeck (2003)

    Google Scholar 

  30. Terrapon, N.: Recherche de domaines protéiques divergents à l’aide de modéles de Markov cachées : application à Plasmodium falciparum (2010). https://tel.archives-ouvertes.fr/tel-00811835/document

  31. Raghavachari, B., Tasneem, A., Przytycka, T.M., Jothi, R.: DOMINE: a database of protein domain interactions. Nucleic Acids Res. 36, D656–D661 (2007)

    Article  Google Scholar 

  32. Sonnhammer, E., Eddy, S., Birney, E., Bateman, A., Durbin, R.: Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res. 26, 320–322 (1998). Oxford University Press

    Article  Google Scholar 

  33. Eddy, S.R., Wheeler, T.J.: HMMER User’s Guide. 0–77 (2015)

    Google Scholar 

  34. Finn, R.D., Clements, J., Arndt, W., Miller, B.L., Wheeler, T.J., Schreiber, F., Bateman, A., Eddy, S.R.: HMMER web server: 2015 update. Nucleic Acids Res. 43, W30–W38 (2015)

    Article  Google Scholar 

  35. Bock, J.R., Gough, D.A.: Predicting protein–protein interactions from primary structure. Bioinformatics 17, 455–460 (2001)

    Article  Google Scholar 

  36. Shen, J., Zhang, J., Luo, X., Zhu, W., Yu, K., Chen, K., Li, Y., Jiang, H.: Predicting protein-protein interactions based only on sequences information. Proc. Natl. Acad. Sci. USA 104, 4337–4341 (2007)

    Article  Google Scholar 

  37. Xia, J.-F., Han, K., Huang, D.-S.: Sequence-based prediction of protein-protein interactions by means of rotation forest and autocorrelation descriptor. Protein Pept. Lett. 17, 137–145 (2010)

    Article  Google Scholar 

  38. You, Z.-H., Zhu, L., Zheng, C.-H., Yu, H.-J., Deng, S.-P., Ji, Z.: Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set. BMC Bioinformatics. 15(Suppl 1), S9 (2014)

    Google Scholar 

  39. Wade, L.G.: Amino Acids, peptides, and proteins. In: Hall, P. (ed.) Organic Chemistry, pp. 1153–1199 (2003)

    Google Scholar 

  40. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)

    Article  MATH  Google Scholar 

  41. Breiman, L.: Random forests. Springer Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  42. Villarroel, J., Kleinheinz, K.A., Jurtz, V.I., Zschach, H., Lund, O., Nielsen, M., Larsen, M.V.: HostPhinder: a phage host prediction tool. Viruses 8, 1–22 (2016)

    Article  Google Scholar 

  43. Edwards, R.A., McNair, K., Faust, K., Raes, J., Dutilh, B.E.: Computational approaches to predict bacteriophage-host relationships. FEMS Microbiol. Rev. 40, 258–272 (2016)

    Article  Google Scholar 

  44. Khan, S.S., Madden, M.G.: One-class classification: taxonomy of study and review of techniques. Knowl. Eng. Rev. 29, 1–24 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Diogo Manuel Carvalho Leite , Xavier Brochet or Carlos Peña-Reyes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Carvalho Leite, D.M., Brochet, X., Resch, G., Que, YA., Neves, A., Peña-Reyes, C. (2017). Computational Prediction of Host-Pathogen Interactions Through Omics Data Analysis and Machine Learning. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2017. Lecture Notes in Computer Science(), vol 10209. Springer, Cham. https://doi.org/10.1007/978-3-319-56154-7_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56154-7_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56153-0

  • Online ISBN: 978-3-319-56154-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics