Recipes for Translating Big Data Machine Reading to Executable Cellular Signaling Models

  • Khaled Sayed
  • Cheryl A. Telmer
  • Adam A. Butchy
  • Natasa Miskov-ZivanovEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10710)


Biological literature is rich in mechanistic information that can be utilized to construct executable models of complex systems to increase our understanding of health and disease. However, the literature is vast and fragmented, and therefore, automation of information extraction from papers and of model assembly from the extracted information is necessary. We describe here our approach for translating machine reading outputs, obtained by reading biological signaling literature, to discrete models of cellular networks. We use outputs from three different reading engines, and demonstrate the translation of different features using examples from cancer literature. We also outline several issues that still arise when assembling cellular network models from state-of-the-art reading engines. Finally, we illustrate the details of our approach with a case study in pancreatic cancer.


Machine reading Big data in literature Text mining Cell signaling networks Automated model generation 


  1. 1.
    Miskov-Zivanov, N.: Automation of biological model learning, design and analysis. In: Proceedings of the 25th Edition on Great Lakes Symposium on VLSI. ACM (2015)Google Scholar
  2. 2.
    Valenzuela-Escárcega, M.A., et al.: A domain-independent rule-based framework for event extraction. In: ACL-IJCNLP 2015, p. 127 (2015)Google Scholar
  3. 3.
    Hucka, M., et al.: The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19(4), 524–531 (2003)CrossRefGoogle Scholar
  4. 4.
    Droste, P., et al.: Visualizing multi-omics data in metabolic networks with the software Omix—a case study. Biosystems 105(2), 154–161 (2011)CrossRefGoogle Scholar
  5. 5.
    Büchel, F., et al.: Qualitative translation of relations from BioPAX to SBML qual. Bioinformatics 28(20), 2648–2653 (2012)CrossRefGoogle Scholar
  6. 6.
    Faeder, J.R., Blinov, M.L., Hlavacek, W.S.: Rule-based modeling of biochemical systems with BioNetGen. In: Systems Biology, pp. 113–167 (2009)Google Scholar
  7. 7.
    Hedengren, J.D., et al.: Nonlinear modeling, estimation and predictive control in APMonitor. Comput. Chem. Eng. 70, 133–148 (2014)CrossRefGoogle Scholar
  8. 8.
    Albert, R.: Scale-free networks in cell biology. J. Cell Sci. 118(21), 4947–4957 (2005)CrossRefGoogle Scholar
  9. 9.
    Pawson, T., Scott, J.D.: Protein phosphorylation in signaling–50 years and counting. Trends Biochem. Sci. 30(6), 286–290 (2005)CrossRefGoogle Scholar
  10. 10.
    Erwin, D.H., Davidson, E.H.: The evolution of hierarchical gene regulatory networks. Nat. Rev. Genet. 10(2), 141–148 (2009)CrossRefGoogle Scholar
  11. 11.
    Schuster, S., Fell, D.A., Dandekar, T.: A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks. Nat. Biotechnol. 18(3), 326–332 (2000)CrossRefGoogle Scholar
  12. 12.
    Schmitz, M.L., et al.: Signal integration, crosstalk mechanisms and networks in the function of inflammatory cytokines. Biochimica et Biophysica Acta (BBA)-Molecular Cell Research 1813(12), 2165–2175 (2011)CrossRefGoogle Scholar
  13. 13.
    Miskov-Zivanov, N., Marculescu, D., Faeder, J.R.: Dynamic behavior of cell signaling networks: model design and analysis automation. In: Proceedings of the 50th Annual Design Automation Conference. ACM (2013)Google Scholar
  14. 14.
    Sayed, K., et al.: DiSH simulator: capturing dynamics of cellular signaling with heterogeneous knowledge (2017). arXiv preprint arXiv:1705.02660
  15. 15.
    GO. Gene Ontology Database.
  16. 16.
    UniProt. UniProt Database.
  17. 17.
    Pfam. Pfam Database.
  18. 18.
    InterPro. InterPro Database.
  19. 19.
    Bioentities. Bioentities Database.
  20. 20.
    PubChem. PubChem Database.
  21. 21.
    HGNC. Database of Human Gene Names.
  22. 22.
  23. 23.
    REACH. Reading and Assembling Contextual and Holistic Mechanisms from Text (2016).
  24. 24.
    Burns, G.A., et al.: Automated detection of discourse segment and experimental types from the text of cancer pathway results sections. In: Database 2016, p. baw122 (2016)Google Scholar
  25. 25.
    Sloate, S., et al.: Extracting protein-reaction information from tables of unpredictable format and content in the molecular biology literature. In: Bioinformatics and Artificial Intelligence (BAI), New York (2016)Google Scholar
  26. 26.
    Sayed, K., Telmer, C.A., Miskov-Zivanov, N.: Motif modeling for cell signaling networks. In: 2016 8th Cairo International Biomedical Engineering Conference (CIBEC). IEEE (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Khaled Sayed
    • 1
  • Cheryl A. Telmer
    • 2
  • Adam A. Butchy
    • 3
  • Natasa Miskov-Zivanov
    • 1
    • 3
    • 4
    Email author
  1. 1.Department of Electrical and Computer EngineeringUniversity of PittsburghPittsburghUSA
  2. 2.Department of Biological SciencesCarnegie Mellon UniversityPittsburghUSA
  3. 3.Department of BioengineeringUniversity of PittsburghPittsburghUSA
  4. 4.Department of Computational and Systems BiologyUniversity of PittsburghPittsburghUSA

Personalised recommendations