Setup of a scientific computing environment for computational biology: Simulation of a genome-scale metabolic model of Escherichia coli as an example

Jeon, Junhyeok; Kim, Hyun Uk

doi:10.1007/s12275-020-9516-6

Setup of a scientific computing environment for computational biology: Simulation of a genome-scale metabolic model of Escherichia coli as an example

Protocol
Published: 27 February 2020

Volume 58, pages 227–234, (2020)
Cite this article

Journal of Microbiology Aims and scope Submit manuscript

Junhyeok Jeon¹ &
Hyun Uk Kim^1,2,3

556 Accesses
6 Citations
Explore all metrics

Abstract

Computational analysis of biological data is becoming increasingly important, especially in this era of big data. Computational analysis of biological data allows efficiently deriving biological insights for given data, and sometimes even counterintuitive ones that may challenge the existing knowledge. Among experimental researchers without any prior exposure to computer programming, computational analysis of biological data has often been considered to be a task reserved for computational biologists. However, thanks to the increasing availability of user-friendly computational resources, experimental researchers can now easily access computational resources, including a scientific computing environment and packages necessary for data analysis. In this regard, we here describe the process of accessing Jupyter Notebook, the most popular Python coding environment, to conduct computational biology. Python is currently a mainstream programming language for biology and biotechnology. In particular, Anaconda and Google Colaboratory are introduced as two representative options to easily launch Jupyter Notebook. Finally, a Python package COBRApy is demonstrated as an example to simulate 1) specific growth rate of Escherichia coli as well as compounds consumed or generated under a minimal medium with glucose as a sole carbon source, and 2) theoretical production yield of succinic acid, an industrially important chemical, using E. coli. This protocol should serve as a guide for further extended computational analyses of biological data for experimental researchers without computational background.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The JBEI quantitative metabolic modeling library (jQMM): a python library for modeling microbial metabolism

Article Open access 05 April 2017

Metabolic Modeling with MetaFlux

Systems biology of the structural proteome

Article Open access 11 March 2016

References

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation ({OSDI} 16), pp. 265–283. USENIX Assocaion.
Cardoso, J.G.R., Jensen, K., Lieven, C., Laerke Hansen, A.S., Galkina, S., Beber, M., Zdemir, E., Herrgrd, M.J., Redestig, H., and Sonnenschein, N. 2018. Cameo: A Python library for computer aided metabolic engineering and optimization of cell factories. ACS Synth. Biol.7, 1163–1166.
Article CAS Google Scholar
Choi, H.S., Lee, S.Y., Kim, T.Y., and Woo, H.M. 2010. In silico identification of gene amplification targets for improvement of lycopene production. Appl. Environ. Microbiol.76, 3097–3105.
Article CAS Google Scholar
Cock, P.J., Antao, T., Chang, J.T., Chapman, B.A., Cox, C.J., Dalke, A., Friedberg, I., Hamelryck, T., Kauff, F., Wilczynski, B., et al. 2009. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics25, 1422–1423.
Article CAS Google Scholar
Ebrahim, A., Lerman, J.A., Palsson, B.O., and Hyduke, D.R. 2013. COBRApy: constraints-based reconstruction and analysis for Python. BMC Syst. Biol.7, 74.
Article Google Scholar
Gu, C., Kim, G.B., Kim, W.J., Kim, H.U., and Lee, S.Y. 2019. Current status and applications of genome-scale metabolic models. Genome Biol.20, 121.
Article Google Scholar
Hunter, J.D. 2007. Matplotlib: A 2D graphics environment. Comput. Sci. Eng.9, 90–95.
Article Google Scholar
Kim, H.U., Kim, T.Y., and Lee, S.Y. 2008. Metabolic flux analysis and metabolic engineering of microorganisms. Mol. Biosyst.4, 113–120.
Article Google Scholar
King, Z.A., Lu, J., Drger, A., Miller, P., Federowicz, S., Lerman, J.A., Ebrahim, A., Palsson, B.O., and Lewis, N.E. 2016. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models. Nucleic Acids Res.44, D515–D522.
Article CAS Google Scholar
Mariano, D., Martins, P., Helene Santos, L., and de Melo-Minardi, R.C. 2019. Introducing programming skills for life science students. Biochem. Mol. Biol. Educ.47, 288–295.
Article CAS Google Scholar
McKinney, W. 2010. Data structures for statistical computing in Python. Proc. of the 9th Python in Science Conf. (SCIPY 2010). pp. 51–56.
Monk, J.M., Lloyd, C.J., Brunk, E., Mih, N., Sastry, A., King, Z., Takeuchi, R., Nomura, W., Zhang, Z., Mori, H., et al. 2017. iML-1515, a knowledgebase that computes Escherichia coli traits. Nat. Biotechnol.35, 904–908.
Article CAS Google Scholar
Nagpal, A. and Gabrani, G. 2019. Python for data analytics, scientific and technical applications. In 2019 Amity International Conference on Artificial Intelligence (AICAI), pp. 140–145. Dubai, United Arab Emirates.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. 2011. Scikit-learn: machine learning in Python. J. Mach. Learn. Res.12, 2825–2830.
Google Scholar
Perkel, J.M. 2018. Why Jupyter is data scientists’ computational notebook of choice. Nature563, 145–146.
Article CAS Google Scholar
Rule, A., Birmingham, A., Zuiga, C., Altintas, I., Huang, S.C., Knight, R., Moshiri, N., Nguyen, M., Rosenthal, S., Prez, F., et al. 2018. Ten simple rules for reproducible research in Jupyter notebooks. ArXivabs/1810.08055.
Ryu, J.Y., Kim, H.U., and Lee, S.Y. 2019. Deep learning enables high-quality and high-throughput prediction of enzyme commission numbers. Proc. Natl. Acad. Sci. USA116, 13996–14001.
Article CAS Google Scholar
Sukumaran, J. and Holder, M.T. 2010. DendroPy: a Python library for phylogenetic computing. Bioinformatics26, 1569–1571.
Article CAS Google Scholar
Thiele, I. and Palsson, B.Ø. 2010. A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat. Protoc.5, 93–121.
Article CAS Google Scholar
van der Walt, S., Colbert, S., and Varoquaux, G. 2011. The NumPy array: A structure for efficient numerical computation. Comput. Sci. Eng.13, 22–30.
Article Google Scholar

Download references

Acknowledgments

We thank Mohammad Rifqi Ghiffary and Komal for their kind review of the manuscript. This work was supported by the Bio-Synergy Research Project (NRF-2018M3A9C4076475) of the Ministry of Science and ICT through the National Research Foundation.

Author information

Authors and Affiliations

Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
Junhyeok Jeon & Hyun Uk Kim
KAIST Institute for Artificial Intelligence, KAIST, Daejeon, 34141, Republic of Korea
Hyun Uk Kim
BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon, 34141, Republic of Korea
Hyun Uk Kim

Authors

Junhyeok Jeon
View author publications
You can also search for this author in PubMed Google Scholar
Hyun Uk Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hyun Uk Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jeon, J., Kim, H.U. Setup of a scientific computing environment for computational biology: Simulation of a genome-scale metabolic model of Escherichia coli as an example. J Microbiol. 58, 227–234 (2020). https://doi.org/10.1007/s12275-020-9516-6

Download citation

Received: 01 November 2019
Revised: 29 January 2020
Accepted: 10 February 2020
Published: 27 February 2020
Issue Date: March 2020
DOI: https://doi.org/10.1007/s12275-020-9516-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Setup of a scientific computing environment for computational biology: Simulation of a genome-scale metabolic model of Escherichia coli as an example

Abstract

Access this article

Similar content being viewed by others

The JBEI quantitative metabolic modeling library (jQMM): a python library for modeling microbial metabolism

Metabolic Modeling with MetaFlux

Systems biology of the structural proteome

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Setup of a scientific computing environment for computational biology: Simulation of a genome-scale metabolic model of Escherichia coli as an example

Abstract

Access this article

Similar content being viewed by others

The JBEI quantitative metabolic modeling library (jQMM): a python library for modeling microbial metabolism

Metabolic Modeling with MetaFlux

Systems biology of the structural proteome

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation