Skip to main content
Log in

Eat, sleep, code, repeat: tips for early-career researchers in computational science

  • Tutorial
  • Published:
The European Physical Journal Plus Aims and scope Submit manuscript


This article is intended as a guide for new graduate students entering the field of computational science. With the increasing influx of students with diverse backgrounds joining the ever-popular field, the aim of this short guide is to help students navigate through the various computational techniques that they are likely to encounter during their studies. Here, we cover a broad spectrum of techniques, including Bash scripting, scientific programming, and machine learning, among other fields. This paper is structured into nine sections, each introducing a different computational method. To enhance readability, we have adopted a casual and instructive tone throughout and included relevant code snippets. Please note that due to the introductory nature of this article, it is not intended to be exhaustive; instead, we direct readers to a list of references to expand their knowledge of the techniques discussed within the paper. Finally, readers should note this article serves as an extension to our student-led seminar series, with additional resources and videos available at for reference.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

Not applicable.

Code availability

Not applicable.


  1. Conda is a widely used package-management environment that allows users to install specific software packages and dependencies. It facilitates the replication of software environments by creating isolated and self-contained spaces, preventing conflicts between distinct projects.


  1. S.J. Clark, M.D. Segall, C.J. Pickard, P.J. Hasnip, M.J. Probert, K. Refson, M.C. Payne, First principles methods using CASTEP. Z. Kristall. 220, 567–570 (2005)

    Article  Google Scholar 

  2. P. Rüßmann, P. Mavropoulos, R. Zeller, J. Bouaziz, M. Dos Santos Dias, S. Blügel, D.S.G. Bauer, P.F. Baumeister, M. Bornemann, S. Brinker, P.H. Dederichs, B.H. Drittler, N. Essing, G. Géranton, N.H. Long, S. Lounis, E. Mendive Tapia, E. Rabel, F. Dos Santos, B. Schweflinghaus, D. Antognini Silva, A.R. Thiess, B. Zimmermann, The JuKKR code (2022).

  3. C.D. Woodgate, D. Hedlund, L.H. Lewis, J.B. Staunton, Interplay between magnetism and short-range order in medium- and high-entropy alloys: Crconi, crfeconi, and crmnfeconi. Phys. Rev. Mater. 7, 053801 (2023).

    Article  Google Scholar 

  4. R. Chadwick, Linux Tutorial for Beginners - Learn Linux and the Bash Command Line. [Online; accessed 20. Oct. 2023] (2023).

  5. R. Chadwick, Bash Scripting Tutorial - Ryans Tutorials. [Online; accessed 20. Oct. 2023] (2023).

  6. K. Dowd, C.R. Severance, High performance computing, 1st edn. (O’Reilly & Associates, Cambridge, 1998)

    Google Scholar 

  7. W.P. Huhn, B. Lange, V.W.-Z. Yu, M. Yoon, V. Blum, GPU acceleration of all-electron electronic structure theory using localized numeric atom-centered basis functions. Comput. Phys. Commun. 254, 107314 (2020).

    Article  MathSciNet  MATH  Google Scholar 

  8. F. Spiga, I. Girotto, phiGEMM: a CPU-GPU library for porting quantum ESPRESSO on hybrid systems. In: 2012 20th Euromicro international conference on parallel, distributed and network-based processing, pp. 368–375 (2012). . ISSN: 2377-5750

  9. L. Vogt, R. Olivares-Amaya, S. Kermes, Y. Shao, C. Amador-Bedolla, A. Aspuru-Guzik, Accelerating resolution-of-the-identity second-order Møller-Plesset quantum chemistry calculations with graphical processing units. J. Phys. Chem. A 112(10), 2049–2057 (2008). (Accessed 2023-06-29)

    Article  Google Scholar 

  10. K. Wilkinson, C.-K. Skylaris, Porting ONETEP to graphical processing unit-based coprocessors. 1. FFT box operations. J. Comp. Chem. 34(28), 2446–2459 (2013).

    Article  Google Scholar 

  11. J. Yan, L. Li, C. O’Grady, Graphics processing unit acceleration of the random phase approximation in the projector augmented wave method. Comput. Phys. Commun. 184(12), 2728–2733 (2013). (Accessed 2023-06-29)

    Article  ADS  MATH  Google Scholar 

  12. L. Genovese, M. Ospici, T. Deutsch, J.-F. Méhaut, A. Neelov, S. Goedecker, Density functional theory calculation on many-cores hybrid central processing unit-graphic processing unit architectures. J. Chem. Phys. 131(3), 034103 (2009). (Accessed 2023-06-29)

    Article  ADS  Google Scholar 

  13. C. Bishop, Pattern recognition and machine learning. J. Electron. Imaging 16(4), 140–155 (2006).

    Article  MATH  Google Scholar 

  14. M.A. Lones, How to avoid machine learning pitfalls: a guide for academic researchers. arXiv (2021), arXiv:2108.02497

  15. M. Belyaev, E. Burnaev, Y. Kapushev, Exact inference for gaussian process regression in case of big data with the cartesian product structure. arXiv (2014), arXiv:1403.6573

  16. H. Liu, Y.-S. Ong, X. Shen, J. Cai, When gaussian process meets big data: a review of scalable GPs. arXiv (2018), arXiv:1807.01065

  17. L. Buitinck, G. Louppe, M. Blondel, F. Pedregosa, A. Mueller, O. Grisel, V. Niculae, P. Prettenhofer, A. Gramfort, J. Grobler, R. Layton, J. VanderPlas, A. Joly, B. Holt, G. Varoquaux, API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD workshop: languages for data mining and machine learning, pp. 108–122 (2013)

  18. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, É. Duchesnay, Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  19. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Zheng, TensorFlow: large-scale machine learning on heterogeneous systems. Software available from (2015).

  20. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, S. Chintala, PyTorch: An Imperative Style, High-Performance Deep Learning Library. Curran Associates, Inc. (2019).

  21. J.D. Hunter, Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 3, 90–95 (2007)

    Article  Google Scholar 

  22. C.R. Harris, K. Jarrod Millman, S.J. Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, J. Taylor, S. Berg, N.J. Smith, R. Kern, M. Picus, S. Hoyer, M.H. Kerkwijk, M. Brett, A. Haldane, J. Río, M. Wiebe, P. Peterson, P. Gérard-Marchant, K. Sheppard, T. Reddy, W. Weckesser, H. Abbasi, C. Gohlke, T.E. Oliphant, Array programming with NumPy. Nature 585, 357–362 (2007)

    Article  ADS  Google Scholar 

  23. P. Virtanen, R. Gommers, T.E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S.J. Walt, M. Brett, J. Wilson, K. Jarod Millman, N. Mayorov, A.R.J. Nelson, E. Jones, R. Kern, R. Larson, C.J. Carey, İ Polat, Y. Feng, E.W. Moore, J. VanderPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Heriksen, E.A. Quintero, C.R. Harris, A.M. Archibald, A.H. Ribeiro, F. Pedregosa, P. Mulbregt, SciPy 1.0 Contributors: SciPy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17, 261–272 (2020)

    Article  Google Scholar 

  24. M.L. Waskom 2021 seaborn: statistical data visualization. J. Open Source Softw. 6(60), 3021

  25. T. Team, pandas-dev/pandas: pandas. Zenodo (2020).

  26. T. Kluyver, B. Ragan-Kelley, F. Pérez, B. Granger, M. Bussonnier, J. Frederic, K. Kelley, J. Hamrick, J. Grout, S. Corlay, P. Ivanov, D. Avila, S. Abdalla, C. Willing, Jupyter notebooks – a publishing format for reproducible computational workflows. In: Loizides, F., Schmidt, B. (eds.) Positioning and power in academic publishing: players, Agents and Agendas, pp. 87–90 (2016). IOS Press

  27. Ligo: tutorials. [Online; accessed 13. Nov. 2023] (2023).

  28. J. Bradbury, R. Frostig, P. Hawkins, M.J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, Q. Zhang, JAX: composable transformations of Python+NumPy programs (2018).

  29. D. Maclaurin, D. Duvenaud, R.P. Adams, Autograd: effortless gradients in numpy. In: ICML 2015 AutoML Workshop, vol. 238, p. 5 (2015)

  30. C. Pilgrim piecewise-regression (aka segmented regression) in Python. J Open Source Softw, 6(68):3859 (2021)

  31. R.D. Peng, Reproducible research in computational science. Science 334(6060), 1226–1227 (2011).

    Article  ADS  Google Scholar 

  32. C. Pilgrim, P. Kent, K. Hosseini, E. Chalstrey, Ten simple rules for working with other people’s code. PLoS Comput. Biol. 19(4), 1011031 (2023).

    Article  ADS  Google Scholar 

  33. Sphinx: Sphinx. [Online; accessed 6. Nov. 2023] (2023).

  34. Doxygen: Doxygen: Doxygen. [Online; accessed 19. Oct. 2023] (2023).

  35. Gitlab: The DevSecOps Platform. [Online; accessed 19. Oct. 2023] (2023).

  36. Github: Build software better, together. [Online; accessed 19. Oct. 2023] (2023).

  37. P.w. Mosh, Git tutorial for beginners: learn Git in 1 hour. Youtube. [Online; accessed 12. Nov. 2023] (2020).

  38. Figshare: figshare - credit for all your research. [Online; accessed 12. Nov. 2023] (2023).

  39. Zenodo: Zenodo. [Online; accessed 12. Nov. 2023] (2023).

  40. C. Schafer, Python tutorial: unit testing your code with the unittest Module. Youtube. [Online; accessed 12. Nov. 2023] (2017).

  41. GeeksforGeeks: principles of software design. GeeksforGeeks. [Online; accessed 12. Nov. 2023] (2022).

  42. K. Chris, SOLID Design principles in software development. FreeCodeCamp (2023)

  43. B.W. Boehm, Seven basic principles of software engineering. J. Syst. Softw. 3(1), 3–24 (1983).

    Article  Google Scholar 

  44. Archiveddocs: chapter 16: quality attributes. [Online; accessed 12. Nov. 2023] (2023).

  45. Molssi: MolSSI’s Best Practices – MolSSI. [Online; accessed 12. Nov. 2023] (2023).

  46. D. Abbasi, Cutting-edge free tools to unlock the power of computational chemistry - Silico Studio. Silico Studio (2023)

  47. Molssi: MolSSI’s Best Practices – MolSSI. [Online; accessed 19. Oct. 2023] (2023).

  48. A. Athalye, The missing semester of your CS education. [Online; accessed 19. Oct. 2023] (2023).

  49. M.D. Learning, MIT deep learning 6.S191. [Online; accessed 19. Oct. 2023] (2023).

  50. T. Chem, Computational Chemistry 0.1 - Introduction. Youtube. [Online; accessed 19. Oct. 2023] (2017). &list=PLm8ZSArAXicIWTHEWgHG5mDr8YbrdcN1K

  51. Virtual Simulation Lab. Youtube. [Online; accessed 19. Oct. 2023] (2023).

  52. StatQuest with Josh Starmer. Youtube. [Online; accessed 19. Oct. 2023] (2023).

  53. The Computational Toolkit. Youtube. [Online; accessed 19. Oct. 2023] (2023).

  54. J. Cumby, M. Degiacomi, V. Erastova, J. Güven, C. Hobday, A. Mey, H. Pollak, R. Szabla, Course materials for an introduction to data-driven chemistry. J. Open Source Educ. 6(63), 192 (2023)

  55. T. French R for data analysis: an open-source resource for teaching and learning analytics with r. J. Open Source Educ. 6(63), 202 (2023)

  56. J. Storopoli, R. Huijzer, L. Alonso, Julia Data Science, (2021).

  57. A.D. White, Deep learning for molecules and materials. Living J. Comput. Molecul. Sci, 3(1), 1499 (2021)

  58. G.C. Solomon, J.Z. Zhang, T. Cuk, An open letter to aspiring authors. ACS Phys. Chem. Au 2(2), 68–69 (2022).

    Article  Google Scholar 

  59. Y.-F. Shi, Z.-X. Yang, S. Ma, P.-L. Kang, C. Shang, P. Hu, Z.-P. Liu, Machine learning for chemistry: basics and applications. Engineering (2023).

    Article  Google Scholar 

  60. F.A. Rodrigues, Machine learning in physics: a short guide. Europhys. Lett. 144(2), 22001 (2023).

    Article  ADS  Google Scholar 

Download references


I.I., D.M., J.M.T., Z.F., C.M., and C.D.W. acknowledge funding from the Engineering and Physical Sciences Research Council (EPSRC) Centre for Doctoral Training in Modelling of Heterogeneous Systems [EP/S022848/1]. S.C. acknowledges funding from the EPSRC Centre for Doctoral Training in Diamond Science and Technology [EP/L015315/1] and the Research Development Fund of the University of Warwick. C.P. acknowledges funding from the EPSRC Mathematics for Real-World Systems Centre for Doctoral Training [EP/S022244/1]. In addition, we would like to acknowledge the valuable contributions of several other colleagues for their early efforts and input in the computational toolkit seminar series, on which this paper is based, and they are Peter Lewin-Jones, Kyle Fogarty, Lakshmi Shenoy, Matthew Harrison, and Charlotte Rogerson. Finally, we would like to thank both Professors James Kermode and Julie Staunton (University of Warwick) for their time in reading our drafts and offering valuable advice and comments on our manuscript.


This study was funded by Engineering and Physical Sciences Research Council (EPSRC) [EP/S022848/1].

Author information

Authors and Affiliations



All authors have contributed equally to this manuscript.

Corresponding author

Correspondence to Idil Ismail.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ismail, I., Chaudhuri, S., Morgan, D. et al. Eat, sleep, code, repeat: tips for early-career researchers in computational science. Eur. Phys. J. Plus 138, 1094 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: