Skip to main content

Advertisement

Log in

Learnings from developing an applied data science curricula for undergraduate and graduate students

  • Article
  • Published:
MRS Advances Aims and scope Submit manuscript

Abstract

Data science has advanced significantly in recent years and allows scientists to harness large-scale data analysis techniques using open source coding frameworks. Data science is a tool that should be taught to science and engineering students in addition to their chosen domain knowledge. An applied data science minor allows students to understand data and data handling as well as statistics and model development. This move will improve reproducibility and openness of research as well as allow for greater interdisciplinarity and more analyses focusing on critical scientific challenges.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. T. Wackler: Strategy for American Leadership in Advanced Manufacturing, National Science and Technology Policy, White House, 40 (2018). https://www.whitehouse.gov/wp-content/uploads/2018/10/Advanced-Manufacturing-Strategic-Plan-2018.pdf. (accessed 4 January 2020).

    Google Scholar 

  2. B. Weinelt: Digital Transformation Initiative, World Economic Forum, (2015). http://wef.ch/2hU0x7I (accessed 4 January 2020).

    Google Scholar 

  3. R. Grossman, The Industries That Are Being Disrupted the Most by Digital, Harvard Business Review, (2016). https://hbr.org/2016/03/the-industries-that-are-being-disrupted-the-most-by-digital (accessed January 4, 2020).

    Google Scholar 

  4. M. I. Jordan, editor, Frontiers in Massive Data Analysis, National Research Council, National Academies Press, (2013). http://www.nap.edu/catalog.php?record_id=18374. (accessed 4 January 2020).

    Google Scholar 

  5. F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach, M. Burrows, T. Chandra, A. Fikes, R.E. Gruber, Bigtable: A distributed storage system for structured data, ACM Transactions on Computer Systems, 26, 4 (2008). http://dl.acm.org/citation.cfm?id=1365816. (accessed January 26, 2016).

    Article  Google Scholar 

  6. R.C. Taylor, An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics, BMC Bioinformatics. 11, S1 (2010). http://www.biomedcentral.com/1471-2105/11/S12/S1. (accessed October 28, 2014).

    Article  Google Scholar 

  7. M. Zaharia, R.S. Xin, P. Wendell, T. Das, M. Armbrust, A. Dave, X. Meng, J. Rosen, S. Venkataraman, M.J. Franklin, A. Ghodsi, J. Gonzalez, S. Shenker, I. Stoica, Apache Spark: A Unified Engine for Big Data Processing, Commun. ACM. 59, 56–65 (2016). https://doi.org/10.1145/2934664. (accessed 4 January 2020).

    Article  Google Scholar 

  8. E. Maxwell: Harnessing Openness to Improve Research, Teaching and Learning in Higher Education. Innovations: Technology, Governance, Globalization, 5(2), 155 (2010). http://dx.doi.org/10.1162/inov_a_00019. (accessed 4 January 2020).

    Article  Google Scholar 

  9. E. Maxwell, Open Standards, Open Source, and Open Innovation: Harnessing the Benefits of Openness, Innovations: Technology, Governance, Globalization, 1, 119–176 (2006). https://doi.org/10.1162/itgg.2006.1.3.119. (accessed 4 January 2020).

    Article  Google Scholar 

  10. D. C. Ince, L. Hatton, and J. Graham-Cumming: The case for open computer programs. Nature, 482, 7386, 485 (2012). http://www.nature.com/nature/journal/v482/n7386/full/nature10836.html. (accessed 4 January 2020).

    Article  CAS  Google Scholar 

  11. J. Andraka: Open Access: The Pathway to Innovation, OSTP, (2013). https://obamawhitehouse.archives.gov/blog/2013/06/20/open-access-pathway-innovation. (accessed 4 January 2020).

    Google Scholar 

  12. J. S. S. Lowndes, B. D. Best, C. Scarborough, J. C. Afflerbach, M. R. Frazier, O’C. C. Hara, N. Jiang, and B. S. Halpern: Our path to better science in less time using open data science tools. Nat. Ecol. Evol., 1(6), 160 (2017). https://dx.doi.org/10.1038/s41559-017-0160. (accessed 4 January 2020).

    Article  Google Scholar 

  13. B. Obama: Executive Order — Making Open and Machine Readable the New Default for Government Information, The White House (2013). https://obamawhitehouse.archives.gov/the-press-office/2013/05/09/executive-order-making-open-and-machine-readable-new-default-government-. (accessed 4 January 2020).

    Google Scholar 

  14. Group of 8 (G8): G8 Open Data Charter and Technical Annex Gov.UK), (2013). https://www.gov.uk/government/publications/open-data-charter/g8-open-data-charter-and-technical-annex. (accessed 4 January 2020).

  15. J. P. Holdren: Increasing Access to the Results of Federally Funded Scientific Research, Executive Office of the President: Office of Science and Technology Policy, (2013). https://obamawhitehouse.archives.gov/blog/2016/02/22/increasing-access-results-federally-funded-science. (accessed 4 January 2020).

    Google Scholar 

  16. C. Wadia, M. Stebbins: It’s Time to Open Materials Science Data, Executive Office of the President: Office of Science and Technology Policy, (2015). https://obamawhitehouse.archives.gov/blog/2015/02/06/its-time-open-materials-science-data. (accessed 4 January 2020).

    Google Scholar 

  17. F. S. Collins and L. A. Tabak, “Policy: NIH plans to enhance reproducibility,” Nature, 505, 7485, 612–613, (Jan. 2014). http://www.nature.com/news/policy-nih-plans-to-enhance-reproducibility-1.14586. (accessed 4 January 2020).

    Article  Google Scholar 

  18. H. V. Fineberg, “Reproducibility and Replicability in Science,” National Academies Press, (May 2019) https://www.nap.edu/catalog/25303/reproducibility-and-replicability-in-science. (accessed 4 January 2020).

    Google Scholar 

  19. Y. E. Wang, G.-Y. Wei, D. Brooks, Benchmarking TPU, GPU, and CPU Platforms for Deep Learning, ArXiv: 1907.10701 [Cs, Stat]. (2019). http://arxiv.org/abs/1907.10701 (accessed January 8, 2020).

    Google Scholar 

  20. N.P. Jouppi, et al., In-Datacenter Performance Analysis of a Tensor Processing Unit, ArXiv: 1704.04760 [Cs]. (2017). http://arxiv.org/abs/1704.04760 (accessed January 8, 2020).

    Google Scholar 

  21. Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature. 521, 436–444 (2015). https://doi.org/10.1038/nature14539. (accessed 4 January 2020).

    Article  CAS  Google Scholar 

  22. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, ImageNet: A Large-Scale Hierarchical Image Database, Proc. of IEEE Computer Vision and Pattern Recognition, 8, (2009). https://wordnet.cs.princeton.edu/papers/imagenet_cvpr09.pdf. (accessed 4 January 2020).

  23. ImageNet, (n.d.). http://image-net.org/ (accessed January 8, 2020).

  24. A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 1097–1105, (2012). https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf. (accessed 4 January 2020).

    Google Scholar 

  25. K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, ArXiv:1409.1556 [Cs]. (2014). http://arxiv.org/abs/1409.1556. (accessed 4 January 2020).

    Google Scholar 

  26. R. Al-Rfou, et al., Theano: A Python framework for fast computation of mathematical expressions, ArXiv:1605.02688 [Cs]. (2016). http://arxiv.org/abs/1605.02688 (accessed January 8, 2020).

    Google Scholar 

  27. M. Abadi, et al., TensorFlow: A System for Large-Scale Machine Learning, Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, 265–283, (2016). https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi (accessed January 8, 2020).

    Google Scholar 

  28. F. Chollet, J. J. Allaire, Deep Learning with R, Manning Publications, (2018). https://www.manning.com/books/deep-learning-with-r (accessed May 29, 2019).

    Google Scholar 

  29. G. Marcus, Deep Learning: A Critical Appraisal, ArXiv:1801.00631 [Cs, Stat]. (2018). http://arxiv.org/abs/1801.00631 (accessed January 8, 2020).

    Google Scholar 

  30. J. Dean, The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design, ArXiv:1911.05289 [Cs, Stat]. (2019). http://arxiv.org/abs/1911.05289 (accessed January 8, 2020).

    Google Scholar 

  31. D. Silver et al., “Mastering the game of Go without human knowledge,” Nature, vol. 550, no. 7676, pp. 354–359, (Oct. 2017). https://www.nature.com/articles/nature24270. (accessed 4 January 2020).

    Article  CAS  Google Scholar 

  32. D. Silver et al., “Mastering the game of Go with deep neural networks and tree search,” Nature, 529, 7587, 484–489, (Jan. 2016). https://www.nature.com/articles/nature16961. (accessed 4 January 2020).

    Article  CAS  Google Scholar 

  33. E. E. David Jr.: Responsible Science, Volume I: Ensuring the Integrity of the Research Process, National Academies Press, (1992). http://www.nap.edu/catalog/1864/responsible-science-volume-i-ensuring-the-integrity-of-the-research. (accessed 4 January 2020).

    Google Scholar 

  34. R. D. Peng: Reproducible Research in Computational Science. Science, 334, 6060, 1226 (2011). https://dx.doi.org/10.1126/science.1213847. (accessed 4 January 2020).

    Article  CAS  Google Scholar 

  35. Announcement: Reducing our irreproducibility. Nature, 496(7446), 398 (2013). http://www.nature.com/news/announcement-reducing-our-irreproducibility-1.12852. (accessed 4 January 2020).

    Article  Google Scholar 

  36. J. T. Leek and R. D. Peng: Statistics: P values are just the tip of the iceberg. Nature, 520, 7549, 612 (2015). http://www.nature.com/doifinder/10.1038/520612a. (accessed 4 January 2020).

    Article  CAS  Google Scholar 

  37. A. Guterres, “The Sustainable Development Goals Report 2018,” United Nations, Department of Economic and Social Affairs, (2018) https://www.un.org/development/desa/publications/the-sustainable-development-goals-report-2018.html. (accessed 4 January 2020).

    Google Scholar 

  38. R. H. French et al., “Degradation science: Mesoscopic evolution and temporal analytics of photovoltaic energy materials,” Current Opinion in Solid State and Materials Science, 19, 4, 212–226, (Aug. 2015). http://www.sciencedirect.com/science/article/pii/S1359028614000989. (accessed 4 January 2020).

    Article  CAS  Google Scholar 

  39. H. E. Yang, R. H. French, L. S. Bruckman, Eds., Durability and Reliability of Polymers and Other Materials in Photovoltaic Modules, 1st Edition. Amsterdam: Elsevier, William Andrew Applied Science Publishers, (2019). https://www.sciencedirect.com/book/9780128115459/durability-and-reliability-of-polymers-and-other-materials-in-photovoltaic-modules. (accessed 4 January 2020).

    Google Scholar 

  40. International Energy Agency, World Energy Outlook 2019, (2019). https://www.iea.org/weo/weo2019/secure/data/. (accessed 4 January 2020).

    Book  Google Scholar 

  41. T. M. Pollock: Integrated Computational Materials Engineering, National Academies Press, (2008). https://nae.edu/25043/Integrated-Computational-Materials-Engineering. (accessed 4 January 2020).

    Google Scholar 

  42. J. P. Holdren: Goals of the Materials Genome Initiative (2011). https://www.mgi.gov/sites/default/files/documents/materials_genome_initiative-final.pdf. (accessed 4 January 2020).

    Google Scholar 

  43. R.M. Dudley, R.M. Dudley, Uniform Central Limit Theorems, Cambridge University Press, (1999). https://doi.org/10.1017/CBO9780511665622. (accessed 4 January 2020).

    Book  Google Scholar 

  44. H. Lasi, P. Fettke, H.-G. Kemper, T. Feld, and M. Hoffmann: Industry 4.0. Business & Information Systems Engineering, 6, 4, 239 (2014). DOI: 10.1007/s12599-014-0334-4. (accessed 4 January 2020).

    Article  Google Scholar 

  45. L. D. Xu, E. L. Xu, and L. Li: Industry 4.0: State of the Art and Future Trends. International Journal of Production Research, 56, 8, 2941 (2018). DOI:10.1080/00207543.2018.1444806. (accessed 4 January 2020).

    Article  Google Scholar 

  46. J. Lee, B. Bagheri, and H.-A. Kao: A Cyber-Physical Systems Architecture for Industry 4.0-based Manufacturing Systems. Manufacturing Letters, 3, 18 (2015). https://doi.org/10.1016/j.mfglet.2014.12.001. (accessed 4 January 2020).

    Article  Google Scholar 

  47. Y. Lu: Industry 4.0: A Survey on Technologies, Applications and Open Research Issues. Journal of Industrial Information Integration, 6, 1 (2017). DOI: 10.1016/j.jii.2017.04.005

    Article  CAS  Google Scholar 

  48. D. Hughes and R. H. French, “Crafting a Minor to Produce T-Shaped Graduates,” National Academies, Washington DC, 21 March 2016. http://tsummit.org/files/T-Summit_Speaker_Abstracts-2016.pdf. (accessed 4 January 2020).

    Google Scholar 

  49. Business Higher Education Forum, “Creating a Minor in Applied Data Science BHEF,” The Business Higher Education Forum, Case Study, Aug. 2016. Available: http://www.bhef.com/publications/creating-minor-applied-data-science. (accessed 4 January 2020).

    Google Scholar 

  50. R Core Team, “R: The R Project for Statistical Computing” (2019). https://www.r-project.org/. (accessed 4 January 2020).

    Google Scholar 

  51. RStudio: Integrated Development Environment for R, RStudio, Inc., Boston, MA (2015). http://www.rstudio.com/. (accessed 4 January 2020)

    Google Scholar 

  52. H. Wickham, G. Grolemund, “R for Data Science: Import, Tidy, Transform, Visualize, and Model Data” 1 edition, O’Reilly Media, (2017). http://r4ds.had.co.nz/. (accessed 4 January 2020).

    Google Scholar 

  53. van G. Rossum, Python tutorial, technical report CS-R9526, National Research Institute for Mathematics and Computer Science, Amsterdam, The Netherlands (1995), p.71. https://ir.cwi.nl/pub/5007/05007D.pdf. (accessed 4 January 2020).

    Google Scholar 

  54. G. Van Rossum and Drake L. Fred, Python 3 Reference Manual, CreateSpace, Scotts Valley, CA (2009).

    Google Scholar 

  55. Python Software Foundation: Python 3.8.1 documentation” (n.d.). https://docs.python.org/3.8/contents.html. (accessed 4 January 2020).

  56. Van H. Styn, Git - Revision Control Perfected, Linux Journal, 208 (2011). https://www.linuxjournal.com/content/git-revision-control-perfected. (accessed 4 January 2020).

  57. Z. Brown, A Git Origin Story, Linux Journal, 288 (2018). https://www.linuxjournal.com/content/git-origin-story. (accessed 4 January 2020).

  58. K. Ram, “Git can facilitate greater reproducibility and increased transparency in science,” Source Code for Biology and Medicine, 8, 1, 7, (Feb. 2013). https://doi.org/10.1186/1751-0473-8-7. (accessed 4 January 2020).

    Google Scholar 

  59. A. Swartz, “Aaron Swartz’s A Programmable Web: An Unfinished Work,” Synthesis Lectures on the Semantic Web: Theory and Technology, 3, 2, 1–64, (Feb. 2013). https://www.morganclaypool.com/doi/abs/10.2200/S00481ED1V01Y201302WBE005. (accessed 4 January 2020).

    Article  Google Scholar 

  60. M. Kline, Modern LaTeX, 2nd Ed. (2018). https://assets.bitbashing.io/modern-latex.pdf. (accessed 4 January 2020).

    Google Scholar 

  61. H. Wickham et al., “Welcome to the Tidyverse,” Journal of Open Source Software, vol. 4, no. 43, p. 1686, (Nov. 2019). https://joss.theoj.org/papers/10.21105/joss.01686. (accessed 4 January 2020).

    Article  Google Scholar 

  62. H. Wickham, ggplot2: Elegant Graphics for Data Analysis, 2nd ed Springer International Publishing, (2016). https://www.springer.com/gp/book/9783319242750. (accessed 4 January 2020).

    Book  Google Scholar 

  63. D. E. Knuth, “Literate Programming,” Comput J, 27, 2, 97–111, (Jan. 1984). https://academic.oup.com/comjnl/article/27/2/97/343244/Literate-Programming. (accessed 4 January 2020).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

French, R.H., Bruckman, L.S. Learnings from developing an applied data science curricula for undergraduate and graduate students. MRS Advances 5, 347–353 (2020). https://doi.org/10.1557/adv.2020.135

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1557/adv.2020.135

Navigation