Abstract
As data become more available and integrated into daily life, there has been growing interest in developing data science curricula for youth in conjunction with scientific practices and classroom technologies. However, the what and how of data science in pre-collegiate education have not yet reached consensus. This paper analyzes two prominent self-identified data science curricula, Introduction to Data Science (Gould et al. 2018) and Bootstrap: Data Science (Krishnamurthi et al. 2020), in order to ascertain what is thus far being presented to schools as data science. We highlight overlapping content and practices by the curricula while noting some key differences between the curricula and with professional practice. Moreover, we examine how lessons are structured and what kinds of data sets are used as well as introduce a measure of data set proximity. We conclude with some recommended areas for further coverage or elaboration in future iterations and future curricular efforts.
Similar content being viewed by others
Notes
Consider the scale difference of genomic vs astronomical data sciences.
Indeed, since this analysis had been completed, the CourseKata Statistics online textbook has received investments to present itself as a statistics and data science curricular resource, albeit it is not, at the time of this writing, a National Science Foundation investment nor is it focused exclusively on pre-collegiate instruction. Similarly, the youcubed mathematics education organization at Stanford is preparing data science curricula with support that does not yet involve National Science Foundation funding.
References
Arnold, P., & Pfannkuch, M. (2019). Posing comparative statistical investigative questions. In G. Burrill & D. Ben-Zvi (Eds.). Topics and trends in current statistics education research: International perspectives (pp. 173–195). Springer International Publishing.
Bakker, A., Biehler, R., & Konold, C. (2005). Should young students learn about box plots? In M. C. Gail Burrill (Ed.). Curricular development in statistics education (pp. 163–173). International Statistical Institute.
Bargagliotti, A., Franklin, C., Arnold, P., Gould, R., Johnson, S., Perez, L., & Spangler, D. (2020). Pre-K-12 Guidelines for Assessment and Instruction in Statistics Education (GAISE) report II. American Statistical Association.
Ben-Zvi, D., & Arcavi, A. (2001). Junior high school students construction of global views of data and data representations. Educational Studies in Mathematics, 45(1), 35–65. https://doi.org/10.1023/A:1013809201228
Bowen, G. A. (2009). Document analysis as a qualitative research method. Qualitative Research Journal, 9(2), 27–40. https://doi.org/10.3316/QRJ0902027
Burke, J., Estrin, D., Hansen, M., Parker, A., Ramanathan, N., Reddy, S., & Srivastava, M. B. (2006). Participatory Sensing. In Proceedings of WSW’06 at SenSys ’06. Boulder, CO: ACM.
CCSSM. (2010). Common core state standards for mathematics. Washington, DC: Author. Retrieved from Common Core State Standards website: http://www.corestandards.org/assets/CCSSI_Math%20Standards.pdf
Chinn, C. A., & Brewer, W. F. (1993). The role of anomalous data in knowledge acquisition: A theoretical framework and implications for science instruction. Review of Educational Research, 63(1), 1–49.
Conway, D. (2013). The data science Venn diagram. Retrieved from http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
De Mauro, A., Greco, M., Grimaldi, M., & Ritala, P. (2018). Human resources for Big Data professions: A systematic classification of job roles and required skill sets. Information Processing & Management, 54(5), 807–817. https://doi.org/10.1016/j.ipm.2017.05.004
D’Ignazio, C., & Klein, L. F. (2020). Data feminism. MIT Press.
diSessa, A. A. (2004). Metarepresentation: Native competence and targets for instruction. Cognition and Instruction, 22(3), 293–331.
delMas, R., & Liu, Y. (2005). Exploring students’ conceptions of the standard deviation. Statistics Education Research Journal, 4(1), 55–82.
Duhigg, C. (2012). How companies learn your secrets. New York Times Magazine. Retrieved from https://www.nytimes.com/2012/02/19/magazine/shopping-habits.html
Edelson, D. C., & Reiser, B. J. (2006). Making authentic practices accessible to learners: Design challenges and strategies. In R. K. Sawyer (Ed.), The Cambridge Handbook of the Learning Sciences (pp. 335–354). Cambridge University Press.
Enyedy, N., & Mukhopadhyay, S. (2007). They don’t show nothing i didn’t know: Emergent tensions between culturally relevant pedagogy and mathematics pedagogy. Journal of the learning sciences, 16(2), 139–174. https://doi.org/10.1080/10508400701193671
Erickson, T., Wilkerson, M., Finzer, W., & Reichsman, F. (2019). Data moves. Technology Innovations in Statistics Education, 12(1).
Felleisen, M., Findler, R. B., Flatt, M., & Krishnamurthi, S. (2018). How to design programs: An introduction to programming and computing. MIT Press.
Finzer, W. (2013). The data science education dilemma. Technology Innovations in Statistics Education, 7(2). Retrieved from https://escholarship.org/uc/item/7gv0q9dc
Franklin, C., Kader, G., Mewborn, D., Moreno, J., Peck, R., Perry, M., & Scheaffer, R. (2007). Guidelines for assessment and instruction in statistics education (GAISE) report: A pre-k–12 curriculum framework. Alexandria, VA: American Statistical Association.
Gabernet, A. R., & Limburn, J. (2017). Breaking the 80/20 rule: How data catalogs transform data scientists’ productivity. Retrieved from https://www.ibm.com/cloud/blog/ibm-data-catalog-data-scientists-productivity
Garfield, J., delMas, R. C., & Chance, B. (2007). Using students’ informal notions of variability to develop an understanding of formal measures of variability. In M. C. Lovett & P. Shah (Eds.), Thinking With Data (pp. 117–148). Lawrence Erlbaum Associates.
Gebre, E. H., & Polman, J. L. (2016). Developing young adults' representational competence through infographic-based science news reporting. International Journal of Science Education, 38(18), 2667–2687. https://doi.org/10.1080/09500693.2016.1258129
Goode, J., Margolis, J., & Chapman, G. (2014). Curriculum is not enough: The educational theory and research foundation of the exploring computer science professional development model. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education (pp. 493–498).
Gould, R., Bargagliotti, A., & Johnson, T. (2017). An analysis of secondary teachers’ reasoning with participatory sensing data. Statistics Education Research Journal, 16(2), 305–334.
Gould, R., Machado, S., Johnson, T. A., & Molynoux, J. (2018). Introduction to Data Science v 5.0. Los Angeles: UCLA Center X.
Hardin, J., Hoerl, R., Horton, N. J., Nolan, D., Baumer, B., Hall-Holt, O., … & Ward, M. D. (2015). Data science in statistics curricula: Preparing students to “think with data.” The American Statistician, 69(4), 343–353. https://doi.org/10.1080/00031305.2015.1077729
Hardy, L., Dixon, C., & Hsi, S. (2020). From Data collectors to data producers: Shifting students’ relationship to data. Journal of the Learning Sciences, 20(1), 104–126. https://doi.org/10.1080/10508406.2019.1678164
Hazzan, O., Ragonis, N., & Lapidot, T. (2020). Data science and computer science education. Guide to Teaching Computer Science: An Activity-Based Approach (pp. 95–117). Springer International Publishing.
Kahn, J. (2020). Learning at the Intersection of Self and society: The family geobiography as a context for data science education. Journal of the learning sciences, 20(1), 57–80. https://doi.org/10.1080/10508406.2019.1693377
Konold, C., Higgins, T., Russell, S. J., & Khalil, K. (2015). Data seen through different lenses. Educational Studies in Mathematics, 88(3), 305–325. https://doi.org/10.1007/s10649-013-9529-8
Krishnamurthi, S., Schanzer, E., Politz, J. G., Lerner, B. S., Fisler, K., & Dooman, S. (2020). Data science as a route to ai for middle-and high-school students. arXiv preprint arXiv:2005.01794.
Lehrer, R., & Schauble, L. (2004). Modeling natural variation through distribution. American Education Research Journal, 41(3), 635–679.
Lehrer, R., & Schauble, L. (2000). Inventing data structures for representational purposes: Elementary grade students’ classification models. Mathematical Thinking and Learning, 2(1 & 2), 51–74.
Lehrer, R., & Schauble, L. (2007). Contrasting emerging conceptions of distribution in contexts of error and natural variation. In M. Lovett & P. Shah (Eds.), Thinking with data (pp. 149–176). Lawrence Erlbaum.
Lee, V. R., & Delaney, V. (2021). Aesthetics of authenticity for teachers’ data set preferences. In E. d. Vries, Y. Hod, & J. Ahn (Eds.), 15th International Conference of the Learning Sciences (ICLS) (pp. 259–266). ISLS.
Lee, V. R., & Dubovi, I. (2020). At home with data: Family engagements with data involved in Type 1 Diabetes management. Journal of the Learning Sciences, 20(1), 11–31. https://doi.org/10.1080/10508406.2019.1666011
Lee, V. R., Drake, J., Cain, R., & Thayne, J. (in press). Remembering what produced the data: Reflective reconstruction in the context of a ‘quantified self’ elementary data and statistics unit. Cognition & Instruction. https://doi.org/10.1080/07370008.2021.1936529
Lee, V. R., Drake, J. R., & Thayne, J. L. (2016). Appropriating Quantified self technologies to improve elementary statistical teaching and learning. IEEE Transactions on Learning Technologies, 9(4), 354–365. https://doi.org/10.1109/TLT.2016.2597142
Lee, V. R., & Wilkerson, M. (2018). Data use by middle and secondary students in the digital age: A status report and future prospects. Retrieved from
Levitt, S. (2019). America’s math curriculum doesn’t add up [Audio Podcast]. Freakonomics. https://freakonomics.com/podcast/math-curriculum/
Makar, K., & Rubin, A. (2018). Learning About Statistical Inference. In D. Ben-Zvi, K. Makar, & J. Garfield (Eds.), International Handbook of Research in Statistics Education (pp. 261–294). Springer International Publishing.
Matuk, C., DesPortes, K., Amato, A., Silander, M., Vacca, R., Vasudevan, V., & Woods, P. J. (2021). Challenges and opportunities in teaching and learning data literacy through art. In E. d. Vries, Y. Hod, & J. Ahn (Eds.), 15th International Conference of the Learning Sciences (ICLS) (pp. 681–684). ISLS.
Metcalf, S. J., & Tinker, R. (2004). Probeware and handhelds in elementary and middle school science. Journal of Science Education and Technology, 13(1), 43–49.
Meyer, B. (1992). Applying “design by contract.” Computer, 25(10), 40–51. https://doi.org/10.1109/2.161279
Mokros, J., & Russell, S. J. (1995). Children’s concepts of average and representativeness. Journal for Research in Mathematics Education, 26(1), 20–39.
National Academies of Science, Engineering, & Medicine. (2018). Data science for undergraduates: Opportunities and options. National Academies Press.
Noble, S. (2018). Algorithms of oppression: How search engines reinforce racism. New York University Press.
O’Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Broadway Books.
Papert, S. (1996). An exploration in the space of mathematics educations. International Journal of Computers for Mathematical Learning, 1(1), 95–123.
Pearson Scott Foresman. (2008). Investigations in number, data, and space. Northbrook, IL: Pearson Scott Foresman.
Quintana, C., Reiser, B. J., Davis, E. A., Krajcik, J., Fretz, E., Duncan, R. G., … & Soloway, E. (2004). A scaffolding design framework for software to support science inquiry. The Journal of Learning Sciences, 13(3), 337–386.
Roberts, J., & Lyons, L. (2020). Examining spontaneous perspective taking and fluid self-to-data relationships in informal open-ended data exploration. Journal of the learning sciences, 20(1), 32–56. https://doi.org/10.1080/10508406.2019.1651317
Rosenberg, J. M., Borchers, C., Dyer, E. B., Anderson, D., & Fischer, C. (2021). Understanding public sentiment about educational reforms: The Next Generation Science Standards on Twitter. AERA Open, 7, 23328584211024261. https://doi.org/10.1177/23328584211024261
Rosenberg, J. M., Lawson, M., Anderson, D. J., Jones, R. S., & Rutherford, T. (2020). Making data science count in and for education. In E. Romero-Hall (Ed.), Research Methods in Learning Design and Technology (pp. 94-110). Routledge.
Rubel, L. H., Hall-Wieckert, M., & Lim, V. Y. (2017). Making space for place: Mapping tools and practices to teach for spatial justice. Journal of the Learning Sciences, 26(4), 643–687. https://doi.org/10.1080/10508406.2017.1336440
Rubin, A. (2019). Facebook or Instagram? Teens explore data about technology use. Retrieved from https://www.terc.edu/facebook-or-instagram-teens-explore-data-about-technology-use/
Rubin, A. (2020). Learning to reason with data: How did we get here and what do we know? Journal of the Learning Sciences, 20(1), 154–164. https://doi.org/10.1080/10508406.2019.1705665
Sandoval, W. A., & Millwood, K. A. (2005). The quality of students’ use of evidence in written scientific explanations. Cognition and Instruction, 23(1), 23–55. https://doi.org/10.1207/s1532690xci2301_2
Schanzer, E., Fisler, K., Krishnamurthi, S., & Felleisen, M. (2015). Transferring skills at solving word problems from computing to algebra through bootstrap. Paper presented at the Proceedings of the 46th ACM Technical Symposium on Computer Science Education, Kansas City, Missouri, USA. https://doi.org/10.1145/2676723.2677238
Schultheis, E. H., & Kjelvik, M. K. (2015). Data nuggets: Bringing real data into the classroom to unearth students’ quantitative & inquiry skills. The American Biology Teacher, 77(1), 19–29. https://doi.org/10.1525/abt.2015.77.1.4
Schwarz-Ballard, J. (2005). Content and curriculum coherence in middle school science. Unpublished Doctoral Dissertation. Northwestern University.
Siemens, G., & Baker, R. S. D. (2012). Learning analytics and educational data mining: Towards communication and collaboration. Paper presented at the Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, Vancouver, British Columbia, Canada. https://doi.org/10.1145/2330601.2330661
Simoneau, E. J. (2015). STATS4STEM: Data, computing, and Assessment resources for High-school statistics students. Chance, 28(4), 4–11.
Stornaiuolo, A. (2020). Authoring data stories in a media makerspace: Adolescents developing critical data literacies. Journal of the learning sciences, 20(1), 81–103. https://doi.org/10.1080/10508406.2019.1689365
Taylor, D. (2016). Battle of the data science Venn diagrams. Retrieved from https://www.kdnuggets.com/2016/10/battle-data-science-venn-diagrams.html
Van Wart, S., Lanouette, K., & Parikh, T. S. (2020). Scripts and counterscripts in community-based data science: Participatory digital mapping and the pursuit of a third space. Journal of the Learning Sciences, 20(1), 127–153. https://doi.org/10.1080/10508406.2019.1693378
Weintrop, D., Beheshti, E., Horn, M., Orton, K., Jona, K., Trouille, L., & Wilensky, U. (2016). defining computational thinking for mathematics and science classrooms. Journal of Science Education and Technology, 25(1), 127–147. https://doi.org/10.1007/s10956-015-9581-5
Wickham, H., & Grolemund, G. (2017). R for Data science: Import, tidy, transform, visualize, and model data: O'Reilly Media, Inc.
Wilkerson, M. H., & Polman, J. L. (2020). Situating data science: Exploring how relationships to data shape learning. Journal of the Learning Sciences, 20(1), 1–10. https://doi.org/10.1080/10508406.2019.1705664
Zimmermann-Niefield, A., Turner, M., Murphy, B., Kane, S. K., & Shapiro, R. B. (2019). Youth learning machine learning through building models of athletic moves. Proceedings of the 18th ACM International Conference on Interaction Design and Children, Boise, ID, USA. https://doi.org/10.1145/3311927.3323139
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics Approval
This is a study of documents and online materials. The Stanford University Institutional Review Board has confirmed no ethical approval is required.
Consent to Participate
There were no human participants involved in this study.
Conflict of Interest
The authors declare no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lee, V.R., Delaney, V. Identifying the Content, Lesson Structure, and Data Use Within Pre-collegiate Data Science Curricula. J Sci Educ Technol 31, 81–98 (2022). https://doi.org/10.1007/s10956-021-09932-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10956-021-09932-1