Skip to main content

Advertisement

Log in

Abstract

In this paper, we review algorithmic bias in education, discussing the causes of that bias and reviewing the empirical literature on the specific ways that algorithmic bias is known to have manifested in education. While other recent work has reviewed mathematical definitions of fairness and expanded algorithmic approaches to reducing bias, our review focuses instead on solidifying the current understanding of the concrete impacts of algorithmic bias in education—which groups are known to be impacted and which stages and agents in the development and deployment of educational algorithms are implicated. We discuss theoretical and formal perspectives on algorithmic bias, connect those perspectives to the machine learning pipeline, and review metrics for assessing bias. Next, we review the evidence around algorithmic bias in education, beginning with the most heavily-studied categories of race/ethnicity, gender, and nationality, and moving to the available evidence of bias for less-studied categories, such as socioeconomic status, disability, and military-connected status. Acknowledging the gaps in what has been studied, we propose a framework for moving from unknown bias to known bias and from fairness to equity. We discuss obstacles to addressing these challenges and propose four areas of effort for mitigating and resolving the problems of algorithmic bias in AIED systems and other educational technology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data Availability

Not applicable.

Code Availability

Not applicable.

References

  • Ali, M., Sapiezynski, P., Bogen, M., Korolova, A., Mislove, A., & Rieke, A. (2019). Discrimination through optimization: How Facebook’s ad delivery can Lead to biased outcomes. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW). https://doi.org/10.1145/3359301.

  • Anderson, H., Boodhwani, A., & Baker, R. S. (2019). Assessing the fairness of graduation predictions. Proceedings of the 12th International Conference on Educational Data Mining, 488–491.

  • Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine Bias: there’s software used across the country to predict future criminals. And it’s biased against blacks. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing%0A. Accessed 1 Oct 2021.

  • Arroyo, I., Burleson, W., Tai, M., Muldner, K., & Woolf, B. P. (2013). Gender differences in the use and benefit of advanced learning Technologies for Mathematics. Journal of Educational Psychology, 105(4), 957–969. https://doi.org/10.1037/a0032748

    Article  Google Scholar 

  • ASSISTments Project. (2014). ASSISTmentsData: Terms of Use for Using Data. Retrieved January 7, 2021, from https://sites.google.com/site/assistmentsdata/termsofuseforusingdata

  • Baker, R. S. (2019). Challenges for the future of educational data mining: The Baker learning analytics prizes. Journal of Educational Data Mining, 11(1), 1–17. https://doi.org/10.5281/zenodo.3554745

    Article  MathSciNet  Google Scholar 

  • Baker, R. S. J. D., Corbett, A. T., Koedinger, K. R., Evenson, S., Roll, I., Wagner, A. Z., Naim, M., Raspat, J., Baker, D. J., & Beck, J. E. (2006). Adapting to When Students Game an Intelligent Tutoring System. Proceedings of the 8th International Conference on Intelligent Tutoring Systems, 392–401. https://doi.org/10.1007/11774303_39.

  • Baker, R. S., Walker, E., Ogan, A., & Madaio, M. (2019). Culture in computer-based learning systems: Challenges and opportunities. Computer-Based Learning in Context, 1(1), 1–13.

    Google Scholar 

  • Baker, R. S., Berning, A., & Gowda, S. M. (2020). Differentiating military-connected and non-military-connected students: Predictors of graduation and SAT score. EdArXiv. https://doi.org/10.35542/osf.io/cetxj . Accessed 1 Oct 2021.

  • Bakken, D. E., Rarameswaran, R., Blough, D. M., Franz, A. A., & Palmer, T. J. (2004). Data obfuscation: Anonymity and desensitization of usable data sets. IEEE Security & Privacy, 2(6), 34–41. https://doi.org/10.1109/MSP.2004.97

    Article  Google Scholar 

  • Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and Machine Learning. fairmlbook.org. http://www.fairmlbook.org. Accessed 1 Oct 2021.

  • Bellamy, R. K. E., Dey, K., Hind, M., Hoffman, S. C., Houde, S., Kannan, K., Lohia, P., Martino, J., Mehta, S., Mojsilović, A., Nagar, S., Ramamurthy, K. N., Richards, J., Saha, D., Sattigeri, P., Singh, M., Varshney, K. R., & Zhang, Y. (2019). AI fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM Journal of Research and Development, 63(4/5), 4:1–4:15. https://doi.org/10.1147/JRD.2019.2942287.

  • Benitez, K., & Malin, B. (2010). Evaluating re-identification risks with respect to the HIPAA privacy rule. Journal of the American Medical Informatics Association, 17(2), 169–177. https://doi.org/10.1136/jamia.2009.000026

    Article  Google Scholar 

  • Benner, K., Thrush, G., & Isaac, M. (2019). Facebook Engages in Housing Discrimination With Its Ad Practices, U.S. Says. New York Timeshttps://www.nytimes.com/2019/03/28/us/politics/facebook-housing-discrimination.html. Accessed 1 Oct 2021.

  • Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., Fissore, L., Laface, P., Mertins, A., Ris, C., Rose, R., Tyagi, V., & Wellekens, C. (2007). Automatic speech recognition and speech variability: A review. Speech Communication, 49(10), 763–786. https://doi.org/10.1016/j.specom.2007.02.006

    Article  Google Scholar 

  • Berk, R., Heidari, H., Jabbari, S., Kearns, M., & Roth, A. (2018). Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research, 50(1), 3–44. https://doi.org/10.1177/0049124118782533

    Article  MathSciNet  Google Scholar 

  • Bird, S., Dale, R., Dorr, B. J., Gibson, B., Joseph, M. T., Kan, M. Y., Lee, D., Powley, B., Radev, D. R., & Tan, Y. F. (2008). No Title. Proceedings of the 6th International Conference on Language Resources and Evaluation, 1755–1759.

  • Bireda, M. R. (2002). Eliminating racial profiling in school discipline: Cultures in conflict. Scarecrow Press.

    Google Scholar 

  • Blodgett, S. L., & O’Connor, B. (2017). Racial disparity in natural language processing: A case study of social media African-American English. ArXiv E-Prints, arXiv:1707.00061. https://arxiv.org/abs/1707.00061. Accessed 1 Oct 2021.

  • Blodgett, S. L., Barocas, S., III, H. D., & Wallach, H. (2020). Language (Technology) is Power: A Critical Survey of “Bias” in NLP. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5454–5476. https://doi.org/10.18653/v1/2020.acl-main.485.

  • Bridgeman, B., Trapani, C., & Attali, Y. (2009). Considering fairness and validity in evaluating automated scoring [Paper presentation]. Annual Meeting of the National Council on Measurement in Education (NCME), United States.

  • Bridgeman, B., Trapani, C., & Attali, Y. (2012). Comparison of human and machine scoring of essays: Differences by gender, ethnicity, and country. Applied Measurement in Education, 25(1), 27–40. https://doi.org/10.1080/08957347.2012.635502

    Article  Google Scholar 

  • Cabrera, Á. A., Epperson, W., Hohman, F., Kahng, M., Morgenstern, J., & Chau, D. H. (2019). FAIRVIS: Visual analytics for discovering intersectional Bias in machine learning. IEEE Conference on Visual Analytics Science and Technology (VAST), 2019, 46–56. https://doi.org/10.1109/VAST47406.2019.8986948

    Article  Google Scholar 

  • Caton, S., & Haas, C. (2020). Fairness in machine learning: A survey. ArXiv E-Prints, arXiv:2010.04053. https://arxiv.org/abs/2010.04053. Accessed 1 Oct 2021.

  • Chicago Beyond (2019). Why am I always being researched? A guidebook for community organizations, researchers, and funders to help us get from insufficient understanding to more authentic truth. Chicago Beyond. https://chicagobeyond.org/researchequity/. Accessed 1 Oct 2021.

  • Chouldechova, A. (2017). Fair prediction with disparate impact: A study of Bias in recidivism prediction instruments. Big Data, 5(2), 153–163. https://doi.org/10.1089/big.2016.0047

    Article  Google Scholar 

  • Christie, S. T., Jarratt, D. C., Olson, L. A., & Taijala, T. T. (2019). Machine-learned school dropout early warning at scale. Proceedings of The 12th International Conference on Educational Data Mining (EDM 2019), 726–731.

  • Ciociola, A. A., Cohen, L. B., Kulkarni, P., & FDA-Related Matters Committee of the American College of Gastroenterology. (2014). How drugs are developed and approved by the FDA: Current process and future directions. The American Journal of Gastroenterology, 109(5), 620–623. https://doi.org/10.1038/ajg.2013.407

    Article  Google Scholar 

  • Cole, N. S., & Zieky, M. J. (2001). The new faces of fairness. Journal of Educational Measurement, 38(4), 369–382. https://doi.org/10.1111/j.1745-3984.2001.tb01132.x

    Article  Google Scholar 

  • Cramer, H., Holstein, K., Vaughan, J. W., Daumé, H., Dudik, M., Wallach, H., Reddy, S., & Jean, G.-G. [The Conference on Fairness, Accountability, and Transparency (FAT*)]. (2019). FAT* 2019 translation tutorial: Challenges of incorporating algorithmic fairness [video]. YouTube. https://youtu.be/UicKZv93SOY

  • Crawford, K. [The Artificial Intelligence Channel]. (2017). The Trouble with Bias - NIPS 2017 Keynote - Kate Crawford [Video]. YouTube. https://youtu.be/fMym_BKWQzk

  • Crenshaw, K. (1991). Mapping the margins: Intersectionality, identity politics, and violence against women of color. Stanford Law Review, 43(6), 1241–1300.

    Article  Google Scholar 

  • Darlington, R. B. (1971). Another look at “cultural fairness.”. Journal of Educational Measurement, 8(2), 71–82. https://doi.org/10.1111/j.1745-3984.1971.tb00908.x

    Article  Google Scholar 

  • Dieterle, E., Dede, C., & Walker, M. (under review). The cyclical ethical effects of using artificial intelligence in education. Manuscript under review.

  • D'ignazio, C., & Klein, L. F. (2020). Data feminism. MIT press.

    Book  Google Scholar 

  • Doran, D., Schulz, S., & Besold, T. R. (2018). What Does Explainable AI Really Mean? A New Conceptualization of Perspectives. CEUR Workshop Proceedings, 2071https://openaccess.city.ac.uk/id/eprint/18660/. Accessed 1 Oct 2021.

  • Dorans, N. J. (2010). Misrepresentations in unfair treatment by Santelices and Wilson. Harvard Educational Review, 80(3), 404–413.

    Article  Google Scholar 

  • Doroudi, S., & Brunskill, E. (2019). Fairer but not fair enough on the equitability of knowledge tracing. Proceedings of the 9th International Conference on Learning Analytics & Knowledge, 335–339. https://doi.org/10.1145/3303772.3303838.

  • Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012). Fairness through awareness. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, 214–226. https://doi.org/10.1145/2090236.2090255.

  • Ferrero, F., & Gewerc Barujel, A. (2019). Algorithmic driven decision-making Systems in Education: Analyzing Bias from the sociocultural perspective. 2019 XIV Latin American Conference on Learning Technologies (LACLO), 166–173. https://doi.org/10.1109/LACLO49268.2019.00038.

  • Finkelstein, S., Yarzebinski, E., Vaughn, C., Ogan, A., & Cassell, J. (2013). The effects of culturally congruent educational technologies on student achievement. In H. C. Lane, K. Yacef, J. Mostow, & P. Pavlik (Eds.), Proceedings of the 16th International Conference on Artificial Intelligence in Education (pp. 493–502). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-39112-5_50.

  • Friedman, B., & Nissenbaum, H. (1996). Bias in computer systems. ACM Transactions on Information Systems, 14(3), 330–347. https://doi.org/10.1145/230538.230561

    Article  Google Scholar 

  • Garcia, M. (2016). Racist in the Machine: The Disturbing Implications of Algorithmic Bias. World Policy Journal, 33(4), 111–117. https://www.muse.jhu.edu/article/645268. Accessed 1 Oct 2021.

  • Gardner, J., Brooks, C., Andres, J. M., & Baker, R. S. (2018). MORF: A framework for predictive modeling and replication at scale with privacy-restricted MOOC data. 2018 IEEE International Conference on Big Data (Big Data), 3235–3244. https://doi.org/10.1109/BigData.2018.8621874.

  • Gardner, J., Brooks, C., & Baker, R. (2019). Evaluating the Fairness of Predictive Student Models Through Slicing Analysis. Proceedings of the 9th International Conference on Learning Analytics & Knowledge, 225–234. https://doi.org/10.1145/3303772.3303791.

  • Gebru, T., Morgenstern, J., Vecchione, B., Wortman Vaughan, J., Wallach, H., Daumé Hal, I. I. I., & Crawford, K. (2018). Datasheets for Datasets. ArXiv E-Prints, arXiv:1803.09010. https://arxiv.org/abs/1803.09010. Accessed 1 Oct 2021.

  • Green, B. (2020). The false promise of risk assessments: Epistemic reform and the limits of fairness. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 594–606. https://doi.org/10.1145/3351095.3372869.

  • Green, B., & Hu, L. (2018, July 10–15). The Myth in the Methodology: Towards a Recontextualization of Fairness in Machine Learning [Conference presentation]. The Debates Workshop at the 35th International Conference on Machine Learning, Stockholm, Sweden.

  • Green, B., & Viljoen, S. (2020). Algorithmic realism: Expanding the boundaries of algorithmic thought. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 19–31. https://doi.org/10.1145/3351095.3372840.

  • Guo, A., Kamar, E., Vaughan, J. W., Wallach, H., & Morris, M. R. (2019). Toward fairness in AI for people with disabilities: A research roadmap. arXiv preprint arXiv:1907.02227https://arxiv.org/abs/1907.02227. Accessed 1 Oct 2021.

  • Hajian, S., Bonchi, F., & Castillo, C. (2016). Algorithmic Bias: From Discrimination Discovery to Fairness-Aware Data Mining. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2125–2126. https://doi.org/10.1145/2939672.2945386.

  • Halevy, A., Norvig, P., & Pereira, F. (2009). The unreasonable effectiveness of data. IEEE Intelligent Systems, 24(2), 8–12. https://doi.org/10.1109/MIS.2009.36

    Article  Google Scholar 

  • Hanna, A., Denton, E., Smart, A., & Smith-Loud, J. (2020). Towards a critical race methodology in algorithmic fairness. In Proceedings of the 2020 conference on fairness, accountability, and transparency (pp. 501–512).

  • Hardt, M., Price, E., & Srebro, N. (2016). Equality of Opportunity in Supervised Learning. Proceedings of the 30th International Conference on Neural Information Processing Systems, 3323–3331.

  • Hellström, T., Dignum, V., & Bensch, S. (2020). Bias in Machine Learning -- What is it Good for? In A. Saffiotti, L. Serafini, & P. Lukowicz (Eds.), Proceedings of the First International Workshop on New Foundations for Human-Centered AI (NeHuAI) co-located with 24th European Conference on Artificial Intelligence (ECAI 2020) (pp. 3–10). RWTH Aachen University.

  • Holstein, K. & Doroudi, S. (in press). Equity and artificial intelligence in education: Will “AIEd” Amplify or Alleviate Inequities in Education? Invited chapter in Porayska-Pomsta, K. & Holmes, W. (Eds.), Ethics in AIED: Who Cares? Data, algorithms, equity and biases in educational contexts. Routledge Press.

  • Holstein, K., Wortman Vaughan, J., Daumé, H., Dudik, M., & Wallach, H. (2019). Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need? Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1–16. https://doi.org/10.1145/3290605.3300830.

  • Howley, I. (2018) If an algorithm is openly accessible, and no one can understand it, is it actually open? In Artificial Intelligence in Education Workshop on Ethics in AIED 2018.

  • Hu, Q., & Rangwala, H. (2020). Towards Fair Educational Data Mining: A Case Study on Detecting At-risk Students. Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020), 431–437.

  • Hunter, J. E., & Schmidt, F. L. (1976). Critical analysis of the statistical and ethical implications of various definitions of test bias. Psychological Bulletin, 83(6), 1053–1071. https://doi.org/10.1037/0033-2909.83.6.1053

    Article  Google Scholar 

  • Hutchinson, B., & Mitchell, M. (2019). 50 years of test (un)fairness: Lessons for machine learning. Proceedings of the Conference on Fairness, Accountability, and Transparency, 49–58. https://doi.org/10.1145/3287560.3287600.

  • James, R., Bexley, E., Anderson, M., Devlin, M., Garnett, R., Marginson, S., & Maxwell, L. (2008). Participation and equity: a review of the participation in higher education of people from low socioeconomic backgrounds and Indigenous peoplehttp://hdl.voced.edu.au/10707/31488. Accessed 1 Oct 2021.

  • Jiang, J., Wang, R., Wang, M., Gao, K., Nguyen, D. D., & Wei, G.-W. (2020). Boosting tree-assisted multitask deep learning for small scientific datasets. Journal of Chemical Information and Modeling, 60(3), 1235–1244.

  • Johns, J., & Woolf, B. (2006). A Dynamic Mixture Model to Detect Student Motivation and Proficiency. Proceedings of the 21st National Conference on Artificial Intelligence, 1, 163–168.

  • Kai, S., Andres, J. M. L. ., Paquette, L., Baker, R. S. ., Molnar, K., Watkins, H., & Moore, M. (2017). Predicting Student Retention from Behavior in an Online Orientation Course. Proceedings of the 10th International Conference on Educational Data Mining, 250–255.

  • Käser Jacober, T. (2014). Modeling and Optimizing Computer-Assisted Mathematics Learning in Children [Doctoral dissetation, ETH Zurich]. ETH Library. https://doi.org/10.3929/ethz-a-010265296.

  • Kay, M., Matuszek, C., & Munson, S. A. (2015). Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (pp. 3819–3828). Association for Computing Machinery. https://doi.org/10.1145/2702123.2702520.

  • Kizilcec, R. F., & Brooks, C. (2017). Diverse big data and randomized field experiments in MOOCs. Handbook of Learning Analytics, 211–222.

  • Kizilcec, R. F., & Lee, H. (2021). Algorithmic Fairness in Education. Algorithmic Fairness in Education. In W. Holmes & K. Porayska-Pomsta (Eds.), Ethics in Artificial Intelligence in Education. Abingdon-on-Thames, UK: Taylor & Francis, in press.

  • Klare, B. F., Burge, M. J., Klontz, J. C., Bruegge, R. W. V., & Jain, A. K. (2012). Face recognition performance: Role of demographic information. IEEE Transactions on Information Forensics and Security, 7(6), 1789–1801. https://doi.org/10.1109/TIFS.2012.2214212

    Article  Google Scholar 

  • Kleinberg, J., Mullainathan, S., & Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. In C. H. Papadimitriou (Ed.), Proceedings of the 8th Innovations in Theoretical Computer Science Conference (ITCS 2017) (Vol. 67, pp. 43:1–43:23). Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik. https://doi.org/10.4230/LIPIcs.ITCS.2017.43.

  • Klingler, S., Wampfler, R., Käser, T., Solenthaler, B., & Gross, M. (2017). Efficient Feature Embeddings for Student Classification with Variational Auto-Encoders. Proceedings of the 10th International Conference on Educational Data Mining, 72–79.

  • Knight, W. (2019). The Apple Card Didn’t “See” Gender—and That’s the Problem. Wired.

  • Kraemer, H. C., & Blasey, C. (2015). How Many Subjects?: Statistical Power Analysis in Research. SAGE Publications. https://books.google.com/books?id=wMxuBgAAQBAJ. Accessed 1 Oct 2021.

  • Kraiger, K., & Ford, J. K. (1985). A meta-analysis of ratee race effects in performance ratings. Journal of Applied Psychology, 70(1), 56–65. https://doi.org/10.1037/0021-9010.70.1.56

    Article  Google Scholar 

  • Le Bras, R., Swayamdipta, S., Bhagavatula, C., Zellers, R., Peters, M., Sabharwal, A., & Choi, Y. (2020). Adversarial filters of dataset biases. Proceedings of the 37th International Conference on Machine Learning, 119, 1078–1088.

  • Lee, M. K. (2018). Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society, 5(1), 2053951718756684. https://doi.org/10.1177/2053951718756684

    Article  Google Scholar 

  • Lee, H., & Kizilcec, R. F. (2020). Evaluation of fairness trade-offs in predicting student success. ArXiv E-Prints, arXiv:2007.00088. https://arxiv.org/abs/2007.00088. Accessed 1 Oct 2021.

  • Lee, M. K., Jain, A., Cha, H. J., Ojha, S., & Kusbit, D. (2019). Procedural Justice in Algorithmic Fairness: Leveraging Transparency and Outcome Control for Fair Algorithmic Mediation. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), Article 182. https://doi.org/10.1145/3359284.

  • Li, X., Song, D., Han, M., Zhang, Y., & Kizilcec, R. F. (2021). On the limits of algorithmic prediction across the globe. arXiv preprint arXiv:2103.15212.

  • Loukina, A., & Buzick, H. (2017). Use of automated scoring in spoken language assessments for test takers with speech impairments. ETS Research Report Series, 2017(1), 1–10. https://doi.org/10.1002/ets2.12170

    Article  Google Scholar 

  • Loukina, A., Madnani, N., & Zechner, K. (2019). The many dimensions of algorithmic fairness in educational applications. Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, 1–10.

  • Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S.-I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 56–67. https://doi.org/10.1038/s42256-019-0138-9

    Article  Google Scholar 

  • Makhlouf, K., Zhioua, S., & Palamidessi, C. (2020). On the applicability of ML fairness notions. ArXiv E-Prints, arXiv:2006.16745. https://arxiv.org/abs/2006.16745. Accessed 1 Oct 2021.

  • Mayfield, E., Madaio, M., Prabhumoye, S., Gerritsen, D., McLaughlin, B., Dixon-Román, E., & Black, A. W. (2019). Equity Beyond Bias in Language Technologies for Education. Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, 444–460. https://doi.org/10.18653/v1/W19-4446.

  • Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2019). A survey on Bias and fairness in machine learning. ArXiv E-Prints, arXiv:1908.09635. https://arxiv.org/abs/1908.09635. Accessed 1 Oct 2021.

  • Melis, E., Goguadze, G., Libbrecht, P., & Ullrich, C. (2009). Culturally adapted mathematics education with ActiveMath. AI & SOCIETY, 24(3), 251–265. https://doi.org/10.1007/s00146-009-0215-4

    Article  Google Scholar 

  • Milliron, M. D., Malcolm, L., & Kil, D. (2014). Insight and action analytics: Three case studies to consider. Research & Practice in Assessment, 9, 70–89.

    Google Scholar 

  • Mitchell, S., Potash, E., Barocas, S., D’Amour, A., & Lum, K. (2021). Algorithmic fairness: Choices, assumptions, and definitions. Annual Review of Statistics and Its Application, 8. https://doi.org/10.1146/annurev-statistics-042720-125902.

  • Naismith, B., Han, N.-R., Juffs, A., Hill, B., & Zheng, D. (2018). Accurate Measurement of Lexical Sophistication with Reference to ESL Learner Data. Proceedings of 11th International Conference on Educational Data Mining, 259–265.

  • O’Reilly-Shah, V. N., Gentry, K. R., Walters, A. M., Zivot, J., Anderson, C. T., & Tighe, P. J. (2020). Bias and ethical considerations in machine learning and the automation of perioperative risk assessment. British Journal of Anaesthesia, 125(6), 843–846. https://doi.org/10.1016/j.bja.2020.07.040

    Article  Google Scholar 

  • Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–453. https://doi.org/10.1126/science.aax2342

    Article  Google Scholar 

  • Ocumpaugh, J., Baker, R., Gowda, S., Heffernan, N., & Heffernan, C. (2014). Population validity for educational data mining models: A case study in affect detection. British Journal of Educational Technology, 45(3), 487–501. https://www.learntechlib.org/p/148344. Accessed 1 Oct 2021

  • Ogan, A., Walker, E., Baker, R., Rodrigo, M. M. T., Soriano, J. C., & Castro, M. J. (2015). Towards understanding how to assess help-seeking behavior across cultures. International Journal of Artificial Intelligence in Education, 25(2), 229–248. https://doi.org/10.1007/s40593-014-0034-8

    Article  Google Scholar 

  • Okur, E., Aslan, S., Alyuz, N., Arslan Esme, A., & Baker, R. S. (2018). Role of socio-cultural differences in labeling students’ affective states. In C. Penstein Rosé, R. Martínez-Maldonado, H. U. Hoppe, R. Luckin, M. Mavrikis, K. Porayska-Pomsta, B. McLaren, & B. du Boulay (Eds.), Proceedings of the 19th International Conference on Artificial Intelligence in Education (pp. 367–380). Springer International Publishing. https://doi.org/10.1007/978-3-319-93843-1_27.

  • Olteanu, A., Castillo, C., Diaz, F., & Kıcıman, E. (2019). Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data, 2, 13. https://doi.org/10.3389/fdata.2019.00013

    Article  Google Scholar 

  • Paquette, L., Ocumpaugh, J., Li, Z., Andres, A., & Baker, R. (2020). Who’s learning? Using demographics in EDM research. Journal of Educational Data Mining, 12(3), 1–30. https://doi.org/10.5281/zenodo.4143612

    Article  Google Scholar 

  • Paullada, A., Raji, I. D., Bender, E. M., Denton, E., & Hanna, A. (2020). Data and its (dis)contents: A survey of dataset development and use in machine learning research. ArXiv E-Prints, arXiv:2012.05345. https://arxiv.org/abs/2012.05345. Accessed 1 Oct 2021.

  • Petersen, N. S., & Novick, M. R. (1976). An evaluation of some models for culture-fair selection. Journal of Educational Measurement, 13(1), 3–29. https://doi.org/10.1111/j.1745-3984.1976.tb00178.x

    Article  Google Scholar 

  • Ramineni, C., & Williamson, D. M. (2013). Automated essay scoring: Psychometric guidelines and practices. Assessing Writing, 18(1), 25–39. https://doi.org/10.1016/j.asw.2012.10.004

    Article  Google Scholar 

  • Ramineni, C., & Williamson, D. (2018). Understanding mean score differences between the e-rater® automated scoring engine and humans for demographically based groups in the GRE® general test. ETS Research Report Series, 2018(1), 1–31.

    Article  Google Scholar 

  • Rauf, D. S. (2020). The New, Tough Expectations Education Companies Face on Race and Diversity. Market Brief: Market Trends. https://marketbrief.edweek.org/market-trends/new-tough-expectations-education-companies-face-race-diversity/. Accessed 1 Oct 2021.

  • Reich, J. (2015). Rebooting MOOC research. Science, 347(6217), 34–35.

    Article  Google Scholar 

  • Riazy, S., Simbeck, K., & Schreck, V. (2020). Fairness in Learning Analytics: Student At-risk Prediction in Virtual Learning Environments. Proceedings of the 12th International Conference on Computer Supported Education (CSEDU 2020), 1, 15–25. https://doi.org/10.5220/0009324100150025.

  • Ritter, S., Yudelson, M., Fancsali, S. E., & Berman, S. R. (2016). How Mastery Learning Works at Scale. Proceedings of the Third (2016) ACM Conference on Learning @ Scale, 71–79. https://doi.org/10.1145/2876034.2876039.

  • Samei, B., Olney, A. M., Kelly, S., Nystrand, M., D’Mello, S., Blanchard, N., & Graesser, A. (2015). Modeling Classroom Discourse: Do Models That Predict Dialogic Instruction Properties Generalize across Populations? Proceedings of the 8th International Conference on Educational Data Mining, 444–447.

  • Santelices, M. V., & Wilson, M. (2010). Unfair treatment? The case of Freedle, the SAT, and the standardization approach to differential item functioning. Harvard Educational Review, 80(1), 106–134. https://doi.org/10.17763/haer.80.1.j94675w001329270

    Article  Google Scholar 

  • Selent, D., Patikorn, T., & Heffernan, N. (2016). ASSISTments Dataset from Multiple Randomized Controlled Experiments. Proceedings of the Third (2016) ACM Conference on Learning @ Scale, 181–184. https://doi.org/10.1145/2876034.2893409.

  • Silva, S., & Kenney, M. (2018). Algorithms, platforms, and ethnic Bias: An integrative essay. Phylon (1960-), 55(1&2), 9–37. https://www.jstor.org/stable/10.2307/26545017. Accessed 1 Oct 2021.

  • Slater, S., & Baker, R. S. (2018). Degree of error in Bayesian knowledge tracing estimates from differences in sample sizes. Behaviormetrika, 45, 475–493. https://doi.org/10.1007/s41237-018-0072-x

    Article  Google Scholar 

  • Smith, L. T. (2013). Decolonizing Methodologies: Research and Indigenous Peoples. Zed Books. https://books.google.com/books?id=8R1jDgAAQBAJ. Accessed 1 Oct 2021.

  • Smith, H. (2020). Algorithmic bias: Should students pay the price? AI & SOCIETY, 35(4), 1077–1078. https://doi.org/10.1007/s00146-020-01054-3

    Article  Google Scholar 

  • Soomro, K., Zamir, A. R., & Shah, M. (2012). UCF101: A dataset of 101 human actions classes from videos in the wild. ArXiv E-Prints, arXiv:1212.0402. https://arxiv.org/abs/1212.0402. Accessed 1 Oct 2021.

  • Soundarajan, S., & Clausen, D. L. (2018). Equal Protection Under the Algorithm: A Legal-Inspired Framework for Identifying Discrimination in Machine Learning. Proceedings of the 35th International Conference on Machine Learning.

  • Stamper, J., & Pardos, Z. A. (2016). The 2010 KDD cup competition dataset: Engaging the machine learning Community in Predictive Learning Analytics. Journal of Learning Analytics, 3(2), 312–316. https://doi.org/10.18608/jla.2016.32.16%0A

    Article  Google Scholar 

  • Strmic-Pawl, H. V., Jackson, B. A., & Garner, S. (2018). Race counts: Racial and ethnic data on the U.S. Census and the implications for tracking inequality. Sociology of Race and Ethnicity, 4(1), 1–13. https://doi.org/10.1177/2332649217742869

    Article  Google Scholar 

  • Suresh, H., & Guttag, J. V. (2020). A framework for understanding unintended consequences of machine learning. ArXiv E-Prints, arXiv:1901.10002. https://arxiv.org/abs/1901.10002. Accessed 1 Oct 2021.

  • Sweeney, L. (2013). Discrimination in online ad delivery. Communications of the ACM, 56(5), 44–54. https://doi.org/10.1145/2447976.2447990

    Article  Google Scholar 

  • Tatman, R. (2017). Gender and dialect Bias in YouTube’s automatic captions. Proceedings of the First Workshop on Ethics in Natural Language Processing, 53–59.

  • Telford, T. (2019). Apple Card algorithm sparks gender bias allegations against Goldman Sachs. Washington Posthttps://www.washingtonpost.com/business/2019/11/11/apple-card-algorithm-sparks-gender-bias-allegations-against-goldman-sachs/. Accessed 1 Oct 2021.

  • Tempelaar, D., Rienties, B., & Nguyen, Q. (2020). Subjective data, objective data and the role of bias in predictive modelling: Lessons from a dispositional learning analytics application. PLoS One, 15(6), e0233977. https://doi.org/10.1371/journal.pone.0233977

    Article  Google Scholar 

  • Tipton, E. (2014). Stratified sampling using cluster analysis: A sample selection strategy for improved generalizations from experiments. Evaluation Review, 37(2), 109–139. https://doi.org/10.1177/0193841X13516324

    Article  Google Scholar 

  • Verma, S., & Rubin, J. (2018). Fairness definitions explained. FairWare ‘18: Proceedings of the International Workshop on Software Fairness, 1–7. https://doi.org/10.1145/3194770.3194776.

  • Wang, Z., Zechner, K., & Sun, Y. (2018). Monitoring the performance of human and automated scores for spoken responses. Language Testing, 35(1), 101–120. https://doi.org/10.1177/0265532216679451

    Article  Google Scholar 

  • Waters, A., & Miikkulainen, R. (2014). GRADE: Machine learning support for graduate admissions. AI Magazine, 35(1), 64. https://doi.org/10.1609/aimag.v35i1.2504

    Article  Google Scholar 

  • Wolff, A., Zdrahal, Z., Nikolov, A., & Pantucek, M. (2013). Improving retention: Predicting at-risk students by Analysing clicking behaviour in a virtual learning environment. Proceedings of the Third International Conference on Learning Analytics and Knowledge, 145–149. https://doi.org/10.1145/2460296.2460324.

  • Woolf, B. P., Arroyo, I., Muldner, K., Burleson, W., Cooper, D. G., Dolan, R., & Christopherson, R. M. (2010). The effect of motivational learning companions on low achieving students and students with disabilities. In V. Aleven, J. Kay, & J. Mostow (Eds.), Proceedings of the 10th international conference on intelligent tutoring systems (ITS’10) (pp. 327–337). Springer Berlin Heidelberg.

    Chapter  Google Scholar 

  • Wu, R., Xu, G., Chen, E., Liu, Q., & Ng, W. (2017). Knowledge or Gaming? Cognitive Modelling Based on Multiple-Attempt Response. Proceedings of the 26th International Conference on World Wide Web Companion, 321–329. https://doi.org/10.1145/3041021.3054156.

  • Xia, M., Asano, Y., Williams, J. J., Qu, H., & Ma, X. (2020). Using information visualization to promote students’ reflection on “gaming the system” in online learning. Proceedings of the Seventh ACM Conference on Learning @ Scale, 37–49. https://doi.org/10.1145/3386527.3405924.

  • Yu, R., Li, Q., Fischer, C., Doroudi, S., & Xu, D. (2020). Towards Accurate and Fair Prediction of College Success: Evaluating Different Sources of Student Data. Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020), 292–301.

  • Yu, R., Lee, H., & Kizilcec, R. F. (2021). Should college dropout prediction models include protected attributes?. In Proceedings of the Eighth ACM Conference on Learning@ Scale (pp. 91–100).

  • Yudelson, M. V., Fancsali, S. E., Ritter, S., Berman, S. R., Nixon, T., & Joshi, A. (2014). Better Data Beat Big Data. Proceedings of the 7th International Conference on Educational Data Mining, 205–208.

  • Zhou, T., Sheng, H., & Howley, I. (2020). Assessing post-hoc Explainability of the BKT algorithm. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (pp. 407–413).

Download references

Funding

Schmidt Futures and University of Pennsylvania. All opinions and perspectives presented in this article are the opinions of the authors and do not necessarily represent the funders’ opinions or perspectives.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ryan S. Baker.

Ethics declarations

Ethics Approval

Not applicable.

Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Conflicts of Interest/Conflicting Interests

None to report.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baker, R.S., Hawn, A. Algorithmic Bias in Education. Int J Artif Intell Educ 32, 1052–1092 (2022). https://doi.org/10.1007/s40593-021-00285-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40593-021-00285-9

Keywords

Navigation