Abstract
Ensuring fair access to quality education is essential for every education system to fully realize every student’s potential. Nowadays, machine learning (ML) is transforming education by enabling educators to develop personalized learning strategies for the students, providing important information on student progression and early identification of potential points of struggle, developing more efficient grading systems, etc. The role of the Educational Data Science (EDS) domain in educational activities for both teachers and learners is becoming therefore increasingly important. However, ML-driven decision-making can be biased, resulting in underperforming ML models and/or ML models that discriminate against individuals or groups of students based on protected attributes like gender or race. Mitigating bias and discrimination in ML is of paramount importance. In this work, we focus on one of the most effective ML tasks, clustering, which is widely used in EDS as an exploratory tool to understand student characteristics and behavior but also as a stand-alone tool for, e.g., group assignments. Traditionally, clustering algorithms focus on finding groups or clusters of similar students and ignore aspects of fairness and discrimination. However, both cluster quality and fairness of the resulting clusters are needed. This chapter provides a comprehensive review of different clustering models in EDS, with greater emphasis on fair clustering models. Among the fair clustering models, we mainly focus on models that have been proposed and/or applied in educational activities to ensure their usefulness and applicability for fairness-aware EDS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
Students’ confidence entropy is computed by the Shannon equation.
- 8.
Because the objects or data points for clustering in the related work are mainly students, we use the terms “objects,” “data points,” and “students” interchangeably.
- 9.
- 10.
- 11.
- 12.
We only take into account the fairness notions introduced in the published papers. Because the fairness notions may be turned into measures [157], therefore, in this review we use the term “fairness notion” and “fairness measure” interchangeably.
Abbreviations
- ACC:
-
Clustering accuracy
- AI:
-
Artificial intelligence
- AIED:
-
Artificial intelligence in education
- ANOVA:
-
Analysis of variance
- ARI:
-
Adjusted rand index
- BIRCH:
-
Balanced iterative reducing and clustering using hierarchies
- BMI:
-
Body mass index
- BMU:
-
Best matching unit
- CFSFDP-HD:
-
Clustering by fast search and finding of density peaks via heat diffusion
- CHI:
-
Calinski–Harabasz index
- CLARA:
-
Clustering in LARge Applications
- CLARANS :
-
Clustering Large Applications based on RANdomized Search
- CORE:
-
Computing Research and Education Association of Australasia
- DBI:
-
Davies–Bouldin index
- DBLP:
-
Database systems and logic programming
- DBSCAN:
-
Density-based spatial clustering of applications with noise
- DI:
-
Dunn index
- DP:
-
Dirichlet process
- EDM:
-
Educational data mining
- EDS:
-
Educational data science
- EM:
-
Expectation–maximization
- EMT:
-
Ensemble meta-based tree
- FCM:
-
Fuzzy c-means
- FIE:
-
Frontiers in education
- ICALT:
-
International Conference on Advanced Learning Technologies
- ITS:
-
Intelligent tutoring system
- KPCA:
-
Kernel-based principal component analysis
- LA:
-
Learning analytics
- LD:
-
Learning design
- LMS:
-
Learning management system
- MIT:
-
Massachusetts Institute of Technology
- ML:
-
Machine learning
- MOOC:
-
Massive open online course
- NMI:
-
Normalized mutual information
- OPTICS:
-
Ordering points to identify the clustering structure
- OULAD:
-
Open University Learning Analytics
- PAM:
-
Partition around medoids
- PISA:
-
Program for International Student Assessment
- RQ:
-
Research question
- SJR:
-
Scimago Journal & Country Rank
- SOM:
-
Self-organizing map
- SSE:
-
Sum of squared error
- SVM:
-
Support vector machine
References
Dorans, N.J., Cook, L.L.: Fairness in Educational Assessment and Measurement. Routledge, New York (2016)
Zlatkin-Troitschanskaia, O., Schlax, J., Jitomirski, J., Happ, R., Kühling-Thees, C., Brückner, S., Pant, H.: Ethics and fairness in assessing learning outcomes in higher education. High Educ. Pol. 32(4), 537–556 (2019). https://doi.org/10.1057/s41307-019-00149-x
Ford, M., Morice, J.: How fair are group assignments? A survey of students and faculty and a modest proposal. J. Inform. Technol. Educ. Res. 2(1), 367–378 (2003)
Miles, J.A., Klein, H.J.: The fairness of assigning group members to tasks. Group Org. Manag. 23(1), 71–96 (1998). https://doi.org/10.1177/1059601198231005
Rezaeinia, N., Góez, J.C., Guajardo, M.: Efficiency and fairness criteria in the assignment of students to projects. Ann. Oper. Res., 1–19 (2021). https://doi.org/10.1007/s10479-021-04001-7
Song, X.: The fairness of a graduate school admission test in China: voices from administrators, teachers, and test-takers. Asia Pac. Educ. Res. 27(2), 79–89 (2018). https://doi.org/10.1007/s40299-018-0367-4
Xiao, W., Ji, P., Hu, J.: A survey on educational data mining methods used for predicting students’ performance. Eng. Rep. (2021). https://doi.org/10.1002/eng2.12482
Meyer, K.: Education, Justice and the Human Good: Fairness and Equality in the Education System. Routledge, London (2014)
McFarland, D.A., Khanna, S., Domingue, B.W., Pardos, Z.A.: Education data science: past, present, future. AERA Open. 7 (2021). https://doi.org/10.1177/23328584211052055
Romero, C., Ventura, S.: Educational data science in massive open online courses. Wiley Interdisc. Rev. Data Min. Know. Discov. 7(1), e1187 (2017). https://doi.org/10.1002/widm.1187
Dutt, A., Ismail, M.A., Herawan, T.: A systematic review on educational data mining. IEEE Access. 5, 15991–16005 (2017). https://doi.org/10.1109/ACCESS.2017.2654247
Peña-Ayala, A.: Educational data mining: a survey and a data mining-based analysis of recent works. Expert Syst. Appl. 41(4), 1432–1462 (2014). https://doi.org/10.1016/j.eswa.2013.08.042
Romero, C., Ventura, S.: Educational data mining and learning analytics: an updated survey. Wiley Interdisc. Rev. Data Min. Know. Discov. 10(3), e1355 (2020). https://doi.org/10.1002/widm.1355
Del Bonifro, F., Gabbrielli, M., Lisanti, G., Zingaro, S.P.: Student dropout prediction. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 129–140 (2020). https://doi.org/10.1007/978-3-030-52237-7_11
Kemper, L., Vorhoff, G., Wigger, B.U.: Predicting student dropout: a machine learning approach. Eur. J. High. Educ. 10(1), 28–47 (2020). https://doi.org/10.1080/21568235.2020.1718520
Hutt, S., Gardner, M., Duckworth, A.L., D’Mello, S.K.: Evaluating fairness and generalizability in models predicting on-time graduation from college applications. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM), pp. 79–88 (2019)
Livieris, I.E., Tampakas, V., Karacapilidis, N., Pintelas, P.: A semi-supervised self-trained two-level algorithm for forecasting students’ graduation time. Intel. Decis. Technol. 13(3), 367–378 (2019). https://doi.org/10.3233/IDT-180136
Fenu, G., Galici, R., Marras, M.: Experts’ view on challenges and needs for fairness in artificial intelligence for education. In: International Conference on Artificial Intelligence in Education, pp. 243–255. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-11644-5_20
Vasquez Verdugo, J., Gitiaux, X., Ortega, C., Rangwala, H.: FairEd: a systematic fairness analysis approach applied in a higher educational context. In: LAK22: 12th International Learning Analytics and Knowledge Conference, pp. 271–281 (Mar 2022). https://doi.org/10.1145/3506860.3506902
Ntoutsi, E., et al.: Bias in data-driven artificial intelligence systems—an introductory survey. Wiley Interdisc. Rev. Data Mining Know. Discov. 10(3), e1356 (2020). https://doi.org/10.1002/widm.1356
Le Quy, T., Roy, A., Iosifidis, V., Zhang, W., Ntoutsi, E.: A survey on datasets for fairness-aware machine learning. Wiley Interdiscip. Rev. Data Min. Knowl. Disc., e1452 (2022). https://doi.org/10.1002/widm.1452
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR). 54(6), 1–35 (2021). https://doi.org/10.1145/3457607
Bayer, V., Hlosta, M., Fernandez, M.: Learning analytics and fairness: do existing algorithms serve everyone equally? In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 71–75 (2021). https://doi.org/10.1007/978-3-030-78270-2_12
Gardner, J., Brooks, C., Baker, R.: Evaluating the fairness of predictive student models through slicing analysis. In: Proceedings of the 9th International Conference on Learning Analytics & Knowledge, pp. 225–234 (2019). https://doi.org/10.1145/3303772.3303791
Riazy, S., Simbeck, K., Schreck, V.: Systematic literature review of fairness in learning analytics and application of insights in a case study. In: Proceedings of the International Conference on Computer Supported Education, pp. 430–449 (2020). https://doi.org/10.1007/978-3-030-86439-2_22
Baker, R.S., Hawn, A.: Algorithmic bias in education. Int. J. Artif. Intell. Educ., 1–41 (2021). https://doi.org/10.1007/s40593-021-00285-9
Kizilcec, R.F., Lee, H.: Algorithmic fairness in education. In: Ethics in Artificial Intelligence in Education (2022)
Liu, S., d’Aquin, M.: Unsupervised learning for understanding student achievement in a distance learning setting. In: Proceedings of the IEEE Global Engineering Education Conference (EDUCON), pp. 1373–1377 (2017). https://doi.org/10.1109/EDUCON.2017.7943026
Zhang, N., Biswas, G., Dong, Y.: Characterizing students’ learning behaviors using unsupervised learning methods. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 430–441 (2017). https://doi.org/10.1007/978-3-319-61425-0_36
Le Quy, T., Roy, A., Friege, G., Ntoutsi, E.: Fair-capacitated clustering. In: Proceedings of the 14th International Conference on Educational Data Mining (EDM21), pp. 407–414 (2021)
Chang, W., Ji, X., Liu, Y., Xiao, Y., Chen, B., Liu, H., Zhou, S.: Analysis of university students’ behavior based on a fusion k-means clustering algorithm. Appl. Sci. 10(18), 6566 (2020). https://doi.org/10.3390/app10186566
Fang, Y., et al.: Clustering the learning patterns of adults with low literacy skills interacting with an intelligent tutoring system. In: Proceedings of the 11th International Conference on Educational Data Mining (EDM), pp. 348–354. ERIC (2018)
Mai, T.T., Bezbradica, M., Crane, M.: Learning behaviours data in programming education: community analysis and outcome prediction with cleaned data. Futur. Gener. Comput. Syst. 127, 42–55 (2022). https://doi.org/10.1016/j.future.2021.08.026
Varela, N., et al.: Student performance assessment using clustering techniques. In: Proceedings of the International Conference on Data Mining and Big Data, pp. 179–188 (2019). https://doi.org/10.1007/978-981-32-9563-6_19
Zhang, S., Shen, M., Yu, Y.: Research on student big data portrait method based on improved k-means algorithm. In Proceedings of the 3rd International Academic Exchange Conference on Science and Technology Innovation (IAECST), pp. 146–150 (2021). https://doi.org/10.1109/IAECST54258.2021.9695501
Ding, D., Li, J., Wang, H., Liang, Z.: Student behavior clustering method based on campus big data. In: Proceedings of the 13th International Conference on Computational Intelligence and Security (CIS), pp. 500–503 (2017). https://doi.org/10.1109/CIS.2017.00116
Waspada, I., Bahtiar, N., Wibowo, A.: Clustering student behavior based on quiz activities on moodle lms to discover the relation with a final exam score. J. Phys. Conf. Ser. 1217, 012118 (2019). https://doi.org/10.1088/1742-6596/1217/1/012118
Esnashari, S., Gardner, L., Watters, P.: Clustering student participation: implications for education. In: Proceedings of the 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 313–318 (2018). https://doi.org/10.1109/WAINA.2018.00104
Jia, L., Cheng, H.N., Liu, S., Chang, W.C., Chen, Y., Sun, J.: Integrating clustering and sequential analysis to explore students’ behaviors in an online Chinese reading assessment system. In: Proceedings of the 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), pp. 719–724 (2017). https://doi.org/10.1109/IIAI-AAI.2017.55
Howlin, C.P., Dziuban, C.D.: Detecting outlier behaviors in student progress trajectories using a repeated fuzzy clustering approach. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM), pp. 742–747 (2019)
McBroom, J., Yacef, K., Koprinska, I.: DETECT: a hierarchical clustering algorithm for behavioural trends in temporal educational data. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 374–385 (2020). https://doi.org/10.1007/978-3-030-52237-7_30
Shen, S., Chi, M.: Clustering student sequential trajectories using dynamic time warping. In: Proceedings of the 10th International Conference on Educational Data Mining (EDM), pp. 266–271 (2017)
Ruipérez-Valiente, J.A., Muñoz-Merino, P.J., Delgado Kloos, C., et al.: Detecting and clustering students by their gamification behavior with badges: a case study in engineering education. Int. J. Eng. Educ. 33(2-B), 816–830 (2017)
López, S.L.S., Redondo, R.P.D., Vilas, A.F.: Discovering knowledge from student interactions: clustering vs classification. In: Proceedings of the 5th International Conference on Technological Ecosystems for Enhancing Multiculturality, pp. 1–8 (2017). https://doi.org/10.1145/3144826.3145390
Mengoni, P., Milani, A., Li, Y.: Clustering students interactions in e-learning systems for group elicitation. In: Proceedings of the International Conference on Computational Science and Its Applications, pp. 398–413. Springer (2018). https://doi.org/10.1007/978-3-319-95168-3_27
Orji, F., Vassileva, J.: Using machine learning to explore the relation between student engagement and student performance. In: Proceedings of the 24th International Conference Information Visualisation (IV), pp. 480–485. IEEE (2020). https://doi.org/10.1109/IV51561.2020.00083
Güvenç, E., Çetin, G.: Clustering of participation degrees of distance learning students to course activity by using fuzzy c-means algorithm. In: Proceedings of the 26th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2018). https://doi.org/10.1109/SIU.2018.8404292
Khalil, M., Ebner, M.: Clustering patterns of engagement in massive open online courses (MOOCs): the use of learning analytics to reveal student categories. J. Comput. High. Educ. 29(1), 114–132 (2017). https://doi.org/10.1007/s12528-016-9126-9
Oladipupo, O.O., Olugbara, O.O.: Evaluation of data analytics based clustering algorithms for knowledge mining in a student engagement data. Intell. Data Anal. 23(5), 1055–1071 (2019). https://doi.org/10.3233/IDA-184254
Palani, K., Stynes, P., Pathak, P.: Clustering techniques to identify low-engagement student levels. In: Proceedings of the 13th International Conference on Computer Supported Education (CSEDU), pp. 248–257 (2021). https://doi.org/10.5220/0010456802480257
Roy, D., Bermel, P., Douglas, K.A., Diefes-Dux, H.A., Richey, M., Madhavan, K., Shah, S.: Synthesis of clustering techniques in educational data mining. In: Proceedings of the ASEE Annual Conference & Exposition (2017)
Huang, J.B., Huang, A.Y., Lu, O.H., Yang, S.J.: Exploring learning strategies by sequence clustering and analysing their correlation with student’s engagement and learning outcome. In: Proceedings of the International Conference on Advanced Learning Technologies (ICALT), pp. 360–362. IEEE (2021). https://doi.org/10.1109/ICALT52272.2021.00115
Moubayed, A., Injadat, M., Shami, A., Lutfiyya, H.: Student engagement level in an e-learning environment: clustering using k-means. Am. J. Dist. Educ. 34(2), 137–156 (2020). https://doi.org/10.1080/08923647.2020.1696140
Hartnett, M.: The importance of motivation in online learning. In: Motivation in Online Education, pp. 5–32. Springer (2016). https://doi.org/10.1007/978-981-10-0700-2_2
Nen-Fu, H., et al.: The clustering analysis system based on students’ motivation and learning behavior. In: Proceedings of the Learning with MOOCS (LWMOOCS), pp. 117–119 (2018). https://doi.org/10.1109/LWMOOCS.2018.8534611
Gunawan, I., et al.: Hidden curriculum and character building on self-motivation based on k-means clustering. In: Proceedings of the 4th International Conference on Education and Technology (ICET), pp. 32–35 (2018). https://doi.org/10.1109/ICEAT.2018.8693931
Wang, Z., Wang, J.: Analysis of emotional education infiltration in college physical education based on emotional feature clustering. Wirel. Commun. Mob. Comput. 2022 (2022). https://doi.org/10.1155/2022/7857522
Ashkanasy, N.M.: Emotion and performance. Human Perform. 17(2), 137–144 (2004). https://doi.org/10.1207/s15327043hup1702_1
Muñoz-Merino, P.J., Molina, M.F., Muñoz-Organero, M., Kloos, C.D.: Motivation and emotions in competition systems for education: an empirical study. IEEE Trans. Educ. 57(3), 182–187 (2014). https://doi.org/10.1109/TE.2013.2297318
Guo, H., Wang, M.: Analysis on the penetration of emotional education in college physical education based on emotional feature clustering. Sci. Program. 2022 (2022). https://doi.org/10.1155/2022/2389453
Salwana, E., Hamid, S., Yasin, N.M.: Student academic streaming using clustering technique. Malays. J. Comput. Sci. 30(4), 286–299 (2017). https://doi.org/10.22452/mjcs.vol30no4.2
Thilagaraj, T., Sengottaiyan, N.: Implementation of fuzzy clustering algorithms to analyze students performance using R-tool. In: Intelligent Computing and Innovation on Data Science, pp. 287–294. Springer, Berlin (2020). https://doi.org/10.1007/978-981-15-3284-9_31
Vo, C.T.N., Nguyen, P.H.: A weighted object-cluster association-based ensemble method for clustering undergraduate students. In: Proceedings of the Asian Conference on Intelligent Information and Database Systems (ACIIDS), pp. 587–598 (2018). https://doi.org/10.1007/978-3-319-75417-8_55
Bharara, S., Sabitha, S., Bansal, A.: Application of learning analytics using clustering data mining for students’ disposition analysis. Educ. Inf. Technol. 23(2), 957–984 (2018). https://doi.org/10.1007/s10639-017-9645-7
Yin, X.: Construction of student information management system based on data mining and clustering algorithm. Complexity. 2021 (2021). https://doi.org/10.1155/2021/4447045
Hooshyar, D., Pedaste, M., Yang, Y.: Mining educational data to predict students’ performance through procrastination behavior. Entropy. 22(1), 12 (2019). https://doi.org/10.3390/e22010012
Park, J., Yu, R., Rodriguez, F., Baker, R., Smyth, P., Warschauer, M.: Understanding student procrastination via mixture models. In: Proceedings of the 11th International Conference on Educational Data Mining (EDM), pp 187–197 (2018)
Preetha, V.: Data analysis on student’s performance based on health status using genetic algorithm and clustering algorithms. In: Proceedings of the 5th International Conference on Computing Methodologies and Communication (ICCMC), pp. 836–842 (2021). https://doi.org/10.1109/ICCMC51019.2021.9418235
Aghababyan, A., Lewkow, N., Baker, R.S.: Enhancing the clustering of student performance using the variation in confidence. In: Proceedings of the International Conference on Intelligent Tutoring Systems, pp. 274–279 (2018). https://doi.org/10.1007/978-3-319-91464-0_27
Effenberger, T., Pelánek, R.: Interpretable clustering of students’ solutions in introductory programming. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 101–112 (2021). https://doi.org/10.1007/978-3-030-78292-4_9
Gao, L., Wan, B., Fang, C., Li, Y., Chen, C.: Automatic clustering of different solutions to programming assignments in computing education. In: Proceedings of the ACM Conference on Global Computing Education, pp. 164–170 (2019). https://doi.org/10.1145/3300115.3309515
Chang, L.H., Rastas, I., Pyysalo, S., Ginter, F.: Deep learning for sentence clustering in essay grading support. In: The 14th International Conference on Educational Data Mining (EDM) (2021)
Sobral, S.R., de Oliveira, C.F.: Clustering algorithm to measure student assessment accuracy: a double study. Big Data Cognit. Comput. 5(4), 81 (2021). https://doi.org/10.3390/bdcc5040081
Khan, A., Ghosh, S.K.: Student performance analysis and prediction in classroom learning: a review of educational data mining studies. Educ. Inf. Technol. 26(1), 205–240 (2021). https://doi.org/10.1007/s10639-020-10230-3
Adjei, S., Ostrow, K., Erickson, E., Heffernan, N.T.: Clustering students in assistments: exploring system-and school-level traits to advance personalization. In: Proceedings of the 10th International Conference on Educational Data Mining (EDM), pp. 340–341 (2017)
Ramanathan, L., Parthasarathy, G., Vijayakumar, K., Lakshmanan, L., Ramani, S.: Cluster-based distributed architecture for prediction of student’s performance in higher education. Clust. Comput. 22(1), 1329–1344 (2019). https://doi.org/10.1007/s10586-017-1624-7
Hassan, Y.M., Elkorany, A., Wassif, K.: Utilizing social clustering-based regression model for predicting student’s GPA. IEEE Access. 10, 48948–48963 (2022). https://doi.org/10.1109/ACCESS.2022.3172438
Casalino, G., Castellano, G., Mencar, C.: Incremental and adaptive fuzzy clustering for virtual learning environments data analysis. In: Proceedings of the 23rd International Conference Information Visualisation (IV), pp. 382–387 (2019). https://doi.org/10.1109/IV.2019.00071
Almasri, A., Alkhawaldeh, R.S., Çelebi, E.: Clustering-based EMT model for predicting student performance. Arab. J. Sci. Eng. 45(12), 10067–10078 (2020). https://doi.org/10.1007/s13369-020-04578-4
Iatrellis, O., Savvas, I.K., Fitsilis, P., Gerogiannis, V.C.: A two-phase machine learning approach for predicting student outcomes. Educ. Inf. Technol. 26(1), 69–88 (2021). https://doi.org/10.1007/s10639-020-10260-x
Francis, B.K., Babu, S.S.: Predicting academic performance of students using a hybrid data mining approach. J. Med. Syst. 43(6), 1–15 (2019). https://doi.org/10.1007/s10916-019-1295-4
Chu, Y.W., Tenorio, E., Cruz, L., Douglas, K., Lan, A.S., Brinton, C.G.: Click-based student performance prediction: a clustering guided meta-learning approach. In: Proceedings of the IEEE International Conference on Big Data (BigData), pp. 1389–1398 (2021). https://doi.org/10.1109/BigData52589.2021.9671729
Iam-On, N., Boongoen, T.: Generating descriptive model for student dropout: a review of clustering approach. HCIS. 7(1), 1–24 (2017). https://doi.org/10.1186/s13673-016-0083-0
Iam-On, N., Boongoen, T.: Improved student dropout prediction in Thai university using ensemble of mixed-type data clusterings. Int. J. Mach. Learn. Cybern. 8(2), 497–510 (2017). https://doi.org/10.1007/s13042-015-0341-x
Purba, W., Tamba, S., Saragih, J.: The effect of mining data k-means clustering toward students profile model drop out potential. J. Phys. Conf. Ser. 1007, 012049 (2018). https://doi.org/10.1088/1742-6596/1007/1/012049
Hung, J.-L., Wang, M.C., Wang, S., Abdelrasoul, M., Li, Y., He, W.: Identifying at-risk students for early interventions—a time-series clustering approach. IEEE Trans. Emerg. Top. Comput. 5(1), 45–55 (2017). https://doi.org/10.1109/TETC.2015.2504239
Nguyen, P., Vo, C.: Early in-trouble student identification based on temporal educational data clustering. In: Proceedings of the International Conference on Information Technology (ICIT), pp. 313–318 (2019). https://doi.org/10.1109/ICIT48102.2019.00062
Yotaman, N., Osathanunkul, K., Khoenkaw, P., Pramokchon, P.: Teaching support system by clustering students according to learning styles. In: Proceedings of the Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), pp. 137–140 (2020). https://doi.org/10.1109/ECTIDAMTNCON48261.2020.9090729
Khayi, N.A., Rus, V.: Clustering students based on their prior knowledge. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM), pp. 246–251 (2019)
Qoiriah, A., et al.: Application of k-means algorithm for clustering student’s computer programming performance in automatic programming assessment tool. In: Proceedings of the International Joint Conference on Science and Engineering (IJCSE 2020), pp. 421–425 (2020). https://doi.org/10.2991/aer.k.201124.075
Silva, D.B., Silla, C.N.: Evaluation of students programming skills on a computer programming course with a hierarchical clustering algorithm. In: Proceedings of the IEEE Frontiers in Education Conference (FIE), pp. 1–9 (2020). https://doi.org/10.1109/FIE44824.2020.9274130
Urbina Nájera, A.B., De La Calleja, J., Medina, M.A.: Associating students and teachers for tutoring in higher education using clustering and data mining. Comput. Appl. Eng. Educ. 25(5), 823–832 (2017). https://doi.org/10.1002/cae.21839
Chang, M.H., Kuo, R., Essalmi, F., Chang, M., Kumar, V., Kung, H.Y.: Usability evaluation plan for online annotation and student clustering system—a tunisian university case. In: Proceedings of the International Conference on Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management, pp. 241–254 (2017). https://doi.org/10.1007/978-3-319-58463-8_21
Kylvaja, M., Kumpulainen, P., Konu, A.: Application of data clustering for automated feedback generation about student Well-being. In: Proceedings of the 1st ACM SIGSOFT International Workshop on Education Through Advanced Software Engineering and Artificial Intelligence, pp. 21–26 (2019. https://doi.org/10.1145/3340435.3342720
Li, Y., Sun, X.: Data analysis and feedback system construction of university students’ psychological fitness based on fuzzy clustering. Wirel. Commun. Mob. Comput. 2022 (2022). https://doi.org/10.1155/2022/6019803
Gulwani, S., Radiček, I., Zuleger, F.: Automated clustering and program repair for introductory programming assignments. ACM SIGPLAN Not. 53(4), 465–480 (2018). https://doi.org/10.1145/3296979.3192387
Masala, M., Ruseti, S., Dascalu, M., Dobre, C.: Extracting and clustering main ideas from student feedback using language models. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 282–292 (2021). https://doi.org/10.1007/978-3-030-78292-4_23
Guo, Y., Chen, Y., Xie, Y., Ban, X.: An effective student grouping and course recommendation strategy based on big data in education. Information. 13(4), 197 (2022). https://doi.org/10.3390/info13040197
Wang, M., Lv, Z.: Construction of personalized learning and knowledge system of chemistry specialty via the internet of things and clustering algorithm. J. Supercomput. 78(8), 10997–11014 (2022). https://doi.org/10.1007/s11227-022-04315-8
Liu, H., Ding, J., Yang, L.T., Guo, Y., Wang, X., Deng, A.: Multi-dimensional correlative recommendation and adaptive clustering via incremental tensor decomposition for sustainable smart education. IEEE Trans. Sustainable Comput. 5(3), 389–402 (2019). https://doi.org/10.1109/TSUSC.2019.2954456
Fasanya, B. K., & Fathizadeh, M.: Clustering from grouping: a key to enhance students’ classroom active engagement. In: 2019 ASEE Annual Conference & Exposition (2019). https://doi.org/10.18260/1-2-32511
Wu, Y., Nouri, J., Li, X., Weegar, R., Afzaal, M., Zia, A.: A word embeddings based clustering approach for collaborative learning group formation. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 395–400 (2021). https://doi.org/10.1007/978-3-030-78270-2_70
Pratiwi, O.N., Rahardjo, B., Supangkat, S.H.: Clustering multiple mix data type for automatic grouping of student system. In: Proceedings of the International Conference on Information Technology Systems and Innovation (ICITSI), pp. 172–176 (2017). https://doi.org/10.1109/ICITSI.2017.8267938
Shelly, Z., Burch, R.F., Tian, W., Strawderman, L., Piroli, A., Bichey, C.: Using k-means clustering to create training groups for elite American football student-athletes based on game demands. Int. J. Kinesiol. Sports Sci. 8(2), 47–63 (2020). https://doi.org/10.7575//aiac.ijkss.v.8n.2p.47
Akbar, S., Gehringer, E., Hu, Z.: Poster: improving formation of student teams: a clustering approach. In: Proceedings of the IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), pp. 147–148 (2018)
Wang, Y., Wang, Q.: A student grouping method for massive online collaborative learning. Int. J. Emerg. Technol. Learn. 17(3), 18–33 (2022). https://doi.org/10.3991/ijet.v17i03.29429
Yang, Y.: Evaluation model and application of college students’ physical fitness based on clustering extraction algorithm. In: Proceedings of the 4th International Conference on Information Systems and Computer Aided Education, pp. 547–552 (2021). https://doi.org/10.1145/3482632.3482748
Dovgan, E., Leskošek, B., Jurak, G., Starc, G., Sorić, M., Luštrek, M.: Enhancing BMI-based student clustering by considering fitness as key attribute. In: Proceedings of the International Conference on Discovery Science, pp. 155–165 (2019). https://doi.org/10.1007/978-3-030-33778-0_13
Natilli, M., Monreale, A., Guidotti, R., Pappalardo, L.: Exploring students eating habits through individual profiling and clustering analysis. In: Proceedings of the MIDAS/PAP@PKDD/ECML 2018, pp. 156–171 (2018). https://doi.org/10.1007/978-3-030-13463-1_12
Chu, Y., Yin, X.: Data analysis of college students’ mental health based on clustering analysis algorithm. Complexity. 2021 (2021). https://doi.org/10.1155/2021/9996146
Li, Y., Liu, C., Zhao, X.: Research on the integration of college students’ mental health education and career planning based on feature fuzzy clustering. In: Proceedings of the 4th International Conference on Information Systems and Computer Aided Education, pp. 56–59 (2021). https://doi.org/10.1145/3482632.3482644
Wang, C., Zha, Q.: Measuring systemic diversity of Chinese universities: a clustering-method approach. Qual. Quant. 52(3), 1331–1347 (2018). https://doi.org/10.1007/s11135-017-0524-5
Nazaretsky, T., Hershkovitz, S., Alexandron, G.: Kappa learning: a new item-similarity method for clustering educational items from response data. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM), pp 129–138 (2019)
Huang, L., Wang, X., Wu, Z., Wang, F.: Feature selection for clustering online learners. In: Proceedings of the 8th International Conference on Educational Innovation Through Technology (EITT), pp. 1–6 (2019). https://doi.org/10.1109/EITT.2019.00009
Liu, F.: Design and implementation of intelligent educational administration system using fuzzy clustering algorithm. Sci. Program. 2021 (2021). https://doi.org/10.1155/2021/9485654
Rahmat, A.: Clustering in education. Eur. Res. Stud. J. 20(3) (2017)
Ahmed, A., Zualkernan, I., Elghazaly, H.: Unsupervised clustering of skills for an online learning platform. In: Proceedings of the International Conference on Advanced Learning Technologies (ICALT), pp. 200–202 (2021). https://doi.org/10.1109/ICALT52272.2021.00066
Pamungkas, A.A.P., Maryono, D., Budiyanto, C.W.: Cluster analysis for student grouping based on index of learning styles. J. Phys. Conf. Ser. 1808, 012023 (2021). https://doi.org/10.1088/1742-6596/1808/1/012023
Du, H., Chen, S., Niu, H., Li, Y.: Application of dbscan clustering algorithm in evaluating students’ learning status. In: Proceedings of the 17th International Conference on Computational Intelligence and Security (CIS), pp. 372–376 (2021). https://doi.org/10.1109/CIS54983.2021.00084
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Math., Stat., and Prob, p. 281 (1965). http://projecteuclid.org/euclid.bsmsp/1200512992
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory. 28(2), 129–137 (1982). https://doi.org/10.1109/TIT.1982.1056489
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Inc., Hoboken (1988). https://doi.org/10.1080/00401706.1990.10484648
Pelleg, D., Moore, A.W., et al.: X-means: extending k-means with efficient estimation of the number of clusters. In: Proceedings of the International Conference on Machine Learning (ICML), vol. 1, pp. 727–734 (2000)
Li, X., Zhang, Y., Cheng, H., Zhou, F., Yin, B.: An unsupervised ensemble clustering approach for the analysis of student behavioral patterns. IEEE Access. 9, 7076–7091 (2021). https://doi.org/10.1109/ACCESS.2021.3049157
Zhang, T., Yin, C., Pan, L.: Improved clustering and association rules mining for university student course scores. In: Proceedings of the 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pp. 1–6 (2017). https://doi.org/10.1109/ISKE.2017.8258808
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, New York (1990). https://doi.org/10.1002/9780470316801
Schubert, E., Rousseeuw, P.J.: Fast and eager k-medoids clustering: O(k) runtime improvement of the PAM, CLARA, and CLARANS algorithms. Inf. Syst. 101, 101804 (2021). https://doi.org/10.1016/j.is.2021.101804
Vasuki, M., Revathy, S.: Analyzing performance of placement students record using different clustering algorithm. Indian J. Comput. Sci. Eng. 13(2), 410–419 (2022). https://doi.org/10.21817/indjcse/2022/v13i2/221302083
Furr, D.: Visualization and clustering of learner pathways in an interactive online learning environment. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM) (2019)
Kausar, S., Huahu, X., Hussain, I., Wenhao, Z., Zahid, M.: Integration of data mining clustering approach in the personalized e-learning system. IEEE Access. 6, 72724–72734 (2018). https://doi.org/10.1109/ACCESS.2018.2882240
Patel, S., Sihmar, S., Jatain, A.: A study of hierarchical clustering algorithms. In: Proceedings of the 2nd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 537–541 (2015)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. ACM SIGMOD Rec. 25(2), 103–114 (1996). https://doi.org/10.1145/235968.233324
Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963). https://doi.org/10.1080/01621459.1963.10500845
Li, S., Chen, G., Xing, W., Zheng, J., Xie, C.: Longitudinal clustering of students’ self-regulated learning behaviors in engineering design. Comput. Educ. 153, 103899 (2020). https://doi.org/10.1016/j.compedu.2020.103899
Zhang, T., Taub, M., Chen, Z.: A multi-level trace clustering analysis scheme for measuring students’ self-regulated learning behavior in a mastery-based online learning environment. In: Proceedings of the 12th International Learning Analytics and Knowledge Conference (LAK), pp. 197–207 (2022). https://doi.org/10.1145/3506860.3506887
Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybernet. 3(3), 32–57 (1973). https://doi.org/10.1080/01969727308546046
Zhang, P., Shen, Q.: Fuzzy c-means based coincidental link filtering in support of inferring social networks from spatiotemporal data streams. Soft. Comput. 22(21), 7015–7025 (2018). https://doi.org/10.1007/s00500-018-3363-y
Tang, Q., Zhao, Y., Wei, Y., Jiang, L.: Research on the mental health of college students based on fuzzy clustering algorithm. Secur. Commun. Net. 2021 (2021). https://doi.org/10.1155/2021/3960559
Amalia, N., et al.: Determination system of single tuition group using a combination of fuzzy c-means clustering and simple additive weighting methods. In: IOP Conference Series: Materials Science and Engineering, vol. 536, p. 012148 (2019). https://doi.org/10.1088/1757-899X/536/1/012148
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. Ser. B. 39(1), 1–22 (1977). https://doi.org/10.1111/j.2517-6161.1977.tb01600.x
Jin, X., Han, J.: In: Sammut, C., Webb, G.I. (eds.) Expectation Maximization Clustering. Springer US, Boston, MA (2010). https://doi.org/10.1007/978-0-387-30164-8_289
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Education India (2016)
Kohonen, T.: Self-organized formation of topologically correct feature maps. Biol. Cybern. 43(1), 59–69 (1982). https://doi.org/10.1007/BF00337288
Bação, F., Lobo, V., Painho, M.: Self-organizing maps as substitutes for k-means clustering. In: Proceedings of the International Conference on Computational Science, pp. 476–483 (2005). https://doi.org/10.1007/11428862_65
Natita, W., Wiboonsak, W., Dusadee, S.: Appropriate learning rate and neighborhood function of self-organizing map (SOM) for specific humidity pattern classification over southern Thailand. Int. J. Model. Optimiz. 6(1), 61 (2016). https://doi.org/10.7763/IJMO.2016.V6.504
Melka, J., Mariage, J.J.: Efficient implementation of self-organizing map for sparse input data. In: International Joint Conference on Computational Intelligence (IJCCI), pp. 54–63 (2017). https://doi.org/10.5220/0006499500540063
Delgado, S., Morán, F., San José, J.C., Burgos, D.: Analysis of students’ behavior through user clustering in online learning settings, based on self organizing maps neural networks. IEEE Access. 9, 132592–132608 (2021). https://doi.org/10.1109/ACCESS.2021.3115024
Tasdemir, K., Merényi, E.: A validity index for prototype-based clustering of data sets with complex cluster structures. IEEE Trans. Syst. Man Cybern. B Cybern. 41(4), 1039–1053 (2011). https://doi.org/10.1109/TSMCB.2010.2104319
Alias, U.F., Ahmad, N.B., Hasan, S.: Mining of e-learning behavior using SOM clustering. In: Proceedings of the 6th ICT International Student Project Conference (ICT-ISPC), pp. 1–4 (2017). https://doi.org/10.1109/ICT-ISPC.2017.8075350
Bara, M.W., Ahmad, N.B., Modu, M.M., Ali, H.A.: Self-organizing map clustering method for the analysis of e-learning activities. In: Majan International Conference (MIC), pp. 1–5 (2018). https://doi.org/10.1109/MINTC.2018.8363155
Ahmad, N.B., Alias, U.F., Mohamad, N., Yusof, N.: Principal component analysis and self-organizing map clustering for student browsing behaviour analysis. Procedia Comput. Sci. 163, 550–559 (2019). https://doi.org/10.1016/j.procs.2019.12.137
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Proces. Syst. 14 (2001)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000). https://doi.org/10.1109/34.868688
Yan, D., Huang, L., Jordan, M.I.: Fast approximate spectral clustering. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 907–916 (2009)
Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), vol. 96, pp. 226–231 (1996)
Chhabra, A., Masalkovaite, K., Mohapatra, P.: An overview of fairness in clustering. IEEE Access. (2021). https://doi.org/10.1109/ACCESS.2021.3114099
Žliobaitė, I.: Measuring discrimination in algorithmic decision making. Data Min. Knowl. Disc. 31(4), 1060–1089 (2017). https://doi.org/10.1007/s10618-017-0506-1
Chierichetti, F., Kumar, R., Lattanzi, S., Vassilvitskii, S.: Fair clustering through fairlets. In: Neural Information Processing Systems, pp. 5036–5044 (2017)
Ahmadian, S., Epasto, A., Kumar, R., Mahdian, M.: Clustering without over-representation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 267–275 (2019). https://doi.org/10.1145/3292500.3330987
Bera, S., Chakrabarty, D., Flores, N., Negahbani, M.: Fair algorithms for clustering. In: Proceedings of the Neural Information Processing Systems Conference (NIPS 2019), p. 32 (2019)
Ghadiri, M., Samadi, S., Vempala, S.: Socially fair k-means clustering. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT), pp. 438–448 (2021). https://doi.org/10.1145/3442188.3445906
Chakrabarti, D., Dickerson, J.P., Esmaeili, S.A., Srinivasan, A., Tsepenekas, L.: A new notion of individually fair clustering: 𝛼-equitable 𝑘-center. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 6387–6408 (2022)
Jones, M., Nguyen, H., Nguyen, T.: Fair k-centers via maximum matching. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 4940–4949 (2020)
Schmidt, M., Schwiegelshohn, C., Sohler, C.: Fair coresets and streaming algorithms for fair k-means. In: Proceedings of the International Workshop on Approximation and Online Algorithms, pp. 232–251 (2019). https://doi.org/10.1007/978-3-030-39479-0_16
Abraham, S.S., Padmanabhan, D., Sundaram, S.S.: Fairness in clustering with multiple sensitive attributes. In: EDBT/ICDT 2020 Joint Conference, pp. 287–298 (2020). https://doi.org/10.5441/002/edbt.2020.26
Xia, X., Hui, Z., Chunming, Y., Xujian, Z., Bo, L.: Fairness constraint of fuzzy c-means clustering improves clustering fairness. In: Proceedings of the Asian Conference on Machine Learning (ACML), pp. 113–128 (2021)
Ahmadian, S., et al.: Fair hierarchical clustering. Adv. Neural Inf. Proces. Syst. 33, 21050–21060 (2020)
Kleindessner, M., Samadi, S., Awasthi, P., Morgenstern, J.: Guarantees for spectral clustering with fairness constraints. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 3458–3467 (2019)
Battaglia, O.R., Di Paola, B., Fazio, C.: K-means clustering to study how student reasoning lines can be modified by a learning activity based on feynman’s unifying approach. Eur. J. Math. Sci. Technol. Educ. 13(6), 2005–2038 (2017). https://doi.org/10.12973/eurasia.2017.01211a
Maylawati, D.S., Priatna, T., Sugilar, H., Ramdhani, M.A.: Data science for digital culture improvement in higher education using k-means clustering and text analytics. Int. J. Electr. Comput. Eng. 10(5), 2088–8708 (2020). https://doi.org/10.11591/ijece.v10i5.pp4569-4580
Šarić-Grgić, I., Grubišić, A., Šerić, L., Robinson, T.J.: Student clustering based on learning behavior data in the intelligent tutoring system. Int. J. Dist. Educ. Technol. 18(2), 73–89 (2020). https://doi.org/10.4018/IJDET.2020040105
Talebinamvar, M., Zarrabi, F.: Clustering students’ writing behaviors using keystroke logging: a learning analytic approach in efl writing. Lang. Test. Asia. 12(1), 1–20 (2022). https://doi.org/10.1186/s40468-021-00150-5
Kurniawan, C., Setyosari, P., Kamdi, W., Ulfa, S.: Electrical engineering student learning preferences modelled using k-means clustering. Global J. Eng. Educ. 20(2), 140–145 (2018)
Rijati, N., Sumpeno, S., Purnomo, M.H.: Multi-attribute clustering of student’s entrepreneurial potential mapping based on its characteristics and the affecting factors: preliminary study on Indonesian higher education database. In: Proceedings of the 10th International Conference on Computer and Automation Engineering, pp. 11–16 (2018). https://doi.org/10.1145/3192975.3193014
Mishler, A., Nugent, R.: Clustering students and inferring skill set profiles with skill hierarchies. In: Proceedings of the 11th International Conference on Educational Data Mining (EDM) (2018)
Mojarad, S., Essa, A., Mojarad, S., Baker, R.S.: Data-driven learner profiling based on clustering student behaviors: learning consistency, pace and effort. In: Proceedings of the International Conference on Intelligent Tutoring Systems, pp. 130–139 (2018). https://doi.org/10.1007/978-3-319-91464-0_13
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979). https://doi.org/10.1109/TPAMI.1979.4766909
Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J. Cybernet. 4(1), 95–104 (1974). https://doi.org/10.1080/01969727408546059
Tempelaar, D., Rienties, B., Mittelmeier, J., Nguyen, Q.: Student profiling in a dispositional learning analytics application using formative assessment. Comput. Hum. Behav. 78, 408–420 (2018). https://doi.org/10.1016/j.chb.2017.08.010
Švábensky`, V., Vykopal, J., Čeleda, P., Tkáčik, K., Popovič, D.: Student assessment in cybersecurity training automated by pattern mining and clustering. Educ. Inf. Technol. 1–32 (2022). https://doi.org/10.1007/s10639-022-10954-4
Bradley, P.S., Bennett, K.P., Demiriz, A.: Constrained K-Means Clustering, p. 20. Microsoft Research, Redmond (2000)
Mulvey, J.M., Beck, M.P.: Solving capacitated clustering problems. Eur. J. Oper. Res. 18(3), 339–348 (1984). https://doi.org/10.1016/0377-2217(84)90155-3
Moshkovitz, M., Dasgupta, S., Rashtchian, C., Frost, N.: Explainable k-means and k-medians clustering. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 7055–7065 (2020)
Bandyapadhyay, S., Fomin, F., Golovach, P.A., Lochet, W., Purohit, N., Simonov, K.: How to find a good explanation for clustering? In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 3904–3912 (2022). https://doi.org/10.1609/aaai.v36i4.20306
Wang, D.-Y., Lin, S.S., Sun, C.-T.: DIANA: a computer-supported heterogeneous grouping system for teachers to conduct successful small learning groups. Comput. Hum. Behav. 23(4), 1997–2010 (2007). https://doi.org/10.1016/j.chb.2006.02.008
Watson, S.B., Marshall, J.E.: Heterogeneous grouping as an element of cooperative learning in an elementary education science course. Sch. Sci. Math. 95(8), 401–405 (1995). https://doi.org/10.1111/j.1949-8594.1995.tb10192.x
Flanagan, B., Majumdar, R., Ogata, H.: Fine grain synthetic educational data: challenges and limitations of collaborative learning analytics. IEEE Access. 10, 26230–26241 (2022). https://doi.org/10.1109/ACCESS.2022.3156073
Vie, J.-J., Rigaux, T., Minn, S.: Privacy-preserving synthetic educational data generation. In: Proceedings of the EC-TEL 2022 (2022)
Backurs, A., Indyk, P., Onak, K., Schieber, B., Vakilian, A., Wagner, T.: Scalable fair clustering. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 405–413 (2019)
Fahad, A., et al.: A survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans. Emerg. Top. Comput. 2(3), 267–279 (2014). https://doi.org/10.1109/TETC.2014.2330519
Assent, I.: Clustering high dimensional data. Wires Data Mining Know. Discov. 2(4), 340–350 (2012). https://doi.org/10.1002/widm.1062
Le Quy, T., Nguyen, T.H., Friege, G., Ntoutsi, E.: Evaluation of group fairness measures in student performance prediction problems. In: Machine Learning and Principles and Practice of Knowledge Discovery in Databases: International Workshops of ECML PKDD 2022, pp. 119–136 (2022). https://doi.org/10.1007/978-3-031-23618-1_8
Rihák, J., Pelánek, R.: Measuring similarity of educational items using data on learners’ performance. In: Proceedings of the 10th International Conference on Educational Data Mining (EDM), pp. 16–23 (2017)
Ninrutsirikun, U., Watanapa, B., Arpnikanondt, C., Watananukoon, V.: A unified framework for student cluster grouping with learning preference associative detection for enhancing students’ learning outcomes in computer programming courses. In: Proceedings of 2018 Global Wireless Summit (GWS), pp. 266–271 (2018). https://doi.org/10.1109/GWS.2018.8686665
Phanniphong, K., Nuankaew, P., Teeraputon, D., Nuankaew, W., Boontonglek, M., Bussaman, S.: Clustering of learners performance based on learning outcomes for finding significant courses. In: Proceedings of the Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT-NCON), pp. 192–196 (2019). https://doi.org/10.1109/ECTI-NCON.2019.8692263
Wang, X., Zhang, Y., Yang, Y., Liu, K., Gao, B.: Research on relevance analysis and clustering algorithms in college students’ academic performance. In: Proceedings of the 10th International Conference on Information Technology in Medicine and Education (ITME), pp. 730–733 (2019). https://doi.org/10.1109/ITME.2019.00167
Chaves, V.E.J., García-Torres, M., Alonso, D.B., Gómez-Vela, F., Divina, F., Vázquez-Noguera, J.L.: Analysis of student achievement scores via cluster analysis. In: Proceedings of the International Conference on European Transnational Education, pp. 399–408 (2020). https://doi.org/10.1007/978-3-030-57799-5_41
Kosztyán, Z.T., Orbán-Mihálykó, É., Mihálykó, C., Csányi, V.V., Telcs, A.: Analyzing and clustering students’ application preferences in higher education. J. Appl. Stat. 47(16), 2961–2983 (2020). https://doi.org/10.1080/02664763.2019.1709052
Pradana, C., Kusumawardani, S., Permanasari, A.: Comparison clustering performance based on moodle log mining. IOP Conf. Ser. Mater. Sci. Eng. 722, 012012 (2020). https://doi.org/10.1088/1757-899X/722/1/012012
Tang, P., Wang, Y., Shen, N.: Prediction of college students’ physical fitness based on k-means clustering and SVR. Comput. Syst. Sci. Eng. 35(4), 237–246 (2020). https://doi.org/10.32604/csse.2020.35.237
Rijati, N., Purwitasari, D., Sumpeno, S., Purnomo, M.: A decision making and clustering method integration based on the theory of planned behavior for student entrepreneurial potential mapping in Indonesia. Int. J. Intell. Eng. Syst. 13(4), 129–144 (2020). https://doi.org/10.22266/ijies2020.0831.12
Chi, D.: Research on the application of k-means clustering algorithm in student achievement. In: Proceedings of the IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), pp. 435–438 (2021). https://doi.org/10.1109/ICCECE51280.2021.9342164
Li, G., Alfred, R., Wang, X.: Student behavior analysis and research model based on clustering technology. Mob. Inf. Syst. 2021 (2021). https://doi.org/10.1155/2021/9163517
Putra, A.A.N.K., Nasucha, M., Hermawan, H.: K-means clustering algorithm in web-based applications for grouping data on scholarship selection results. In: Proceedings of the International Symposium on Electronics and Smart Devices (ISESD), pp. 1–6 (2021). https://doi.org/10.1109/ISESD53023.2021.9501716
Susanto, R., Husen, M.N., Lajis, A., Lestari, W., Hasanah, H.: Clustering of student perceptions on developing a physics laboratory based on information technology and local wisdom. In: Proceedings of the 8th International Conference on Information Technology, Computer and Electrical Engineering (ICITACEE), pp. 68–73 (2021). https://doi.org/10.1109/ICITACEE53184.2021.9617483
Rauthan, A., et al.: Impact on higher education in pandemic: analysis k-means clustering using urban & rural areas. In: Proceedings of the 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), pp. 1974–1980 (2021). https://doi.org/10.1109/ICAC3N53548.2021.9725709
Wang, Q.: Application of the intra cluster, characteristic of k-means clustering method in English score analysis in colleges. J. Phys. Conf. Ser. 1941, 012001 (2021). https://doi.org/10.1088/1742-6596/1941/1/012001
Cheng, W., Shwe, T.: Clustering analysis of student learning outcomes based on education data. In: 2019 IEEE Frontiers in Education Conference (FIE), pp. 1–7 (2019). https://doi.org/10.1109/FIE43999.2019.9028400
Singelmann, L., Alvarez, E., Swartz, E., Pearson, M., Striker, R., Ewert, D.: Innovators, learners, and surveyors: clustering students in an innovation-based learning course. In: IEEE Frontiers in Education Conference (FIE), pp. 1–9 (2020). https://doi.org/10.1109/FIE44824.2020.9274235
Popov, A., Ovsyankin, A., Emomaliev, M., Satsuk, M.: Application of the clustering algorithm in an automated training system. J. Phys. Conf. Ser. 1691, 012120 (2020). https://doi.org/10.1088/1742-6596/1691/1/012120
Supianto, A.A., et al.: Improvements of fuzzy c-means clustering performance using particle swarm optimization on student grouping based on learning activity in a digital learning media. In: Proceedings of the 5th International Conference on Sustainable Information Engineering and Technology, pp. 239–243 (2020). https://doi.org/10.1145/3427423.3427449
Yadav, R.S.: Application of hybrid clustering methods for student performance evaluation. Int. J. Inf. Technol. 12(3), 749–756 (2020). https://doi.org/10.1007/s41870-018-0192-2
Parvathavarthini, S., Sharvanthika, K., Jagadeesh, M., Kishore, B.: Analysis of student performance in e-learning environment using crow search based fuzzy clustering. In: Proceedings of the 2nd International Conference on Smart Electronics and Communication (ICOSEC), pp. 1784–1787 (2021). https://doi.org/10.1109/ICOSEC51865.2021.9591920
Premalatha, N., Sujatha, S.: Prediction of students’ employability using clustering algorithm: a hybrid approach. Int. J. Model. Simul. Sci. Comput. 2250049 (2022). https://doi.org/10.1142/S1793962322500490
Waluyo, E., Djeni, D., Pratama, L., Anggraini, V.: Clustering based on sociometry in Pythagoras theorem. J. Phys. Conf. Ser. 1211, 012058 (2019). https://doi.org/10.1088/1742-6596/1211/1/012058
Purbasari, I., Puspaningrum, E., Putra, A.: Using self-organizing map (SOM) for clustering and visualization of new students based on grades. J. Phys. Conf. Ser. 1569, 022037 (2020). https://doi.org/10.1088/1742-6596/1569/2/022037
Rakhmawati, N.A., Faiz, N., Hafidz, I., Raditya, I., Dinatha, P., Suwignyo, A.: Clustering student Instagram accounts using author-topic model. Int. J. Bus. Intell. Data Min. 19(1), 70–79 (2021). https://doi.org/10.1504/IJBIDM.2021.115954
Yan, Q., Su, Z.: Evaluation of college students’ English performance considering Roche multiway tree clustering. Int. J. Electric. Eng. Educ. (2021). https://doi.org/10.1177/00207209211004207
Acknowledgments
The work of the first author is supported by the Ministry of Science and Culture of Lower Saxony, Germany, within the PhD program “LernMINT: Data-assisted teaching in the MINT subjects.”
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Le Quy, T., Friege, G., Ntoutsi, E. (2023). A Review of Clustering Models in Educational Data Science Toward Fairness-Aware Learning. In: Peña-Ayala, A. (eds) Educational Data Science: Essentials, Approaches, and Tendencies. Big Data Management. Springer, Singapore. https://doi.org/10.1007/978-981-99-0026-8_2
Download citation
DOI: https://doi.org/10.1007/978-981-99-0026-8_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-0025-1
Online ISBN: 978-981-99-0026-8
eBook Packages: Computer ScienceComputer Science (R0)