Skip to main content

A Review of Clustering Models in Educational Data Science Toward Fairness-Aware Learning

  • Chapter
  • First Online:
Educational Data Science: Essentials, Approaches, and Tendencies

Abstract

Ensuring fair access to quality education is essential for every education system to fully realize every student’s potential. Nowadays, machine learning (ML) is transforming education by enabling educators to develop personalized learning strategies for the students, providing important information on student progression and early identification of potential points of struggle, developing more efficient grading systems, etc. The role of the Educational Data Science (EDS) domain in educational activities for both teachers and learners is becoming therefore increasingly important. However, ML-driven decision-making can be biased, resulting in underperforming ML models and/or ML models that discriminate against individuals or groups of students based on protected attributes like gender or race. Mitigating bias and discrimination in ML is of paramount importance. In this work, we focus on one of the most effective ML tasks, clustering, which is widely used in EDS as an exploratory tool to understand student characteristics and behavior but also as a stand-alone tool for, e.g., group assignments. Traditionally, clustering algorithms focus on finding groups or clusters of similar students and ignore aspects of fairness and discrimination. However, both cluster quality and fairness of the resulting clusters are needed. This chapter provides a comprehensive review of different clustering models in EDS, with greater emphasis on fair clustering models. Among the fair clustering models, we mainly focus on models that have been proposed and/or applied in educational activities to ensure their usefulness and applicability for fairness-aware EDS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://scholar.google.com/

  2. 2.

    https://dblp.org/

  3. 3.

    https://www.scopus.com/

  4. 4.

    http://portal.core.edu.au/conf-ranks/

  5. 5.

    https://www.scimagojr.com/

  6. 6.

    https://moodle.org/

  7. 7.

    Students’ confidence entropy is computed by the Shannon equation.

  8. 8.

    Because the objects or data points for clustering in the related work are mainly students, we use the terms “objects,” “data points,” and “students” interchangeably.

  9. 9.

    https://cran.r-project.org/web/packages/NbClust/index.html

  10. 10.

    https://cran.r-project.org/web/packages/fclust/index.html

  11. 11.

    https://cran.r-project.org/web/packages/mixsmsn/index.html

  12. 12.

    We only take into account the fairness notions introduced in the published papers. Because the fairness notions may be turned into measures [157], therefore, in this review we use the term “fairness notion” and “fairness measure” interchangeably.

Abbreviations

ACC:

Clustering accuracy

AI:

Artificial intelligence

AIED:

Artificial intelligence in education

ANOVA:

Analysis of variance

ARI:

Adjusted rand index

BIRCH:

Balanced iterative reducing and clustering using hierarchies

BMI:

Body mass index

BMU:

Best matching unit

CFSFDP-HD:

Clustering by fast search and finding of density peaks via heat diffusion

CHI:

Calinski–Harabasz index

CLARA:

Clustering in LARge Applications

CLARANS :

Clustering Large Applications based on RANdomized Search

CORE:

Computing Research and Education Association of Australasia

DBI:

Davies–Bouldin index

DBLP:

Database systems and logic programming

DBSCAN:

Density-based spatial clustering of applications with noise

DI:

Dunn index

DP:

Dirichlet process

EDM:

Educational data mining

EDS:

Educational data science

EM:

Expectation–maximization

EMT:

Ensemble meta-based tree

FCM:

Fuzzy c-means

FIE:

Frontiers in education

ICALT:

International Conference on Advanced Learning Technologies

ITS:

Intelligent tutoring system

KPCA:

Kernel-based principal component analysis

LA:

Learning analytics

LD:

Learning design

LMS:

Learning management system

MIT:

Massachusetts Institute of Technology

ML:

Machine learning

MOOC:

Massive open online course

NMI:

Normalized mutual information

OPTICS:

Ordering points to identify the clustering structure

OULAD:

Open University Learning Analytics

PAM:

Partition around medoids

PISA:

Program for International Student Assessment

RQ:

Research question

SJR:

Scimago Journal & Country Rank

SOM:

Self-organizing map

SSE:

Sum of squared error

SVM:

Support vector machine

References

  1. Dorans, N.J., Cook, L.L.: Fairness in Educational Assessment and Measurement. Routledge, New York (2016)

    Book  Google Scholar 

  2. Zlatkin-Troitschanskaia, O., Schlax, J., Jitomirski, J., Happ, R., Kühling-Thees, C., Brückner, S., Pant, H.: Ethics and fairness in assessing learning outcomes in higher education. High Educ. Pol. 32(4), 537–556 (2019). https://doi.org/10.1057/s41307-019-00149-x

    Article  Google Scholar 

  3. Ford, M., Morice, J.: How fair are group assignments? A survey of students and faculty and a modest proposal. J. Inform. Technol. Educ. Res. 2(1), 367–378 (2003)

    Google Scholar 

  4. Miles, J.A., Klein, H.J.: The fairness of assigning group members to tasks. Group Org. Manag. 23(1), 71–96 (1998). https://doi.org/10.1177/1059601198231005

    Article  Google Scholar 

  5. Rezaeinia, N., Góez, J.C., Guajardo, M.: Efficiency and fairness criteria in the assignment of students to projects. Ann. Oper. Res., 1–19 (2021). https://doi.org/10.1007/s10479-021-04001-7

  6. Song, X.: The fairness of a graduate school admission test in China: voices from administrators, teachers, and test-takers. Asia Pac. Educ. Res. 27(2), 79–89 (2018). https://doi.org/10.1007/s40299-018-0367-4

    Article  Google Scholar 

  7. Xiao, W., Ji, P., Hu, J.: A survey on educational data mining methods used for predicting students’ performance. Eng. Rep. (2021). https://doi.org/10.1002/eng2.12482

  8. Meyer, K.: Education, Justice and the Human Good: Fairness and Equality in the Education System. Routledge, London (2014)

    Book  Google Scholar 

  9. McFarland, D.A., Khanna, S., Domingue, B.W., Pardos, Z.A.: Education data science: past, present, future. AERA Open. 7 (2021). https://doi.org/10.1177/23328584211052055

  10. Romero, C., Ventura, S.: Educational data science in massive open online courses. Wiley Interdisc. Rev. Data Min. Know. Discov. 7(1), e1187 (2017). https://doi.org/10.1002/widm.1187

    Article  Google Scholar 

  11. Dutt, A., Ismail, M.A., Herawan, T.: A systematic review on educational data mining. IEEE Access. 5, 15991–16005 (2017). https://doi.org/10.1109/ACCESS.2017.2654247

    Article  Google Scholar 

  12. Peña-Ayala, A.: Educational data mining: a survey and a data mining-based analysis of recent works. Expert Syst. Appl. 41(4), 1432–1462 (2014). https://doi.org/10.1016/j.eswa.2013.08.042

    Article  Google Scholar 

  13. Romero, C., Ventura, S.: Educational data mining and learning analytics: an updated survey. Wiley Interdisc. Rev. Data Min. Know. Discov. 10(3), e1355 (2020). https://doi.org/10.1002/widm.1355

    Article  Google Scholar 

  14. Del Bonifro, F., Gabbrielli, M., Lisanti, G., Zingaro, S.P.: Student dropout prediction. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 129–140 (2020). https://doi.org/10.1007/978-3-030-52237-7_11

  15. Kemper, L., Vorhoff, G., Wigger, B.U.: Predicting student dropout: a machine learning approach. Eur. J. High. Educ. 10(1), 28–47 (2020). https://doi.org/10.1080/21568235.2020.1718520

    Article  Google Scholar 

  16. Hutt, S., Gardner, M., Duckworth, A.L., D’Mello, S.K.: Evaluating fairness and generalizability in models predicting on-time graduation from college applications. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM), pp. 79–88 (2019)

    Google Scholar 

  17. Livieris, I.E., Tampakas, V., Karacapilidis, N., Pintelas, P.: A semi-supervised self-trained two-level algorithm for forecasting students’ graduation time. Intel. Decis. Technol. 13(3), 367–378 (2019). https://doi.org/10.3233/IDT-180136

    Article  Google Scholar 

  18. Fenu, G., Galici, R., Marras, M.: Experts’ view on challenges and needs for fairness in artificial intelligence for education. In: International Conference on Artificial Intelligence in Education, pp. 243–255. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-11644-5_20

    Chapter  Google Scholar 

  19. Vasquez Verdugo, J., Gitiaux, X., Ortega, C., Rangwala, H.: FairEd: a systematic fairness analysis approach applied in a higher educational context. In: LAK22: 12th International Learning Analytics and Knowledge Conference, pp. 271–281 (Mar 2022). https://doi.org/10.1145/3506860.3506902

  20. Ntoutsi, E., et al.: Bias in data-driven artificial intelligence systems—an introductory survey. Wiley Interdisc. Rev. Data Mining Know. Discov. 10(3), e1356 (2020). https://doi.org/10.1002/widm.1356

    Article  Google Scholar 

  21. Le Quy, T., Roy, A., Iosifidis, V., Zhang, W., Ntoutsi, E.: A survey on datasets for fairness-aware machine learning. Wiley Interdiscip. Rev. Data Min. Knowl. Disc., e1452 (2022). https://doi.org/10.1002/widm.1452

  22. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR). 54(6), 1–35 (2021). https://doi.org/10.1145/3457607

    Article  Google Scholar 

  23. Bayer, V., Hlosta, M., Fernandez, M.: Learning analytics and fairness: do existing algorithms serve everyone equally? In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 71–75 (2021). https://doi.org/10.1007/978-3-030-78270-2_12

  24. Gardner, J., Brooks, C., Baker, R.: Evaluating the fairness of predictive student models through slicing analysis. In: Proceedings of the 9th International Conference on Learning Analytics & Knowledge, pp. 225–234 (2019). https://doi.org/10.1145/3303772.3303791

  25. Riazy, S., Simbeck, K., Schreck, V.: Systematic literature review of fairness in learning analytics and application of insights in a case study. In: Proceedings of the International Conference on Computer Supported Education, pp. 430–449 (2020). https://doi.org/10.1007/978-3-030-86439-2_22

  26. Baker, R.S., Hawn, A.: Algorithmic bias in education. Int. J. Artif. Intell. Educ., 1–41 (2021). https://doi.org/10.1007/s40593-021-00285-9

  27. Kizilcec, R.F., Lee, H.: Algorithmic fairness in education. In: Ethics in Artificial Intelligence in Education (2022)

    Google Scholar 

  28. Liu, S., d’Aquin, M.: Unsupervised learning for understanding student achievement in a distance learning setting. In: Proceedings of the IEEE Global Engineering Education Conference (EDUCON), pp. 1373–1377 (2017). https://doi.org/10.1109/EDUCON.2017.7943026

  29. Zhang, N., Biswas, G., Dong, Y.: Characterizing students’ learning behaviors using unsupervised learning methods. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 430–441 (2017). https://doi.org/10.1007/978-3-319-61425-0_36

  30. Le Quy, T., Roy, A., Friege, G., Ntoutsi, E.: Fair-capacitated clustering. In: Proceedings of the 14th International Conference on Educational Data Mining (EDM21), pp. 407–414 (2021)

    Google Scholar 

  31. Chang, W., Ji, X., Liu, Y., Xiao, Y., Chen, B., Liu, H., Zhou, S.: Analysis of university students’ behavior based on a fusion k-means clustering algorithm. Appl. Sci. 10(18), 6566 (2020). https://doi.org/10.3390/app10186566

    Article  Google Scholar 

  32. Fang, Y., et al.: Clustering the learning patterns of adults with low literacy skills interacting with an intelligent tutoring system. In: Proceedings of the 11th International Conference on Educational Data Mining (EDM), pp. 348–354. ERIC (2018)

    Google Scholar 

  33. Mai, T.T., Bezbradica, M., Crane, M.: Learning behaviours data in programming education: community analysis and outcome prediction with cleaned data. Futur. Gener. Comput. Syst. 127, 42–55 (2022). https://doi.org/10.1016/j.future.2021.08.026

    Article  Google Scholar 

  34. Varela, N., et al.: Student performance assessment using clustering techniques. In: Proceedings of the International Conference on Data Mining and Big Data, pp. 179–188 (2019). https://doi.org/10.1007/978-981-32-9563-6_19

  35. Zhang, S., Shen, M., Yu, Y.: Research on student big data portrait method based on improved k-means algorithm. In Proceedings of the 3rd International Academic Exchange Conference on Science and Technology Innovation (IAECST), pp. 146–150 (2021). https://doi.org/10.1109/IAECST54258.2021.9695501

  36. Ding, D., Li, J., Wang, H., Liang, Z.: Student behavior clustering method based on campus big data. In: Proceedings of the 13th International Conference on Computational Intelligence and Security (CIS), pp. 500–503 (2017). https://doi.org/10.1109/CIS.2017.00116

  37. Waspada, I., Bahtiar, N., Wibowo, A.: Clustering student behavior based on quiz activities on moodle lms to discover the relation with a final exam score. J. Phys. Conf. Ser. 1217, 012118 (2019). https://doi.org/10.1088/1742-6596/1217/1/012118

    Article  Google Scholar 

  38. Esnashari, S., Gardner, L., Watters, P.: Clustering student participation: implications for education. In: Proceedings of the 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 313–318 (2018). https://doi.org/10.1109/WAINA.2018.00104

  39. Jia, L., Cheng, H.N., Liu, S., Chang, W.C., Chen, Y., Sun, J.: Integrating clustering and sequential analysis to explore students’ behaviors in an online Chinese reading assessment system. In: Proceedings of the 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), pp. 719–724 (2017). https://doi.org/10.1109/IIAI-AAI.2017.55

  40. Howlin, C.P., Dziuban, C.D.: Detecting outlier behaviors in student progress trajectories using a repeated fuzzy clustering approach. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM), pp. 742–747 (2019)

    Google Scholar 

  41. McBroom, J., Yacef, K., Koprinska, I.: DETECT: a hierarchical clustering algorithm for behavioural trends in temporal educational data. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 374–385 (2020). https://doi.org/10.1007/978-3-030-52237-7_30

  42. Shen, S., Chi, M.: Clustering student sequential trajectories using dynamic time warping. In: Proceedings of the 10th International Conference on Educational Data Mining (EDM), pp. 266–271 (2017)

    Google Scholar 

  43. Ruipérez-Valiente, J.A., Muñoz-Merino, P.J., Delgado Kloos, C., et al.: Detecting and clustering students by their gamification behavior with badges: a case study in engineering education. Int. J. Eng. Educ. 33(2-B), 816–830 (2017)

    Google Scholar 

  44. López, S.L.S., Redondo, R.P.D., Vilas, A.F.: Discovering knowledge from student interactions: clustering vs classification. In: Proceedings of the 5th International Conference on Technological Ecosystems for Enhancing Multiculturality, pp. 1–8 (2017). https://doi.org/10.1145/3144826.3145390

  45. Mengoni, P., Milani, A., Li, Y.: Clustering students interactions in e-learning systems for group elicitation. In: Proceedings of the International Conference on Computational Science and Its Applications, pp. 398–413. Springer (2018). https://doi.org/10.1007/978-3-319-95168-3_27

  46. Orji, F., Vassileva, J.: Using machine learning to explore the relation between student engagement and student performance. In: Proceedings of the 24th International Conference Information Visualisation (IV), pp. 480–485. IEEE (2020). https://doi.org/10.1109/IV51561.2020.00083

  47. Güvenç, E., Çetin, G.: Clustering of participation degrees of distance learning students to course activity by using fuzzy c-means algorithm. In: Proceedings of the 26th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2018). https://doi.org/10.1109/SIU.2018.8404292

    Chapter  Google Scholar 

  48. Khalil, M., Ebner, M.: Clustering patterns of engagement in massive open online courses (MOOCs): the use of learning analytics to reveal student categories. J. Comput. High. Educ. 29(1), 114–132 (2017). https://doi.org/10.1007/s12528-016-9126-9

    Article  Google Scholar 

  49. Oladipupo, O.O., Olugbara, O.O.: Evaluation of data analytics based clustering algorithms for knowledge mining in a student engagement data. Intell. Data Anal. 23(5), 1055–1071 (2019). https://doi.org/10.3233/IDA-184254

    Article  Google Scholar 

  50. Palani, K., Stynes, P., Pathak, P.: Clustering techniques to identify low-engagement student levels. In: Proceedings of the 13th International Conference on Computer Supported Education (CSEDU), pp. 248–257 (2021). https://doi.org/10.5220/0010456802480257

    Chapter  Google Scholar 

  51. Roy, D., Bermel, P., Douglas, K.A., Diefes-Dux, H.A., Richey, M., Madhavan, K., Shah, S.: Synthesis of clustering techniques in educational data mining. In: Proceedings of the ASEE Annual Conference & Exposition (2017)

    Google Scholar 

  52. Huang, J.B., Huang, A.Y., Lu, O.H., Yang, S.J.: Exploring learning strategies by sequence clustering and analysing their correlation with student’s engagement and learning outcome. In: Proceedings of the International Conference on Advanced Learning Technologies (ICALT), pp. 360–362. IEEE (2021). https://doi.org/10.1109/ICALT52272.2021.00115

  53. Moubayed, A., Injadat, M., Shami, A., Lutfiyya, H.: Student engagement level in an e-learning environment: clustering using k-means. Am. J. Dist. Educ. 34(2), 137–156 (2020). https://doi.org/10.1080/08923647.2020.1696140

    Article  Google Scholar 

  54. Hartnett, M.: The importance of motivation in online learning. In: Motivation in Online Education, pp. 5–32. Springer (2016). https://doi.org/10.1007/978-981-10-0700-2_2

  55. Nen-Fu, H., et al.: The clustering analysis system based on students’ motivation and learning behavior. In: Proceedings of the Learning with MOOCS (LWMOOCS), pp. 117–119 (2018). https://doi.org/10.1109/LWMOOCS.2018.8534611

  56. Gunawan, I., et al.: Hidden curriculum and character building on self-motivation based on k-means clustering. In: Proceedings of the 4th International Conference on Education and Technology (ICET), pp. 32–35 (2018). https://doi.org/10.1109/ICEAT.2018.8693931

  57. Wang, Z., Wang, J.: Analysis of emotional education infiltration in college physical education based on emotional feature clustering. Wirel. Commun. Mob. Comput. 2022 (2022). https://doi.org/10.1155/2022/7857522

  58. Ashkanasy, N.M.: Emotion and performance. Human Perform. 17(2), 137–144 (2004). https://doi.org/10.1207/s15327043hup1702_1

    Article  Google Scholar 

  59. Muñoz-Merino, P.J., Molina, M.F., Muñoz-Organero, M., Kloos, C.D.: Motivation and emotions in competition systems for education: an empirical study. IEEE Trans. Educ. 57(3), 182–187 (2014). https://doi.org/10.1109/TE.2013.2297318

    Article  Google Scholar 

  60. Guo, H., Wang, M.: Analysis on the penetration of emotional education in college physical education based on emotional feature clustering. Sci. Program. 2022 (2022). https://doi.org/10.1155/2022/2389453

  61. Salwana, E., Hamid, S., Yasin, N.M.: Student academic streaming using clustering technique. Malays. J. Comput. Sci. 30(4), 286–299 (2017). https://doi.org/10.22452/mjcs.vol30no4.2

    Article  Google Scholar 

  62. Thilagaraj, T., Sengottaiyan, N.: Implementation of fuzzy clustering algorithms to analyze students performance using R-tool. In: Intelligent Computing and Innovation on Data Science, pp. 287–294. Springer, Berlin (2020). https://doi.org/10.1007/978-981-15-3284-9_31

    Chapter  Google Scholar 

  63. Vo, C.T.N., Nguyen, P.H.: A weighted object-cluster association-based ensemble method for clustering undergraduate students. In: Proceedings of the Asian Conference on Intelligent Information and Database Systems (ACIIDS), pp. 587–598 (2018). https://doi.org/10.1007/978-3-319-75417-8_55

  64. Bharara, S., Sabitha, S., Bansal, A.: Application of learning analytics using clustering data mining for students’ disposition analysis. Educ. Inf. Technol. 23(2), 957–984 (2018). https://doi.org/10.1007/s10639-017-9645-7

    Article  Google Scholar 

  65. Yin, X.: Construction of student information management system based on data mining and clustering algorithm. Complexity. 2021 (2021). https://doi.org/10.1155/2021/4447045

  66. Hooshyar, D., Pedaste, M., Yang, Y.: Mining educational data to predict students’ performance through procrastination behavior. Entropy. 22(1), 12 (2019). https://doi.org/10.3390/e22010012

    Article  Google Scholar 

  67. Park, J., Yu, R., Rodriguez, F., Baker, R., Smyth, P., Warschauer, M.: Understanding student procrastination via mixture models. In: Proceedings of the 11th International Conference on Educational Data Mining (EDM), pp 187–197 (2018)

    Google Scholar 

  68. Preetha, V.: Data analysis on student’s performance based on health status using genetic algorithm and clustering algorithms. In: Proceedings of the 5th International Conference on Computing Methodologies and Communication (ICCMC), pp. 836–842 (2021). https://doi.org/10.1109/ICCMC51019.2021.9418235

  69. Aghababyan, A., Lewkow, N., Baker, R.S.: Enhancing the clustering of student performance using the variation in confidence. In: Proceedings of the International Conference on Intelligent Tutoring Systems, pp. 274–279 (2018). https://doi.org/10.1007/978-3-319-91464-0_27

  70. Effenberger, T., Pelánek, R.: Interpretable clustering of students’ solutions in introductory programming. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 101–112 (2021). https://doi.org/10.1007/978-3-030-78292-4_9

  71. Gao, L., Wan, B., Fang, C., Li, Y., Chen, C.: Automatic clustering of different solutions to programming assignments in computing education. In: Proceedings of the ACM Conference on Global Computing Education, pp. 164–170 (2019). https://doi.org/10.1145/3300115.3309515

  72. Chang, L.H., Rastas, I., Pyysalo, S., Ginter, F.: Deep learning for sentence clustering in essay grading support. In: The 14th International Conference on Educational Data Mining (EDM) (2021)

    Google Scholar 

  73. Sobral, S.R., de Oliveira, C.F.: Clustering algorithm to measure student assessment accuracy: a double study. Big Data Cognit. Comput. 5(4), 81 (2021). https://doi.org/10.3390/bdcc5040081

    Article  Google Scholar 

  74. Khan, A., Ghosh, S.K.: Student performance analysis and prediction in classroom learning: a review of educational data mining studies. Educ. Inf. Technol. 26(1), 205–240 (2021). https://doi.org/10.1007/s10639-020-10230-3

    Article  Google Scholar 

  75. Adjei, S., Ostrow, K., Erickson, E., Heffernan, N.T.: Clustering students in assistments: exploring system-and school-level traits to advance personalization. In: Proceedings of the 10th International Conference on Educational Data Mining (EDM), pp. 340–341 (2017)

    Google Scholar 

  76. Ramanathan, L., Parthasarathy, G., Vijayakumar, K., Lakshmanan, L., Ramani, S.: Cluster-based distributed architecture for prediction of student’s performance in higher education. Clust. Comput. 22(1), 1329–1344 (2019). https://doi.org/10.1007/s10586-017-1624-7

    Article  Google Scholar 

  77. Hassan, Y.M., Elkorany, A., Wassif, K.: Utilizing social clustering-based regression model for predicting student’s GPA. IEEE Access. 10, 48948–48963 (2022). https://doi.org/10.1109/ACCESS.2022.3172438

    Article  Google Scholar 

  78. Casalino, G., Castellano, G., Mencar, C.: Incremental and adaptive fuzzy clustering for virtual learning environments data analysis. In: Proceedings of the 23rd International Conference Information Visualisation (IV), pp. 382–387 (2019). https://doi.org/10.1109/IV.2019.00071

  79. Almasri, A., Alkhawaldeh, R.S., Çelebi, E.: Clustering-based EMT model for predicting student performance. Arab. J. Sci. Eng. 45(12), 10067–10078 (2020). https://doi.org/10.1007/s13369-020-04578-4

    Article  Google Scholar 

  80. Iatrellis, O., Savvas, I.K., Fitsilis, P., Gerogiannis, V.C.: A two-phase machine learning approach for predicting student outcomes. Educ. Inf. Technol. 26(1), 69–88 (2021). https://doi.org/10.1007/s10639-020-10260-x

    Article  Google Scholar 

  81. Francis, B.K., Babu, S.S.: Predicting academic performance of students using a hybrid data mining approach. J. Med. Syst. 43(6), 1–15 (2019). https://doi.org/10.1007/s10916-019-1295-4

    Article  Google Scholar 

  82. Chu, Y.W., Tenorio, E., Cruz, L., Douglas, K., Lan, A.S., Brinton, C.G.: Click-based student performance prediction: a clustering guided meta-learning approach. In: Proceedings of the IEEE International Conference on Big Data (BigData), pp. 1389–1398 (2021). https://doi.org/10.1109/BigData52589.2021.9671729

  83. Iam-On, N., Boongoen, T.: Generating descriptive model for student dropout: a review of clustering approach. HCIS. 7(1), 1–24 (2017). https://doi.org/10.1186/s13673-016-0083-0

    Article  Google Scholar 

  84. Iam-On, N., Boongoen, T.: Improved student dropout prediction in Thai university using ensemble of mixed-type data clusterings. Int. J. Mach. Learn. Cybern. 8(2), 497–510 (2017). https://doi.org/10.1007/s13042-015-0341-x

    Article  Google Scholar 

  85. Purba, W., Tamba, S., Saragih, J.: The effect of mining data k-means clustering toward students profile model drop out potential. J. Phys. Conf. Ser. 1007, 012049 (2018). https://doi.org/10.1088/1742-6596/1007/1/012049

    Article  Google Scholar 

  86. Hung, J.-L., Wang, M.C., Wang, S., Abdelrasoul, M., Li, Y., He, W.: Identifying at-risk students for early interventions—a time-series clustering approach. IEEE Trans. Emerg. Top. Comput. 5(1), 45–55 (2017). https://doi.org/10.1109/TETC.2015.2504239

    Article  Google Scholar 

  87. Nguyen, P., Vo, C.: Early in-trouble student identification based on temporal educational data clustering. In: Proceedings of the International Conference on Information Technology (ICIT), pp. 313–318 (2019). https://doi.org/10.1109/ICIT48102.2019.00062

  88. Yotaman, N., Osathanunkul, K., Khoenkaw, P., Pramokchon, P.: Teaching support system by clustering students according to learning styles. In: Proceedings of the Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), pp. 137–140 (2020). https://doi.org/10.1109/ECTIDAMTNCON48261.2020.9090729

  89. Khayi, N.A., Rus, V.: Clustering students based on their prior knowledge. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM), pp. 246–251 (2019)

    Google Scholar 

  90. Qoiriah, A., et al.: Application of k-means algorithm for clustering student’s computer programming performance in automatic programming assessment tool. In: Proceedings of the International Joint Conference on Science and Engineering (IJCSE 2020), pp. 421–425 (2020). https://doi.org/10.2991/aer.k.201124.075

  91. Silva, D.B., Silla, C.N.: Evaluation of students programming skills on a computer programming course with a hierarchical clustering algorithm. In: Proceedings of the IEEE Frontiers in Education Conference (FIE), pp. 1–9 (2020). https://doi.org/10.1109/FIE44824.2020.9274130

  92. Urbina Nájera, A.B., De La Calleja, J., Medina, M.A.: Associating students and teachers for tutoring in higher education using clustering and data mining. Comput. Appl. Eng. Educ. 25(5), 823–832 (2017). https://doi.org/10.1002/cae.21839

    Article  Google Scholar 

  93. Chang, M.H., Kuo, R., Essalmi, F., Chang, M., Kumar, V., Kung, H.Y.: Usability evaluation plan for online annotation and student clustering system—a tunisian university case. In: Proceedings of the International Conference on Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management, pp. 241–254 (2017). https://doi.org/10.1007/978-3-319-58463-8_21

  94. Kylvaja, M., Kumpulainen, P., Konu, A.: Application of data clustering for automated feedback generation about student Well-being. In: Proceedings of the 1st ACM SIGSOFT International Workshop on Education Through Advanced Software Engineering and Artificial Intelligence, pp. 21–26 (2019. https://doi.org/10.1145/3340435.3342720

  95. Li, Y., Sun, X.: Data analysis and feedback system construction of university students’ psychological fitness based on fuzzy clustering. Wirel. Commun. Mob. Comput. 2022 (2022). https://doi.org/10.1155/2022/6019803

  96. Gulwani, S., Radiček, I., Zuleger, F.: Automated clustering and program repair for introductory programming assignments. ACM SIGPLAN Not. 53(4), 465–480 (2018). https://doi.org/10.1145/3296979.3192387

    Article  Google Scholar 

  97. Masala, M., Ruseti, S., Dascalu, M., Dobre, C.: Extracting and clustering main ideas from student feedback using language models. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 282–292 (2021). https://doi.org/10.1007/978-3-030-78292-4_23

  98. Guo, Y., Chen, Y., Xie, Y., Ban, X.: An effective student grouping and course recommendation strategy based on big data in education. Information. 13(4), 197 (2022). https://doi.org/10.3390/info13040197

    Article  Google Scholar 

  99. Wang, M., Lv, Z.: Construction of personalized learning and knowledge system of chemistry specialty via the internet of things and clustering algorithm. J. Supercomput. 78(8), 10997–11014 (2022). https://doi.org/10.1007/s11227-022-04315-8

    Article  Google Scholar 

  100. Liu, H., Ding, J., Yang, L.T., Guo, Y., Wang, X., Deng, A.: Multi-dimensional correlative recommendation and adaptive clustering via incremental tensor decomposition for sustainable smart education. IEEE Trans. Sustainable Comput. 5(3), 389–402 (2019). https://doi.org/10.1109/TSUSC.2019.2954456

    Article  Google Scholar 

  101. Fasanya, B. K., & Fathizadeh, M.: Clustering from grouping: a key to enhance students’ classroom active engagement. In: 2019 ASEE Annual Conference & Exposition (2019). https://doi.org/10.18260/1-2-32511

  102. Wu, Y., Nouri, J., Li, X., Weegar, R., Afzaal, M., Zia, A.: A word embeddings based clustering approach for collaborative learning group formation. In: Proceedings of the International Conference on Artificial Intelligence in Education (AIED), pp. 395–400 (2021). https://doi.org/10.1007/978-3-030-78270-2_70

  103. Pratiwi, O.N., Rahardjo, B., Supangkat, S.H.: Clustering multiple mix data type for automatic grouping of student system. In: Proceedings of the International Conference on Information Technology Systems and Innovation (ICITSI), pp. 172–176 (2017). https://doi.org/10.1109/ICITSI.2017.8267938

  104. Shelly, Z., Burch, R.F., Tian, W., Strawderman, L., Piroli, A., Bichey, C.: Using k-means clustering to create training groups for elite American football student-athletes based on game demands. Int. J. Kinesiol. Sports Sci. 8(2), 47–63 (2020). https://doi.org/10.7575//aiac.ijkss.v.8n.2p.47

    Article  Google Scholar 

  105. Akbar, S., Gehringer, E., Hu, Z.: Poster: improving formation of student teams: a clustering approach. In: Proceedings of the IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), pp. 147–148 (2018)

    Google Scholar 

  106. Wang, Y., Wang, Q.: A student grouping method for massive online collaborative learning. Int. J. Emerg. Technol. Learn. 17(3), 18–33 (2022). https://doi.org/10.3991/ijet.v17i03.29429

    Article  Google Scholar 

  107. Yang, Y.: Evaluation model and application of college students’ physical fitness based on clustering extraction algorithm. In: Proceedings of the 4th International Conference on Information Systems and Computer Aided Education, pp. 547–552 (2021). https://doi.org/10.1145/3482632.3482748

  108. Dovgan, E., Leskošek, B., Jurak, G., Starc, G., Sorić, M., Luštrek, M.: Enhancing BMI-based student clustering by considering fitness as key attribute. In: Proceedings of the International Conference on Discovery Science, pp. 155–165 (2019). https://doi.org/10.1007/978-3-030-33778-0_13

  109. Natilli, M., Monreale, A., Guidotti, R., Pappalardo, L.: Exploring students eating habits through individual profiling and clustering analysis. In: Proceedings of the MIDAS/PAP@PKDD/ECML 2018, pp. 156–171 (2018). https://doi.org/10.1007/978-3-030-13463-1_12

  110. Chu, Y., Yin, X.: Data analysis of college students’ mental health based on clustering analysis algorithm. Complexity. 2021 (2021). https://doi.org/10.1155/2021/9996146

  111. Li, Y., Liu, C., Zhao, X.: Research on the integration of college students’ mental health education and career planning based on feature fuzzy clustering. In: Proceedings of the 4th International Conference on Information Systems and Computer Aided Education, pp. 56–59 (2021). https://doi.org/10.1145/3482632.3482644

  112. Wang, C., Zha, Q.: Measuring systemic diversity of Chinese universities: a clustering-method approach. Qual. Quant. 52(3), 1331–1347 (2018). https://doi.org/10.1007/s11135-017-0524-5

    Article  Google Scholar 

  113. Nazaretsky, T., Hershkovitz, S., Alexandron, G.: Kappa learning: a new item-similarity method for clustering educational items from response data. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM), pp 129–138 (2019)

    Google Scholar 

  114. Huang, L., Wang, X., Wu, Z., Wang, F.: Feature selection for clustering online learners. In: Proceedings of the 8th International Conference on Educational Innovation Through Technology (EITT), pp. 1–6 (2019). https://doi.org/10.1109/EITT.2019.00009

  115. Liu, F.: Design and implementation of intelligent educational administration system using fuzzy clustering algorithm. Sci. Program. 2021 (2021). https://doi.org/10.1155/2021/9485654

  116. Rahmat, A.: Clustering in education. Eur. Res. Stud. J. 20(3) (2017)

    Google Scholar 

  117. Ahmed, A., Zualkernan, I., Elghazaly, H.: Unsupervised clustering of skills for an online learning platform. In: Proceedings of the International Conference on Advanced Learning Technologies (ICALT), pp. 200–202 (2021). https://doi.org/10.1109/ICALT52272.2021.00066

  118. Pamungkas, A.A.P., Maryono, D., Budiyanto, C.W.: Cluster analysis for student grouping based on index of learning styles. J. Phys. Conf. Ser. 1808, 012023 (2021). https://doi.org/10.1088/1742-6596/1808/1/012023

    Article  Google Scholar 

  119. Du, H., Chen, S., Niu, H., Li, Y.: Application of dbscan clustering algorithm in evaluating students’ learning status. In: Proceedings of the 17th International Conference on Computational Intelligence and Security (CIS), pp. 372–376 (2021). https://doi.org/10.1109/CIS54983.2021.00084

  120. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Math., Stat., and Prob, p. 281 (1965). http://projecteuclid.org/euclid.bsmsp/1200512992

  121. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory. 28(2), 129–137 (1982). https://doi.org/10.1109/TIT.1982.1056489

    Article  MathSciNet  Google Scholar 

  122. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Inc., Hoboken (1988). https://doi.org/10.1080/00401706.1990.10484648

    Book  Google Scholar 

  123. Pelleg, D., Moore, A.W., et al.: X-means: extending k-means with efficient estimation of the number of clusters. In: Proceedings of the International Conference on Machine Learning (ICML), vol. 1, pp. 727–734 (2000)

    Google Scholar 

  124. Li, X., Zhang, Y., Cheng, H., Zhou, F., Yin, B.: An unsupervised ensemble clustering approach for the analysis of student behavioral patterns. IEEE Access. 9, 7076–7091 (2021). https://doi.org/10.1109/ACCESS.2021.3049157

    Article  Google Scholar 

  125. Zhang, T., Yin, C., Pan, L.: Improved clustering and association rules mining for university student course scores. In: Proceedings of the 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pp. 1–6 (2017). https://doi.org/10.1109/ISKE.2017.8258808

  126. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, New York (1990). https://doi.org/10.1002/9780470316801

    Book  Google Scholar 

  127. Schubert, E., Rousseeuw, P.J.: Fast and eager k-medoids clustering: O(k) runtime improvement of the PAM, CLARA, and CLARANS algorithms. Inf. Syst. 101, 101804 (2021). https://doi.org/10.1016/j.is.2021.101804

    Article  Google Scholar 

  128. Vasuki, M., Revathy, S.: Analyzing performance of placement students record using different clustering algorithm. Indian J. Comput. Sci. Eng. 13(2), 410–419 (2022). https://doi.org/10.21817/indjcse/2022/v13i2/221302083

    Article  Google Scholar 

  129. Furr, D.: Visualization and clustering of learner pathways in an interactive online learning environment. In: Proceedings of the 12th International Conference on Educational Data Mining (EDM) (2019)

    Google Scholar 

  130. Kausar, S., Huahu, X., Hussain, I., Wenhao, Z., Zahid, M.: Integration of data mining clustering approach in the personalized e-learning system. IEEE Access. 6, 72724–72734 (2018). https://doi.org/10.1109/ACCESS.2018.2882240

    Article  Google Scholar 

  131. Patel, S., Sihmar, S., Jatain, A.: A study of hierarchical clustering algorithms. In: Proceedings of the 2nd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 537–541 (2015)

    Google Scholar 

  132. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. ACM SIGMOD Rec. 25(2), 103–114 (1996). https://doi.org/10.1145/235968.233324

    Article  Google Scholar 

  133. Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963). https://doi.org/10.1080/01621459.1963.10500845

    Article  MathSciNet  Google Scholar 

  134. Li, S., Chen, G., Xing, W., Zheng, J., Xie, C.: Longitudinal clustering of students’ self-regulated learning behaviors in engineering design. Comput. Educ. 153, 103899 (2020). https://doi.org/10.1016/j.compedu.2020.103899

    Article  Google Scholar 

  135. Zhang, T., Taub, M., Chen, Z.: A multi-level trace clustering analysis scheme for measuring students’ self-regulated learning behavior in a mastery-based online learning environment. In: Proceedings of the 12th International Learning Analytics and Knowledge Conference (LAK), pp. 197–207 (2022). https://doi.org/10.1145/3506860.3506887

  136. Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybernet. 3(3), 32–57 (1973). https://doi.org/10.1080/01969727308546046

    Article  MathSciNet  Google Scholar 

  137. Zhang, P., Shen, Q.: Fuzzy c-means based coincidental link filtering in support of inferring social networks from spatiotemporal data streams. Soft. Comput. 22(21), 7015–7025 (2018). https://doi.org/10.1007/s00500-018-3363-y

    Article  Google Scholar 

  138. Tang, Q., Zhao, Y., Wei, Y., Jiang, L.: Research on the mental health of college students based on fuzzy clustering algorithm. Secur. Commun. Net. 2021 (2021). https://doi.org/10.1155/2021/3960559

  139. Amalia, N., et al.: Determination system of single tuition group using a combination of fuzzy c-means clustering and simple additive weighting methods. In: IOP Conference Series: Materials Science and Engineering, vol. 536, p. 012148 (2019). https://doi.org/10.1088/1757-899X/536/1/012148

  140. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. Ser. B. 39(1), 1–22 (1977). https://doi.org/10.1111/j.2517-6161.1977.tb01600.x

    Article  MathSciNet  Google Scholar 

  141. Jin, X., Han, J.: In: Sammut, C., Webb, G.I. (eds.) Expectation Maximization Clustering. Springer US, Boston, MA (2010). https://doi.org/10.1007/978-0-387-30164-8_289

  142. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Education India (2016)

    Google Scholar 

  143. Kohonen, T.: Self-organized formation of topologically correct feature maps. Biol. Cybern. 43(1), 59–69 (1982). https://doi.org/10.1007/BF00337288

    Article  MathSciNet  Google Scholar 

  144. Bação, F., Lobo, V., Painho, M.: Self-organizing maps as substitutes for k-means clustering. In: Proceedings of the International Conference on Computational Science, pp. 476–483 (2005). https://doi.org/10.1007/11428862_65

  145. Natita, W., Wiboonsak, W., Dusadee, S.: Appropriate learning rate and neighborhood function of self-organizing map (SOM) for specific humidity pattern classification over southern Thailand. Int. J. Model. Optimiz. 6(1), 61 (2016). https://doi.org/10.7763/IJMO.2016.V6.504

    Article  Google Scholar 

  146. Melka, J., Mariage, J.J.: Efficient implementation of self-organizing map for sparse input data. In: International Joint Conference on Computational Intelligence (IJCCI), pp. 54–63 (2017). https://doi.org/10.5220/0006499500540063

  147. Delgado, S., Morán, F., San José, J.C., Burgos, D.: Analysis of students’ behavior through user clustering in online learning settings, based on self organizing maps neural networks. IEEE Access. 9, 132592–132608 (2021). https://doi.org/10.1109/ACCESS.2021.3115024

    Article  Google Scholar 

  148. Tasdemir, K., Merényi, E.: A validity index for prototype-based clustering of data sets with complex cluster structures. IEEE Trans. Syst. Man Cybern. B Cybern. 41(4), 1039–1053 (2011). https://doi.org/10.1109/TSMCB.2010.2104319

    Article  Google Scholar 

  149. Alias, U.F., Ahmad, N.B., Hasan, S.: Mining of e-learning behavior using SOM clustering. In: Proceedings of the 6th ICT International Student Project Conference (ICT-ISPC), pp. 1–4 (2017). https://doi.org/10.1109/ICT-ISPC.2017.8075350

  150. Bara, M.W., Ahmad, N.B., Modu, M.M., Ali, H.A.: Self-organizing map clustering method for the analysis of e-learning activities. In: Majan International Conference (MIC), pp. 1–5 (2018). https://doi.org/10.1109/MINTC.2018.8363155

  151. Ahmad, N.B., Alias, U.F., Mohamad, N., Yusof, N.: Principal component analysis and self-organizing map clustering for student browsing behaviour analysis. Procedia Comput. Sci. 163, 550–559 (2019). https://doi.org/10.1016/j.procs.2019.12.137

    Article  Google Scholar 

  152. Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Proces. Syst. 14 (2001)

    Google Scholar 

  153. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000). https://doi.org/10.1109/34.868688

    Article  Google Scholar 

  154. Yan, D., Huang, L., Jordan, M.I.: Fast approximate spectral clustering. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 907–916 (2009)

    Google Scholar 

  155. Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), vol. 96, pp. 226–231 (1996)

    Google Scholar 

  156. Chhabra, A., Masalkovaite, K., Mohapatra, P.: An overview of fairness in clustering. IEEE Access. (2021). https://doi.org/10.1109/ACCESS.2021.3114099

  157. Žliobaitė, I.: Measuring discrimination in algorithmic decision making. Data Min. Knowl. Disc. 31(4), 1060–1089 (2017). https://doi.org/10.1007/s10618-017-0506-1

    Article  MathSciNet  Google Scholar 

  158. Chierichetti, F., Kumar, R., Lattanzi, S., Vassilvitskii, S.: Fair clustering through fairlets. In: Neural Information Processing Systems, pp. 5036–5044 (2017)

    Google Scholar 

  159. Ahmadian, S., Epasto, A., Kumar, R., Mahdian, M.: Clustering without over-representation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 267–275 (2019). https://doi.org/10.1145/3292500.3330987

  160. Bera, S., Chakrabarty, D., Flores, N., Negahbani, M.: Fair algorithms for clustering. In: Proceedings of the Neural Information Processing Systems Conference (NIPS 2019), p. 32 (2019)

    Google Scholar 

  161. Ghadiri, M., Samadi, S., Vempala, S.: Socially fair k-means clustering. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT), pp. 438–448 (2021). https://doi.org/10.1145/3442188.3445906

  162. Chakrabarti, D., Dickerson, J.P., Esmaeili, S.A., Srinivasan, A., Tsepenekas, L.: A new notion of individually fair clustering: 𝛼-equitable 𝑘-center. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 6387–6408 (2022)

    Google Scholar 

  163. Jones, M., Nguyen, H., Nguyen, T.: Fair k-centers via maximum matching. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 4940–4949 (2020)

    Google Scholar 

  164. Schmidt, M., Schwiegelshohn, C., Sohler, C.: Fair coresets and streaming algorithms for fair k-means. In: Proceedings of the International Workshop on Approximation and Online Algorithms, pp. 232–251 (2019). https://doi.org/10.1007/978-3-030-39479-0_16

  165. Abraham, S.S., Padmanabhan, D., Sundaram, S.S.: Fairness in clustering with multiple sensitive attributes. In: EDBT/ICDT 2020 Joint Conference, pp. 287–298 (2020). https://doi.org/10.5441/002/edbt.2020.26

  166. Xia, X., Hui, Z., Chunming, Y., Xujian, Z., Bo, L.: Fairness constraint of fuzzy c-means clustering improves clustering fairness. In: Proceedings of the Asian Conference on Machine Learning (ACML), pp. 113–128 (2021)

    Google Scholar 

  167. Ahmadian, S., et al.: Fair hierarchical clustering. Adv. Neural Inf. Proces. Syst. 33, 21050–21060 (2020)

    Google Scholar 

  168. Kleindessner, M., Samadi, S., Awasthi, P., Morgenstern, J.: Guarantees for spectral clustering with fairness constraints. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 3458–3467 (2019)

    Google Scholar 

  169. Battaglia, O.R., Di Paola, B., Fazio, C.: K-means clustering to study how student reasoning lines can be modified by a learning activity based on feynman’s unifying approach. Eur. J. Math. Sci. Technol. Educ. 13(6), 2005–2038 (2017). https://doi.org/10.12973/eurasia.2017.01211a

    Article  Google Scholar 

  170. Maylawati, D.S., Priatna, T., Sugilar, H., Ramdhani, M.A.: Data science for digital culture improvement in higher education using k-means clustering and text analytics. Int. J. Electr. Comput. Eng. 10(5), 2088–8708 (2020). https://doi.org/10.11591/ijece.v10i5.pp4569-4580

    Article  Google Scholar 

  171. Šarić-Grgić, I., Grubišić, A., Šerić, L., Robinson, T.J.: Student clustering based on learning behavior data in the intelligent tutoring system. Int. J. Dist. Educ. Technol. 18(2), 73–89 (2020). https://doi.org/10.4018/IJDET.2020040105

    Article  Google Scholar 

  172. Talebinamvar, M., Zarrabi, F.: Clustering students’ writing behaviors using keystroke logging: a learning analytic approach in efl writing. Lang. Test. Asia. 12(1), 1–20 (2022). https://doi.org/10.1186/s40468-021-00150-5

    Article  Google Scholar 

  173. Kurniawan, C., Setyosari, P., Kamdi, W., Ulfa, S.: Electrical engineering student learning preferences modelled using k-means clustering. Global J. Eng. Educ. 20(2), 140–145 (2018)

    Google Scholar 

  174. Rijati, N., Sumpeno, S., Purnomo, M.H.: Multi-attribute clustering of student’s entrepreneurial potential mapping based on its characteristics and the affecting factors: preliminary study on Indonesian higher education database. In: Proceedings of the 10th International Conference on Computer and Automation Engineering, pp. 11–16 (2018). https://doi.org/10.1145/3192975.3193014

  175. Mishler, A., Nugent, R.: Clustering students and inferring skill set profiles with skill hierarchies. In: Proceedings of the 11th International Conference on Educational Data Mining (EDM) (2018)

    Google Scholar 

  176. Mojarad, S., Essa, A., Mojarad, S., Baker, R.S.: Data-driven learner profiling based on clustering student behaviors: learning consistency, pace and effort. In: Proceedings of the International Conference on Intelligent Tutoring Systems, pp. 130–139 (2018). https://doi.org/10.1007/978-3-319-91464-0_13

  177. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979). https://doi.org/10.1109/TPAMI.1979.4766909

    Article  Google Scholar 

  178. Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J. Cybernet. 4(1), 95–104 (1974). https://doi.org/10.1080/01969727408546059

    Article  MathSciNet  Google Scholar 

  179. Tempelaar, D., Rienties, B., Mittelmeier, J., Nguyen, Q.: Student profiling in a dispositional learning analytics application using formative assessment. Comput. Hum. Behav. 78, 408–420 (2018). https://doi.org/10.1016/j.chb.2017.08.010

    Article  Google Scholar 

  180. Švábensky`, V., Vykopal, J., Čeleda, P., Tkáčik, K., Popovič, D.: Student assessment in cybersecurity training automated by pattern mining and clustering. Educ. Inf. Technol. 1–32 (2022). https://doi.org/10.1007/s10639-022-10954-4

  181. Bradley, P.S., Bennett, K.P., Demiriz, A.: Constrained K-Means Clustering, p. 20. Microsoft Research, Redmond (2000)

    Google Scholar 

  182. Mulvey, J.M., Beck, M.P.: Solving capacitated clustering problems. Eur. J. Oper. Res. 18(3), 339–348 (1984). https://doi.org/10.1016/0377-2217(84)90155-3

    Article  Google Scholar 

  183. Moshkovitz, M., Dasgupta, S., Rashtchian, C., Frost, N.: Explainable k-means and k-medians clustering. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 7055–7065 (2020)

    Google Scholar 

  184. Bandyapadhyay, S., Fomin, F., Golovach, P.A., Lochet, W., Purohit, N., Simonov, K.: How to find a good explanation for clustering? In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 3904–3912 (2022). https://doi.org/10.1609/aaai.v36i4.20306

    Chapter  Google Scholar 

  185. Wang, D.-Y., Lin, S.S., Sun, C.-T.: DIANA: a computer-supported heterogeneous grouping system for teachers to conduct successful small learning groups. Comput. Hum. Behav. 23(4), 1997–2010 (2007). https://doi.org/10.1016/j.chb.2006.02.008

    Article  Google Scholar 

  186. Watson, S.B., Marshall, J.E.: Heterogeneous grouping as an element of cooperative learning in an elementary education science course. Sch. Sci. Math. 95(8), 401–405 (1995). https://doi.org/10.1111/j.1949-8594.1995.tb10192.x

    Article  Google Scholar 

  187. Flanagan, B., Majumdar, R., Ogata, H.: Fine grain synthetic educational data: challenges and limitations of collaborative learning analytics. IEEE Access. 10, 26230–26241 (2022). https://doi.org/10.1109/ACCESS.2022.3156073

    Article  Google Scholar 

  188. Vie, J.-J., Rigaux, T., Minn, S.: Privacy-preserving synthetic educational data generation. In: Proceedings of the EC-TEL 2022 (2022)

    Google Scholar 

  189. Backurs, A., Indyk, P., Onak, K., Schieber, B., Vakilian, A., Wagner, T.: Scalable fair clustering. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 405–413 (2019)

    Google Scholar 

  190. Fahad, A., et al.: A survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans. Emerg. Top. Comput. 2(3), 267–279 (2014). https://doi.org/10.1109/TETC.2014.2330519

    Article  Google Scholar 

  191. Assent, I.: Clustering high dimensional data. Wires Data Mining Know. Discov. 2(4), 340–350 (2012). https://doi.org/10.1002/widm.1062

    Article  Google Scholar 

  192. Le Quy, T., Nguyen, T.H., Friege, G., Ntoutsi, E.: Evaluation of group fairness measures in student performance prediction problems. In: Machine Learning and Principles and Practice of Knowledge Discovery in Databases: International Workshops of ECML PKDD 2022, pp. 119–136 (2022). https://doi.org/10.1007/978-3-031-23618-1_8

    Chapter  Google Scholar 

  193. Rihák, J., Pelánek, R.: Measuring similarity of educational items using data on learners’ performance. In: Proceedings of the 10th International Conference on Educational Data Mining (EDM), pp. 16–23 (2017)

    Google Scholar 

  194. Ninrutsirikun, U., Watanapa, B., Arpnikanondt, C., Watananukoon, V.: A unified framework for student cluster grouping with learning preference associative detection for enhancing students’ learning outcomes in computer programming courses. In: Proceedings of 2018 Global Wireless Summit (GWS), pp. 266–271 (2018). https://doi.org/10.1109/GWS.2018.8686665

  195. Phanniphong, K., Nuankaew, P., Teeraputon, D., Nuankaew, W., Boontonglek, M., Bussaman, S.: Clustering of learners performance based on learning outcomes for finding significant courses. In: Proceedings of the Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT-NCON), pp. 192–196 (2019). https://doi.org/10.1109/ECTI-NCON.2019.8692263

  196. Wang, X., Zhang, Y., Yang, Y., Liu, K., Gao, B.: Research on relevance analysis and clustering algorithms in college students’ academic performance. In: Proceedings of the 10th International Conference on Information Technology in Medicine and Education (ITME), pp. 730–733 (2019). https://doi.org/10.1109/ITME.2019.00167

  197. Chaves, V.E.J., García-Torres, M., Alonso, D.B., Gómez-Vela, F., Divina, F., Vázquez-Noguera, J.L.: Analysis of student achievement scores via cluster analysis. In: Proceedings of the International Conference on European Transnational Education, pp. 399–408 (2020). https://doi.org/10.1007/978-3-030-57799-5_41

  198. Kosztyán, Z.T., Orbán-Mihálykó, É., Mihálykó, C., Csányi, V.V., Telcs, A.: Analyzing and clustering students’ application preferences in higher education. J. Appl. Stat. 47(16), 2961–2983 (2020). https://doi.org/10.1080/02664763.2019.1709052

    Article  MathSciNet  Google Scholar 

  199. Pradana, C., Kusumawardani, S., Permanasari, A.: Comparison clustering performance based on moodle log mining. IOP Conf. Ser. Mater. Sci. Eng. 722, 012012 (2020). https://doi.org/10.1088/1757-899X/722/1/012012

    Article  Google Scholar 

  200. Tang, P., Wang, Y., Shen, N.: Prediction of college students’ physical fitness based on k-means clustering and SVR. Comput. Syst. Sci. Eng. 35(4), 237–246 (2020). https://doi.org/10.32604/csse.2020.35.237

    Article  Google Scholar 

  201. Rijati, N., Purwitasari, D., Sumpeno, S., Purnomo, M.: A decision making and clustering method integration based on the theory of planned behavior for student entrepreneurial potential mapping in Indonesia. Int. J. Intell. Eng. Syst. 13(4), 129–144 (2020). https://doi.org/10.22266/ijies2020.0831.12

    Article  Google Scholar 

  202. Chi, D.: Research on the application of k-means clustering algorithm in student achievement. In: Proceedings of the IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), pp. 435–438 (2021). https://doi.org/10.1109/ICCECE51280.2021.9342164

  203. Li, G., Alfred, R., Wang, X.: Student behavior analysis and research model based on clustering technology. Mob. Inf. Syst. 2021 (2021). https://doi.org/10.1155/2021/9163517

  204. Putra, A.A.N.K., Nasucha, M., Hermawan, H.: K-means clustering algorithm in web-based applications for grouping data on scholarship selection results. In: Proceedings of the International Symposium on Electronics and Smart Devices (ISESD), pp. 1–6 (2021). https://doi.org/10.1109/ISESD53023.2021.9501716

  205. Susanto, R., Husen, M.N., Lajis, A., Lestari, W., Hasanah, H.: Clustering of student perceptions on developing a physics laboratory based on information technology and local wisdom. In: Proceedings of the 8th International Conference on Information Technology, Computer and Electrical Engineering (ICITACEE), pp. 68–73 (2021). https://doi.org/10.1109/ICITACEE53184.2021.9617483

  206. Rauthan, A., et al.: Impact on higher education in pandemic: analysis k-means clustering using urban & rural areas. In: Proceedings of the 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), pp. 1974–1980 (2021). https://doi.org/10.1109/ICAC3N53548.2021.9725709

  207. Wang, Q.: Application of the intra cluster, characteristic of k-means clustering method in English score analysis in colleges. J. Phys. Conf. Ser. 1941, 012001 (2021). https://doi.org/10.1088/1742-6596/1941/1/012001

    Article  Google Scholar 

  208. Cheng, W., Shwe, T.: Clustering analysis of student learning outcomes based on education data. In: 2019 IEEE Frontiers in Education Conference (FIE), pp. 1–7 (2019). https://doi.org/10.1109/FIE43999.2019.9028400

  209. Singelmann, L., Alvarez, E., Swartz, E., Pearson, M., Striker, R., Ewert, D.: Innovators, learners, and surveyors: clustering students in an innovation-based learning course. In: IEEE Frontiers in Education Conference (FIE), pp. 1–9 (2020). https://doi.org/10.1109/FIE44824.2020.9274235

  210. Popov, A., Ovsyankin, A., Emomaliev, M., Satsuk, M.: Application of the clustering algorithm in an automated training system. J. Phys. Conf. Ser. 1691, 012120 (2020). https://doi.org/10.1088/1742-6596/1691/1/012120

    Article  Google Scholar 

  211. Supianto, A.A., et al.: Improvements of fuzzy c-means clustering performance using particle swarm optimization on student grouping based on learning activity in a digital learning media. In: Proceedings of the 5th International Conference on Sustainable Information Engineering and Technology, pp. 239–243 (2020). https://doi.org/10.1145/3427423.3427449

  212. Yadav, R.S.: Application of hybrid clustering methods for student performance evaluation. Int. J. Inf. Technol. 12(3), 749–756 (2020). https://doi.org/10.1007/s41870-018-0192-2

    Article  Google Scholar 

  213. Parvathavarthini, S., Sharvanthika, K., Jagadeesh, M., Kishore, B.: Analysis of student performance in e-learning environment using crow search based fuzzy clustering. In: Proceedings of the 2nd International Conference on Smart Electronics and Communication (ICOSEC), pp. 1784–1787 (2021). https://doi.org/10.1109/ICOSEC51865.2021.9591920

  214. Premalatha, N., Sujatha, S.: Prediction of students’ employability using clustering algorithm: a hybrid approach. Int. J. Model. Simul. Sci. Comput. 2250049 (2022). https://doi.org/10.1142/S1793962322500490

  215. Waluyo, E., Djeni, D., Pratama, L., Anggraini, V.: Clustering based on sociometry in Pythagoras theorem. J. Phys. Conf. Ser. 1211, 012058 (2019). https://doi.org/10.1088/1742-6596/1211/1/012058

    Article  Google Scholar 

  216. Purbasari, I., Puspaningrum, E., Putra, A.: Using self-organizing map (SOM) for clustering and visualization of new students based on grades. J. Phys. Conf. Ser. 1569, 022037 (2020). https://doi.org/10.1088/1742-6596/1569/2/022037

    Article  Google Scholar 

  217. Rakhmawati, N.A., Faiz, N., Hafidz, I., Raditya, I., Dinatha, P., Suwignyo, A.: Clustering student Instagram accounts using author-topic model. Int. J. Bus. Intell. Data Min. 19(1), 70–79 (2021). https://doi.org/10.1504/IJBIDM.2021.115954

    Article  Google Scholar 

  218. Yan, Q., Su, Z.: Evaluation of college students’ English performance considering Roche multiway tree clustering. Int. J. Electric. Eng. Educ. (2021). https://doi.org/10.1177/00207209211004207

Download references

Acknowledgments

The work of the first author is supported by the Ministry of Science and Culture of Lower Saxony, Germany, within the PhD program “LernMINT: Data-assisted teaching in the MINT subjects.”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tai Le Quy .

Editor information

Editors and Affiliations

Appendix

Appendix

Table 2.8 Clustering models used in EDS

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Le Quy, T., Friege, G., Ntoutsi, E. (2023). A Review of Clustering Models in Educational Data Science Toward Fairness-Aware Learning. In: Peña-Ayala, A. (eds) Educational Data Science: Essentials, Approaches, and Tendencies. Big Data Management. Springer, Singapore. https://doi.org/10.1007/978-981-99-0026-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-0026-8_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-0025-1

  • Online ISBN: 978-981-99-0026-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics