Skip to main content

Peer assessment using soft computing techniques

Abstract

In this paper, we applied a peer assessment scenario at the Technical University of Manabí (Ecuador). Students and professors evaluated some works through rubrics, assigned a numerical score, and provided textual feedback grounding why such a numerical score was determined, to detect inaccuracy between both assessments. The proposed model uses soft computing techniques to reduce the professor's workload in the correction process. Experiments were carried out with a data set in the Spanish language. We applied a supervised machine learning approach to obtain a sentiment score corresponding to specific textual feedback, and the fuzzy logic approach to detect inaccuracy between numerical and sentiment scores and obtain the assessment score. The results showed that the support vector machine model had a better performance with low computational costs when the feedback was represented as a 1-g and 2-g vector, whose relevance was weighted with term frequency-inverse document frequency; moreover, the grader's critical judgment validity was inferred from the similarities between numerical and sentiment scores. At the end, the outcomes assert the model is reliable and guarantees a fair peer assessment procedure.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    https://gitlab.com/peerassessment/soft_computing.

References

  1. Alves, M. A., Castro, G. Z., Oliveira, B. A. S., Ferreira, L. A., Ramírez, J. A., Silva, R., & Guimarães, F. G. (2021). Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs. Computers in Biology and Medicine. https://doi.org/10.1016/j.compbiomed.2021.104335

    Article  Google Scholar 

  2. Babo, R., Rocha, J., Fitas, R., Suhonen, J., & Tukiainen, M. (2021). Self and peer e-assessment: A study on software usability. International Journal of Information and Communication Technology Education, 17(3), 68–85. https://doi.org/10.4018/IJICTE.20210701.oa5

    Article  Google Scholar 

  3. Barlybayev, A., Sharipbay, A., Ulyukova, G., Sabyrov, T., & Kuzenbayev, B. (2016). Student’s performance evaluation by fuzzy logic. Procedia Computer Science, 102(August), 98–105. https://doi.org/10.1016/j.procs.2016.09.375

    Article  Google Scholar 

  4. Bong, J., & Park, M. S. (2020). Peer assessment of contributions and learning processes in group projects: an analysis of information technology undergraduate students performance. Assessment and Evaluation in Higher Education, 45(8), 1155–1168. https://doi.org/10.1080/02602938.2020.1727413

    Article  Google Scholar 

  5. Bürgermeister, A., Glogger-Frey, I., & Saalbach, H. (2021). Supporting peer feedback on learning strategies: effects on self-efficacy and feedback quality. Psychology Learning & Teaching. https://doi.org/10.1177/14757257211016604

    Article  Google Scholar 

  6. Capuano, N., Loia, V., Member, S., & Orciuoli, F. (2017). A FUZZY group decision making model for ordinal peer assessment. 10(2), 247–259. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7467573

  7. Capuano, N., Caballé, S., Percannella, G., & Ritrovato, P. (2020). FOPA-MC: Fuzzy multi-criteria group decision making for peer assessment. Soft Computing, 24(23), 17679–17692. https://doi.org/10.1007/s00500-020-05155-5

    Article  Google Scholar 

  8. Chai, K. C., Tay, K. M., & Lim, C. P. (2015). A new fuzzy peer assessment methodology for cooperative learning of students. Applied Soft Computing, 32, 468–480. https://doi.org/10.1016/j.asoc.2015.03.056

    Article  Google Scholar 

  9. Cobos, R., Jurado, F., & Blazquez-Herranz, A. (2019). A content analysis system that supports sentiment analysis for subjectivity and polarity detection in online courses. IEEE Revista Iberoamericana De Tecnologias Del Aprendizaje, 14(4), 177–187. https://doi.org/10.1109/RITA.2019.2952298

    Article  Google Scholar 

  10. Daou, D., Sabra, R., & Zgheib, N. K. (2020). Factors that determine the perceived effectiveness of peer feedback in collaborative learning: A mixed methods design. Medical Science Educator, 30(3), 1145–1156. https://doi.org/10.1007/s40670-020-00980-7

    Article  Google Scholar 

  11. El Alaoui, M., El Yassini, K., & Ben Azza, H. (2019). Peer assessment improvement using fuzzy logic . Springer International Publishing. https://doi.org/10.1007/978-3-030-11196-0_35

    Book  Google Scholar 

  12. Esparza, G. G., De-Luna, A., Zezzatti, A. O., Hernandez, A., Ponce, J., Álvarez, M., Cossio, E., & de Jesus Nava, J. (2018). A sentiment analysis model to analyze students reviews of teacher performance using support vector machines. In S. Omatu, S. Rodríguez, G. Villarrubia, P. Faria, P. Sitek, & J. Prieto (Eds.), Advances in intelligent systems and computing. Springer. https://doi.org/10.1007/978-3-319-62410-5_19

    Chapter  Google Scholar 

  13. Fang, J. W., Chang, S. C., Hwang, G. J., & Yang, G. (2021). An online collaborative peer-assessment approach to strengthening pre-service teachers’ digital content development competence and higher-order thinking tendency. Educational Technology Research and Development, 69(2), 1155–1181. https://doi.org/10.1007/s11423-021-09990-7

    Article  Google Scholar 

  14. Hew, K. F., Hu, X., Qiao, C., & Tang, Y. (2020). What predicts student satisfaction with MOOCs: A gradient boosting trees supervised machine learning and sentiment analysis approach. Computers & Education, 145, 103724. https://doi.org/10.1016/j.compedu.2019.103724

    Article  Google Scholar 

  15. İskender, E., & Batı, G. B. (2015). Comparing Turkish Universities entrepreneurship and innovativeness index’s rankings with sentiment analysis results on social media. Procedia - Social and Behavioral Sciences, 195, 1543–1552. https://doi.org/10.1016/j.sbspro.2015.06.457

    Article  Google Scholar 

  16. Izzo, J. A., & Maloy, K. (2017). 86 Sentiment analysis demonstrates variability in medical student grading. Annals of Emergency Medicine, 70(4), S35–S36. https://doi.org/10.1016/j.annemergmed.2017.07.111

    Article  Google Scholar 

  17. Jeni, L. A., Cohn, J. F., De La Torre, F. (2013). Facing imbalanced data - Recommendations for the use of performance metrics. In Proceedings - 2013 humaine association conference on affective computing and intelligent interaction, ACII 2013, September, 245–251. https://doi.org/10.1109/ACII.2013.47

  18. Jyothi, G., Parvathi, C., Srinivas, P., & Althaf Rahaman, S. (2014). fuzzy expert model for evaluation of faculty performance in technical educational institutions. Journal of Engineering Research and Applications , vol 4(5), pp. 41–50.

  19. Kastrati, Z., Imran, A. S., & Kurti, A. (2020). Weakly supervised framework for aspect-based sentiment analysis on students’ reviews of MOOCs. IEEE Access, 8, 106799–106810. https://doi.org/10.1109/ACCESS.2020.3000739

    Article  Google Scholar 

  20. Kontogiannis, D., Bargiotas, D., & Daskalopulu, A. (2021). Fuzzy control system for smart energy management in residential buildings based on environmental data. Energies. https://doi.org/10.3390/en14030752

    Article  Google Scholar 

  21. Lee, S. J., & Kwon, K. (2021). Peer assessment as a facilitating and assessment strategy in online and face-to-face classes. International Journal of Online Pedagogy and Course Design, 11(3), 36–48. https://doi.org/10.4018/IJOPCD.2021070103

    Article  Google Scholar 

  22. Leekwijck, W. V., & Kerre, E. E. (1999). Defuzziÿcation : Criteria and classiÿcation. Fuzzy Sets and Systems, 108, 159–178.

    Article  Google Scholar 

  23. Liu, Z., Qi, Y., Ma, Z., & Yang, J. (2017). Sentiment analysis by exploring large scale web-based Chinese short text. International Conference on Computer Science and Application Engineering (CSAE), 190, 930–939.

    Google Scholar 

  24. Lopez, J., Ray, I., & Crispo, B. (2015). Risks and security of internet and systems. In J. Lopez, I. Ray, & B. Crispo (Eds.), Lecture notes in computer science including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics. Springer International Publishing. https://doi.org/10.1007/978-3-319-17127-2

    Chapter  Google Scholar 

  25. López-Pellisa, T., Rotger, N., & Rodríguez-Gallego, F. (2020). Collaborative writing at work: Peer feedback in a blended learning environment. Education and Information Technologies. https://doi.org/10.1007/s10639-020-10312-2

    Article  Google Scholar 

  26. Lwin, H. H., Oo, S., Ye, K. Z., Kyaw Lin, K., Aung, W. P., Paing Ko, P. (2020). Feedback analysis in outcome base education using machine learning. In 2020 17th International conference on electrical engineering/electronics, computer, telecommunications and information technology (ECTI-CON), pp. 767–770. https://doi.org/10.1109/ECTI-CON49241.2020.9158328

  27. Marsico M De, Sciarrone F, Sterbini A, Temperini M (2018) Educational data mining for peer assessment in communities of learners. In Visvizi, MD Lytras, L Daniela (Eds.) The future of innovation and technology in education: policies and practices for teaching and learning excellence, Emerald Publishing Limited, Berlin

  28. Martínez-Cámara, E., Martín-Valdivia, M. T., & Ureña-López, L. A. (2011). Opinion classification techniques applied to a Spanish Corpus. In R. Muñoz, A. Montoyo, & E. Métais (Eds.), Natural language processing and information systems (pp. 169–176). Springer.

    Chapter  Google Scholar 

  29. Mercader, C., Ion, G., & Díaz-Vicario, A. (2020). Factors influencing students’ peer feedback uptake: Instructional design matters. Assessment & Evaluation in Higher Education. https://doi.org/10.1080/02602938.2020.1726283

    Article  Google Scholar 

  30. Mogharreban, N., & Dilalla, L. F. (2006). Comparison of defuzzification techniques for analysis of non-interval data. In Annual conference of the North American fuzzy information processing society - NAFIPS, July, 257–260. https://doi.org/10.1109/NAFIPS.2006.365418

  31. Monllaó Olivé, D., Huynh, D. Q., Reynolds, M., Dougiamas, M., & Wiese, D. (2020). A supervised learning framework: Using assessment to identify students at risk of dropping out of a MOOC. Journal of Computing in Higher Education, 32(1), 9–26. https://doi.org/10.1007/s12528-019-09230-1

    Article  Google Scholar 

  32. Nguyen, K. Van, Nguyen, V. D., Nguyen, P. X. V., Truong, T. T. H., & Nguyen, N. L.-T. (2018c). UIT-VSFC: Vietnamese students’ feedback corpus for sentiment analysis. In 2018 10th international conference on knowledge and systems engineering (KSE), pp. 19–24. https://doi.org/10.1109/KSE.2018.8573337

  33. Nguyen, P. X. V., Hong, T. T. T., Nguyen, K. Van, & Nguyen, N. L.-T. (2018b). Deep learning versus traditional classifiers on vietnamese students’ feedback corpus. In 2018 5th NAFOSTED conference on information and computer science (NICS), pp. 75–80. https://doi.org/10.1109/NICS.2018.8606837

  34. Nguyen, V. D., Nguyen, K. Van, & Nguyen, N. L.-T. (2018a). Variants of long short-term memory for sentiment analysis on vietnamese students’ feedback corpus. In 2018 10th international conference on knowledge and systems engineering (KSE), pp. 306–311. https://doi.org/10.1109/KSE.2018.8573351

  35. NguyenXiongLitman, H. W. D. (2017). Iterative design and classroom evaluation of automated formative feedback for improving peer feedback localization. International Journal of Artificial Intelligence in Education, 27(3), 582–622. https://doi.org/10.1007/s40593-016-0136-6

    Article  Google Scholar 

  36. Nikolić, N., Grljević, O., & Kovačević, A. (2020). Aspect-based sentiment analysis of reviews in the domain of higher education. The Electronic Library, 38(1), 44–64. https://doi.org/10.1108/EL-06-2019-0140

    Article  Google Scholar 

  37. Obeleagu, O. U., Abass, Y. A., & Adeshina, S. (2019). Sentiment analysis in student learning experience. In 2019 15th international conference on electronics, computer and computation, ICECCO 2019, Icecco, pp. 0–4. https://doi.org/10.1109/ICECCO48375.2019.9043293

  38. Oh, E. G., Huang, W.-H.D., Hedayati Mehdiabadi, A., & Ju, B. (2018). Facilitating critical thinking in asynchronous online discussion: Comparison between peer- and instructor-redirection. Journal of Computing in Higher Education, 30(3), 489–509. https://doi.org/10.1007/s12528-018-9180-6

    Article  Google Scholar 

  39. Onan, A. (2021). Sentiment analysis on massive open online course evaluations: A text mining and deep learning approach. Computer Applications in Engineering Education, 29(3), 572–589. https://doi.org/10.1002/cae.22253

    Article  Google Scholar 

  40. Ostuzzi, F., & Hoveskog, M. (2020). Education for flourishing: An illustration of boundary object use, peer feedback and distance learning. International Journal of Sustainability in Higher Education, 21(4), 757–777. https://doi.org/10.1108/IJSHE-09-2019-0271

    Article  Google Scholar 

  41. Pérez, M. C. I., Vidal-Puga, J., & Juste, M. R. P. (2020). The role of self and peer assessment in higher education. Studies in Higher Education. https://doi.org/10.1080/03075079.2020.1783526

    Article  Google Scholar 

  42. Pinargote Ortega, M., Bowen Mendoza, L., Meza Hormaza, J., & Ventura Soto, S. (2020). Accuracy’ measures of sentiment analysis algorithms for Spanish corpus generated in peer assessment. In proceedings of the 6th international conference on engineering & MIS 2020. https://doi.org/10.1145/3410352.3410838

  43. Pong-Inwong, C., & Kaewmak, K. (2017). Improved sentiment analysis for teaching evaluation using feature selection and voting ensemble learning integration. In 2016 2nd IEEE international conference on computer and communications, ICCC 2016 - Proceedings, pp. 1222–1225. https://doi.org/10.1109/CompComm.2016.7924899

  44. Pong-inwong, C., & Songpan, W. (2019). Sentiment analysis in teaching evaluations using sentiment phrase pattern matching (SPPM) based on association mining. International Journal of Machine Learning and Cybernetics, 10(8), 2177–2186. https://doi.org/10.1007/s13042-018-0800-2

    Article  Google Scholar 

  45. Qiao, L., Wang, Z., Bao, S., & Xia, Y. (2020). Analysis and reflection on peer assessment results based on short play of game theory. International Journal of Systems Assurance Engineering and Management, 11(4), 780–784. https://doi.org/10.1007/s13198-019-00837-2

    Article  Google Scholar 

  46. Ramachandran, L., Gehringer, E. F., & Yadav, R. K. (2017). Automated assessment of the quality of peer reviews using natural language processing techniques. International Journal of Artificial Intelligence in Education, 27(3), 534–581. https://doi.org/10.1007/s40593-016-0132-x

    Article  Google Scholar 

  47. Rico-Juan, J. R., Gallego, A.-J., & Calvo-Zaragoza, J. (2019). Automatic detection of inconsistencies between numerical scores and textual feedback in peer-assessment processes with machine learning. Computers & Education, 140, 103609. https://doi.org/10.1016/j.compedu.2019.103609

    Article  Google Scholar 

  48. Sangeetha, K., & Prabha, D. (2020). Sentiment analysis of student feedback using multi-head attention fusion model of word and context embedding for LSTM. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-020-01791-9

    Article  Google Scholar 

  49. Sciarrone, F., & Temperini, M. (2020). K-OpenAnswer: A simulation environment to analyze the dynamics of massive open online courses in smart cities. Soft Computing, 24(15), 11121–11134. https://doi.org/10.1007/s00500-020-04696-z

    Article  Google Scholar 

  50. Serrano-Aguilera, J. J., Tocino, A., Fortes, S., Martín, C., Mercadé-Melé, P., Moreno-Sáez, R., Muñoz, A., Palomo-Hierro, S., & Torres, A. (2021). Using peer review for student performance enhancement: Experiences in a multidisciplinary higher education setting. Education Sciences, 11(2), 1–21. https://doi.org/10.3390/educsci11020071

    Article  Google Scholar 

  51. Spatiotis, N., Perikos, I., Mporas, I., & Paraskevas, M. (2020). Sentiment analysis of teachers using social information in educational platform environments. International Journal on Artificial Intelligence Tools, 29(02), 2040004. https://doi.org/10.1142/S0218213020400047

    Article  Google Scholar 

  52. Stenalt, M. H. (2021). Researching student agency in digital education as if the social aspects matter: students’ experience of participatory dimensions of online peer assessment. Assessment \& Evaluation in Higher Education, 46(4), 644–658. https://doi.org/10.1080/02602938.2020.1798355

    Article  Google Scholar 

  53. Voskoglou, M. (2013). Fuzzy logic as a tool for assessing students’ knowledge and skills. Education Sciences, 3(2), 208–221. https://doi.org/10.3390/educsci3020208

    Article  Google Scholar 

  54. Wang, Y., Subhan, F., Shamshirband, S., Zubair Asghar, M., Ullah, I., & Habib, A. (2020). Fuzzy-based sentiment analysis system for analyzing student feedback and satisfaction. Computers Materials & Continua, 62(2), 631–655. https://doi.org/10.32604/cmc.2020.07920

    Article  Google Scholar 

  55. Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In HLT/EMNLP 2005 - human language technology conference and conference on empirical methods in natural language processing, proceedings of the Conference, October, pp. 347–354. https://doi.org/10.3115/1220575.1220619

  56. Zhan, Y. (2021). What matters in design? Cultivating undergraduates’ critical thinking through online peer assessment in a confucian heritage context. Assessment and Evaluation in Higher Education, 46(4), 615–630. https://doi.org/10.1080/02602938.2020.1804826

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Maricela Pinargote-Ortega.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A. Rubric

Rubrics were designed, considering the following parameters:

  • Activity objective.

  • Determination of criteria and features.

  • Definition level of resolution. A liker scale was used; the first three levels reflect that the criterion has been fully or mostly adequately met, and the last two levels that little or nothing has been met:

(5) highly adequate, (4) fairly adequate, (3) adequate, (2) not very adequate, and (1) not at all adequate.

  • Numerical score (level of resolution) and feedback textual by each criterion

Table 11 Example of holistic type rubric of activity-2 (exercises of use case diagram)

Table 11 shows a holistic type rubric description containing four criteria.

Appendix B. Examples of labeled Spanish language feedbacks

Table 12 Examples of labeled Spanish language feedbacks from activity-2 (exercises of use case diagram)

Table 12 shows examples of labeled Spanish language feedbacks with the rules of step-4 (data labeling) of section three. Feedback (F1) was labeled (− 1, negative) because it contained the word “not very adequate (poco adecuada).” Feedback (F2) was labeled (1, positive) because it contained “are specified (están especificados).” Feedbacks (F3 and F4) was labeled (− 1, negative) because they contained the word "not (no)."

Appendix C. Data set for model training

Table 13 Some examples of the data set (D1) in Spanish language for model training

Table 13 shows examples of the data set (D1) in the Spanish language for model training using machine and deep learning. The data sets contain activity code, grader code, evaluated code, criterion, numerical score, feedback, and sentiment polarity labeled by the annotator.

Appendix D. Stop-Words in Spanish language

The Stop-Words in the Spanish language were created in a text file, separated by commas, without space between the words; some examples are detailed below:

el,la,las,los.

yo,tu,ella,ellas,ellos,usted,ustedes.

nosotros,nosotras,vosotros,vosotras.

nuestro,nuestra,nuestros,nuestras,vuestro,vuestra,vuestros,vuestras.

me,ti,te,nos,le,se,les.

mí,mis,tu,su,sus.

mío,mía,míos,mías,tuyo,tuya,tuyos,tuyas,suyo,suya,suyos,suyas.

quien,quienes,cuyo,cuya,cuyos,cuyas,cuanto,cuanta,cuantos,cuantas.

este,esto,esta,estos,estas,ese,eso,esa,esos,esas,aquel,aquella,aquello,aquellas,aquellos.

a,ante,cabe,con,contra,de,desde,en,entre,hacia,hasta,para,por,según,sobre,tras.

uno,una,unos,unas.

Appendix E. Parameters of algorithms

Table 14 Machine learning algorithms parameter settings

Table 14 shows the settings of machine learning algorithms parameters, the best settings are highlighted in color light blue.

Table 15 Deep learning algorithms parameter settings

Table 15 shows the settings of deep learning algorithms parameters.

Table 16 Bi-LSTM architecture

Table 16 shows the architecture applied in the Bi-LSTM algorithm.

Appendix F. Performance results of algorithms

Table 17 and Table 18 show the performance results of the machine learning algorithms using the parameter settings of model-1 and model-2 (see Table 14).

Table 17 Performance of machine learning algorithms with model-1 parameters settings
Table 18 Performance of machine learning algorithms with model-2 parameters settings

Appendix G. Data set for detecting inaccuracies and obtaining the assessment score

Table 19 Some examples of the data set (D1) in Spanish language with numerical and sentiment score generated by the predictive model from the textual feedback for the detection of inaccuracies and compute of the assessment score

Table 19 shown examples of the data set (D1) in the Spanish language to detecting inaccuracies and compute the assessment score through fuzzy logic.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pinargote-Ortega, M., Bowen-Mendoza, L., Meza, J. et al. Peer assessment using soft computing techniques. J Comput High Educ 33, 684–726 (2021). https://doi.org/10.1007/s12528-021-09296-w

Download citation

Keywords

  • Peer assessment
  • Supervised machine learning
  • Natural language processing
  • Sentiment analysis
  • Fuzzy logic
  • Higher education