Skip to main content

Advertisement

Log in

Predicting academic performance using tree-based machine learning models: A case study of bachelor students in an engineering department in China

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

Educational data mining (DEM) provides valuable educational information by applying data mining tools and techniques to analyze data at educational institutions. In this paper, tree-based machine learning algorithms are used to predict students’ overall academic performance in their bachelor’s program. The transcript data of the students in the same department in a Chinese university were collected. All the courses in the bachelor’s program were then divided into six typical categories, and the mean GPAs of each category were taken as primary input features for prediction. Three tree-based machine learning models were established, i.e. decision tree (DT), Gradient boosting decision tree (GBDT) and random forest (RF). Results show that we can successfully identify more than 80% of the students at low-performance risk using the RF model at the end of the second semester, which is meaningful because the global quality of teaching and learning of the department can be improved by taking targeted measures in time according to the machine learning model. Feature importance and the structure of decision tree were also analyzed to extract knowledge that is valuable for both students and teachers. The results of this case study can be used as a reference for other engineering departments in China.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

Download references

Acknowledgments

This work is funded by the 2021 Project of Higher Education Teaching Quality and Teaching Reform sponsored by the Department of Education of Guangdong Province (Grant No. 2021-8.2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, W., Wang, Y. & Wang, S. Predicting academic performance using tree-based machine learning models: A case study of bachelor students in an engineering department in China. Educ Inf Technol 27, 13051–13066 (2022). https://doi.org/10.1007/s10639-022-11170-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-022-11170-w

Keywords

Navigation