An intelligent college English level 4 pass rate forecasting model using machine learning

Chen, Yu

doi:10.1007/s00500-023-09221-6

An intelligent college English level 4 pass rate forecasting model using machine learning

Data analytics and machine learning
Published: 25 September 2023

Volume 27, pages 17585–17601, (2023)
Cite this article

Soft Computing Aims and scope Submit manuscript

Yu Chen¹

211 Accesses
Explore all metrics

Abstract

In the last few decades, technological improvements have had a considerable impact on education, much like many other fields of society and human endeavor. The applications of today’s cutting-edge technology are numerous in education, but recently researchers focused on using these technologies, especially artificial intelligence (AI) and machine learning (ML), for students’ performance prediction prior to the exam. It makes perfect sense to forecast student success to help all participants in the educational process. Student performance prediction may assist them in the selection of suitable courses and in creating their academic schedules. Keeping in mind the importance of student performance evaluation, this paper analyzes the prediction rate of college English level 4 by using one of the powerful ML algorithms called random forest (RF). RF uses numerous classifiers, or “ensembles”, rather than just one classifier, and is based on the decision tree technique. To do this, we constructed input and output variables and collected data for these variables, including basic student information (gender, ethnicity, major), English scores on college admission exams, college English scores (total of four semesters), and extracurricular activities of college students. Preprocessing was performed on the collected data, which included the removal of unnecessary attributes, handling outliers, normalization, and data cleaning. After the preprocessing, the features were extracted and transformed into reduced dimensionality by the local-preserving projection (LPP) algorithm. From the extracted features, we selected only the most relevant in order to feed them as input to the RF model. The model is implemented in MATLAB in order to evaluate its performance. The efficiency of the proposed model is evaluated with the help of experiments in order to verify the effectiveness of the model. The performance of the RF algorithm-based college English IV pass rate prediction model is evaluated by computing the prediction accuracy, recall rate, and hit rate of the classification results. We achieved a prediction accuracy of 96.5%, a recall rate of 89.5%, and a hit rate of 93.3%. The results show that the random RF-based prediction model for college English level 4 has a good classification effect and that the prediction results are more accurate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting academic performance using tree-based machine learning models: A case study of bachelor students in an engineering department in China

Article 21 June 2022

Predicting Academic Performance in Mathematics Using Machine Learning Algorithms

Prediction of student exam performance using data mining classification algorithms

Article Open access 03 May 2024

Data availability

Not applicable.

References

Abdulkareem NM, Abdulazeez AM (2021) Machine learning classification based on random forest algorithm: a review. Int J Sci Business 5(2):128–142
Google Scholar
Abidin AFZ, Darmawan MF, Osman MZ et al (2019) Adaboost-multilayer perceptron to predict the student’s performance in software engineering. Bull Electr Eng Inform 8(4):1556–1562
Article Google Scholar
Adekitan AI, Noma-Osaghae E (2019) Data mining approach to predicting the performance of first year student in a university using the admission requirements. Educ Inf Technol 24(2):1527–1543
Article Google Scholar
Ali, M., Yin, B., Kumar, A., Sheikh, A.M. et al. 2020, July. Reduction of Multiplications in Convolutional Neural Networks. In 2020 39th Chinese Control Conference (CCC) (pp. 7406–7411). IEEE. doi: https://doi.org/10.23919/CCC50068.2020.9188843.
Alshamsi A, Bayari R, Salloum S (2020) Sentiment analysis in English texts. Adv Sci Technol Eng Syst J 5(6):1683–1689
Article Google Scholar
Aslam MS, Dai X, Hou J, Li Q, Ullah R, Ni Z, Liu Y (2020) Reliable control design for composite-driven scheme based on delay networked T-S fuzzy system. Int J Robust Nonlinear Control 30:1622–1642
Article MathSciNet MATH Google Scholar
Ballı S, Karasoy O (2019) Development of content-based SMS classification application by using Word2Vec-based feature extraction. IET Software 13(4):295–304
Article Google Scholar
Beaulac C, Rosenthal JS (2019) Predicting university students’ academic success and major using random forests. Res High Educ 60(7):1048–1064
Article Google Scholar
Behr A, Giese M, Theune K (2020) Early prediction of university dropouts–a random forest approach. Jahrbücher Für Nationalökonomie Und Statistik 240(6):743–789
Article Google Scholar
Bennett M, Bezodis N, Shearer DA et al (2019) Descriptive conversion of performance indicators in rugby union. J Sci Med Sport 22(3):330–334
Article Google Scholar
Bi Q, Goodman KE, Kaminsky J, Lessler J (2019) What is machine learning? A primer for the epidemiologist. Am J Epidemiol 188(12):2222–2239
Google Scholar
Charbuty B, Abdulazeez A (2021) Classification based on decision tree algorithm for machine learning. J Appl Sci Technol Trends 2(01):20–28
Article Google Scholar
Chen Z (2019) Observer-based dissipative output feedback control for network T-S fuzzy systems under time delays with mismatch premise. Nonlinear Dyn 95:2923–2941
Article MATH Google Scholar
Chen G, Chen P, Huang W, Zhai J (2022) Continuance intention mechanism of middle school student users on online learning platform based on qualitative comparative analysis method. Math Prob Eng 2022:12
Google Scholar
Costa-Mendes R, Oliveira T, Castelli M et al (2021) A machine learning approximation of the 2015 Portuguese high school student grades: A hybrid approach. Educ Inf Technol 26(2):1527–1547
Article Google Scholar
Groll A, Ley C, Schauberger G et al (2019) A hybrid random forest to predict soccer matches in international tournaments. J Quant Anal Sports 15(4):271–287
Article Google Scholar
Hazrat B, Yin B, Kumar A, Ali M, Zhang J, Yao J (2023) Jerk-bounded trajectory planning for rotary flexible joint manipulator: an experimental approach. Soft Comput 27(7):4029–4039. https://doi.org/10.1007/s00500-023-07923-5
Article Google Scholar
Kumar A, Shaikh AM, Li Y et al (2021) Pruning filters with L1-norm and capped L1-norm for CNN compression. Appl Intell 51:1152–1160. https://doi.org/10.1007/s10489-020-01894-y
Article Google Scholar
Liu X, Zhou G, Kong M, Yin Z, Li X (2023a) Developing multi-labelled corpus of twitter short texts: a semi-automatic method. Systems 11(8):390
Article Google Scholar
Liu X, Shi T, Zhou G, Liu M (2023b) Emotion classification for short texts: an improved multi-label method. Human Social Sci Commun 10(1):306
Article Google Scholar
Meng F, Xiao X, Wang J (2022) Rating the crisis of online public opinion using a multi-level index system. Int Arab J Inform Technol 19(4):597–608
Article Google Scholar
Młyńczak K, Golicki D (2021) Validity of the EQ-5D-5L questionnaire among the general population of Poland. Qual Life Res 30(3):817–829
Article Google Scholar
Mohnen SM, Rotteveel AH, Doornbos G et al (2020) Healthcare expenditure prediction with neighbourhood variables–a random forest model. Statist Politics Policy 11(2):111–138
Article Google Scholar
Mosey TJ, Mitchell LJG (2020) Key performance indicators in Australian sub-elite rugby union. J Sci Med Sport 23(1):35–40
Article Google Scholar
Qaisar I, Majid A, Shamrooz S (2023) Adaptive event-triggered robust H∞ control for Takagi-Sugeno fuzzy networked Markov jump systems with time-varying delay. Asian J Control 25:213–228
Article MathSciNet Google Scholar
Sethy A, Patra PK, Nayak DR (2019) Gray-level co-occurrence matrix and random forest based off-line Odia handwritten character recognition. Recent Pat Eng 13(2):136–141
Article Google Scholar
Shah K, Patel H, Sanghvi D et al (2020) A comparative analysis of logistic regression, random forest and KNN models for the text classification. Augment Human Research 5(1):1–16
Article Google Scholar
Shamrooz M, Li Q, Hou J (2021) Fault detection for asynchronous T-S fuzzy networked Markov jump systems with new event-triggered scheme. IET Control Theory Appl 15(11):1461–1473
Article MathSciNet Google Scholar
Tan, J., Jin, H., Hu, H., Hu, R., Zhang, H., Zhang, H. (2022). WF-MTD: Evolutionary Decision Method for Moving Target Defense Based on Wright-Fisher Process. IEEE Transactions on Dependable and Secure Computing.
Ullah R, Dai X, Sheng A (2020) Event-triggered scheme for fault detection and isolation of non-linear system with time-varying delay. IET Control Theory Appl 14(16):2429–2438
Article MathSciNet Google Scholar
Wang J, Zuo R, Xiong Y (2020) Mapping mineral prospectivity via semi-supervised random forest. Nat Resour Res 29(1):189–202
Article Google Scholar
Luyang Wang, Qiang Zhai, Baoqun Yin, et al. Second-order convolutional network for crowd counting, Proc. SPIE 11198, Fourth International Workshop on Pattern Recognition, 111980T (31 July 2019); https://doi.org/10.1117/12.2540362.
Xu H, Sun Z, Cao Y et al (2023) A data-driven approach for intrusion and anomaly detection using automated machine learning for the Internet of Things. Soft Comput. https://doi.org/10.1007/s00500-023-09037-4
Article Google Scholar
Yao, W., Guo, Y., Wu, Y. and Guo, J., 2017, July. Experimental validation of fuzzy PID control of flexible joint system in presence of uncertainties. In 2017 36th Chinese Control Conference (CCC) (pp. 4192–4197). IEEE. doi: https://doi.org/10.23919/ChiCC.2017.8028015.
Yin B, Aslam MS et al (2023) A practical study of active disturbance rejection control for rotary flexible joint robot manipulator. Soft Comput 27:4987–5001. https://doi.org/10.1007/s00500-023-08026-x
Article Google Scholar
Yin, B., Khan, J., Wang, L., Zhang, J. and Kumar, A., 2019, July. Real-time lane detection and tracking for advanced driver assistance systems. In 2019 Chinese Control Conference (CCC) (pp. 6772–6777). IEEE. doi: https://doi.org/10.23919/ChiCC.2019.8866334.
Zhang S, Xiao K, Carranza EJM et al (2019) Maximum entropy and random forest modeling of mineral potential: Analysis of gold prospectivity in the Hezuo-Meiwu district, west Qinling Orogen, China. Nat Resour Res 28(3):645–664
Article Google Scholar
Zhu H, Xue M, Wang Y, Yuan G, Li X (2022) Fast visual tracking with siamese oriented region proposal network. IEEE Signal Process Lett 29:1437
Article Google Scholar

Download references

Funding

No funding was provided for the completion of this study.

Author information

Authors and Affiliations

Xinyang Vocational and Technical College, Xinyang, 464000, Henan, China
Yu Chen

Authors

Yu Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Chen.

Ethics declarations

Conflict of interest

The authors have no financial or proprietary interests in any material discussed in this article. The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, Y. An intelligent college English level 4 pass rate forecasting model using machine learning. Soft Comput 27, 17585–17601 (2023). https://doi.org/10.1007/s00500-023-09221-6

Download citation

Accepted: 09 September 2023
Published: 25 September 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00500-023-09221-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An intelligent college English level 4 pass rate forecasting model using machine learning

Abstract

Access this article

Similar content being viewed by others

Predicting academic performance using tree-based machine learning models: A case study of bachelor students in an engineering department in China

Predicting Academic Performance in Mathematics Using Machine Learning Algorithms

Prediction of student exam performance using data mining classification algorithms

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An intelligent college English level 4 pass rate forecasting model using machine learning

Abstract

Access this article

Similar content being viewed by others

Predicting academic performance using tree-based machine learning models: A case study of bachelor students in an engineering department in China

Predicting Academic Performance in Mathematics Using Machine Learning Algorithms

Prediction of student exam performance using data mining classification algorithms

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation