1 Introduction

Since higher education plays an essential role in the development of a society (Pinheiro et al. 2015), increasing student success is a long-term goal for academic institutions. In order to increase students’ success rate, it is vital to understand and define academic success. The definition of academic success is rather complex and wide-ranging; therefore, it is frequently misused within educational research. However, the study of York et al. (2015) suggests a theoretically grounded definition of academic success that is made up of six components: (1) academic achievement, which is nearly entirely measured with course grades and grade point average (GPA), (2) satisfaction, which is often captured either by course evaluation or institutional surveys, (3) persistence, which is measured by retention between particular years of college and degree attainment rates, (4) acquisition of skills and competencies, which can be measured by assignments and course evaluations, (5) attainment of learning objectives, which can also be measured by assignments and course evaluations, and finally (6) career success, which can be determined by job attainment rates, promotion histories, career satisfaction and professional goal attainment.

A second crucial requirement for maximizing students’ success is the identification of the factors that have an effect over academic performance. Awareness of students’ success factors could assist in achieving the highest level of quality education (Yassein et al. 2017). It can potentially help in providing a clear and strong description of the types of knowledge and behaviour that are associated with adequate performance. Such awareness can be gained by using methods of data mining (DM) over educational records. The practice of DM methods applied to educational data is known as Educational Data Mining (EDM) (Baker and Yacef 2009). It is drawn from a variety of domains, including DM and machine learning, psychometrics and other areas of statistics, information visualization, and computational modelling (Romero and Ventura 2007). Generally, EDM refers to techniques and tools designed for automatically extracting useful information and patterns from huge data repositories related to people learning activities in an educational environment (Nithya et al. 2016). Those tools employ machine learning algorithms, database systems, statistical analysis, and artificial intelligence. The DM techniques include regression, clustering, classification, association, and prediction.

Figure 1 by Romero and Ventura (2007) shows the application of DM in educational systems, and we use it in order to position the approaches analyzed in this survey within the EDM landscape. The use of DM in educational systems is represented as an iterative cycle of hypothesis formation, testing, and refinement where the systems can be directed to support the specific needs of every participant in the educational process. EDM carries an array of DM techniques, to (1) support relationship analysis, classification, and clustering, (2) elaborate educational hypotheses, and (3) provide learning assistant (Baker and Yacef 2009).

Fig. 1
figure 1

The cycle of applying data mining in educational systems (Romero and Ventura 2007)

The scope of this paper lies within the DM phase in the EDM cycle. There is an extensive variety of methods within DM. However, as this survey covers predictions in EDM, we address only supervised DM methods (also known as predictive or directive). Supervised techniques such as classification and regression predict the value of the output variables based on the inputs. To do this, a model is developed from training data where the values of input and output are previously labelled. The model generalizes the relationship between the inputs and outputs and uses it to predict other datasets where only inputs are known (Witten et al. 2017).

After having introduced the EDM concept and the supervised techniques, the rest of the paper is organized as follows: We start by giving an overview of existing EDM surveys, along with the research question and research methododogy. Afterwards, we provide an outline of the task of prediction in EDM. This task involves essential decision making regarding the DM tool to be used in performing the predictions, the prediction methods and techniques, and the selection of features (or predictors). We then continue with a rich literature review regarding predicting students’ academic achievement in the past 12 years. Lastly, we present a summary of the main results and the reached conclusions, as well as future lines of research.

2 Overview

The EDM literature is rapidly growing and needs to be brought up-to-date frequently to take new studies into account. Romero and Ventura (2007) conducted a literature survey on EDM covering published work between 1995 and 2005. They reviewed different types of educational systems (traditional classrooms and distance learning) and how EDM can be applied on them. They also described the DM techniques that have been applied in educational systems. Another EDM survey was conducted by Peña-Ayala (2014) and analyzes EDM studies published between 2010 and 2013. It provides an analysis of the EDM weakness, strengths, threats, and opportunities. Dutt et al. (2017) performed a systematic literature review covering research between 1983 and 2016 on clustering algorithms only, which are unsupervised techniques. Unsupervised techniques uncover hidden patterns in unlabeled data with the aim of finding patterns in a dataset. In unsupervised DM, there are no output variables to predict. In their study, they viewed the applicability and usability of clustering algorithms in the context of EDM. They concluded that the key benefit of the clustering algorithms is that it provides a fairly explicit schema of students’ learning style when using a number of attributes (e.g., time spent to complete tasks, students’ behavior in class, and students’ motivation towards learning) to cluster. Kumar et al. (2017) conducted a literature survey covering publications between 2007 and 2016 on students’ performance prediction. They report the used DM methods and their accuracies. However, they do not report the used DM tools. They found that GPA and students’ internal marks are important attributes for predicting academic achievement. Our literature survey covers different and more recently published papers (2007–2018) and focuses on prediction tasks in EDM using supervised methods. In this section of the paper, we outline our research questions and the research methodology used to collect relevant literature.

2.1 Research Questions

The research questions proposed in this literature survey are as follows:

  • What are the measurable aspects of the prediction of student academic achievement in higher education?

  • Which are the best DM methods to predict students’ academic achievement in higher education?

2.2 Research Methodology

To answer our research questions, we followed a quantitative approach by collecting information from 22 individual published studies regarding predictions performed in higher education institutes and universities. We displayed the collected information using tables to allow random access and to simplify comparison between the different data.

2.2.1 Search Strategy

Three databases were used to search and filter out the papers that were relevant to our investigation. They are as follows: SpingerLink,Footnote 1 ScienceDirect,Footnote 2 and ACM Digital Library.Footnote 3 We searched different types of publications, including Journal articles as well as conference and workshop proceedings. Our search strings were written using Boolean operators like AND, and OR to further produce more relevant results, e.g., (student success) AND (factors OR features OR attributes OR characteristics OR aspects) AND (educational data mining) AND (prediction OR estimation). We have also hand-searched journals to identify papers that might have not yet been included in electronic databases, or have not been indexed. Hand-searching involves the page-by-page inspection of relevant conference proceedings or journal issues to find studies relevant to our research.

2.2.2 Inclusion Criteria

The inclusion criteria to determine relevant literature are as follows:

  • Studies that have been conducted between 2007 and 2018.

  • Studies that reported the used data mining method for performing the prediction.

  • Studies that reported the features used for performing the prediction.

2.2.3 Exclusion Criteria

The following principles were used to exclude the literature that was not relevant for this research:

  • Studies that focused on unsupervised analysis.

  • Studies that did not include analysis or prediction of academic success.

  • Studies which are not performed on a higher education level, e.g., elementary or secondary school.

3 Overview of Features, Algorithms and Software Used for Prediction in EDM

Being able to predict a student’s academic success serves as an essential research topic in many academic disciplines due to its benefits to both teaching and learning activities (Holland and Nichols 1964). Performing predictions in educational environments generally includes three steps: (1) selecting the right features (predictors), (2) selecting the right method or technique, and (3) selecting the proper data mining software. As is the case with most supervised machine learning tasks, these choices are essential for the reliability of the findings, for reducing the likelihood of errors, and ultimately for the overall performance of the solution. In the following, we give a short account of the features, methods and tools that we have frequently encountered in our survey, in the DM phase of the EDM process.

3.1 Overview of Features Used in EDM

Many features have been researched with the scope of determining their ability to predict students’ academic performance. Based on the studies that have been reviewed in our research investigation, these features can be classified into three general categories:

  • Demographical features, which as the title suggests, include gender, age, marital status, background, income, occupation, mobility (transportation), disability, parents’ education level, etc.

  • Pre-enrollment features, which are related to students’ achievements before their enrollment such as their GPA from previous studies, previous major (field of study), previous institute, language proficiency, grades earned in prerequisite courses, as well as pre-enrollment exams, e.g., Scholastic Aptitude Test (SAT) and Graduate Record Examinations (GRE), etc.

  • Post-enrollment features, which are related to students’ attitude after enrollment and during the course such as attendance, assignments, scores earned in quizzes and final exams, lab work, writing notes in class, boredom level during lectures, the number of credit hours per semester, etc.

We provide details on the commonality as well as performance of these features as revealed by our literature survey in Sect. 4.1.

3.2 Overview of Data Mining Methods Used in EDM

In EDM, multiple prediction methods have been researched. Since there is no definite answer to the question of which is the best DM method as every method has its advantages and limitations, most researchers often explore two or more techniques to reveal which method generates the best accuracy in their specific case and adopt it. Table 1 summarizes the advantages and weaknesses of the most common DM methods used today in predicting students’ academic achievement. We provide details on the usage of these algorithms in EDM in Sect. 4.2.

Table 1 Pros and cons of the most common data mining methods

3.3 Overview of Data Mining Software Used in EDM

In recent years, a number of tools have been developed with the purpose of conducting DM research. According to Slater et al. (2017), the 7 tools represented in Table 2 are the ones that offer algorithms that can be used to model and predict processes and relationships in educational data. They are all well documented and they are all cross-platform applications as they may run on Microsoft Windows, Linux, and mac OS. We get back to analyzing the use of these tools in the context of EDM, in Sect. 4.3.

Table 2 Pros and cons of the different data mining tools and packages

4 Comprehensive Review of Academic Achievement Prediction Literature

In this section, we survey the different types of predictions that have been performed in higher education institutions. We summarize 22 recent studies regarding the prediction of academic achievement in higher education. These studies have been conducted in different countries around the world between the year 2007 and 2018. We also demonstrate the most significant findings out of the literature study by discussing the outcomes of the previous works.

4.1 Features Used in Predicting Students’ Academic Achievement

Figure 2 shows the prevalence of the most commonly used features for predicting academic achievement in higher education, as encountered in the studied literature. As can be seen, gender and the GPA are used in more than half of the studies: 14 (63%) and 12 (54%) respectively. Their frequencies are followed by those of age (40%) and language proficiency (31%). The other features such as income, nationality, marital status, employment status and attendance are each used in less than 30% of the publications.

Fig. 2
figure 2

Mostly used features in predicting students’ academic achievement between 2007 and 2018

In the following, we analyze these features in more detail.

4.1.1 Demographics

4.1.1.1 Gender

As Fig. 2 illustrates, it can be concluded that gender has been used the most compared to other demographics in predicting academic achievement. This should come as no surprise since the relationship between gender and academic achievement of students has been discussed for decades (Eitle 2005), resulting in a substantial body of literature. In the literature that this study targets, while some of the researchers found no significant gender difference between students (Goni et al. 2015), others found a significant difference with either the male (Chang 2008) or the female (Simsek and Balaban 2010) performing better, based on the specific subject. Unfortunately, none of these aforementioned studies also attempted the prediction task. Regarding the studies that actually used gender for predicting academic outcomes, we identify 14 studies. However, only 2 studies (Kovačić 2010; Osmanbegović et al. 2012) reported its impact on the overall prediction. Both concluded that gender does not have a significant impact on this task.

4.1.1.2 Age

The second most frequently used demographical feature for predicting academic success is age. A potential explanation to the prevalence of this feature is the fact that many researchers in the past found a positive relationship between age and performance (Sturman 2003; Watkins and Hattie 1985). These previous studies try to explain the positive correlation between age and academic achievement by suspecting that older students were more highly motivated, more experienced, and that they possess effective study habits. Regarding the studies that we target, unfortunately most of the studies do not report the individual impact of this feature. Exception is a study by Kovačić (2010) who found that age actually does not have a significant impact on predicting academic success.

4.1.1.3 Marital Status

The relationship between marital status and the academic achievement of students has also been widely discussed in the literature, specifically in 18% of the studies we surveyed. Yess (2009) investigated the influence of marital status on the scholastic achievement of 240 Community Colleges students in the US. The result revealed that it was a significant predictor of achievement. Another study by Ma and Wooster (2009) investigated how the marital status of College students can affect their academic performance using a study sample of 374 students. Their investigation revealed that married students had higher grades than unmarried students.

4.1.1.4 Other Demographic Features

There are also other demographics, such as income, that have been used often as predictors (Daud et al. 2017; Nghe et al. 2007; Pal and Pal 2013; Ali et al. 2013; Villwock et al. 2015; Yadav and Pal 2012). Among them, Ali et al. (2013) examined the individual features affecting academic performance of graduate students, including student’s social economic status. Using a sample of 100 randomly selected students, they found that income significantly contributes to students’ success. Students’ employment status has also been used several times as a predictor of academic achievement (Daud et al. 2017; Kovačić 2010; Nghe et al. 2007, Mohamadian et al. 2015). Among them, the study of Mohamadian et al. (2015) investigates the relationship between employment status and academic achievement using a sample of 235 students. Their results showed that unemployed students had significantly higher academic achievement than employed students. They believe that working students devote less time to study and as a consequence, achieve less success.

We conclude that although the demographical features are heavily used, the extent to which they are useful in the academic achievement prediction task is not yet clear, with multiple studies either not reporting the individual contribution of the features, or with studies reaching opposing conclusions, particularly with respect to gender and age. Since previous researchers claim that gender, age, marital status, etc. have an effect on students’ success then, the latest EDM research tends to use them as features for predicting academic success, but yet with unclear success. In our analysis, it has become apparent that the use of demographical features, as well as their choice might be strongly influenced by the cultural background of the countries where the study is held. For instance, when the study is performed in a collectivistic country (e.g., India, and Malaysia), we witness features related to the family of the student, such as family support (Sembiring et al. 2011), family income (Yadav and Pal 2012; Pal and Pal 2013; Villwock et al. 2015; Abu Saa 2016; Daud et al. 2017), family size (Yadav and Pal 2012), and parent’s qualifications (Abu Saa 2016). This was not the case in studies performed in individualistic countries. (e.g., United States and Europe). While individualistic cultures tend to focus on personal achievement, collectivist cultures prioritize family and team goals over individual requirements (Kim 1995). This might mean that students from individualistic cultures could be more competitive than those from collectivistic cultures. We therefore observe that this finding certainly deserves further investigation in order to improve our understanding of the cultural impact over academic performance.

4.1.2 Pre-enrollment Features

4.1.2.1 GPA

With respect to using students’ previous qualifications for predicting academic achievement, the mostly used feature is GPA (Nghe et al. 2007; Kovačić 2010; Osmanbegović et al. 2012; Huang and Fang 2013; Pal and Pal 2013; Kabakchieva 2013; Abu Saa 2016). In fact, Ibrahim and Rusli (2007) found that GPA is the most significant feature (with an 87% correlation) to predict students’ success compared to some demographical and pre-enrolment features.

4.1.2.2 Academic Language Skills

Also, academic language skills have been used frequently in the list of features to predict student achievement (Nghe et al. 2007; Abu Saa 2016; Badr et al. 2016; Asif et al. 2017). Academic Language is the language being used in textbooks, spoken in classrooms, and presented on tests and examinations. While some researchers (Arsad et al. 2014;) found out that the academic language skills do not affect students’ success in “knowledge courses” or “non-linguistic courses” other researchers (Wait and Gressel 2009) found a significant relationship and concluded that students who are proficient in the teaching language will be much better equipped to acquire new knowledge through reading and listening, and will also be better in expressing their ideas through oral discussions and oral exams. As to the significance of using language proficiency in prediction, most of the viewed literature did not report the significance of using academic language skills as a predictor. However, Badr et al. (2016) reported that they acquired better accuracy (67.33%) when their prediction model depended on only language skills and no other feature.

To conclude, we believe that using pre-enrollment features to predict students’ academic achievement is significant, especially if the prediction is to be performed at a very early stage (i.e., before the start of the program) as there are no other measurable features available at that point of time. As per choosing the predictor features, although the literature evaluating the impact of the individual features is scarce, the few studies that do it, do agree that both GPA and academic language skills have a positive impact on the prediction.

4.1.3 Post-enrollment Features

4.1.3.1 Grades

When it comes to using students’ post-enrollment features in predicting academic achievement, the grades earned in quizzes and examinations have been mostly used (Al luhaybi et al. 2018; Aulck et al. 2017; Badr et al. 2016; Huang and Fang 2013; Kemper 2018; Pradeep and Thomas 2015; Shakeel and Anwer Butt 2015; Villwock et al. 2015; Yadav et al. 2011; Yassein et al. 2017). In the study of Huang and Fang (2013), the earned grade in a mid-term exam was found to be the most important feature affecting prediction accuracy.

4.1.3.2 Results in Previous Semester

The success rate of the previous semester, which was usually measured by GPA, has also been used often (Nghe et al. 2007; Kabakchieva 2013; Alemu Yehuala 2015; Abu Saa 2016; Asif et al. 2017; Al luhaybi et al. 2018; Kemper 2018) in the studies we have reviewed. That is due to the fact that students’ success is highly dependent on previously acquired knowledge or skills. Asif et al. (2017) found that the marks of the first and second year courses of a four-year program play a role in predicting the graduation performance in a program. Likewise, Al luhaybi et al. (2018) found that the results of the core modules of the first year of the academic program have a high impact on the prediction of the high risk of failure students.

4.1.3.3 Attendance

A number of studies have used attendance in predicting academic achievement (Al luhaybi et al. 2018; Pradeep and Thomas 2015; Yadav et al. 2011; Yassein et al. 2017) as increased attendance could be seen as a direct indicator of students’ success. Lukkarinen et al. (2016) investigated the relationship between university students’ class attendance and learning performance by using data from a course in a university in which attendance to classes was not mandatory. They found that attendance is positively and significantly related to students’ performance. Another study by Alija (2013) used binary logistic regression to study the relationship between attendance and students’ achievement. They found that students who regularly attend the lectures have more chances to receive passing grades.

4.1.3.4 Other Post-enrollment Features

Balancing the academic load is vital to academic achievement. It is measured in terms of credit hours and course difficulty (Szafran and Austin 2002). Therefore, the choice of registered courses (Alemu Yehuala 2015; Aulck et al. 2017) and the total credit hours per semester (Alemu Yehuala 2015; Abu Saa 2016) have been used as predictors for academic success. In fact, Alemu Yehuala (2015) found that that the number of credit hours is one of the main significant attributes for predicting academic achievement.

We conclude that using post-enrollment features for predicting students’ academic achievement can contribute to maximizing the accuracy of the prediction. This is due to the fact that such features represent students’ current situation in the program rather than depending on their previous condition only.

4.2 Mostly Used Data Mining Methods in Predicting Students’ Achievement

By observing the viewed studies, it can be noticed that most researches explored several methods to predict students’ success and did not rely on the results of just one method. They often compared the results of each method to determine the best-fit technique for the specific dataset and consequently ensure the highest accuracy rates when deploying the model.

As seen in Fig. 3, the mostly used DM methods in the covered studies are decision trees. Due to their usability and efficiency, they have become one of the most effective and popular methods in machine learning since their introduction in the 1960 s (Song and Lu 2015). Knowledge models under this paradigm can be directly transformed into a set of IF–THEN rules. CHAID, CART, C4.5, and ID3 (Jain et al. 2017) are all decision tree algorithms. However, C4.5 (J48 in Weka) appears to be more popular than the rest of the decision tree algorithms. It has been used in 15 studies leading to a range of accuracies between 0.364 and 0.945. It was assessed to be the best scoring method in five studies (Alemu Yehuala 2015; Kabakchieva 2013; Nghe et al. 2007; Yadav and Pal 2012) and second best scoring method in two studies (Osmanbegović et al. 2012; Abu Saa 2016). CART has also been used in 5 of the reviewed studies leading to a range of accuracies between 0.40 and 0.622. It was found to be the best scoring method in three cases (Kovačić 2010; Yadav et al. 2011; Abu Saa 2016). ID3 was applied in 4 studies and was assessed to be the best method in the study of (Pal and Pal 2013) with 0.78 accuracy, and the worst method in the study of (Abu Saa 2016) with 0.333 accuracy. ADT was used in 2 studies only. In the first study (Pal and Pal 2013), it produced 0.6950 accuracy, while in the second (Pradeep and Thomas 2015), it obtained an accuracy of 0.995 and was assessed as the best scoring method. CHAID was also used in two studies only. It produced an accuracy of 0.594 in the first study (Kovačić 2010) and an accuracy of 0.341 in the second (Abu Saa 2016).

Fig. 3
figure 3

Mostly used data mining methods in educational predictions between 2007 and 2018

Rule-based classifiers such as JRip, NNge, OneR, and Ridor (Lakshmi 2012) have also been used several times by researchers. They often produced good results as they gave an accuracy of 0.545 (Kabakchieva 2013) in its worst cases and 0.982 (Pradeep and Thomas 2015) in its best. Also, Naïve Bayes produced outstanding results in most cases with accuracy above 0.75.

Even though sophisticated techniques, like neural networks or support vector machines, may outperform logistic regression and decision trees regarding prediction accuracy (HoYu et al. 2010), they are deemed to be less suitable for EDM purposes. Knowledge models obtained under these paradigms are considered to be black-box mechanisms, i.e., they can achieve high accuracy rates but can be difficult for people to comprehend.

4.3 Mostly Used Data Mining Tools in Predicting Students’ Achievement

Based on the studies we have viewed in this paper, the open-source Weka tool appears to be the most widely used DM tool for predicting academic results (see Fig. 4). It is intended for machine learning and DM and was developed at the University of Waikato in New Zealand. Weka supports several standard DM tasks like data clustering, classification, regression, pre-processing, visualization and feature selection. Weka has become popular with academic researchers in recent years due to its highly active community.

Fig. 4
figure 4

Mostly used data mining tools between 2007 and 2018

SPSS and Rapid miner tools have also been used by the EDM researchers quite often in comparison to the rest of the DM tools. The advantage of the IBM SPSS tool is that it offers the user much control and enables to develop the predictive models quickly using business expertise (Brahmeswara Kadaru and Umamaheswararao 2017). Likewise, RapidMiner, formerly called as “Yale”, has many benefits including the multiple deployment options based on the user’s preferences.

After overviewing the features, methods and tools used in the target literature, we now analyze in detail the same literature, by shifting the focus to the sub-tasks that comprise the academic achievement task.

4.4 Per-Task Analysis

From the viewed literature, the prediction of students’ performance in higher education can be broadly classified into three areas based on the prediction of (1) academic performance or GPA at a degree level, (2) failure or drop out of a degree, and (3) academic performance at a course level. In this section of the paper, we present the viewed literature using bullet points and tables. The bullet points show only the studies which have reported the significance of certain features on the prediction, whereas the tables show all the viewed studies, including the prediction task, where the study was held, the features used for the prediction, the DM tool, the DM method, and the accuracy of the prediction.

4.4.1 Prediction of Students’ Academic Performance or GPA at a Degree Level

One of the most known standards for assessing the quality of universities is based on the excellence records of their students’ academic outcome. A primary application area of prediction in EDM is predicting students’ GPA or overall academic performance, e.g., excellent, very good, good, etc. This type of prediction is useful in different contexts in universities, like for instance, identifying excellent students for allocating scholarships. Following are the studies that have reported the impact of the success features on the prediction of students’ academic performance or GPA at a degree level:

  • Sembiring et al. (2011) sampled 300 students to predict the final grade of students from the faculty of computer systems and software engineering. They used innovative features that were not visible in the rest of the studies. The significance of each feature was tested using multi-variant analysis methods. They found that family support had the most impact (52.6% contribution) on the prediction, followed by engaging time, then study behaviour, and finally study interest. On the other hand, students’ personal beliefs did not have any impact.

  • Kabakchieva (2013) used a dataset of 10,330 students to predict their performance using 5 classes (Bad, average, good, very good and excellent). They found out that the classifiers perform differently for the five classes. Another finding is that the post-enrollment features related to students’ university admission score, and numbers of failures at the first year exams are among the most influencing features in the classification.

  • Abu Saa (2016) collected data from 270 students using a survey distributed in daily classes and online with the aim of predicting students’ performance in an IT Department. They found that the students’ performance is not totally dependent on post-enrolment features, such as their academic efforts, but that on the contrary, there are many other features that have equal to more significant influences as well. This includes demographical features, such as gender, and mother occupation, as well as pre-enrolment features, such as high school grade, and University fees discount.

  • Asif et al. (2017) predicted students’ performance using a sample of 210 undergraduate students. The features they used to perform the prediction are marks only. The results of their study showed that it is possible to predict the graduation performance in a four-year university program using only pre-university marks and marks of first and second-year courses with a reasonable accuracy.

Table 3 provides a summary of the studies we analyzed, with respect to the used features, tools, methods and accuracy.

Table 3 Prediction of students’ academic performance or GPA at a degree level

4.4.2 Prediction of Students’ Failure or Drop Out of a Degree

Student failure or dropout is a significant concern in the education and policy-making communities (Demetriou and Schmitz-Sciborski 2011). High dropout rates and poor academic performance among students are examples of the most common issues that affect the reputation of an educational institution. The negative consequences of dropping out of the educational system are considerable, both for the individuals as well as the teaching institutions. Therefore, preventing educational dropout poses a significant challenge to institutions of higher education. This prevention can be made by predicting students at risk at an early stage. Following are the studies that have been performed to predict students’ failure or drop out of a degree:

  • Pradeep and Thomas (2015) Predicted bachelor student dropout using the records of 150 students who have been enrolled in a Technology program. Interestingly, the number of used features was reduced from 67 features to the best 13 using Attribute Selection Algorithms provided in WEKA tool. The selected features were mostly post-enrollment features such as attendance, taking notes from class, and some courses scores. Features such as age, gender and religion were neglected as they did not have an effect on the overall prediction.

  • Alemu Yehuala (2015) used 11,873 student records to predict university students who are at risk of failure. They found out that the 6 main features determining the failure or success of students are: number of students in a class, number of courses given in a semester, higher education, entrance certificate, examination result of a student, and gender.

  • Villwock et al. (2015) investigated the factors that may influence the students’ decision to drop out from a Mathematics Major. It was possible to observe that the courses that contributed to dropouts in the Major differ in different years. Considering only the subjects taken in the first year, the course that most contributed to dropouts was “Differential and Integral Calculus I”, and considering the first 2 years, it was “Finite Mathematics”. It was also concluded that the work factor is the feature that most contributed to the decision of dropping out. They believe that this is due to the fact that the working student has little time to devote to extracurricular study. They also found that marital status and age contributed to the decision of dropping out as well.

  • Daud et al. (2017) used 776 student instances to predict the completion or dropout of students from multiple universities in Pakistan. 23 features (selected by the feature extraction process) were chosen for the experiment. They concluded that the features that are most influential for predicting students’ performance are students’ natural gas expenditure, electricity expenditure, self-employment and location.

  • Aulck et al. (2017) used a dataset of over 32,500 students to predict student drop out in an Electrical Engineering department. Examining individual features revealed that the strongest features for the prediction of students’ drop out are GPA in math, English, chemistry, and psychology courses, year of enrollment, and birth year.

Table 4 provides a summary of the studies we analyzed regarding the prediction of students’ failure or drop out of a degree, with respect to the used features, tools, methods and accuracy of the prediction.

Table 4 Prediction of students’ failure or drop out of a degree

4.4.3 Prediction of Students’ Results on Particular Courses

The prediction of a student’s achievement at a course level can help instructors develop a good understanding of how well the students in their classes perform and as a result, take proactive measures to improve students’ learning experience. For instance, if the prediction shows that some of the students in the class are “at risk of failing the course”, educators may consider taking specific proactive measures to help those students achieve better in the given course. This can be done by adopting a variety of active and cooperative learning strategies. Following is a brief presentation of some studies that have been performed to predict students’ results on particular courses:

  • Kovačić (2010) collected data from 453 students to predict their performance in an “Information Systems” course. They tried to find out whether the successful vs unsuccessful student can be distinguished in terms of demographic features (such as gender, age, ethnicity, and disability) or by study environment (such as course program, faculty or course block). Their results suggest that the information gathered during the enrolment process (demographics, secondary school, working status, and early enrolment) are not sufficient for an accurate distinction between successful and unsuccessful students.

  • Osmanbegović et al. (2012) used a dataset of 257 student records to predict their performance in a “Business Informatics” course. They performed an analysis to determine the importance of each feature individually. The results of their analysis revealed that the GPA impacts output the most, followed by entrance exam, then the study material, then the average weekly hours dedicated to studying. On the other hand, the number of household members, distance of residence from the faculty, and gender had the smallest output impact.

  • Huang and Fang (2013) used the data of 323 undergraduate students to predict their performance in a “Dynamics” course. One of their interesting findings is that the grades that students earn in pre-requisite courses might not truly reflect the knowledge of the students in those topics. This is due to the fact that they may have taken pre-requisite courses years ago, and by the time they take the dependent course, their knowledge in the pre-requisite courses may have improved.

  • Badr et al. (2016) used 203 students’ records to predict their performance in a “Programming” course. They analyzed the relationship between the programming course and the other courses and found out that only the English courses have a direct effect on the prediction.

  • Al luhaybi et al. (2018) collected data from 129 students to predict the students at high risk of failure in four computer science core modules. The predicted class feature is the “Overall Grade”, which is the final grade obtained by the student in the targeted module. The overall grade has five possible values A: Excellent, B: Very Good, C: Good, D: Acceptable, and F: Fail, which have been merged on to Low risk, Medium risk and High risk of failure to improve the classification results. A significant finding in their study was that student qualifications on the program entry have a high impact on their academic performance. They also found out that some of the final grades in previous modules are influencing the students’ academic results in the current modules.

Table 5 provides a summary of the studies we analyzed regarding the prediction of students’ results on particular courses, with respect to the used features, tools, methods and accuracy of the prediction.

Table 5 Prediction of students’ results on particular courses

As it can be seen from Tables 4 and 5, there are no particular features, tools or methods used for particular tasks, but rather the same methodologies are used across the three tasks.

5 Current Trends and Future Work

While this survey covers academic prediction studies performed between 2007 and 2018, there are also more recent studies that have been published in 2019 and 2020. Many of these approaches (Adekitan and Salau 2019; Berens et al. 2019; Bhutto et al. 2020) still rely on traditional machine learning methods, such as SVM, decision trees, logistic regression, and Naïve Bayes. However, there are also some new data mining methods that have been explored, such as Structural Equation Modeling (Nabizadeh et al. 2019) and probabilistic neural networks (Adekitan and Salau 2019). Although deep neural nets have seen a growing popularity in the machine learning community, particularly with applications to natural language processing, they are still not adopted in the EDM literature. This is probably due to their need of very large training data, whose sourcing is problematic in educational contexts.

With respect to the type of features used to perform the predictions, demographical features (Berens et al. 2019; Bhutto et al. 2020; Nabizadeh et al. 2019), previous GPA (Adekitan and Salau 2019; Berens et al. 2019; Bhutto et al. 2020; Nabizadeh et al. 2019), as well as students’ satisfaction and interaction with system (Bhutto et al. 2020) are still commonly used. However, some new features are also being investigated such as cognitive and metacognitive learning strategies (Nabizadeh et al. 2019). These strategies include students’ effort in rehearsal, elaboration, organization, critical thinking, and metacognitive self-regulation. Intrestengly, using such features shows promising results. This comes to confirm that prediction of students’ performance is still a very actively researched problem, whose current solutions can still be improved, and that the factors that mostly influence academic outcomes and hence can be used to predict future performances are still not widely understood.

With respect to opportunities for further research in the domain of EDM, we identify two critical gaps in the previous literature. First relates to the investigation of the relation between personality and academic achievement. In light of the recent personality measures such as IPIP-NEU (Goldberg et al. 2006), we identify the opportunity of using such measures for academic performance achievement. However, the complexity of collecting such data is an important challenge to overcome. The second gap we identify relates to student self-assessment. None of the studies reports on analyzing the relationship between self-assessment and actual performance. We expect that the inclusion of such measures has the potential to further improve the accuracy of academic performance prediction.

6 Summary and Conclusion

Educational data mining is an area that holds exciting opportunities for researchers and practitioners all around the world. This field assists in improving institutional effectiveness by supporting decision making and enhancing student learning to reach visible and measurable targets. This paper provides a rich literature review on the prediction of academic achievement in higher education for the past 12 years with the final aim of providing researchers and educational planners with information to assist them when attempting to carry out an EDM solution.

This paper revealed that a considerable amount of work has been performed in analyzing and predicting academic performance. It showed that classification and regression algorithms can be used successfully to predict students’ academic achievement in both course and degree level. It can be seen that most of the reviewed EDM research in the past decade has been completed using the open source Weka machine learning software. We found that the most used methods for predicting academic achievement are decision tree algorithms, with C4.5 being most popular among them, most likely because such white box classification algorithms obtain models that can explain predictions by IF–THEN rules. These rules can be simply interpreted by non-expert DM users, such as teachers, and can be directly applied in decision making. On the other hand, neural networks, support vector machines and K-nearest neighbour were not frequently used as compared to the rest of the DM methods. Such methods are not suitable for EDM purpose due to their black-box mechanisms.

We also found that the used features broadly differ based on the specific settings of the institute, culture, and country. However, gender, age, previous GPA and the proficiency level of the academic language are the features that most researchers agreed on when predicting students’ academic achievement in higher education regardless of their environment, i.e., where they come from and what they believe in. Nevertheless, an essential limitation of the surveyed literature is the fact that only a few studies investigate and report the significance of each predictor. Rather, the vast majority of studies report only the final results, making it challenging to judge the value of each feature, even for the very widely used ones. We therefore conclude that more research is needed first to deepen our understanding of the contribution of each traditionally used feature, and second, to extend the set of features and methodologies for further improving the current prediction accuracies.