Introduction

To date, the study of learning analytics has tended to evolve exponentially from the early 2010s in the areas of education and psychology, as well as computing and data science (Prieto et al. 2019). As a result, while the concept of learning analytics is vaguely defined, it sees a plethora of conceptual variations, including school analytics (Sergis and Sampson 2016), teacher or teaching analytics (Sergis and Sampson 2017), academic analytics (Long and Siemens 2011), assessment analytics (Nouira et al. 2019), social learning analytics (Buckingham Shum and Ferguson 2012), or multimodal learning analytics (Blikstein and Worsley 2016). For this systematic literature review with a specific focus on learning analytics in higher education and its link to study success, learning analytics are defined as “the use, assessment, elicitation and analysis of static and dynamic information about learners and learning environments, for the near real-time modelling, prediction and optimisation of learning processes, and learning environments, as well as for educational decision-making” (Ifenthaler 2015, p. 447).

The success of learning analytics in improving students’ learning has yet to be proven systematically and empirically (Lodge and Corrin 2017). There have been a number of research efforts, some of which focused on various learning analytics tools (Atif et al. 2013), some on practices (Sclater et al. 2016) and policies (Tsai et al. 2018), and some, which related to learning analytics system adoption at school-level, within higher education institutions, and at national level (Buckingham Shum and McKay 2018; Ifenthaler 2017). Thus, the increased importance of data in education has led to an upsurge in primary research publications in learning analytics (Prieto et al. 2019), which indicate that the analysis of digital traces of learning and teaching may reveal benefits for learners, teachers, learning environments, or the organisation (Gašević et al. 2015). Although there are initial systematic reviews on learning analytics, such as on policy recommendations for learning analytics (Ferguson et al. 2016), identification of learning analytics research objectives and challenges (Papamitsiou and Economides 2014), learning analytics in the context of distance education (Kilis and Gulbahar 2016), and more recently on the efficiency of learning analytics interventions (Larrabee Sønderlund et al. 2018), there exists no current and comprehensive systematic review focusing on learning analytics for supporting study success.

Study success includes the successful completion of a first degree in higher education to the largest extent, and the successful completion of individual learning tasks to the smallest extent (Sarrico 2018). Factors affecting study success range from individual dispositions and characteristics such as age, gender, motivation or prior academic performance to features of the educational environment, such as curriculum design, learning tasks or social components (Bijsmans and Schakel 2018; Tinto 2005). The essence of study success is to capture any positive learning satisfaction, improvement, or experience during learning. Pistilli and Arnold (2010) have been among the first researchers to identify the potentials of learning analytics for supporting study success.

However, it is difficult for educational researchers, practitioners, and decision makers to develop and implement learning analytics strategies and systems that provide the greatest student success (Gašević et al. 2016). Therefore, the purpose of this article is to identify empirical evidence demonstrating how learning analytics have been successful in facilitating study success in higher education.

Theoretical framework

Study success

Even though many academic support programmes have been implemented (Padgett et al. 2013), and research on study success is extensive (Attewell et al. 2006; Bijsmans and Schakel 2018; Morosanu et al. 2010; Schmied and Hänze 2015), dropout rates in higher education remain at about 30% in the Organization for Economic Cooperation and Development member countries (OECD 2019). Student dropout has consequences on different levels, such as for the individual, the higher education institution, and for society (Larsen et al. 2013). For example, dropouts often represent a waste of resources for the individual and society, as well as reflecting poorly on the quality of the higher education institution (In der Smitten and Heublein 2013).

Hence, the success of students at higher education institutions has been a global concern for many years (Tinto 2005). Factors that contribute to student success, which may influence a student’s decision to discontinue higher education are various and complex (Tinto 1982, 2005). Important factors for dropouts that have been consistently found in international studies include the choice of the wrong study programme, lack of motivation, personal circumstances, an unsatisfying first-year experience, lack of university support services, and academic unpreparedness (Heublein 2014; Thomas 2002; Willcoxson et al. 2011; Yorke and Longden 2008). Moreover, there are several theoretical perspectives and models of student success in higher education (Bean and Metzner 1985; Rovai 2003; Tinto 1982), and many share common factors, even though their emphasis varies. Such common factors, which are related to study success include students’ sociodemographic factors (e.g., gender, ethnicity, family background), cognitive capacity, or prior academic performance (e.g., grade point average [GPA]), and individual attributes (e.g., personal traits, and motivational or psychosocial contextual influences) as well as course related factors such as active learning and attention or environmental factors related to supportive academic and social embeddedness (Bijsmans and Schakel 2018; Brahm et al. 2017; Remedios et al. 2008; Tinto 2017). To sum up, the essence of study success is to capture any positive learning satisfaction, academic improvement, or social experience in higher education. The possibility to collect and store data for the above mentioned factors and combining them in (near) real-time analysis opens up advanced evidence-based opportunities to support study success utilising meaningful interventions (Pistilli and Arnold 2010).

Learning analytics

Early approaches of learning analytics were limited in analysing trace-data or web-statistics in order to describe learner behaviour in online learning environments (Veenman 2013). With increased investigation of educational data, potentials for a broader educational context have been recognised, such as the identification of potential dropouts from study programmes (Sclater et al. 2016). Meanwhile, an extensive diversification of the initial learning analytics approaches can be documented (Prieto et al. 2019). These learning analytics approaches apply various methodologies, such as descriptive, predictive, and prescriptive analytics to offer different insights into learning and teaching (Berland et al. 2014). Descriptive analytics use data obtained from sources such as course assessments, surveys, student information systems, learning management system activities and forum interactions mainly for reporting purposes. Predictive analytics utilise similar data from those sources and attempt to measure onward learning success or failure. Prescriptive analytics deploy algorithms mainly to predict study success and whether students will complete courses, as well as recommending any immediate interventions necessary (Baker and Siemens 2015). The main motivations of utilising learning analytics for higher education institutions include (a) improving students’ learning and motivation, thus reducing dropout rate (or inactivity) (Colvin et al. 2015; Glick et al. 2019), and (b) attempting to improve the learner’s learning process by providing adaptive learning pathways toward specific goals set by the curriculum, teacher, or student (Ifenthaler et al. 2019). However, the success of learning analytics in improving higher education students’ learning and success has yet to be proven systematically. Only a few studies have tried to address this but limited evidence is shown (Suchithra et al. 2015). Nevertheless, higher education institutions are collecting and storing educational data (i.e., factors related to study succuss) which may be utilised through learning analytics systems for supporting study success.

Purpose of this systematic review and research questions

Rigorous empirical evidence on the successful usage of learning analytics for supporting and improving students’ learning and success in higher education is lacking for the large-scale adoption of learning analytics (Marzouk et al. 2016). While higher education institutions still lack the organisational, technical, and staff capabilities for the sustainable and effective implementation of learning analytics systems (Ifenthaler 2017; Leitner et al. 2019), only very few empirically-tested learning analytics systems exist (Rienties et al. 2016). Another serious concern related to learning analytics is the ethically responsible and appropriate use of educational data (Scholes 2016; Slade and Prinsloo 2013; West et al. 2016), respecting data protection regulations (e.g., EU-GDPR) and privacy principles of all involved stakeholders (Ifenthaler and Schumacher 2016, 2019; Pardo and Siemens 2014). Another well advanced line of research in learning analytics focuses on the design of dashboards or broader visualisations of information from data analytics for supporting learning and teaching (Park and Jo 2015; Roberts et al. 2017). However, neither research includes a complete and detailed review of existing evidence on how learning analytics may contribute toward study success at higher education institutions.

Therefore, the purpose of this systematic literature review was to identify empirical evidence demonstrating how learning analytics have been successful in facilitating study success in the continuation and completion of students’ learning. In order to guide the systematic review, the following research questions were formulated:

  1. 1.

    What study success factors have been operationalised in relation to learning analytics?

  2. 2.

    What factors from learning analytics systems contribute toward study success?

  3. 3.

    Are there specific learning analytics interventions for supporting study success?

Method

The preparation of the systematic review followed the eight steps proposed by Okoli and Schabram (2010). In order to produce a scientifically rigorous systematic review, all eight steps are essential (Okoli 2015): (1) identity the purpose; (2) draft protocol and train the team; (3) apply practical screen; (4) search for literature; (5) extract data; (6) appraise quality; (7) synthesise studies; (8) write the review.

Figure 1 shows the flow diagram of conducting the systematic review. The purpose of the systematic review has been presented above, including three research questions. The research team developed a research protocol, which described the individual steps of conducting the systematic review. In order to validate the research protocol, a training session was conducted which focused on database handling, reviewing, and note-taking techniques. The practical screening followed the previously outlined inclusion criteria in the research protocol, namely (a) studies were situated in the higher education context, (b) were published between January 2013 and December 2018 (2013 marks the rise of learning analytics research publications), (c) were published in English language, (d) had an abstract available, (e) presented either qualitative or quantitative analyses and findings, and (f) were peer-reviewed. Concerning overall consistency, the research team exchanged their findings for critical reflections. The literature search strictly followed the pre-defined research protocol, which included several steps:

  1. (a)

    Identification of international databases: GoogleScholar, ACM Digital Library, Web of Science, Science Direct, ERIC (Education Resources Information Center), and DBLP (computer science bibliography). The initial search results included peer-reviewed journal articles, peer-reviewed conference papers, and peer-reviewed chapters.

  2. (b)

    Specific search in relevant scientific peer-reviewed journals following the top 20 ranked educational technology journals in GoogleScholar: Computers & Education, British Journal of Educational Technology, The International Review of Research in Open and Distributed Learning, The Internet and Higher Education, Journal of Educational Technology & Society, Journal of Computer Assisted Learning, Education and Information Technologies, Educational Technology Research and Development, Language Learning & Technology, Interactive Learning Environments, TechTrends, The Turkish Online Journal of Educational Technology, Learning@Scale, Learning, Media and Technology, International Journal of Artificial Intelligence in Education Computer Assisted Language Learning, IEEE Transactions on Learning Technologies International Conference on Technological Ecosystems for Enhancing Multiculturality, Australasian Journal of Educational Technology, as well as three additional pertinent journals—Journal of Learning Analytics, Computers in Human Behavior, and Technology, Knowledge and Learning. In addition, the Proceedings of the International Conference on Learning Analytics And Knowledge were included, as they include peer-reviewed contributions of the learning analytics community.

  3. (c)

    The searches were conducted using the search terms ‘learning analytics’ in combination with ‘study success’, ‘retention’, ‘dropout prevention’, ‘course completion’, and ‘attrition’ in titles, keywords, abstracts, and full texts. A total of N = 6220 publications were located.

  4. (d)

    The detailed analysis of all identified publications included the removal of duplicates or publications with irrelevant topics (N = 3057) and an in-depth abstract search (focussing on relevant concepts, e.g., learning analytics in combination with study success factors) resulted in a final set of N = 374 publications.

  5. (e)

    The full text analysis of the remaining publications focused on the theoretical rigor of the key concepts (i.e., learning analytics, study success factors), substantiality of sampling technique and methodological procedure, and the empirical evidence presented resulted in a final sample of N = 46 key publications for the systematic review (this included step six of conducting systematic reviews, i.e., appraise quality). Copies of all publications were stored and organised in a digital literature database.

Fig. 1
figure 1

Flow diagram of the systematic literature review process

Following the three research questions, relevant information was extracted from the key publications and organised in an annotated table. The research team used a quantitative and qualitative content analysis as well as reflective exchange to extract the findings of the key studies. This synthesis of key publications followed the triangulation approach, as the final studies included quantitative and qualitative studies (Okoli 2015). The final step of conducting the systematic review included the dissemination of the findings through the writing of this paper, which documents the findings and discussion of implications as well as limitations.

Results

Summary of key publications

The 46 key publications included in this systematic review were conducted in the USA (n = 13), Australia (n = 5), UK (n = 3), Spain (n = 3), Brazil (n = 3), Ireland (n = 2), Taiwan (n = 2), The Netherlands (n = 2), South Korea (n = 2), China (n = 1), Columbia (n = 1), Czech Republic (n = 1), France (n = 1), Greece (n = 1), India (n = 1), Israel (n = 1), Japan (n = 1), Pakistan (n = 1), Saudi Arabia (n = 1) and Sweden (n = 1). Most articles were published in 2017 (16), followed by eight articles in 2018, 2016 (5), 2015 (6), 2014 (8), and three in 2013. The average sample size of all key studies was M = 15,981.74 (SD = 67,388.72; Min = 29; Max = 447,977).

The key publications utilised data analytics methods, such as binary logistic regression, decision tree analysis, support vector machines, logistic regression and classification systems. Many of the key studies applied several statistical methods in order to determine which one would achieve the most accurate prediction of the intended outcome variable. The main predictions forecasted in the key publications were on course completion, grades to be obtained, and dropout rates. Table 1 provides a summary of the key publications focusing on learning analytics for supporting study success and includes information about the author(s), the country in which the study was conducted, study sample characteristics, measurement variables, the key aim of the study, operationalisation of study success measure, and applied interventions. The research team also evaluated the overall research rigour (categorised as weak; moderate; strong) of each of the key publications with regard to the definition of study success and learning analytics (theoretical rigour), the tested sample, variables and methods (methodological rigour), rigour of findings and implications (see Table 1). The average evaluation of the individual categories resulted in the overall research rigour score. None of the key publications were rated as strong research rigour (11 weak; 35 moderate), mainly because of the missing operationalisation of study success, lack of precise methodological approaches or limited sample size.

Table 1 Summary of key publications focusing on learning analytics for supporting study success

Conceptualisation of study success

Study success was the central element in our systematic review and we included only those papers that were centred on supporting study success, as can be observed by our search terms ‘study success’ or ‘course completion’ as positive aspects of increasing study success and ‘attrition’ or ‘dropout’ as negative aspects which require the reduction thereof. Different conceptualisations of study success were provided in the articles, including the more precise descriptions, with positive factors of study success defined, for example, as ‘course completion’ (n = 7) and ‘student retention’ (n = 1) and negative factors utilising terms relating to attrition or dropout, such as ‘student at-risk’ or ‘dropout’ (n = 14), ‘loss of academic status’ (n = 2), and ‘attrition’ (n = 1). Other generalised or more abstract terms indicating study success (or lack thereof) were also utilised, and predicted factors including ‘student performance’ (n = 9), ‘student learner behaviour’ (n = 1), ‘low performance’, ‘under-achieving students’ (n = 4), ‘students’ achievements and failures’ (n = 1), ‘student (dis)engagement and learning outcomes’ (n = 6), ‘success’ (n = 1), ‘student online behaviour’ (n = 1), ‘academic achievement’ (n = 1), ‘correctness of answers’ (n = 1), and ‘grades’ (n = 1).

In the key publications, the above-mentioned study success factors are presented mainly in the keywords, or briefly in the abstract and introduction. The remainder of the articles are focused on utilising different data analytics methods in order to make accurate predictions. Analysis and evaluations are presented accordingly and are finalised with how accurate the algorithms were. Overall, throughout the articles, the publications lack an in-depth discussion on the relevance of the obtained findings in relation to study success factors. In the articles’ implications and conclusions, study success is again, rarely reinstated and discussed. Instead, the focus tends to be on the accuracy of the algorithms.

Factors contributing toward study success

A number of factors have been identified in the key publications as contributing to study success and can be organised in two categories: (1) predictors and (2) visualisation.

Predictors for study success

The application of predictive algorithms forms a significant part of the key publications (see Table 1). One set of predictors is based on data collected through online behaviour, mainly logfiles and trace-data. This includes forum interactions (e.g., posts, replies, length of posts) (Andersson et al. 2016; Cambruzzi et al. 2015; Guerrero-Higueras et al. 2018; Seidel and Kutieleh 2017), engagement with learning artefacts (e.g., ePortfolio, lecture slides, videos, tasks, self-assessments) (Aguiar et al. 2014; Carter et al. 2015; Conijn et al. 2018; Gong et al. 2018; Okubo et al. 2017), and overall interaction with a digital learning environment based on logfiles (Hu et al. 2014; Labarthe et al. 2016). For example, Chai and Gibson (2015) use login frequency, access of materials, submission of assignments, and enrolment data for predicting a student’s attrition risk. Similarly, data from websites or learning management systems (e.g., event-based timestamps) are used in combination with grades to predict students at risk of dropping out (Cohen 2017; Conijn et al. 2017; Elbadrawy et al. 2015; Jo et al. 2014; Manrique et al. 2018; Nespereira et al. 2015; Nguyen et al. 2017; Saqr et al. 2017), with the detailed analysis of clickstream or trace-data also being used to predict student dropout (Whitehill et al. 2017), student retention (Wolff et al. 2013), or student performance (Yang et al. 2017).

As shown in Table 1, another set of data used for predicting study success is based on students’ background information, such as demographics (e.g., age, gender), socio-economic status (e.g., family income, background, expenditure), prior academic experience and performance (Daud et al. 2017; Djulovic and Li 2013; Guarrin 2013). For example, Lacave et al. (2018) use enrolment age, prior choice of subject and information on scholarships in order to predict student dropouts. In addition to demographic variables (Aulck et al. 2017; Sarker 2014), the student’s academic self-concept, academic history and work-related data are used to predict student performance (Mitra and Goldstein 2015), while others use GPA, academic load, and access to counselling (Rogers et al. 2014), the student’s financial background (Thammasiri et al. 2014), or academic performance history (Bydzovska and Popelinsky 2014; Sales et al. 2016; Srilekshmi et al. 2016) as predictors of students at risk.

Other studies focus on data collected through surveys, such as students self-reporting on expected grades (Zimmerman and Johnson 2017), motivation, or academic and technological preparedness (Bukralia et al.2014).

Several predictive algorithms are formed on a multimodal basis (Blikstein and Worsley 2016), i.e., they draw data from various sources, such as logfiles or trace-data (non-reactive data collection), assessments, survey data (reactive data collection), as well as from aggregated information or historical data.

Visualisation for study success

In a number of studies retrieved for the systematic review (see Table 1), it was noted that visualisation was important in supporting study success. Visualisation is realised through optical signals (Arnold and Pistilli 2012) and learning analytics features, which are distinguishing elements for supporting individual learning processes (Few 2013) or aiming to facilitate reflection (Dorodchi et al. 2018). Learning analytics features are implemented on web-based dashboards—customisable control panels displaying data which adapt to the learning process in (near) real time or on a summative basis.

The findings of an experiment conducted by Kim et al. (2016) found that a dashboard can be beneficial for learners of different motivation and achievement levels. For example, students who received dashboard analysis obtained a higher final score than those who did not. However, high academic achievers who received dashboard analysis showed lower satisfaction with the dashboard, i.e., it was less useful for them academically. He et al. (2015) used visualisations for documenting the student’s individual probability of failure, which could be problematic as this information may be wrongly attributed by the learners.

In summary, visualisation for study success is best realised with dashboards including meaningful information about learning tasks and the progress of learning towards specific goals.

Learning analytics interventions for supporting study success

A recent systematic review of 11 publications on the efficacy of learning analytics interventions in higher education documents visual signals and other dashboard features as dominant elements (Larrabee Sønderlund et al. 2018). Beyond these findings, our systematic review adds further insight into interventions for supporting study success (see Table 1). For example, alerts to teachers enable them to give more individual attention to students (Darlington 2017; Dawson et al. 2017; Gkontzis et al. 2018; Lu et al. 2017; Yang et al. 2017). Other interventions focus on facilitating peer-to-peer communication (Cohen 2017; Seidel and Kutieleh 2017), as well as on recommendations for adaptive learning materials, prior knowledge building, reduction of test anxiety, or student–teacher perceptions.

In summary, the key publications only exhibited a few intervention strategies (Cambruzzi et al. 2015; Carroll and White 2017; Casey and Azcona 2017). However, the effects found when various interventions for supporting study success were applied, may be biased, as other variables may contribute to the overall effects identified.

Discussion

Different learning analytics measures, visualisations, and intervention strategies need to be set in place to individualise support services for various learner needs, as reasons for study success vary significantly (Tinto 2017). In addition, distinctive measures, visualisations, and intervention strategies may work in specific contexts for some and not for others (Mah 2016). Whilst learning analytics are mainly implemented as a data-driven method for detecting at-risk students (Chai and Gibson 2015; Okubo et al. 2017; Rogers et al. 2014), higher education institutions need to supply an additional and supplementary support system to encourage and lead these students back onto the track of study success (Viberg et al. 2018). For some students, personal and interactive discussions are necessary to resolve obstacles and barriers to the studies (concerning more complex personal barriers). For other students, perhaps their obstacles were due to simpler barriers, such as the misunderstanding of previous concepts or topics, or missed information due to absence, or requiring more time and effort for completion (Tinto 2005). In such cases, personalised learning paths and encouraging interventions can be the answer to leading these students back onto their learning track (Howell et al. 2018).

Implications

Our systematic review indicates that a wider adoption of learning analytics systems is needed, as well as work towards standardised measures, visualisations, and interventions, which can be integrated into any digital learning environment to reliably predict at-risk students and to provide personalised prevention and intervention strategies. While standards for data models and data collection, such as xAPI (Experience API), exist (Kevan and Ryan 2016), learning analytics research and development need to clearly define standards for reliable and valid measures, informative visualisations, and design guidelines for pedagogically effective learning analytics interventions (Seufert et al. 2019). In particular, personalised learning environments are increasingly in demand and are valued in higher education institutions for creating tailored learning packages optimised for each individual learner based on their personal profile, containing information such as their geo- and socio-demographic backgrounds (Lacave et al. 2018), previous qualifications (Daud et al. 2017), their engagement in the recruitment journey (Berg et al. 2018), activities on websites (Seidel and Kutieleh 2017), and tracking information on their searches (Macfadyen and Dawson 2012).

The key publications of this systematic review indicate the different valid factors that could be applied in learning analytics as being a combination of learners’ background information, behaviour data from digital platforms (e.g., learning management systems, games and simulations), formative and summative assessment data, and information collected through surveys. Hence, measures for learning analytics need to include reactive and non-reactive data collection, i.e., multimodal data for supporting learning, teaching, and study success (Blikstein and Worsley 2016).

A prerequisite for defining such a multimodal or holistic data set for learning analytics is a strong theoretical foundation of learning analytics (Marzouk et al. 2016). Self-regulated learning, affective (motivation and emotion), and social constructivism theories are widely discussed in the context of learning analytics research (Azevedo et al. 2010; Tabuenca et al. 2015). Initial work, which will form the basis for future investigations, has been conducted on how to facilitate educational research by employing learning theories to guide data collection and examine learning analytics (Prieto et al. 2019). Despite the awareness of a stronger theory-informed learning analytics practice, the findings of this systematic review document an obvious weakness in defining key constructs such as study success or retention, and operationalising key factors for reliable and valid measurements. Further, the predominant methodological approach identified in this systematic review on learning analytics and study success is of correlational nature (Wong et al. 2019). Hence, a significant weakness of learning analytics research is the lack of large-scale, longitudinal, and experimental research focusing on how learning analytics impact learning and teaching in higher education. In addition, the identified lack of a clearly defined and empirically tested holistic data set remains a major challenge for learning analytics research to come.

From an instructional design perspective, visualisations, features, and interventions currently being implemented in learning analytics dashboards are of a simplistic nature, i.e., they provide statistics which are less informative for supporting learning processes (He et al. 2015). As documented by Larrabee Sønderlund et al. (2018) and in this systematic review, learning analytics interventions need to enable active learning, such as through adaptive scaffolds or by helping teachers to curate and act on data about their students (Arthars et al. 2019; Darlington 2017; Dawson et al. 2017). This includes the provision of a better understanding of students’ expectations of learning analytics features to support learning processes and warrant study success (Schumacher and Ifenthaler 2018).

Challenges ahead

Fully automated adaptive learning analytics support systems may reduce a learner’s self-regulation and perceived autonomy. Therefore, learning analytics need to provide opportunities for personalisation, i.e., learners may adjust (customise) the information and support provided at any time. For example, a learner may not need support in a topic where she has prior-knowledge and a high interest, but may, on the other hand require scaffolding while working in an unfamiliar domain. At the same time, the learner may want competitive elements (e.g., group comparison of achievement) in some situations, and collaborators to exchange ideas in others.. Other lesser observed instructional design components in learning analytics are the social impact on learning (Buckingham Shum and Ferguson 2012) and the means of collaborative learning (Gašević et al. 2019).

Given the promising opportunities of learning analytics for supporting study success, Leitner et al. (2019) present the challenges likely to be faced in further research, including (1) a shortage of learning analytics leadership at higher education institutions, (2) a shortage of equal engagement among all stakeholders, (3) a shortage of pedagogy-based approaches informing learning analytics practice, (4) a shortage of sufficient professional learning for learning analytics, (5) a shortage of rigorous studies empirically validating the impact of learning analytics, and (6) a shortage of policies specific to learning analytics.

To sum up, more educational data does not always make better educational decisions. Learning analytics have obvious limitations and data collected from various educational sources can have multiple meanings (d’Aquin et al. 2014). As not all educational data is relevant and equivalent, the reliability and validity of data and its accurate and bias-free analysis is critical for the generation of useful summative, real-time or formative, and predictive or prescriptive insights for learning and teaching. While the key publications identified in this systematic review had access to a wide range of data from students and their associated learning interactions and contexts, limited access to educational data (e.g., from distributed networks outside the institution) may generate disadvantages for involved stakeholders (e.g., students, teachers). For example, invalid forecasts may lead to inefficient (or false) decisions for pedagogical interventions (Arthars et al. 2019). In addition, ethical and privacy issues are associated with the use of educational data for learning analytics (Prinsloo and Slade 2015). This implies how personal data is collected and stored as well as the way in which it is analysed and presented to different stakeholders (West et al. 2016). Consequently, higher education institutions need to address ethics and privacy issues linked to learning analytics: They need to define who has access to which data, where and for how long the data will be stored, and which procedures and algorithms should be implemented if further use is to be made of the available educational data and analysis results (Ifenthaler and Schumacher 2019; Ifenthaler and Tracey 2016).

Limitations and recommendations for future research

Systematic reviews are a good way to synthesise findings from quantitative and qualitative research on a topic, but they cannot include results from all available research. The research methodology of this systematic review followed the eight steps suggested by Okoli (2015). The accurate execution of these steps is indispensable for the production of valid findings of a systematic review. However, even if keywords are applied, databases approached, and specific journals searched, some important research studies may still have been neglected. As shown in the initial dataset, although more than 6000 publications were identified, not all qualified to be included in this systematic review. Thus, the systematic review does not reflect all research on learning analytics and study success. In addition, this systematic review only included articles published in the English language. Hence, important findings from articles published in other languages may have been overlooked. However, as learning analytics research matures further, it is expected that meta-analyses will emerge, which may provide further empirical insights, including effect size estimates on how learning analytics impact study success.

Research on learning analytics is fast evolving. Hence, while writing this systematic review, further studies may have been published which could provide additional insights into the impact of learning analytics on study success. Accordingly, a continuing meta discussion of findings is required while the research area matures. Thus, further systematic reviews on learning analytics will help to identify important trends in the literature and suggest avenues for future research.

In order to add more rigour to future systematic reviews, experts in the field of learning analytics may be consulted in addition to standardised screening and search procedures. These experts may suggest studies in progress or publications in press which may inform the research questions accordingly and in a more timely manner. Such a procedure will add another step to the eight steps of conducting systematic reviews suggested by Okoli (2015).

Another issue found in the key publications of this systematic review is the theoretical clarity of the key constructs, such as definitions of study success, student retention, or learning analytics. When clear definitions are missing, operationalisations of these constructs become blurred and their valid measurement becomes impossible. This issue has been documented through the evaluated research rigour (i.e., theoretical rigour, methodological rigour, and rigour of findings and implications) for each of the key publications (11 weak; 35 moderate; 0 strong). Future research in learning analytics needs not only to clearly define the key constructs addressed, but also to adopt the standards of empirical research methodology for producing valid findings (Campbell and Stanley 1963). Moreover, in order to produce generalisable and transferable findings, future learning analytics research requires a stronger methodological focus on large-scale, longitudinal, and experimental research designs.

Finally, empirical evidence from articles which were not eligible in forming the key publications in this systematic review (e.g., due to incomplete work or lack of depth) are available. These studies provide additional supporting evidence as to the ways in which learning analytics can be used to increase study success. A further investigation of these publications would overwhelm the current research group. In order to invite additional researchers, a database including all identified studies and their initial classifications will be created to provide broader access to the findings.

Conclusion

Learning analytics are a socio-technical data-mining and analytic practice in educational contexts. Rigorous large-scale evidence to support the effectiveness of learning analytics in supporting study success is still lacking (Ifenthaler 2017; Ifenthaler et al. 2019). The tested variables, algorithms, and methods can be used as a guide in helping researchers and educators to further improve the design and implementation of learning analytics systems. One suggestion is to leverage existing learning analytics research by designing large-scale, longitudinal, or quasi-experimental studies with well-defined and operationalised constructs, hence connecting learning analytics research with decades of previous research in education. Further documented evidence on learning analytics demonstrates that learning analytics cannot be used as a one-size-fits-all approach, but that it requires precise analysis of institutional and individual characteristics to best facilitate learning processes for study success (Ifenthaler 2020). Also, teachers need to be encouraged to further their educational data literacy—the ethically responsible collection, management, analysis, comprehension, interpretation, and application of data from educational contexts. While further advances in empirical evidence are being achieved, higher education institutions need to address required change management processes which facilitate the adoption of learning analytics, an institution-wide acceptance of learning analytics, as well as the development of rigorous guidelines and policies focusing on data protection and ethics for learning analytics systems.