From the start, the rationale for learning analytics was clear—to collect, analyze and use student data to understand and inform students’ learning (Gašević et al., 2015) and to improve their chances of success. Student data and learning analytics do not “exist independently of ideas, techniques, technologies, systems, people, and contexts” and should be understood as a “complex sociotechnical assemblage” (Kitchin, 2014, p. 24), and as “a knot of social, political, economic and cultural agendas that is riddled with complications, contradictions and conflicts'' (Selwyn, 2014, p. 6). As such, it is important to ask which theoretical understandings and specifically learning theories inform learning analytics. Not only is “research without theory is blind, and theory without research is empty” (Bourdieu & Wacquant, 1992, p. 162), the theoretical underpinnings of our research and practices reveal ontological and epistemological assumptions and points of departure (Biesta et al., 2011).

In one of the earliest mappings of the impact of learning theory on learning analytics, Clow (2012) notes that learning analytics “draws on broader educational theory (including Kolb and Schön)” (p. 134) that not only provides learning analytics practice with a sound theoretical base, but also steers the continuous improvement of learning analytics projects..

Preceding the first Learning Analytics & Knowledge (LAK) conference, MacLure (2010) claimed that greater access to data might not only suggest that theory is unnecessary (because data explains itself), but that it may actually get in the way of making sense of that data (MacLure, 2010). Atkisson and Wiley (2011) refer to the temptation to “poke about in this data in thoroughly unprincipled ways” (p. 119). And although there may be “seemingly interesting relationships between constructs, without an interpretive framework grounded in specific theoretical commitments, the data tail may come to wag the theory dog” (Atkisson & Wiley, 2011, p. 119) (Also see Wise & Shaffer, 2015).. Despite the claim by Gašević, et al. (2017) that initial enthusiasm for data-driven and a-theoretical approaches has waned, misconceptions about data and data-informed approaches persist and may continue to inform learning analytics research and practice, e.g., claims of objectivity, the notion of raw data, that theory will become obsolete and that data are neutral (boyd & Crawford, 2012; Eynon, 2013; Kitchin, 2014; Williamson, 2017).

In the light of the claims about the need for theory in data science and specifically in learning analytics research, and considering the multi and interdisciplinary nature of learning analytics research and praxis (Lackner et al. 2015; Khalil et al. 2022), it is not clear which theories inform our research and praxis, and whether there is a particular learning theory, such as Self-Regulated Learning (SRL) that dominates learning analytics research. If SRL indeed, dominates learning analytics research, how do we then respond to claims that much of learning analytics is ‘positivist’, dominated by behaviorism, e.g., Rogers (2015) and Wilson et al. (2017), and heralds a “new behaviorism (Selwyn, 2019, p. 15).

In establishing the theoretical bases of learning analytics research, and determining whether any theory dominates, we situate this scoping review in the context of the research published under the auspices of the Society of Learning Analytics (SoLAR).

In the context of learning analytics research, SoLAR is instrumental in learning analytics providing a specific community with a shared history, commitments and to a large extent, a point of reference for the implementation of learning analytics. Under the auspices of SoLAR, there are two main publication outlets—the Journal of Learning Analytics and the annual Learning Analytics and Knowledge Conference Proceedings published by ACM. It should be noted that although a significant amount of learning analytics research is published in these two outlets, there are researchers who may not identify with the history or aims of SoLAR and who publish learning analytics research in other publication outlets.

Despite the limitation of focusing only on these two publication outlets under the auspices of SoLAR and not being able to generalize from these findings to make claims about all learning analytics research, the findings are significant for everyone who has an interest in learning analytics research and practice.

Hence our main question in this study, which learning theories dominate the field of learning analytics within the SoLAR community?

Setting a theoretical context

One might assume, given its nomenclature and the core definition of learning analytics - that learning analytics is about learning - that the field would have an explicit relationship with learning theories in particular, amid a broader range of educational theories. Given its interdisciplinary nature though, the use, choice and role of learning theory is not a given. As we show, even the notion of ‘learning theory’ is less definite than might first be thought. We begin by discussing what qualifies as theory and its role in research, before investigating ‘learning theory’ or ‘theories of learning’ as a basis for determining the role and impact of particular theories in learning analytics research.

Theory at the intersections of research and practice

Determining the theoretical influences in learning analytics research is important since those theories set forth a number of propositions that contribute to understanding and explaining phenomena or make visible causal relationships between variables, as well as allowing us to make rational inferences about future events (Biesta et al., 2011; Carr, 2006). As such, theories attempt to counter and replace practitioners’ contextually dependent, subjective beliefs with “objective knowledge generated by theory”’ that apply irrespective of context or “parochial practical beliefs” (Carr, 2006, p. 144). The claim to objective, context-less universal principles was later replaced by an acknowledgement that there is no epistemological position which makes it possible to escape or transcend the specificities of context, tradition and culture. In Biesta et al. (2011), the authors refer to Gaston Bachelard’s overview of theory as “a science of the hidden”, making “things visible or intelligible that are not immediately observable” (p. 277). Our position here is that the choice and use of a particular theory suggests a specific “ontology (nature of reality), epistemology (nature of the relationship between knower and known), and method of inquiry itself” (Sandelowski, 1993, p. 214).

In education research, Biesta et al. (2011) point to a lack of capacity building in theory, and specifically, the poor quality of theory use. The role of theory in making sense of phenomena is further impacted by the dominance of evidence in the form of data, and an emphasis on ‘what works’ without considering whether ‘what works’ is also appropriate (and ethical). While traditionally the role of theory has been seen as to aid understanding, Biesta et al. (2011) suggest instead that “theory rather has the task of generating more and different understandings” (p. 233).

Aligned to our objectives here, they moot the need for a “systematic exploration of the forms of theory and ways of theorising that play a role in educational practices, again in order to map usage and generate understanding of what theory and theorising are ‘doing’” (Biesta et al., 2011, p. 234).

An archaeology of learning theory/ies

Mapping the use and role of learning theories in learning analytics should not only consider the range of learning theories, but also their evolution (Illeris, 2018) and the emergence of the learning sciences. Until the 1950s, learning theory was developing independently across four approaches, namely within German Gestalt psychology; American behaviorism; Russian cultural-historical theory; and Piaget’s constructivism (Illeris, 2018). Since the 1980’s there have been various attempts to cover the field of learning with a single coherent model or framework. Examples include the experiential learning cycle by Kolb, Engeström’s activity theoretical reconceptualisation, Kegan’s constructive-developmental approach and Jarvis’ approach to lifelong learning (Illeris, 2018).

Complicating any study of the theories informing learning analytics research is the issue that there is no single, accepted definition of what constitutes learning theory, i.e., which theories are included and which do not ‘qualify’ as learning theory. Schunk (2019), for example proposes five major categories of learning theory—behaviorism, social cognitive theory, information processing theory, cognitive learning processes, and constructivism, while Harasim (2012) suggests adding connectivism to this list. The list suggested by Bates (2020), excludes information processing theory but also adds connectivism resulting in the following five learning theories, namely objectivism, behaviorism, cognitivism, constructivism and connectivism. Zhou and Brown (2015) list 12 theories that include, for example, the Theory of Moral Development, as an educational learning theory, while Hean et al. (2009) differentiate between two major approaches to understanding learning, i.e., behaviorism (focusing on the outcomes of the learning) and constructivism (focusing on the processes involved in learning). In yet another approach, Woolfolk (2014) discusses behavioral, cognitive and complex cognitive views of learning and suggests that these “focus on the individual and what is happening in his or her ‘head’” (p. 318).

A “relatively recent” development in our understanding of learning is an interdisciplinary approach, namely the learning sciences (Woolfolk, 2014, p. 318). One of the foundations of learning sciences is constructivism foregrounding the social and cultural factors in learning. What differentiates the learning sciences from learning theory, according to Woolfolk (2014), is firstly its interdisciplinary foundations, including computer science, educational psychology, neuroscience and anthropology. Interestingly, in discussing the learning sciences, Woolfolk (2014) then discusses cognitive and social constructivism. Separate from these, but linked, are social cognitive views of learning focusing on concepts such as self-efficacy and SRL. Due to the interdisciplinary basis of the learning sciences, it seems a natural fit with learning analytics (Luckin & Cukurova, 2019; Wise & Cui, 2018).

One of the founding and prominent scholars in learning analytics, George Siemens, acknowledges three major learning theories—behaviorism, cognitivism, and constructivism—but argues that these were developed in periods not impacted through technology (Siemens, 2005). In proposing a new theory, namely connectivism, Siemens opines that, while it is natural that existing theories are revised and adapted to reflect changing environments, there is a limit when “the underlying conditions have altered so significantly, (so) that further modification is no longer sensible” (p. 3). Establishing clear theoretical foundations is not simple, however. The claim of connectivism as learning theory has also attracted criticism. While it falls outside the scope of this review to discuss this debate, connectivism as learning theory has been increasingly accepted (e.g. Bates, 2020; Haythornthwaite, 2011).

Given that there is no single accepted definition of what constitutes a learning theory, the question of which learning theories inform learning analytics remains problematic.

Learning theories in learning analytics—an opening introduction

In providing context for this scoping review, we briefly point to some examples of learning analytics’ connection to theory. Consideration of theory as integral to learning analytics research and practice has been evident from the first keynote at the first LAK conference when Haythornthwaite (2011) situated learning analytics in the context of social network research which “provides a groundwork … in terms of both graph theory and studies of social behavior” (p. 18) as well as the social network perspective as “a strong suite of social theory and analytic techniques that can illuminate interaction processes” At the second LAK conference (LAK’12), Clow (2012) proposed a Learning Analytics Cycle “as a development of previous theorisations of learning analytics. However, it is also more fundamentally, a development of much older learning theory” (p. 135; emphasis added) including Kolb’s Experiential Learning Cycle, Schön’s work on reflective practice and Laurillard’s Conversational Framework, and their implications for the Learning Analytics Cycle. He also referred to “other educational literature” that “identifies qualitatively different approaches to study” (p. 136). He observed that these educational theories “take inspiration from the cybernetic conception of control theory, and in particular, the closed-loop control system used widely in engineering of all sorts” (p. 136). In conclusion, Clow (2012) again referred to the “theoretical grounding” of the Learning Analytics Cycle.

4 years into the evolution of learning analytics, the Journal of Learning Analytics included a special section dedicated to learning analytics and learning theory. An introductory article to this by Wise and Shaffer (2015) frame their intention to “provoke a critical dialogue in the field about the ways in which learning analytics research draws on and contributes to theory” (p. 5). Contra to the belief that bigger data removes the need for theory, Wise and Shaffer (2015) claimed that “with larger amounts of data, theory plays an ever-more critical role in analysis” (p. 5). Wise and Shaffer (2015) furthermore list several roles of theory in learning analytics research, including but not limited to, providing guidance to the researcher about the variables to include or exclude in a model; guidance about “what potential confounds, subgroups, or covariates in the data to account for’; “which results to attend to”; and providing a framework for making sense of the results and making those results actionable; as well as assisting the researcher to generalize results to different populations and contexts (p. 9).

The need to explore learning theories informing learning analytics appears even more important in the light of recent research by Guzmán-Valenzuela et al (2021). They found that more than half of all learning analytics papers are published in journals in the applied sciences (engineering and technology) and are predominantly quantitative in nature. The authors conclude that their analysis points to “two LA communities—a data driven, practical and management-oriented community focused on interventions, and an academic community more focused on theories and their development—that tend not to work together” (p. 14) and that “educational and learning theories are insufficiently present in LA research” (p.15).

Research design

In choosing an appropriate research design linked to the focus of this article, i.e., to establish which learning theories inform learning analytics research, the researchers considered both systematic and scoping reviews. Though there are similarities between the two, they are performed for different reasons and according to different methodologies (Munn et al., 2018; Tricco, et al., 2016). According to Arksey and O’Malley (2005), scoping reviews are appropriate since they allow researchers to find and report on research gaps in a particular field, whereas systematic reviews require a well-defined and clearly articulated research question. Munn et al, (2018) expand on this adding that a scoping review is a suitable choice when seeking, inter alia, to map “types of available evidence in a given field”, to “clarify key concepts/ definitions in the literature”, to “identify key characteristics or factors related to a concept” (p. 2). As such a scoping review can also be a precursor for a systematic review.

Scoping reviews are therefore appropriate when researchers are not sure how to map a particular research question or how to approach the research whether in terms of reporting, mapping and/or discussion. Peters et al. (2015) describe scoping reviews as “‘reconnaissance’ – to clarify working definitions and conceptual boundaries of a topic or field” (p. 141).

From the initial engagement with clarifying the search terms, we recognised the need to clarify the notion of “learning theory”. As discussed later, establishing the boundaries of which theories qualify as learning theory and which will be excluded, posed a considerable challenge. It was also unclear, at the outset of the research, how the choice of a particular learning theory or combination of learning theories might be traceable in the selected articles [See the later discussion].

In opting for a scoping review, the authors also considered their limitations, including, for example, a focus on breadth of information rather than depth (Tricco et al., 2016). There is also the possibility that the selection of databases or publication outlets or, indeed, a lack of critical appraisal of included studies (Pham et al., 2014) could result in the loss of other, relevant publications. As such, scoping reviews should be used to inform policy or practice. Having said that, it should be noted that scoping reviews are not less rigorous than, for example, systematic reviews, but are a “different entity” altogether with “a different set of purposes and objectives” (Pham et al., 2014, p. 380).

Scoping reviews are often iterative in nature as researchers refine and map different possibilities (Arksey & O’Malley, 2005). Munn et al., (2018, p. 5) identify the following characteristics of scoping reviews—that they are:

  • Informed by an a priori protocol

  • Systematic and often include exhaustive searching for information

  • Transparent and reproducible

  • Designed to reduce error and increase reliability (such as the inclusion of multiple reviewers)

  • Conducted so that data is extracted and presented in a structured way

In this review, we followed the stages proposed by Arksey and O’Malley (2005), namely:

  1. Stage 1:

    identifying the research question

  2. Stage 2:

    identifying relevant studies

  3. Stage 3:

    study selection

  4. Stage 4:

    charting the data

  5. Stage 5:

    collating, summarizing and reporting the results

In addition, we took cognisance of a checklist provided by Cooper et al. (2019) and considered the PRISMA guidelines for scoping reviews (Tricco et al., 2018).

Stage 1: Identifying the research question

To reflect the relatively porous boundaries regarding learning theories, the research question was phrased as: How does learning theory shape learning analytics research?. As such, the review sought to establish evidence around the following sub-questions:

1. Which learning theories are found in learning analytics research?

2. How are these learning theories informing and impacting on learning analytics research?

3. What are the implications of the prevailing theories in learning analytics research?

Stage 2: Identifying relevant studies

While research on learning analytics is published in a range of peer-reviewed and ‘grey’ literature, the flagships of learning analytics research are the annual Learning Analytics and Knowledge conference (since 2011) and the Journal of Learning Analytics (JLA) (since 2014). Regarding including conference proceedings in fields characterized by rapid developments, Alexander (2020) states that “in particular educational research communities, published conference proceedings are de rigueur” (p. 13). The JLA is furthermore “the first journal dedicated to research into the challenges of collecting, analysing and reporting data with the specific intent to improve learning”.Footnote 1 These two outlets are both found under the auspices of the Society of Learning Analytics Research (SoLAR), and as such do not represent the whole field of learning analytics research. We argue that the significant number of publications in the ACM LAK Conference Proceedings and the Journal of Learning Analytics present a specific community that claims a specific identity and focus, with a broad shared understanding of their claims and value contribution, and as such, recognising the limitations, a justified rationale for selected these two publications outlets for this scoping review.

Stage 3: Study selection

This stage was particularly important in that search terms and inclusion/exclusion criteria would have a direct impact on the results, and therefore on any conclusions later drawn. The researchers first considered searching for specific learning theories, e.g., behaviorism, etc., but the initial review of literature had suggested no clear agreement on what qualifies as a learning theory. It was therefore decided to search the two publication outlets (LAK and JLA) for the broader term “theor” (allowing for both the singular and plural of theory). In line with Tricco et al. (2016) who emphasized that scoping reviews focus on breadth rather than depth, the researchers opted not to search for specific learning theories which, as explained earlier, would be an impossible task due to the undefined nature of learning theories. Searching for ‘“theor” allowed a breadth in line with the purpose of scoping reviews.

An initial pilot study indicated that searching only the abstracts and keywords would not be sufficient, and so the complete text of articles/papers were searched.

Searching for ‘theor’ resulted in a wide-ranging list of theories used as background, or applied to the respective research questions. Articles and papers not including ‘theor’ were excluded (first step of the inclusion/exclusion criteria, see Fig. 1). The next phase sought to evaluate whether included theories were, indeed, learning theories (second step of the inclusion/exclusion criteria). As indicated earlier, this proved difficult as many did not fall neatly within the lists of major learning theories mentioned earlier. As there is no agreement as to what constitutes a learning theory, the researchers adopted definitions from a range of internet sites,Footnote 2 as well as Google Scholar to establish whether a theory was presented as a learning theory. It was also impossible, in many cases to distinguish between learning theories, paradigms or research philosophies such as positivism, hermeneutics, and interpretivism (e.g., Atkisson & Wiley, 2011), and/or reference to epistemologies (that may find expression in a particular learning theory) (e.g. (Knight et al., 2013). The researchers therefore opted to accept authors’ classifications of the theories used in the presented research.

Fig. 1
figure 1

Overview of the iterative process of the inclusion and exclusion criteria. For a full list of the included papers, see declaration

We acknowledge that this lack of definitive boundaries may impact on the replicability of this research. The selected publishing outlets dictated the time frame—LAK proceedings were searched from 2011 to 2020, and all available issues of JLA (2014—2020, Volume 7, number 2).

Stage 4: Charting the data

Earlier, we acknowledged the criteria used to look for the search term ‘theor’, and the issues faced in defining learning theory. In line with the iterative approaches proposed by Arksey and O’Malley (2005), we considered both the specific delimitations of our search terms and the potential consequences. One example of this is the exclusion of articles/papers which did not contain the text string ‘theor’. An initial analysis provided some evidence that this search term identified references to most listed learning theories, except for behaviorism. Given criticism by Rogers (2015) that learning analytics is, essentially, behaviorist, we searched again for ‘behaviorism’/ ‘behaviourism’/’behaviourist’/ ‘behaviorist’. In addition, due to the explicit intention of learning analytics to positively impact student success and retention, the initial analysis found no references to student success and/or retention theory, or to other well-established works (e.g. Spady (1970) and Tinto (1975)), for example. We therefore checked whether searching for the theories proposed by Tinto and/or Spady would suggest papers or articles not originally picked up with the term ‘theor’. This verification process found three references to behaviorism not on the initial list, 15 further references to the work of Tinto and one reference to the work of Spady (see declaration).

In summary then, the scoping review comprised two distinct processes of inclusion and exclusion as illustrated in Fig. 1. The first process involved identifying papers and articles that indexed the search term ‘theor’ from the two selected publishing outlets. This returned (N = 396) of which (n = 355) were proceeding papers and (n = 41) articles. The first step of the inclusion/exclusion process (whereby the term(s) theory/theories were used in passing only) resulted in the exclusion of 234 LAK papers and 29 JLA articles.

The second iteration criteria to verify those papers with specific reference and evidence of learning theories to inform the study and/or methodology resulted in the further exclusion of 52 LAK papers and 7 JLA articles. This resulted in a final selection of 69 LAK papers and 5 JLA articles.

Stage 5: collating, summarizing and reporting the results

Any deliberate or unintended inclusion of bias will impact on the robustness and quality of a scoping review study. To address issues of bias in this study, inter-rater reliability (IRR) was calculated. IRR refers to the reproducibility or consistency of decisions between at least two reviewers and is a necessary component to establish the validity of review studies (Cook & Beckman, 2006). Fleiss kappa was used as an indicator to evaluate the inter-rater agreement among the authors. Fleiss et al. (2013) suggest that Fleiss kappa values over 0.75 indicate excellent levels of agreement, 0.40 ~ 0.75 indicate fair to good agreement, and below 0.40 indicates a poor level of agreement. For this study, two authors scanned all filtered papers and identified selected papers for inclusions and exclusions. The IRR value at this stage was (κ = 0.421, p < 0.005). To improve the IRR value, all authors held further discussion on areas of disagreement and resolved contradictions. The final IRR kappa value indicated a good level of agreement (κ = 0.629, p < 0.005).

Results and discussion

At this stage, we collate, summarize and report on the results using the different sub-questions as structure.

Which learning theories are found in learning analytics research and inform learning analytics practice?

The remaining (n = 69) LAK papers and (n = 5) JLA articles were further reviewed to map the use of learning theories in learning analytics research (Fig. 2) and to develop a trend analysis (Fig. 3). In total, 32 different learning theories were found across the 74 texts, with 94 instances, i.e., some papers included multiple theories.

Fig. 2
figure 2

List of learning theories identified from the final list of the included papers (Figure is best shown in color) (Color figure online)

Fig. 3
figure 3

Theory use: trend over time. The line shows smoothness along the points, and the standard error of prediction is disabled. (Figure is best shown in color)  (Color figure online)

Figure 2 Clearly shows the dominance of SRL in informing learning analytics research, with Cognitive Load Theory and Constructivism as second most cited theory, and Cognitive Theory, Connectivism, Meta-cognition, and Situated Learning as third. Flow Theory, Motivation Theory, Networked Learning, Self-Determination Theory and Social Learning were each referenced three times

Of these, theories arising from the broader family of Cognitivism are Cognitive Load Theory, Cognitive Theory, Meta-cognition, and Distributed Cognition. Of interest is the number of ‘self’ theories—possibly from a range of broader learning theory categories such as behaviorism, cognitivism, and constructivism. The ‘self’ theories include—SRL, Self-Determination Theory, Self-Concept and Self-Efficacy. The only learning theories matching those in the previously discussed lists of Bates (2020), Harasim (2012) and Schunk (2019) are social cognitive theory, cognitive learning processes, constructivism, and connectivism.

To establish whether any theory had become more or less dominant over time, we examined the trend over the 10-year period for theories appearing more than three times. As Figs. 3 and 4 illustrate, there is no single pattern of growth or decline. Some theories show growth in their use over the early years of learning analytics research followed by decline, e.g., Meta-cognition, Social Learning, Constructivism and Networked Learning. Four theories; Cognitive Load Theory, Cognitive Theory, Motivation Theory and SRL seem to be on the rise, with SRL showing exponential growth in recent years.

Fig. 4
figure 4

Theory use: proportion over years 2012–2020. (Figure is best shown in color)  (Color figure online)

Figure 4 depicts the proportion status of theories in use in the included paper. As noticed in the graph, theories such as constructivism and meta-cognition possessed a large percentage in particular years. However, SRL and cognitive load theory appears consistently across recent years.

How are these learning theories informing and impacting on learning analytics research?

Analysis of the various learning theories found in the literature has highlighted how some particularly impact on learning analytics research, and raises questions with regard to the lack of explicit mentions of behaviorism. We first discuss selected examples of how theory has impacted on research.

Recognising their embrace of theoretical pluralism, Ferguson and Buckingham Shum (2012) make clear that they do not ground their work “in a single theory of social learning, nor do we think that a techno-centric taxonomy is helpful.” Rather they are “drawing on diverse pedagogical and technological underpinnings” (p. 25). Conversely, there are examples of other researchers who do link to particular learning theories, such as Worsley and Blikstein (2013) who state that their work “fundamentally builds on Piaget’s notion that knowledge is actively and dynamically constructed by the learner based on resources that she already has, and Papert’s Constructionism” (p. 94).

The role of theory in learning analytics research is emphasized by Wise (2014) who states that learning analytics research needs a “theoretical component that explains what concept or construct the analytic represents and what its relevance and relationship to other concepts and constructs is hypothesized to be” (p. 204). Theory allows us to know how to ask the right questions and know whether the answers are relevant. “Practically this means that an analytics user must have an understanding of the pedagogical context in which the data was generated, knowledge of what particular analytics are meant to indicate, and an appreciation of how these relate to the learning goals of the situation” (p. 204). For example, SRL appeared in a number of studies based on observation of learning strategies with a view to either better understand how students were approaching their learning and/or to inform analytics approaches for detecting learning strategies (see, e.g. Matcha et al, 2020). Similarly, cognitive load theory was often applied in seeking additional insight into learning outcomes (see, e.g., Srivastava et al, 2020).

Given the apparent dominance of a handful of theories, we look briefly at the question of how a learning theory might impact learning analytics research. Of particular interest is to consider the claim that behaviorism underpins much of learning analytics (e.g., Rogers, 2015) given the paucity of explicit references to behaviorism in the reviewed papers and articles.. For Wilson et al. (2017), learning analytics as behaviorism is based on their assumption that using digital traces as proxies for learning is behaviorist. Rogers (2015) bases his claim that learning analytics understands learning as a new form of behaviourism on the importance of and emphasis on “the manipulation of antecedent conditions to control behaviour” (p. 227). It is interesting that in building his argument of learning analytics as behaviorist, Rogers (2015) refers to three articles – Atkisson (2011); Dietrichson (2013), and Lodge and Lewis (2012). Atkisson (2011) sees very little difference between learning analytics and the assumptions underpinning the 'radical behaviorism' philosophy of American psychologist B.F. Skinner, while Dietrichson (2013) refers to the potential of learning analytics to be nothing more than “clickometry” of value in courses founded on behaviorist pedagogies. Lodge and Lewis (2012) state that although there are “notable exceptions”, most of the current approach in learning analytics is “akin to a behavioural theory of learning” and that though “there is some use for a behavioural approach to learning […], these approaches to learning do not provide a complete account of what happens in higher education and through technology-mediated, networked learning environments”.

The only other reference to behaviorism in the context of learning analytics in either LAK papers or JLA articles was from Joksimović et al. (2015) although it is mentioned only as part of their findings on topics discussed by MOOC participants.

Inherent in the claims of Rogers (2015) and others that learning analytics is, essentially, behaviorist and therefore ‘reductionist’ and ‘positivist’, is the implicit reference to the critiques of the work of Skinner and other behavioral theorists (e.g. Breger & McGaugh, 1965) as underestimating context and the inherent interdependencies and inter-relationships in learning. While it falls outside the scope of this scoping review to evaluate whether all SRL in the context of learning analytics behaviorist, this scoping review can provide evidence of how SRL functions in learning analytics research.

Self-regulated learning

In the light of the ascendency and dominance of SRL in learning analytics research, (Figs. 2 and 3), we briefly discuss SRL and illustrate, with some examples, how SRL is used in learning analytics research.

Zimmerman (1989) suggests that SRL is linked to seven prominent theoretical perspectives, namely “operant, phenomenological, information processing, social cognitive, volitional, Vygotskian, and cognitive constructivist approaches” (p. 1). Interestingly, in discussing different definitions of SRL, he states that cognitive orientations such as constructivists prefer “definitions in terms of covert responses” while behaviorists prefer definitions “in terms of overt responses” (Zimmerman, 1989, p. 5). One of the “largest and most influential bodies of research on self-regulation” has been produced by operant researchers following the principles of B.F. Skinner and adapting his behavioral technology for personal use (Zimmerman, 1989, p. 7). Since self-awareness cannot be observed, operant researchers focus on “behavioral manifestations of self-awareness, namely self-reactivity” (Zimmerman, 1989, p. 9). Given that operant researchers value how internal processes are manifested in overt behavior in a particular instructional environment, this results in developing effective instructional interventions such as verbal tuition and reinforcement. At first “external cues and contingencies are imposed, and then self-regulation responses are gradually shaped” (p. 11). As the individual gets more competent in self-regulation, “external cues are faded, and short-term reinforcers are thinned gradually” (p. 11).

There is ample evidence that the majority of articles referring to SRL expand on their reference to SRL either in the literature review section, and/or provide examples of how the adoption of SRL influenced the research design, methodology and/or analysis. For example, Gašević et al. (2014) uses text and video analytics to investigate “students’ use of the tool and the psychometrics and linguistic processes evident in their written annotations” (p. 123) and interpret the findings through the lens of SRL. The authors discuss SRL and note that SRL is a “foundational theory in modern educational psychology”, it “recognizes that learners create knowledge, and select and manage cognitive operations in the form of study tactics they apply to learn” and provide “Dynamic feedback loops are the core of self-regulated learning” (p. 124). The analysis of students’ annotations are key in “understanding of students’ self-regulated learning and self-reflection processes” (p. 125). Another example where SRL is core to learning analytics research is the paper by Molenaar et al. (2015) where they investigated the “effects of sequences of socially regulated learning on group performance”. Another example is the article by Azevedo et al. (2017) who reports on using data visualizations to illustrate “cognitive, affective, metacognitive, and motivational (CAMM) self-regulated learning (SRL) processes” with the purpose “to foster emotion regulation (ER) with advanced learning technologies” (p. 444).

There are also a number of articles combining SRL with other theories, such as the article by Larmuseau et al. (2018) combining SRL with cognitive load theory. Another example of linking SRL with other (related) learning theories is the article by Crossley et al. (2020) discussing SRL in the context of the work by Bandura (1977) and Bandura and Schunck (1981) on self-efficacy. Also of interest is the article by Milligan and Griffin (2016) combining references to SRL with Connectivism, the acquisition of a range of twenty-first century skills and the social construction of learning.

A detailed analysis will be necessary to determine which of the seven prominent theoretical orientations in SRL, as suggested by Zimmerman (1989) is dominant in learning analytics research. It is also important to note the work by Winne (2017) who tracks the “trajectory of scholarship about self-regulated learning” from Descartes's 17th-century writings implying “mental activities consistent with metacognition”, the work of Skinner, as well as the notion of agency inherent in the work by Bandura. In the researchers’ view, it will be disingenuous, without a thorough analysis of all the instances of SRL in the corpus of analysis, to claim that the dominant orientation is the operant orientation. Taking Zimmerman’s own assessment that the operant orientation is the most dominant in SRL, it may point to its dominance also in the field of learning analytics research.

What are the implications of prevailing theories in learning analytics research?

The analysis has shown links between learning theory and learning analytics research since its emergence as a field, although we recognize also that the omission of any explicit mention of theory does not mean that the research was not informed by theory.

With increasing access to bigger data sets; a greater variety, granularity and velocity of data; the increasing use and prominence of multimodal learning analytics; and the potential and risks of Artificial Intelligence and Machine Learning, we caution against approaches that devalue learning theory. More data may mean more noise and with underdeveloped theory, there is a real risk that the noise is confused for the signal (Silver, 2012).

A recent paper by Wiley et al. (2020) petitions for a grounding of learning analytics in theoretical understandings of learning. Without theory, data can be selected haphazardly, or learning analytics biased towards “data that is simply proximal to rather than consequential for learning”, (p. 569), and it is theory that distinguishes learning analytics from data analytics. These authors argue for theory-informed or theory-grounded learning analytics on the basis that “theory informs the decision of which data are most appropriate for measuring a particular aspect of learning as well as facilitating the explanation of analytics-identified student outcomes and lighting the path for responsive action” (p. 570). Learning analytics as grounded in theory is also supported by Papamitsiou et al. (2020) who claim that “Failing to include theory and practice (e.g., pedagogical perspective, learning theories) is likely to slow progress, fail to achieve cohesiveness and universality, and might threaten validity” (pp. 567–568).

The analysis also provides evidence of the richness of the field with a wide range of learning theories being used, referred to and applied in learning analytics research. The theoretical plurality of learning analytics research, however, offers both risk and potential. Theoretical pluralism provides more than one heuristic lens with which to engage with and understand a phenomenon and offers flexibility. Midgley (2011) provides a rationale in support of theoretical pluralism by stating that (a) all knowing is inevitably bounded; (b) greater insight can be generated by exploring the boundaries of knowledge than by taking boundaries for granted; (c) different theories assume different boundaries; and (d) exploring multiple boundaries can usefully involve drawing upon multiple theories (p. 6). Multiple theories, each with its own boundary judgment, are a rich resource for undertaking research in complex environments. Acknowledging that paradigmatic pluralism “is a largely unaddressed reality” in educational technology, Kimmons and Johnstun (2019) suggest that professionals have not always been provided with sufficient guidance on how to navigate it. While our analysis pointed to a rich array of educational learning theories being used in learning analytics research, it is not clear to what extent researchers recognize the implications of their choice of learning theory, and the implications for their research designs, methods and analyses should another learning theory have informed the research. This is a pertinent point when one theoretical orientation begins to dominate a research focus and practice as it may result in an incommensurability with other approaches or designs (e.g. Kuhn, 1996), or a monist approach (Kimmons & Johnstun, 2019).

Considering the inherent multi and interdisciplinary nature of the field of learning analytics, pluralism (whether paradigmatic, theoretical, methodological) is much more than a given, but is a characteristic that should be intentionally nurtured. No discipline, theory, or methodology has all the answers nor the definitive answer to the complexities of student learning. This means that once a particular discipline, paradigm or methodology begins to dominate the field of learning analytics, it may lose some of its richness, and potentially one of its defining characteristics (Gašević et al., 2017; Selwyn, 2020). There is also the concern by (McPherson et al., 2016) that “Although learning analytics itself is an interdisciplinary field, it tends to take a ‘one-size-fits-all’ approach to the collection, measurement, and reporting of data, overlooking disciplinary knowledge practices” (p. 158). We should, however, not underestimate the inherent challenge in adopting a pluralist orientation as it means “suspending values, beliefs, or foundational assumptions (e.g., epistemological, ontological) to work within potentially contradictory paradigms at different points in time” (Kimmons & Johnstun, 2019, p. 636). Working within the context of the inherently multi and interdisciplinary field of learning analytics, require, in the words of Kimmons and Johnstun (2019), a commensurabilist orientation “that various methods can be used to achieve the same goals or to uncover the same underlying phenomena, wherein practical day-to-day activities, such as what data we collect and how we collect them, operate semi-autonomously from our underlying assumptions” (p. 636).

Navigating the multi and interdisciplinary nature of learning analytics, between monist and pluralist, or commensurablish or incommensurabilist orientations, will require a commitment to openness and epistemological honesty, recognising one’s own positions and a willingness to engage with other positions—“ situating [ourselves] deeply in disparate paradigmatic communities but struggle with the complexity that such a self-contradicting approach requires” (Kimmons and Jonstun, 2019, p. 640). [See the techniques suggested by Kimmons and Jonstun, 2019].


We acknowledge that focusing on these two databases publishing outlets is a limitation and we cannot use these findings to generalize to the whole field of learning analytics. However, given that our research suggests that a significant percentage of relevant studies would be included within these outlets, and that both these publication outlets are found under the auspices of SoLAR, we are confident that our findings are significant, not only for the SoLAR community, but also for those who do not formally align with SoLAR but who have an interest in learning analytics research and practice. Given that this review is centred on uses of learning theory, the use of the initial search term ‘theor’ appears reasonable. However, we recognise that this may have excluded those papers which did not explicitly include this term. There may have been other relevant studies informed by theory which were not picked up by the search. The apparent lack of consensus on the boundaries of what constitutes learning theory inevitably muddies the waters. The authors have made a judgement that those theories identified here are generally accepted as learning theories, and that others—which may later be considered appropriate—were at this time excluded.


Explaining and understanding student learning, and student retention and success in all of its complexities (Braxton, 2000; Spady, 1970, and Tinto, 1975, 2006) are central to the value contribution of learning analytics. This scoping review provides ample evidence that learning theory has always been, and continues to be, an inherent part of learning analytics research, despite ongoing examples of studies that appear to rely on allowing data to speak for itself. Having said that, it would be remiss not to mention that learning analytics in practice, for example, as evidenced in administrative and practitioner oriented applications, may lean on data fishing approaches rather than be led by accepted learning theory. Though we found that not all papers and articles refer to learning theory, we do not make the assumption that the research presented in these papers and articles are uninformed by theory. Indeed, we found theories from a range of disciplines such as, inter alia, education, sociology, computer science, economics and computer science. The theoretical pluralism in the field was further evidenced in the range of specifically learning theories, some of which are on the wane, while other theories, such as SRL are in ascendency.

The value contribution of this scoping review is found in providing a glimpse of the range of learning theories used in learning analytics research since its emergence as discipline, field, research focus and practice. The evidence suggests that the field is rich in different perspectives, paradigms and theoretical influences—some from outside of education. Taking note of the increasing dominance of SRL in learning analytic research, we note that SRL is not homogenous, but is itself informed by a range of theoretical perspectives, of which the operant version is linked to and supports a particular form of behaviorism as learning theory. In the light of its ascendency, it is crucial for learning analytics as a field to consider the different theoretical approaches within SRL and explore the richness of this learning theory in all its dimensions.

The theoretical pluralism found here provides a clear rationale for learning analytics to embrace learning theory whilst remaining cautious when any one discipline or theoretical approach starts to dominate the field. Theory, has in its essence, the potential to offend, to unsettle and “open up static fields of habit and practice” (MacLure, 2010, p. 277) and to disrupt and transform (Biesta et al., 2011). The value of theory lies then in its ability to “get in the way”, to “interrupt” and “thereby, hopefully, open new possibilities for thinking and doing” (MacLure, 2010, p. 277).