Keywords

Introduction

As higher education becomes increasingly digitalized and datafied, not only did pedagogy, curriculum and assessment practices change in response to the availability of synchronous and asynchronous technologies, but also institutions had access to more granular and often real-time student and learning data, from a variety of sources (Prinsloo, Slade, & Khalil 2021). Collecting, measuring, and analyzing student data for the purposes of improving the effectiveness of teaching and learning, for use by educators, students, and course development and support teams, came to be known as learning analytics (LA) (Siemens & Long, 2011).

Since the emergence of LA in 2011 as a distinct research focus and practice, the field has matured and become institutionalized predominantly in the Global North and in residential institutions, with research into LA being dominated by the Global North (Guzmán-Valenzuela, Gómez-González, Tagle, and Lorca-Vyhmeister, 2021). While there is evidence emerging that LA is making inroads also in the Global South, its adoption by traditional distance and open distance education institutions remains mostly limited to the Open University in the UK (Prinsloo, Slade, & Khalil, 2022).

This chapter first provides a brief overview of LA and its relevance, especially for Open, Distance, and Digital Education (ODDE). Following this, theoretical influences guiding LA are discussed before providing a review of selected discourses and research in LA. This is followed by some open questions and directions for future research before this chapter concludes with implications for ODDE practice that arise from this research.

Central to the value contribution of this chapter is not if LA can contribute to more effective and successful learning in ODDE contexts, but rather, under what conditions will LA become an essential part of teaching and learning in ODDE institutions.

Overview of LA and Its Relevance

It is crucial to understand LA against the historical use of student data in service of improving teaching strategies, student support, and student learning. Collecting and using student data – whether referring to demographic, registration, prior learning experiences, or current learning behavior data – has always been essential to teaching, regardless of the modality. Data on students’ performance and progression in their courses and chosen program of study inform not only institutional strategic and operational planning such as enrolment plans and resource allocation, but also teachers’ choice of pedagogical strategies and assessment approaches, as well as student and course support teams’ strategies. Feedback on progress in attaining the envisioned outcomes of courses, whether in the form of assignments, tasks, and examinations, is furthermore key to informing students on their progress, probabilities of passing or failing courses, and serves as an essential resource for students to make informed decisions. It is, however, clear that as institutions become digitalized and datafied, they have access to increasing volumes, variety, and granularity of student data from a variety of courses so that student data, and particularly LA, will increase in strategic and operational importance.

What Is the Relevance of LA for ODDE?

  • Teachers in ODDE contexts often feel as if they are “teaching in the dark” due to physical separation between students and teachers inherent in distributed learning contexts. Teachers and institutions therefore rely on student learning data (e.g., assignments and online behavioral and engagement patterns) to get a sense of students’ progress, students’ risk of dropping out, and/or need for additional guidance and support.

  • Since its emergence in 2011, LA has matured not only as institutional practice but also as research focus and provides a wide range of empirical evidence of its potential to predict students’ performance, provide decision support for teachers and student, predictive analysis of retention/dropout, descriptive and predictive analysis of cognitive states, and learning interactions (Du, Yang, Shelton, Hung, & Zhang, 2021) (Also see Bart, Olney, Nichols, & Herodotou, 2020).

  • Student retention and success in distributed learning contexts have always been and remain a cause for concern (Kember, 1995; Subotzky & Prinsloo, 2011). The measurement, collection, analysis, and use of student data therefore offers huge potential to increase student retention and success, inform learning design, assessment strategies, and student support interventions.

In the next section, the major theoretical insights that guide and emerge from LA are discussed.

Guiding Theories

Theoretical influences and the importance of theory in LA is well-established and appreciated (Wise & Shaffer, 2015). For example, Rogers, Gašević, and Dawson (2016) state that theory “is an explicit articulation of the causal forces and mechanisms in a domain of interest that purports to connect empirical findings to each other and to the whole, making sense of what is figure and what is ground” (p. 237). As such theory lays out the core assumptions and fundamental principles, in general hypothetical terms, that inform research into a particular phenomenon or practice. How theory and which theories shaped the evolution of LA and continue to shape research into and the adoption of LA are, however, more difficult to establish. Though many of the published research in LA do not explicitly mention any theory, but the mere “absence of explicit theory in [e.g.] predictive analytics research does not mean ‘no theory’” (Rogers et al., 2016, p. 238).

To understand the role of theory in LA, it is important to first understand the emergence of LA in broader context and history of data-informed decision-making in higher education, and second, understand the interdisciplinary nature of LA.

Student and organizational operational data have always formed part of and as basis for data-informed decision-making in higher education institutions’ planning and reporting through what has come to be known as Educational Data Mining (EDM) (Baek & Doleck, 2021; Liñán & Pérez, 2015). The specific focus on student learning data, to inform decisions teachers and students make, emerged from the intersection of several disciplines such as, but not limited to, psychology, education, computer science, and the broader social sciences (Ferguson, 2012). Though LA and EDM are often used interchangeably, LA is distinct from EDM with regard to theoretical influences, its purpose, and the data it collects as well as the users of the analytics (Baker & Inventado, 2014; Ferguson, 2012). The interdisciplinary nature of LA also results in discipline-specific theories and practices shaping how student learning is understood, what data are collected, as well as understandings of the data.

Though it is impossible to provide a comprehensive overview of all the theoretical insights in the field of LA as an interdisciplinary field, the next section maps some key, selected theoretical “moments” that shaped the development of theorizing the field of LA as well as LA as praxis.

Theoretical Influences

The article by Siemens and Long (2011) was the first specific overview not only of the expectations and aims of learning analytics, but also clarifying the difference between academic and learning analytics. Academic analytics refer to institutional (learning profiles, performance), regional, national, and international analytics used by administrators, funders, and marketing as well as a range of educational authorities and governments. In contrast, learning analytics refers to course-level (student behavioral and learning data) and departmental data used by students and faculty. They warn that confining learning analytics to behavioral data risks “a return to behaviourism as a learning theory,” and they ask, “how can we account for more than behavioural data?” (p. 38). This early warning has continued to haunt LA up to the present day (Rogers et al., 2016; Selwyn, 2019).

The first published article that made specific reference to theories informing LA is the article by Clow (2012) in which he refers to the five-step model proposed by Campbell and Oblinger (2007) and the need to put LA on “an established theoretical base” (p. 134). Theories referred by Clow (2012) include Kolb’s (1984) Experiential Learning Cycle which refers to the work of Dewey and Piaget, Schön (1983), and Argyris and Schön’s (1974) work on reflective practice. He also refers to Laurillard’s (2002) Conversational Framework. Other educational literature mentioned are approaches to learning – deep, surface, or strategic – referring to the work of Richardson (2000) and Trigwell and Prosser (2004), as well as a reference to closed and open-loop control systems in “engineering theory” (Clow, 2012, p. 136).

Very early in the evolution of LA was considering the role of theory in making sense of having access to more data. Higher education institutions and certain forms of delivery (e.g., MOOCs) increasingly have access to more data not only on the institutional Learning Management System (LMS) but also from a range of other sources, such as geolocation, multimodal, and other forms of data “across sites and multiple identities” (Siemens & Long, 2011, p. 38). With this access to more data, a particular imaginary emerged that theory is no longer needed because “the data deluge makes the scientific method obsolete” (Anderson, 2008), and that “with enough data, the numbers speak for themselves” (in Siemens & Long, 2011, p. 34) (Also see Wise & Shaffer, 2015).

Pointing to the apparent tension between access to more data and theory, Clow (2013) states “As a field, learning analytics is data-driven and is often atheoretical, or more precisely, is not explicit about its theoretical basis,” and although there are attempts to ground LA in theory, “this is not universal, running the risk of treating the data that have been gathered as the data that matter” (p. 692; emphasis added). Later in 2015, Gašević, Dawson, and Siemens (2015) remark that “learning analytics tools are generally not developed from theoretically established instructional strategies, especially those related to provision of student feedback” (p. 65; italics added). Reflecting on the relative success of the Signals project at Purdue University, “the tool design did not have sufficient theoretically informed functionality to encourage adoption of effective instructional and intervention practices” (p. 66). In following Winne (2006, in Gašević et al., 2015, p. 66), the authors reflect on three axioms from the field of educational psychology – “learners construct knowledge, learners are agents, and data includes randomness” (p. 66). Building on the principles of Self-Regulated Learning (SRL), the authors focus on students as agents, and their freedom to make choices, however constrained by internal and/or external conditions.

Recent research by Wang, Mousavi, and Lu (2022) mapped key theoretical constructs found in LA research and based on their analysis and found that most of the research in LA “were guided by the theories of self-regulated learning and social constructivism; most integrated theories into LA for better interpreting the data analysis results; and most linked theoretical constructs and log variables directly.” The authors also found that researchers employed survey-instruments to measure theoretical constructs. (Also see Prinsloo et al., for an overview of the complexities in identifying the dominant theories in LA research.)

Though the above is anything but a systematic and comprehensive overview of theoretical underpinnings and emergent theoretical issues in LA since 2011, it does provide a very useful, and insightful, basis for which to consider, in the rest of this chapter, key theoretical moments in LA. In the next section, this chapter outlines selected major research and discourses in LA.

Selected Major Research and Discourses in LA

In selecting major research and discourses in LA of particular importance for ODDE, it is important to note that the field of LA and scope of published research are rich and wide. For example, research and discourses include, inter alia, stakeholder perspectives, new developments such as multimodal analytics and Artificial Intelligence (AI), the use of dashboards, measuring the impact of LA, and issues pertaining to student consent, privacy, and ethics. The selected research in the following section maps research into the adoption and institutionalization of LA, the role of LA in informing learning design and pedagogy, privacy and ethics in LA, and evidence of the impact of LA.

The Adoption and Institutionalization of LA

Since the emergence of learning analytics in 2011, one of the main foci in LA discourses and research was to do research on factors that may influence the adoption and institutionalization of LA. For example, the first research to provide a framework, not only for understanding LA from an institutional perspective but also to inform its adoption, was provided by Greller and Drachsler (2012). They proposed six interdependent and mandatory critical dimensions encompassing “stakeholders, objectives, data, instruments, external constraints, and internal limitations” (p. 45). Of particular interest is the authors’ foregrounding of “theories of learning, teaching, cognition and knowledge” (p. 55) as it points to a recognition of theoretical influences that shape the institutionalization of LA. They state that “more empirical evidence is needed to identify which pedagogic theory LA serves best” (p. 53). They further opine that while there is evidence of LA being informed by “behaviourist-instructivist style approaches… […] there is as yet little evidence for the support of constructivist approaches to learning” (p. 53). They conclude that “technologies are not pedagogically neutral” and, as such, moot the need for constant evaluation of approaches taken.

The Learning Analytics Readiness Instrument (LARI) developed by Arnold, Lonn, and Pistilli (2014) refers to “literature [that] offers would-be practitioners a solid base of theory, process, and research” (p. 163) but does not provide any detail pertaining to what this “solid base of theory” entails. In another article, Arnold et al. (2014) refer briefly to the work of Kotter (2008), on “leading change,” and emphasize “using the existing research and theory as the foundation to begin building out new theory and research in system level thinking to support learning analytics” (p. 260). The authors refer to the five stages of Puglise’s student success analytics (2010, in Arnold et al. (2014), p. 258) namely (1) technology infrastructure, analytics tools, and applications; (2) policies, processes, practices, and workflows; (3) values and skills; (4) culture and behavior; and (5) leadership.

The Supporting Higher Education to Integrate Learning Analytics (SHEILA) project (https://sheilaproject.eu/sheila-framework/) cofunded by the European Commission via the Erasmus+ program (Tsai et al., 2018) is one of the more recent, comprehensive, and widely used frameworks for institutionalizing LA. The framework was developed “based on interviews with 78 senior managers from 51 European higher education institutions across 16 countries,” and Tsai et al. (2018) report on findings of the implementation of the framework in four different institutional settings.

The purpose of the SHEILA project was “to guide individual institutions to develop a comprehensive policy that speaks to the needs of their particular contexts and stakeholders therein” (p. 4), and the project focused on the following research questions: (1) What is the state of the art in terms of LA adoption among European HEIs? (2) What are the key drivers for LA from the perspectives of institutional leaders, teaching staff, and students? (3) What are the key challenges for LA from the perspectives of institutional leaders, teaching staff, and students? (4) How can we move toward systematic adoption of LA in higher education? The SHEILA framework highlights four important areas of work in the implementation of LA namely (1) tool development; (2) policy development; (3) user-centered implementation; and (4) communication with primary stakeholders. As such, the SHEILA framework provides a structured approach to drafting a policy for learning analytics by allowing institutions wanting to implement LA to map the political context, identify key stakeholders, identify desired behavior changes, develop an engagement strategy, and analyze the internal capacity to effect change and establishing monitoring and learning frameworks (https://sheilaproject.eu/sheila-framework/).

LA As Informing Learning Design and Pedagogy

Ameliorating the effects of the geographic separation between students and teachers has been central to the evolution of ODDE praxis and theorization, for example, Moore’s (2019) work on transactional distance, the promise of guided didactic conversation (Holmberg, 1999), getting the right mix of different elements and technologies in the design of learning experiences (Anderson, 2003), and the Community of Inquiry Framework (Garrison & Arbaugh, 2007), to mention but a few. Considering that LA and research into LA is about improving the effectiveness of teaching and learning (Gašević et al., 2015), the success of LA in informing learning design and pedagogy is one of the most important themes in LA analytics research.

Reflecting on the alignment of LA with learning design in the context of the Open University (OU) in the United Kingdom, (UK), Rienties, Nguyen, Holmes, and Reedy (2017) report that “learning design decisions made by OU teachers seem to have a direct and indirect impact on how students are working online and offline, which in part also influenced their satisfaction and learning outcomes” (p. 147). Learning design at the OU focuses on what students do and is in contrast to many approaches that emphasize what teachers do. As such student digital engagement data in the different activities allow teachers to design formative learning activities to not only address student needs, but also ensure that students are retained and supported toward success. LA analytics allows teachers to provide feedback to students on what students do and the progress they make. Rienties et al. (2017) state that while there has been claims that learning design impacts on the effectiveness of learning and teaching, LA provides the opportunity to provide evidence of such impact. LA at the OU is based on an institutionally approved learning design taxonomy consisting of a number of learning design activities namely (1) assimilative; (2) finding and handling information; (3) communication; (4) productive; (5) experiential; (6) interactive/adaptive; and (7) assessment.

Another study providing evidence of the positive correlation between LA, pedagogy, and student engagement is the research by Macfadyen, Lockyer, and Rienties (2020) foregrounding the impact of the decisions made by educators, and “how students are reacting to these decisions” (p. 10). Considering the diversity of students, especially in ODDE environments, LA provides key insights into how to personalize learning experiences depending on student profiles, behavior, and support needs.

Where Is the Evidence? Mapping the Impact of LA

Considering evidence that suggests that LA helps to improve learning design and pedagogy (as discussed above), it is less clear to what extent LA impacts on student retention and success.

One of the first published reports on the impact of LA, by Arnold et al. (2012), shares findings of the use of a “predictive student success algorithm (SSA) is run on-demand by instructors,” and the reported positive outcomes of the Signals project. An algorithm was developed consisting of four components namely “performance, effort, as defined by interaction …; prior academic history…; and, student characteristics, such as residency, age, or credits attempted.” Each of these components was weighted and then operationalized by the algorithm to determine students’ chances of success resulting in red, yellow, or green signals that were then displayed on students’ course homepages. “A red light indicates a high likelihood of being unsuccessful; yellow indicates a potential problem of succeeding; and a green signal demonstrates a high likelihood of succeeding in the course.” The authors conclude that “The use of learner analytics through the application of Course Signals to difficult courses has shown great promise with regard to the success of first- and second-year students, as well as their overall retention to the University” (Arnold et al. 2012). In the light of the fact that the study by Arnold et al. (2012) is the “most frequently cited institutional deployment of learning analytics,” it is important to note that some of the claims of the improved retention of students have been disputed (Jisc, 2016) by Caulfield (2013), Straumsheim (2013), and Clow (2013).

Forward to 2017, Ferguson and Clow (2017) point to a number of problems with evidence pertaining to the impact of LA such as a lack of geographical spread, gaps in the knowledge base of LA such as no evidence of LA in, for example, informal learning contexts, “little evaluation of commercially available tools” and a “lack of attention to the learning analytics cycle (by Clow, 2012), limited attention to ethics, issues pertaining to sample selection, access to research findings, and an “over-representation of LAK conference papers.” The authors conclude that “there is considerable scope for improving the evidence base for learning analytics.” The quest to find evidence of the impact of LA on student learning is also addressed in Kitto, Shum, and Gibson (2018) who opine that the lack of evidence could be attributed to the “mistake of concentrating development in LA upon a concept that is easy to define and track, but not particularly useful to learning” combined with an overemphasis on “upon valuing what we can measure, instead of measuring what we value — a longstanding concern in educational assessment” (p. 454).

A more recent attempt to map evidence of the efficacy of LA is found in the systematic review by Larrabee Sønderlund, Hughes, and Smith (2019). Their research found only 11 studies that evaluated the effectiveness of interventions based on LA. The authors conclude “While there is plenty of research on the forecasting of student performance and retention, there is very little on the effectiveness of LA interventions” (p. 2613). They further note that “The LA interventions that we have identified centre on the idea that alerting students to their risk status, and engaging them on this basis, will change their performance for the better,” and according to the authors, there are several caveats in this assumption. These research findings are also confirmed by Ifenthaler, Mah, and Yau (2019). Viberg and Gronlund (2021) confirm that there is still “very little existing evidence” that LA improves teaching, learning, and student support at scale, and Guzmán-Valenzuela et al. (2021) propose that there is a “preponderance of analytics but very little learning” in their bibliometric and a content analysis. These authors venture the existence of two communities within the LA landscape namely “a practice-based community led by management units within higher education institutions and an academic community whose object of research study is LA as such” (p. 16). Of specific importance to this chapter is their finding that “there is a shortage of papers devoted to developing or expanding educational theories about students’ learning” (p. 16).

Early research on understanding student retention and success in distance education environments from a socio-critical perspective (Subotzky & Prinsloo, 2011), points to the role the lack of and inefficiencies in administrative support attribute to student frustration and dropout. In a recent article by Herodotou, Naydenova, Boroowa, Gilmour, and Rienties (2020), they explore how predictive learning analytics and motivational interventions increase student retention and enhance administrative support in distance education. The research used a Student Probabilities Model (SPM) that “produced predictions of whether an individual student would reach specific milestones (different points in a course presentation or between courses), such as completing and passing a course, or returning in the next academic year” (p. 75). Based on the outcomes of the SPM, proactive interventions were executed including text, mail, and phone calls and “likely helped students remain engaged and progress through their studies” (p. 78) and confirmed that “interpersonal contact and communication with support and academic teams is more likely to contribute to a sense of belonging and social integration with the university, connecting students with the institution from a distance” (p. 80). Also see Herodotou et al. (2020).

Ethics and Privacy in LA

Concerns about privacy and ethics in the measurement, analysis, and use of student data have been part and parcel of the evolution of LA from before its official launch in 2011 (Knox, 2010; Slade & Galpin, 2012). Since then, various issues and concerns regarding ethics and privacy in LA formed part of the mainstream discussions in LA. The first comprehensive attempt to map the ethical and privacy concerns and propose a pointer for consideration is found in the work of Slade and Prinsloo (2013) mooting several principles for consideration namely (1) LA as moral practice; (2) students as agents; (3) student identity and performance are temporal dynamic constructs; (4) student success is a complex and multidimensional phenomenon; (5) transparency; and (6) higher education cannot afford to not use data. Of particular interest in the context of this chapter, this work by Slade and Prinsloo (2013) formed the basis for the first institutional policy for ethics in LA, developed at the Open University (UK) (OUUK, 2014). Other examples of how ethics and privacy claimed a space in LA research, discourses, and practice include an article by Pardo and Siemens (2014) on ethical and privacy principles in LA and a proposal by Willis (2014) to go beyond utilitarianism in thinking about ethics in LA.

Since these early research into the ethical and privacy concerns in LA, numerous initiatives, both institutional and as research, followed such as a Code of Practice for Learning Analytics (Jisc, 2018), a Discussion Paper, “The ethics of learning analytics in Australian higher education” (Corrin et al., 2019), “Global guidelines: Ethics in learning analytics” developed by the Association for the Advancement of Computing in Education (Alayan, 2019), and codes of practice/ethics for learning analytics in several higher education institutions such as the University of Leeds (UK) (2019). See Pargman and McGrath (2021) for a systematic review on ethics in LA.

Open Questions and Directions for Future Research

From the preceding and selective overview of some of the major research foci and discourses in LA research, several questions arise that may serve as directions for further research. Most probably, the most pertinent question that arises pertains to the scattered and incomplete evidence that LA impacts positively on student retention and success. While it falls outside of the scope of this chapter to speculate regarding the lack of unequivocal evidence, there are glimpses in the above overview that may hold clues such as the following:

  • How does LA research build on existing theory? Many authors pointed to the lack of theoretical grounding in LA research (e.g., Misiejuk & Wasson, 2017). Despite or amid the reality that LA is found in the nexus of various disciplines, and methodological approaches and theoretical frameworks, the evident lack of explicit theory is cause for further research, if not concern. Even when reference is made to the work of Tinto (e.g., Arnold et al. 2012), the reference is as background. Considering the rich theoretical and empirical history of research into student success and retention (e.g., Spady, 1970; Tinto, 1975, 1982; Kember, 1995; Subotzky & Prinsloo, 2011), it is not clear to what extent LA as field and as practice takes its cues from theory.

  • What does LA research contribute to theory? We also have to ponder the inverse, namely, to what extent does LA contribute to theory development on student success and retention? Tinto (1982) proposes we have to recognize that “current theory cannot do or explain everything” (p. 688). Acknowledging current theoretical limitations should not “constrain us from seeking to improve our existing models or replace them with better ones” (Tinto, 1982, p. 689). We therefore have to contemplate to what extent LA can expand existing theories or provide novel understandings of student persistence and success.

  • What are the practical effects of LA? We also need to heed the words of Tinto (1982) that we should “also recognize that there are deep- rooted limits to what we can do to reduce dropout both at the national and institutional levels of practice” (p. 699). The chapter provided evidence that much of LA focuses on providing students with information to make better choices, institutions with information on how to support identified at-risk students better, and data to teachers and learning designers to design better pedagogical strategies. Considering that student success is multidimensional and emerges from various, interdependent, and often mutually constitutive factors in the nexus of students (habitus, loci of control, self-efficacy, and prior learning experiences), institutions (character, disciplinary domains, efficiencies, and responsiveness), and macrosocietal factors, is LA measuring the wrong things? It is therefore significant that Tinto (1982) proposes that we “need ask not whether we should eliminate dropout (since that is not possible) but for which types of students in which types of settings we should act to reduce it” (p. 699). This may also require that we question some of the defaults and normalized assumptions in LA (Archer & Prinsloo, 2020).

  • What is the student role in LA? Lastly, growing a student-centered approach to LA means that we need to reconsider the role students play not only in providing data, but also in making sense of the data. We need to engage them on classification systems and categories used (e.g., household), the proxies for their (dis)engagement, as well as preventing LA from becoming a datafied voice-over of the student experience (Broughan & Prinsloo, 2020), ignoring the complexities not only of student learning, but also in facilitating learning and providing administrative, affective, and cognitive support.

The above questions and pointers for future research are, of course, neither comprehensive, nor neutral. For a more comprehensive analysis of current research in LA, see the systematic review by Du et al. (2021). There are, also, a number of authors who have mapped alternative research agendas for LA such as Gunn (2014), Selwyn (2019, 2020), Wise, Sarmiento, and Boothe (2021), and Prinsloo et al. (2021).

Implications for ODDE Practice

The introduction to this chapter made it clear that collecting and using student data has been part and parcel of education, irrespective of the mode of delivery or its openness. With the emergence of LA in 2011 as a distinct research focus and practice, the potential of using the increasing volumes, diversity, and granularity of data from a variety of sources opened opportunities but also raised several ethical and privacy issues. In the light of concerns about student retention and success in ODDE contexts, LA offers scope for critical interrogation and ethical operationalization. Despite that most, if not all open, distributed, and online provision is, in one form or the other, digitalized and therefore datafied, it is somewhat strange that, outside of the adoption and institutionalization of LA at the Open University (UK), there is no evidence of the adoption of LA, at scale, in other ODDE institutions (Prinsloo et al., 2022).

Following from this, the most important question that emerges is to understand what is preventing ODDE institutions to embrace and operationalize LA? Considering that LMSs may form, to a large extent, the backbone of administrative and teaching systems in ODDE institutions, more research is needed to investigate the reasons why not more ODDE institutions are adopting LA. The issue is not if LA can contribute to more effective and successful learning in ODDE contexts, but rather, under what conditions will LA become an essential part of teaching and learning in ODDE institutions?