Keywords

1 Introduction

Learning analytics (LA) has attracted much attention by its promise to offer insights into some of the key challenges faced by higher education institutions (HEIs) [17, 45]. Examples of the challenges that LA can address include student retention, adaptive learning, personalised feedback at scale, and quality enhancement. In spite of many reports indicating the positive results with the use of LA addressing these challenges, there have been few examples of systemic adoption of LA in HEIs [8]. One of the key reasons for the limited adoption is the shortage of LA policies that would guide the way how HEIs address some of they key legal, ethical, privacy, and security issues vis-à-vis LA [42].

This paper contributes to the broader body of the literature in LA by reporting the findings of a study that solicited expert input on the directions of what LA policy in HEIs should include. Specifically, the contributions of the paper include: (1) a methodologically collected list of features that a LA policy in HEIs should include, (2) empirically systematised and rated themes that encompass the features that a LA policy in HEIs should include, and (3) suggestions for HEIs how to proceed with the development of LA policy.

2 Literature

2.1 Issues in the Adoption of Learning Analytics

LA aims to close feedback loops with real-time data about learners and learning contexts based on learner engagement and performance, e.g. log data collected from virtual learning environments, academic and demographic data held in student information systems, and the social interactions of learners in online forums or social media. Clow [5] illustrates the feedback loop with four elements that form an iterative cycle: learners, data, metrics, and interventions. LA analyses data about learners and produces feedback based on pre-identified metrics, for the purpose of supporting learners with interventions such as feedback dashboards, personal messages, face-to-face meetings, and curriculum adjustment [4, 20, 30]. However, closing a LA feedback loop can be challenging due to various issues associated with each of the four elements.

The learner is the main subject of data in a LA cycle. The large scope and velocity of data being collected from them could induce a sense of surveillance and intrusion into spaces considered private or personal [31]. There is a prevailing conflict in the LA field where anonymity policy that guides institutional data practices runs against the requirement of LA in retaining certain degrees of individual linkages to deliver customised interventions [36]. The dilemma that HEIs face here is the duty of care in terms of protecting students from being data-fied or having their privacy violated, and the opportunity to improve educational quality through a more personalised approach. This has led to a call for more transparency and control of data for students [32, 33].

However, the operation of LA based on individual consent could be problematic in that not only the quality and integrity of data are threatened, but also the received consent is hardly ever fully informed. Prominent issues with informed consent are the lack of interest or information that can help students understand the implications of agreeing to share data about themselves [32, 36]. This has also led to a question of timing as to when consent should be sought [32]. In light of this issue and in response to the consent requirements in the General Data Protection Regulation 2016/679, the UK non-profit educational consultancy Jisc suggests that institutions should seek ‘downstream consent’ (consent for personalised intervention), as there is usually clearer information about consequences on individuals at this stage than in the phase of finding patterns in data [7].

LA relies on data and metrics to provide so-called ‘evidence-based’ insights. However, a number of issues have been raised in relation to the two elements. In terms of data, common issues include the challenges of integrating information systems and different types of data [2], breaking down silos of data [3, 38], and embedding data technologies into existing learning environments [13]. In addition to technical issues associated with data, there is a concern that the choice of data sources and metrics for LA narrows learning to activities that happen in the digital domain, ignoring activities that are not ‘capturable’ or ‘perceivable’ but are an integral part of learning processes [28]. This concern has led to criticisms on LA being driven by behaviourism that tends to focus on describing rather than explaining actions [35]. It has also resulted in the problem of metrics being disconnected from the educational contexts and the broader social and cultural conditions in which learning takes place [25]. As a result, several scholars contend that the design and implementation of LA need to consider educational theories and practice [14, 15, 21, 22, 24]. In particular, the interpretation of analytics results about learners needs to consider learning design choices [27]. In light of the issues related to data and metrics, Gašević and colleagues [17] argue that approaches to LA should be question-driven rather than data-driven, and that institutions need to explore creative data sourcing to tackle learning issues, while acknowledging the inherent limitations of data.

A common issue with LA-based interventions is the limited availability of time and skills from key users [43]. The perception of LA being a burden on workload has been observed especially among teaching staff [19, 26]. This has often resulted in resistance to the adoption of new technology, including LA. Moreover, to close the feedback loop effectively, key users are expected to have a certain degree of data literacy that allows them to interpret data and make critical decisions as to whether and how to act on the feedback [2, 31, 42, 46] but insufficient data literacy among students could lead to misinterpretation of LA dashboards and negative emotions as a consequence [16]. Both the constraints of time and skills can stagnate the development of a data-informed culture in decision making, which is arguably a key step to enable institutional transformation with LA [17].

Another common issue to consider when designing interventions is the impact on student well-being and the equity of treatment, e.g. the mechanism of nudging students when being identified as at risk of failing or underperforming could potentially demotivate learners and cause undue anxiety or damage to self-esteem [19]. Similarly, the peer-comparison function of learning dashboards has often attracted polarised views from students [16, 21, 34]. Although LA has been recognised for its potential to enhance learning by personalising educational support, this strength has also been perceived as an issues when it comes to equity of treatment, i.e., educational resources being directed to some learners but not the others [34, 44]. On the other hand, the highly personalised approach also raises concerns about spoon-feeding students and impeding independent skill development as a result [19]. The above-mentioned issues are crucial to the closure of a LA feedback loop and systemic adoption of LA at an institutional level. In the next section, we discuss approaches that have been suggested in the literature to tackle these prominent challenges.

2.2 LA Adoption Frameworks and Policy

Issues that hamper institutional adoption of LA tend to derive from the interactions of technical, social and cultural factors in a complex educational system. A LA sophistication model [41] paints five stages of deployment maturity, starting from awareness and moving on to experimentation, implementation, organisational transformation and finally sector transformation. The current deployment of LA in the higher education landscape is mostly at the first three stages, with no large-scale systemic adoption being reported yet. Recent studies have echoed the observation of the field as thriving but yet to mature [8, 42], e.g. studies by Ferguson et al. [12] and Viberg et al. [45] show that the potential of LA in improving learning and teaching is yet to be verified with more empirical evidence. Moreover, in their review of 252 papers on the adoption of LA in higher education, Viberg et al. found that only a small number of the studies are deemed scalable (6%). Similarly, Dawson et al. [8] examined 522 papers and found that the majority of LA studies focus on small-scale projects or independent courses.

In view of the tangled interactions between technology and the myriads of human and social elements in a complex educational system, scholars have proposed strategic frameworks and approaches to guide LA adoption. For example, Greller and Drachsler [18] proposed a framework of critical dimensions of LA processes to highlight technical requirements, key stakeholders, and social constraints that require attention when formulating LA design. Similarly, the Learning Analytics Readiness Instrument (LARI) [2] assesses five readiness components: governance/infrastructure, ability, data, culture, and process. The beta analysis of this framework revealed that culture particularly plays a key role in institutional readiness for LA [29]. In light of the resistant culture to change in higher education, Ferguson and others [13] proposed the Rapid Outcome Mapping Approach (ROMA), originally developed to inform policy process in international development [47], to promote strategic planning that is responsive to the constantly changing environment of higher education. In addition to the elements of objectives, stakeholders and capacity considered by the two frameworks mentioned above, this framework highlights a context-specific approach to identifying drivers for LA and desired changes.

LA adoption frameworks need to work along with a sound policy that speaks to different stakeholders and takes into consideration issues that derive from the interactions of social, cultural, technological, and educational dimensions. Jisc for example developed a code of practice for LA and carried out a series of expert consultation activities and identify six types of stakeholders and their responsibility in LA processes [39, 40]. The purpose of the code is to ensure that LA benefits students and is carried out transparently. A similar approach is seen in the wider European context where an EU-funded project, Learning Analytics Community Exchange (LACE), drove the development of the DELICATE checklist to demystify pervasive uncertainty about legal boundaries and ethical limits when it comes to LA [10]. The list’s eight action points are meant to help institutional leaders to develop a trust relationship with key stakeholders in their deployment of LA.

Existing LA policies do not address all the dimensions deemed as important factors in LA processes. This is revealed in a study by Tsai and Gašević [42]. In their review of eight policies, including Jisc’s code of practice and the DELICATE checklist [10, 39, 40], they noted the lack of two-way communication channels among stakeholders in a stratified institutional structure and indications of required skills or training for LA, despite the fact that stakeholder involvement and data literacy has been highlighted as key elements of capacity building [1, 18, 29, 41]. They also found that while all the reviewed policies clearly state that enhancing learning and teaching were the ultimate goals for LA, there was no indication about any pedagogy-based approach that teaching staff, technology developers, or decision makers should consider when developing LA metrics or interventions. Similarly, Dawson et al. [8] point out that attention paid to evaluating LA-based interventions has been insufficient to date. The discrepancies mentioned above show that existing policies and guidelines tend to focus on ensuring ethical and legally compliant conducts, while giving relatively little attention to other dimensions that are equally important to LA deployment, as identified in the LA adoption frameworks discussed above.

In light of this, we conducted a group concept mapping (GCM) study intended to explore disparities between what is considered important and what is easy to implement in a LA policy context. Other aspects within the domain of LA have already been explored making use of GCM, e.g. quality indicators of LA [37], specific changes that learning analytics will trigger in Dutch education [11], and continued impact of learning analytics on learning and teaching [6]. These studies have shown that GCM is an effective method to collect and cluster grounded data based on the opinions of participants. However, none of these previous studies specifically uses GCM to analyse key stakeholder’s views towards policy in the context of learning analytics. An essential part of policy formation is the consultation of experts who have research and practical experience in implementing LA. Hence, we carried out an expert consultation using a GCM approach to identify essential elements of LA policy and directions for policy development in the field.

3 Methods

Group Concept Mapping (GCM) is a common methodology to identify a group’s understanding of any given issue. Making use of quantitative as well as qualitative measures and providing specific analysis and data interpretation methods, GCM is a very structured approach that creates maps of the involved stakeholders’ ideas of the chosen topic [23]. Our study was conducted using a GCM online toolFootnote 1 and consisted of three steps: (1) brainstorming, i.e. collection of ideas about a topic, (2) sorting of the collected ideas into clusters, and (3) rating of the ideas according to their importance and their ease of implementation. The data collected with the GCM tool were with statistical techniques such as multidimensional scaling and hierarchical clustering to reveal shared patterns. The GCM tool also provides visualisations of the analyses to help grasp the emerging structures and to interpret them. The appeal of using a GCM is its bottom-up approach: experts are given ideas to sort and rate that were generated by the community itself.

Our study was divided in two phases: the community phase and the experts phase. The community phase consisted of the brainstorming step where participation was accessible via a link and was conducted openly, i.e. people did not have to register with the GCM tool in order to participate. Calls for participation were circulated among the academic research community via several channels, e.g. Twitter, project websites, Google groups, personal contact, email etc., specifically trying to reach those interested in LA policies. Participants were asked to generate ideas by completing the statement “An essential feature of a higher education institution’s learning analytics policy should be ...”. The brainstorming phase was open for ten days from October 1, 2016 to October 10, 2016. Sixty-five people participated in the brainstorming phase and generated a total of 136 ideas. Before the ideas were released into the second phase, identical statements were unified while those statements containing more than one idea were split so that each statement contained one possible LA policy feature. After this cleaning process, the 99 ideasFootnote 2 that were left were randomised and pushed into phase Two.

The second phase of the study consisted of the sorting and the rating steps. Seventy-five experts from the field of LA (including members of the project consortium) were selected for this part of the study based on their specific experience and expertise (i.e. they had been involved in the domain for several years, had published about LA-related topics, were from the higher education sector and preferably had a PhD degree) and personally invited by email to participate. In order to participate, they had to register with the GCM tool. The sorting and rating module of the tool was open for participation for three weeks from October 27, 2016 to November 18, 2016. Participants first sorted the features according to their view of the features’ similarity in meaning or theme and were asked to also name the clusters. Dissimilar features were not to be put into a ‘miscellaneous’ cluster but rather into their own one-feature-cluster in order to ensure feature similarity within the clusters. Then, the participants rated all features on a scale of 1 to 7 according to their importance and ease of implementation in an institution’s LA policy, with 1 being of lowest and 7 being of highest importance/ease. In the end, the sortings of 30 participants were included in the study, while the importance ratings of 29 participants and the ease ratings of 25 participants were included (the difference in numbers stems from partial responses being excluded from the analysis).

4 Results

For the sorted features, the GCM tool offers multidimensional scaling and hierarchical clustering, while means, standard deviation and correlation analyses were done for the ratings. The outcome of the multidimensional scaling analysis is a so-called point map that can be read like a geographic map of a landscape with having semantically similar feature points in the North, South, West or East. Feature points that are clustered for instance in the North are semantically highly different from statements clustered in other parts of the map (see the points visible in the cluster map in Fig. 1). In the multidimensional scaling analysis, each feature is assigned a bridging value between 0 and 1. Features with low bridging values were grouped with other similar features around them. In cases where the bridging values were higher, features could still be grouped together but the distance to the surrounding points on the map was then bigger.

Fig. 1.
figure 1

Cluster map with labels

In order to determine boundaries between the groups of features, i.e. to determine clusters, the GCM tool’s hierarchical clustering analysis was used. Making use of cluster replay maps (i.e. the tool’s different cluster solutions to a given point map) and starting with a larger number (e.g., 12 clusters) and working down to a lower number (e.g., two) for each cluster-merging step, we looked at the features of clusters that were to be combined and checked whether that merge made sense. In our case, the solution with six clusters best represented the collected data and the purpose of our study. Once the number of clusters was settled, the clusters needed to be labelled meaningfully. Using the suggestions made by the GCM tool is one way of finding these labels. Alternatively, one could look for an overarching theme for all features in a cluster or for those with low bridging values only. Combining all three methods we labelled our clusters in the following way (see Fig. 1): (1) privacy & transparency, (2) roles & responsibilities (of all stakeholders), (3) objectives of learning analytics (learner and teacher support), (4) risks & challenges, (5) data management, and (6) research & data analysis. The GCM tool also assigned a bridging value to each cluster. The lower the bridging value was, the more coherent a cluster was. Cluster 1, privacy & transparency, was the most coherent one (0.12), followed by Cluster 3, objectives of LA (0.28). In the middle with similar coherence values were Cluster 4, risks & challenges (0.41), and Cluster 2, roles & responsibilities (0.45). The last two clusters, also with similar values of coherence, were Cluster 6, research & data analysis (0.60), and Cluster 5, data management (0.64).

With the clusters identified and labelled, the experts’ ratings of the features according to their importance and ease of implementation in LA policy were taken into account as well. The GCM tool automatically applied the experts’ ratings to the cluster map and indicated the levels of importance and ease of implementation by layering the clusters. The GCM tool always bases its calculations on a maximum of five layers. The actual number of layers per cluster is then based on the average ratings provided by the experts for the features in that cluster. The anchors for the map legend are based on the high and low average ratings across all participating experts. One layer indicates an overall low rating, while five layers indicate an overall high rating for a given cluster (see Figs. 2 and 3).

Fig. 2.
figure 2

Rating map on importance (legend shows average ratings of layers)

Fig. 3.
figure 3

Rating map on ease of implementation (legend shows average ratings of layers)

Fig. 4.
figure 4

Ladder graph of the importance and ease of implementation rating values for the six clusters

Fig. 5.
figure 5

Go-zone graph of all 99 features mapped on the two axes of importance and ease of implementation according to their average rating

A visualisation well-suited for the comparison of clusters’ ratings is a ladder graph. Figure 4 shows such a graph for the results of our study. The rating values are based on a cluster’s average rating. A Pearson product-moment correlation coefficient of \(r = 0.66\) indicates an intermediate positive relation between the two aspects of importance and ease of implementation. For both aspects, the privacy & transparency cluster by far received the highest value. As was already observable from the two rating maps, the order of the other clusters differs between the two rating aspects. What the ladder graph shows very clearly, however, is that the experts’ importance ratings were considerably higher than those for ease of implementation. All cluster average ratings for importance were higher than those for ease of implementation except for the ease cluster on privacy & transparency which was at a similar value as the importance clusters.

A third visualisation for the rating data offered by the GCM tool are go-zone graphs. These graphs allow us to explore the features in relation to their ratings more deeply. In a go-zone graph each point, i.e. each feature, is mapped onto a space between x- and y-axis based on the mean values of the two ratings importance and ease of implementation. Go-zone graphs were created for individual clusters or for all features together. Figure 5 shows the go-zone graph for all 99 features in our study. These types of graphs made it easy to identify features that are particularly important or particularly easy to implement in a LA policy. They also allow the identification of features with a good balance of importance and ease and are thus are very useful in the selection of features suitable for a LA policy. For example, the results of the GCM have been adopted to update the first version of the SHEILA framework [43].

5 Discussion and Conclusion

The clustering results (see Fig. 1) show that a wide range of topics were considered essential to a LA policy in higher education. In particular, the cluster on objectives of LA forms the basis of our cluster landscape. Formulating an aim for the use of LA can thus be seen as an entry point. This is in line with Ferguson et al. [13] who propose to identify the overarching policy objectives as the first step of the ROMA model when it is being used in the LA context. As can be seen from the ratings (see Fig. 4), features in this cluster were not deemed overly important by LA experts and not easy (i.e., they are rather difficult) to implement. This finding seems to suggest that defining objectives of LA in a HEI’s LA policy is not a straightforward process. It is unclear whether this is due to a data-driven (rather than question-driven) approach to LA as an observed issue in the literature [17], or due to insufficient empirical evidence proving that LA has reached its ultimate goals to enhance learning and teaching [12, 42, 45]. However, as the set goals for LA would inevitably affect approaches to LA [13], and hence all the issues represented through these clustered themes, an LA policy in HEIs must explicitly state the objectives of LA, despite their low ease of implementation.

Above this quite coherent base layer, a group of clusters forms the intermediate body. At the centre of the map and thus connecting all other clusters with one another, was the one about risks & challenges. The cluster was flanked by two more technical clusters (data management and research & data analysis) in the West and one stakeholder-related cluster (roles & responsibilities) in the East. This latter cluster is seen by the experts as fairly important and also quite easy to implement (Fig. 4). As also exemplified by Jisc’s code of practice [39, 40], LA requires collective efforts from a wide range of stakeholders, and it is therefore crucial to clarify roles and responsibilities for stakeholders ranging from managers to students which the LA field has clearly identified as a need [18, 41]. A policy can be seen as something rather prescriptive that is imposed by an institution’s management, but LA adoption needs both top-down and bottom-up approaches, i.e. all stakeholders need to be involved. It has, however, also been identified that current LA policies have paid relatively low attention to skill development of key users and two-way communication channels [18, 42]. We thus suggest that policy makers should address these areas when considering roles and responsibilities of stakeholders.

At the very top of the map, i.e. in the North, sits the cluster on privacy & transparency. While the bottom cluster about objectives can be seen as a base, this cluster can be seen as the pinnacle or the lid that rounds out a LA policy. Without it, a policy would thus not be complete. Aspects about transparency and privacy are considered the most important ones but also the easiest to implement in LA policy by far according to the GCM participants. Another interesting result with regards to the statements of the privacy & transparency cluster was the overall positive rating on the ease of implementation. This raised our attention as privacy and ethics have been considered as difficult issues in the literature so far. Looking closer at the ratings of this cluster reveals a discrepancy between more theoretical and practical privacy-related statements. For instance, the most highly rated statement with regards to importance ‘2. transparency, i.e. clearly informing students of how their data is collected, used and protected’ as well as the most highly rated statement with regards to ease ‘88. a clear description of data protection measures taken’ can both be considered as theoretical statements that can be easily safeguarded by university policy. A more privacy practical item like ‘96. an agreement between learners, teachers and policy makers on regulating a proper use of data’ on the other hand, is rated less easy to be implemented in LA policy as it pinpoints to the difficult situation of establishing privacy protection in daily practice.

This finding thus warrants future research considering that the challenges identified in the literature related to transparency and privacy are never straightforward [31, 36]. That is to say, while data policies tend to highlight transparency and privacy procedures, the implementation of them in the real world tend to meet complex challenges [42] that derive from the conflicts of interests among different actors in a social network and the increasing focus on the ‘ownership of data’, control of data for students and issues with informed consent [32, 33]. Therefore, it is important that the development of LA policy involves inputs from all relevant stakeholders, and that communication channels are clearly indicated in the policy to invite feedback on the implementation of the written policy in the real world, so as to ensure its relevance to the institutional practices.

The clustered themes shown in this study coincide with the argument made by Siemens et al. [41] that the main challenges in the deployment of LA are not technical but social. We could also see from the decline of average values in the ratings of ease of implementation compared to the ratings of importance that each of the identified themes are potential challenges to address in practice. This study has highlighted important aspects to address in LA policy. However, it is not our intention to suggest that policy makers should prioritise one aspect more than the other given the experts’ ratings of the importance and ease of implementation. Instead, the study reflects the current emphasis on privacy and legal compliance in the deployment of LA, and the views presented in this study are based on a particular stakeholder group only, i.e., LA experts. All the aspects should receive equal attention, as suggested in the literature, though one aspect might be easier to define than another. Involving all the relevant stakeholders in a co-creation process [9] of LA policy could help clarify the ’foggy areas’ of these identified aspects and ensure their relevance to the experiences of different stakeholders in the institution.