3.1 Introduction and Scope

3.1.1 Scope

The goals on this chapter are to:

  • introduce the basics of methods and tools for analyzing and interpreting online learners’ data to facilitate their personalized support,

  • focus on organizing, analyzing, presenting and interpreting learner-generated data within their learning context, and

  • elaborate on ethical concerns and policies for protecting learner-generated data from mistreatment and misuse.

3.1.2 Chapter Learning Objectives

This chapter learning objectives

Learn2Analyse

Educational data Literacy

Competence profile

Know what the common measurements of learner data and their contexts are, and understand the processes needed to collect both learner and context data in online and/or blended learning settings

1.1

Be able to identify and describe the limitations and quality measures on collecting learners’ data in online and/or blended learning settings

1.2

Know methods for learners’ data analysis and modelling as part of learning analytics methods

3.1

Know and understand learner-generated data presentation methods

3.2

Know and understand learners’ data properties in learning analytics

4.1

Be able to identify and discriminate statistics commonly used for the interpretation of educational data in learning analytics

4.2

Be able to elaborate on the insights from learners’ data analysis

4.3

Know and understand the methods that can be used to protect individuals’ data privacy, confidentiality, integrity and security in learning analytics

6.2

3.1.3 Introduction

At the heart of this chapter is the so-called Learning Analytics. Learning analytics has been a hot topic for a while in educational communities, organizations and institutions. There are four essential elements involved in all learning analytics processes: data, analysis, report and action (Fig. 3.1).

Fig. 3.1
A diagram indicates the learning analytics basic elements namely, data, analysis, report, and action.

The basic elements of learning analytics

  1. 1.

    Data, as the primary analytics asset, are the raw material that gets transformed into analytical insights; in the educational domain, they include information that is (usually) gathered as the learning processes are taking place, and is about the learners, the learning environment, the learning interactions, and the learning outcomes. A complete view of educational data has been provided in Chap. 1.

  2. 2.

    Analysis is the process of transforming the collected data to obtain actionable information from them, using, for this purpose, a set of mathematical and statistical algorithms and techniques; during data analysis, the data are cleansed, transformed and modelled with the goal of discovering meaningful information and supporting decision-making and action.

  3. 3.

    Report is used to summarize what the analysis of the collected data can tell about learning and to present this information in a meaningful manner; it is a set of processes for organizing and presenting the results of the analysis of learners’ and learning data into charts and tables. Reporting learners’ and learning data will provide insights about the learners’ states during learning; interpreting those insights can guide data-driven decision making to action taken.

  4. 4.

    Action is the ultimate goal of any learning analytics process; it is the set of the informed decisions and the practical interventions that the educational stakeholders will undertake. The results of follow-up actions will determine the success or failure of the analytical efforts. Learning analytics is useful only if there is “action” as a result of its implementation.

The increased need to inform decisions and take actions based on data, points out the significance of understanding and adopting learning analytics in everyday educational practice. And in order to treat educational data in a respectful and protected manner, the policies for learning analytics play a major role and need to be explicitly clarified.

3.2 Using Learner-Generated Data and Learning Context for Extracting Learning Analytics

3.2.1 Definition and Objectives of Learning Analytics

Learning analytics is defined by SOLAR as “the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs” (SOLAR, 2011). In other words, it is an ecosystem of methods and techniques (in general procedures) that successively gather, process, report and act on machine-readable data on an ongoing basis in order to improve the learning environments and experience.

As described in the Learning Analytics video (in the useful video resources), like any other context-aware process, learning analytics procedures track and record data about learners and their contexts, organize and monitor them, and interpret and map the real current state of those data, to use them for providing “actionable intelligence”, i.e., insights to act upon.

Based on the shared common understanding of learning analytics, it is important to clarify and discuss what learning analytics can do, what they can be used for, why one needs to use learning analytics, or in other words, what are the objectives of learning analytics? Some simple examples from everyday experience can showcase those objectives.

  • In traditional classroom settings, it’s often hard to identify each student’s individual strengths and weaknesses, learning disabilities and prior subject knowledge, and subsequently tailor and personalize instruction accordingly. It’s also hard to recommend personalized learning resources to the individuals.

  • In online learning settings, it’s common that the students drop-out early. It’s also hard to detect students’ emotions or enhance students’ social learning skills.

  • In blended learning settings, the students might not know how to self-regulate their learning, and they often procrastinate. It’s also hard to monitor each student’s progress and provide feedback accordingly.

These deficiencies are identified immediately with learning analytics. More specifically, learning analytics aim to (Chatti et al., 2012; Papamitsiou & Economides, 2014) and those objectives are illustrated in Fig. 3.2:

  • Monitor learners’ progress

  • Model learners/learners’ behaviour

  • Detect affects/emotions of learners

  • Predict learning performance/dropout/retention

  • Generate feedback

  • Provide recommendations

  • Guide adaptation

  • Increase self-reflection/ self-awareness

  • Facilitate self-regulation

Overall, learning analytics are important because every “trace” within an electronic learning environment may be valuable information that can be tracked, analyzed and combined with external learner data; every simple or more complex action within such environments can be isolated, identified and classified through computational methods into meaningful patterns; every type of interaction can be coded into behavioural schemes and decoded into interpretable guidance for decision making.

Fig. 3.2
A post-customization diagram measures behavior, recognizes emotions, predicts performance, and learns from human input.

The objectives of learning analytics

3.2.2 Measurements as Indicators of Learners’ Current Learning States

Learning analytics seeks to produce “actionable intelligence”; the key is the action that is taken. Campbell and Oblinger (2007) have pointed out five steps in learning analytics: Capture, Report, Predict, Act, Refine. From (a) capturing and gathering the raw data, to (b) introducing metrics for sharing a common understanding of the data in educationally meaningful ways, to (c) analyzing the metrics for predicting the future states of the learners, to (d) gaining insights into the learning processes, and to (e) acting upon the data-based evidence for delivering personalized learning to each individual, the cyclical process of learning analytics is fed with the continuously generated learner data, illustrated in Fig. 3.3.

Fig. 3.3
A cycle diagram indicates the five steps in learning analytics a. raw data is captured, b. learner metrics are extracted, c. data are analyzed for predicting the future states of the learners, d. gaining insights into the learning processes, and d. decisions are taken and implemented.

The cycle of learning analytics

Learning analytics are about learners and their learning. As such, Clow (2012) proposed a cycle for learning analytics that starts with learners. The next step is the generation and capture of data about or by the learners. The third step is the processing of this data into metrics or analytics, which provide some insight into the learning process. The cycle is not complete until these metrics are used to drive one or more interventions (actions) that have some effect on learners.

This learning analytics cycle can provide a data-perspective to strong learning theories. For instance, the cycle can be viewed as a data-driven aspect of Kolb’s Experiential Learning Cycle (1984): taking the system as a whole, there is a direct correspondence: actions by or about learners (concrete experience) generate data (observation) from which metrics are derived (abstract conceptualization), which are used to guide an intervention (active experimentation). The role of the learner is fundamental in this process. And, since learning analytics are extracted from the learners’ and learning data, two steps need to be clarified: a) what is the learner’s data that will be used in learning analytics, and b) what types of learning analytics can be formed from the learner’s data.

As already explained, learning analytics is a cyclical process. Learners generate data that can be processed into metrics and analyzed for patterns such as success, weakness, overall personal or comparable performance, and learning habits. Educators can administer “interventions” based on the data analyzed, and the process then repeats itself.

Before beginning to analyze data, one should understand what data are collected, and why it is needed to collect them: data collection should have specific objectives and outcomes. The collected data on their own cannot give meaningful insights, unless they are associated with specific measurements, depending on what one wants to measure: learning outcomes, goal attainment, performance, behavioural changes, engagement, motivation, cognition, abilities, emotions, etc. Metrics are what one measures, the measurements.

There are many types of data that support student learning – and they are so much more than test scores. The type of information the educational data often include, and the sources the data can be collected from, usually are linked with a straightforward relation. For instance, student characteristic data and/or contextual information are usually collected from enrolment records, student profiles, or attendance rolls; student perception data can be found in surveys and interviews; student activity data are available in logs from the LMS and interaction records; student achievement data lay within various kinds of assessment data such as rubrics, scores or observation notes; student wellbeing data capture students’ social and emotional development, or school climate, and can be found in sources such as biosignals or social networks. Educational data and the respective data sources are explained in Chap. 1.

But individual data points don’t give the full picture needed to support the incredibly important education goals of parents, students, educators, and policymakers. The What Is Student Data? Video (see useful video resources) explains in simple terms what student data is about and when they can be used effectively. As explained in this video and in Chap. 1, there are learner and context data that can be captured within the learning environment (e.g., log-files, quiz scores, login data, content access, file downloads, discussion participation, etc.), and there are also other types of data that are external to the learning environment (e.g., survey-demographic data, biosensor data, online discussion forums, social network data, etc.). In addition, aggregating/integrating different data sources to increase validity and relevance, and to reduce biases (improve reliability) is also important. Once one understands what data need to be collected, one will be able to locate and select the most appropriate data sources to extract them from. Those data will feed the learning analytics cycle.

It has been explained in previous sections what student data are about and how they can be combined together to show the whole picture of student learning, which is deeply related to the context itself. Learning analytics is a context-aware process. Both learner and context data are necessary in this process. Different types of data can come together – under different objectives – to form a full picture of student learning. When used effectively, data empowers everyone. The first step is to understand why one is collecting data and associate the data with metrics according to the learning concept one aims to measure and shed light on. Each of these measurements referred to as learning analytics metrics, can be associated with one or more learning analytics objectives (see Sect. 3.2.1), summarized in Fig. 3.4.

Fig. 3.4
A diagram indicates that data is collected, then associated with metrics and learning analytics metrics are further associated with objectives.

Associating data to learning analytics metrics and objectives

To understand that, let’s take the following simple and generic example. How many views make an educational YouTube video a success? How about 300 K? That’s how many views a video you posted got. It featured some well-known and successful professionals, who prompted young people to enrol in a Data Science course. It was twice as popular as any video you had posted to date. Success! Then came the data report: only eight viewers had signed up to take the course, and zero actually completed it. Zero completions. From 300 K views. Suddenly, it was clear that views did not equal success. In terms of completion rates, the video was a complete failure. What happened?

Well, not all important things in life can be measured and not everything that can be measured is important. If one is measuring something, but not necessarily all the right things, the end result could still not be right, or one is relying on the wrong data to make the case. The critical question is which measurements are the “right” ones. There is a difference between numbers and numbers that matter. This is what separates data from metrics. One can’t control the educational data one is collecting, but can control what one measures. When we talk about learning analytics metrics and measurements, we’re typically referring to gathering data on three areas: efficiency, effectiveness, and outcome (Robbins, 2017), illustrated in Fig. 3.5.

  • Efficiency is generally thought of as learning-centric activity metrics—number of learners, time on task, frequencies of resources downloads, quiz scores, attempts, hint usage, etc.

  • Effectiveness metrics are evaluation-focused and include aspects like learner engagement, quality of deliverables, knowledge acquisition, collaboration, progress, performance, etc.

  • Outcome looks at bottom-line results. To the extent that efficiency and effectiveness metrics matter, they provide validation and explanation for the outcome.

Learning efficiency refers to more granular metrics, closer to raw data; their objective is to describe learners’ actions at the task or activity level (micro-level), and they cannot sufficiently reveal a lot about learning (as a more general objective) on their own. Combining these metrics can contribute to understanding more complex learning constructs, such as engagement and collaboration. The metrics used to refer to this meso-level (activity or course) of more abstract and complex concepts are synopsized under the learning effectiveness metrics, and their objective is to quantify less fine-grained constructs. Finally, learning outcome can be described with metrics from previous categories that are combined to give insight and explain the results of the learning processes (macro-level).

Fig. 3.5
A diagram indicates the objective and granular measurements of learning analytics where data is gathered in three areas: efficiency, effectiveness, and outcome.

Categories of learning analytics metrics

Depending on the goals (i.e., the learning analytics objective), the learning analytics metrics will be obtained from the same or different learner and context data. The types/levels of the metrics will be decided according to their sophistication, the complexity of the analysis method employed, and the value they add for human decision-making (Lang et al., 2017; Scapin, 2015; Soltanpoor & Sellis, 2016):

  • Descriptive analytics: use data aggregation and data mining to provide insight into the past and answer: “What has happened?” (e.g., reports and descriptions).

  • Diagnostic analytics: dissect the data with methods like data discovery, data mining and correlations to answer the question “Why did it happen?” (e.g., interactive visualizations).

  • Predictive analytics: utilize a variety of data to make the prediction and apply sophisticated analysis techniques (such as machine learning) to answer the question “What is likely to happen?” (e.g., trends and predictions).

  • Prescriptive analytics: utilize an understanding of what has happened, why it has happened and a variety of “what-might-happen” analysis to help the user determine the best action to take and answer the question “What do I need to do?” (e.g., alerts, notifications, recommendations).

Figure 3.6 illustrates the types of learning analytics based on their complexity and value for decision-making.

Fig. 3.6
A diagram indicates that learning analytics are evaluated by the complexity of past, present, and future knowledge and their value for decision-making.

Types of learning analytics based on their complexity and value for decision-making

Questions and Teaching Materials

  1. 1.

    What are the three core questions to ask before using learning analytics?

    1. (a)

      What information to use? What has happened? What is likely to happen?

    2. (b)

      What information to use? How is it gathered? What is likely to happen?

    3. (c)

      What information to use? How is it gathered? How is it combined?

    4. (d)

      What information to use? How is it gathered? What has happened?

Correct answer: c.

  1. 2.

    What can Learning Analytics do, and what can they be used for?

    1. (a)

      Monitor progress, predict performance, create content, facilitate self-regulation

    2. (b)

      Predict dropout, increase self-awareness, guide adaptation, detect emotions

    3. (c)

      Generate feedback, model learners, support game-based learning, predict retention

    4. (d)

      Evaluate learning, provide recommendations, assess collaboration, increase effort

Correct answer: b.

  1. 3.

    How can Learning Analytics (LA) provide a data-driven perspective to strong learning theories:

    1. (a)

      LA helps teachers develop more appropriate interventions and learning opportunities for target learners (e.g., experiential learning)

    2. (b)

      LA helps learners become aware of their progress on different tasks by combining the learning data that are generated during the process (e.g., self-regulated learning)

    3. (c)

      LA make use of data generated by learners’ online activity to identify behaviours and patterns within the learning environment that signify effective process (e.g., social learning).

    4. (d)

      All the above

Correct answer: d.

  1. 4.

    Which of the following are learning data?

    1. (a)

      Survey-demographic data, biosensor data

    2. (b)

      Gender, socioeconomic status, special education needs

    3. (c)

      Test scores, educational file downloads, educational content access

    4. (d)

      Enrolment records, emotional development, social network data

Correct answer: c.

  1. 5.

    What is the difference between data and metrics?

    1. (a)

      Data are measurements (numbers/calculations) to help make decisions about how to move forward, whilst metrics are indicators of progress and achievement

    2. (b)

      Data is the set of raw numbers or calculations gathered, whilst metrics are proxies for what ultimately matters (i.e., what we measure)

    3. (c)

      Data is a mapping of observations into numbers, whilst metrics are numerical approximations of objectives

    4. (d)

      Data is the raw measurements, whilst metrics are trends in the data

Correct answer: b.

  1. 6.

    Consider the following metrics: time on task, frequencies of resources downloads, quiz scores, attempts, hint usage. What category of metrics are they?

    1. (a)

      Learning efficiency (micro-level)

    2. (b)

      Learning effectiveness (meso-level)

    3. (c)

      Learning outcome (macro-level)

    4. (d)

      Those are not metrics – they are raw data

Correct answer: a.

  1. 7.

    What type of learning analytics would you use to help you determine that all of the student’s actions—low interaction time, low forum participation, and low scores—point to low engagement in the activity?

    1. (a)

      Descriptive analytics

    2. (b)

      Diagnostic analytics

    3. (c)

      Predictive analytics

    4. (d)

      Prescriptive analytics

Correct answer: b.

  1. 8.

    ACTIVITY/PRACTICE QUESTION (Reflect on)

    We encourage you to elaborate on your response on data collection in the following reflective task.

    You may reflect on:

    1. 1.

      Can you associate the educational data with the learning analytics objectives? Please provide specific examples of how educational data can be used to address specific learning analytics objectives.

    2. 2.

      Can you explain how the same or different educational data can be used as different types of learning analytics? Please provide specific examples of data used as learning analytics metrics for descriptive, diagnostic, predictive and prescriptive analytics

3.2.3 Limitations and Data Quality Issues of Learners’ Data Measurements in Open and Blended Courses

As already explained in Chap. 1, data often suffer from inaccuracies, biases or even manipulations; the educational data, apart from being relevant to be used for decision making (fit-for-purpose), should also be reliable and valid. According to Wikipedia (Data Quality, 2022), data is generally considered high quality if it is “fit for [its] intended uses in operations, decision making and planning and data is deemed of high quality if correctly represents the real-world construct to which it refers.

Like in all kinds of organizations, data quality is critical for educational institutes, as well. In online and blended learning settings, many factors are additive to the existing difficulty in handling educational data quality. For example, such factors often are heterogeneous educational data sources, high volumes of learner and learning data, and a myriad of unstructured data types extracted. The Data Quality Matters – Tech Vision 2018 Trendvideo (see useful video resources) explains the critical issues of data quality from a more general perspective. As discussed in this video, there are many aspects to data quality, including completeness, consistency, accuracy, timeliness, validity, and uniqueness, synopsized as follows (Mihăiloaie, 2015; Pipino et al., 2002) and illustrated in Fig. 3.7:

  • Completeness: there are no gaps in the data from what was expected to be collected and what was actually collected, i.e., there are no missing data – the collected dataset is complete.

  • Consistency: the data types must align and be compatible with the expected versions of the data being collected, i.e., there are no contradictions in the data types and the data are usable.

  • Accuracy: collected data are correct, relevant and accurately represent what they should.

  • Timeliness: the data should be received at the expected time for the information to be utilized efficiently.

  • Validity: a measurement is well-founded and likely corresponds accurately to the real world.

  • Uniqueness: there should be no data duplicates reported.

Among the 6 dimensions, completeness and validity usually are easy to assess, followed by timeliness and uniqueness. Accuracy and consistency are the most difficult to assess. The critical question is how those data limitations relate to learning analytics and why does quality matters. Here, we will focus on how these principles/limitations apply in learning analytics.

Fig. 3.7
A diagram indicates data quality's dimensions namely: completeness, consistency, accuracy, timeliness, validity, and uniqueness.

The dimensions of data quality

Specifically, in the learning analytics cycle, learner and contextual data are collected and transformed into metrics (analytics), according to the learning objective that needs to be addressed; the different types of metrics shall next guide human decision-making and interventions. Yet, the higher the need for data-driven decision-making is, the more the integrity and quality of data become critical (National Forum for the Enhancement of Teaching and Learning in Higher Education, 2017). The following example demonstrates in simple terms the impact of data limitations and quality for learning analytics.

Let’s examine the case of an educator who wants to understand learners’ engagement with an activity. To measure engagement on the activity level, it is common practice to use learners’ participation data (e.g., frequency of logins, session duration, posts on the activity forum, etc.). If the learners’ ID is missing from the data that are available via the LMS (the data are incomplete), then the educator shall not be able to identify each learner’s participation. Similarly, if each learner’s data would be stored in different formats (e.g., dates: MM/DD/YY vs. DD/MM/YY) this would result in confusion about the validity of the data and their interpretation (the data are not valid). In the same example, this inconsistency in the data format would also result to inaccurate data– when did the learner really log in to the activity? – i.e., it would be unclear what the correct values of the stored data are. Furthermore, if the learners’ data during the activity would not become timely available, the educator would not gain insight to what the learners are doing during that activity (violation of timeliness), making it impossible to intervene in a timely manner. Similarly, if the same learners’ data are stored multiple times (e.g., each time a learner logs in the activity, the login is duplicated) and all the information is considered for analysis, the results would be misleading (violation of uniqueness).

It is important to clarify that raw data quality strongly affects the analytics quality; learning analytics metrics are transformations of the raw learner and learning data collected, according to the objectives set. These metrics will next be treated as “data” themselves, and they will be subjected to further processing. Just like with any kind of data, quality also matters for learning analytics metrics: what the specific metrics can reveal is strongly dependent on their quality. In most cases, limited quality will have the direct result of lack of trust in the metrics, and consequently, poor decisions and gradual abandonment of the data-driven educational decision-support system. Poor quality data is troublesome (The data quality benchmark report, 2015). Educators cannot and will not trust insights that are acquired by processing corrupted, duplicate, inconsistent, missing, broken, or incomplete data. Learning analytics metrics quality is expected to increase the value of the learner and learning data and the opportunities to use them properly.

The following approaches were developed to discuss the exact concerns of quality issues in learning analytics metrics. In particular, the LACE project developed a proposal for a framework of quality indicators for learning analytics that contributes towards a standardized and holistic approach for the evaluation of learning analytics tools (Scheffel et al., 2015). It potentially can act as a means for providing evidence on the impact of learning analytics on educational practices. The suggested framework is generic and considers multiple learning analytics aspects, ranging from their objectives to organizational issues. For the measures and data aspects, the framework highlights comparability, effectiveness, efficiency, and helpfulness, as well as transparency, data standards, data ownership, and privacy, respectively (Fig. 3.8).

Fig. 3.8
A quality indicators diagram for learning analytics has objectives, learning support, learning measures and output, data and organizational aspects.

Quality indicators for learning analytics. (Adopted from: Scheffel et al., 2015)

From a more “data-oriented” approach to “quality” aspects for learning analytics metrics, the above indicators can be combined and merged with those identified before (illustrated in Fig. 3.7), as follows:

  • Learning analytics metrics quality indicators: Standards (comparability, consistency), Completeness, Accuracy (effectiveness, efficiency), Validity, Timeliness, Uniqueness.

  • Learning analytics metrics ethics considerations: Privacy, Ownership, Transparency, Consent.

The “quality indicators” refer to how appropriate the learning analytics metrics are, how fit-for-purpose they are as data that will be used in the decision-making process in turn; the “condition” of the data themselves – the degree to which a set of characteristics of data fulfils requirements.

The “ethics considerations” refer to systemising, defending, and recommending concepts of right and wrong conduct in relation to data; they are considerations that tackle the potential for data misuse, and issues about the right, legitimate, and proper ways to use data. Ethics considerations are placed on top of quality indicators, since the latter are relevant to the data, whilst the former are relevant to the usage of the data (Fig. 3.9).

Fig. 3.9
A diagram indicates that ethical considerations override quality indicators because the former is related to data usage.

Ethical considerations of learning analytics on top of quality indicators

Like any kind of data, learning analytics metrics should be protected from misuse, mistreatment, or violations. The quality of learning analytics as (data) metrics themselves matters in terms of impacting the quality of the outcome as a data-driven decision. Mostly it is important to control who has access to those metrics, what can and cannot be done with the metrics, and for how long access is granted after the collection and analysis of the raw learning and context data occurs. Therefore, along with the learning analytics metrics quality indicators, the ethical limitations should be considered, as well.

Questions and Teaching Materials

  1. 1.

    Data completeness refers to:

    1. (a)

      Data that are well-founded and likely correspond accurately to the real world

    2. (b)

      Correct and relevant data that accurately represent what it should.

    3. (c)

      There are no gaps in the data from what was expected to be collected and what was actually collected

    4. (d)

      The data types must align and be compatible with the expected versions of the data being collected

Correct answer: c

  1. 2.

    Assume an LMS’s database is a huge file, which has an important index located 20% of the way through and saves content data at the 75% mark. Consider a scenario where an e-tutor comes and creates new content (e.g., adds new exercises) at the same time a backup is being performed, which is being made as a simple “file copy” which copies from the beginning to the end of the large file(s) – and at the time of the content edit, it is 50% complete. The new content is added to the content space (at the 75% mark) and a corresponding index entry is added (at the 20% mark). What is the data quality problem that raises in this scenario?

    1. (a)

      Data consistency

    2. (b)

      Data completeness

    3. (c)

      Data accuracy

    4. (d)

      Data timeliness

Correct answer: a.

  1. 3.

    The goal of the quality indicators framework is:

    1. (a)

      To evaluate how appropriate the learning analytics metrics are, how fit-for-purpose they are as data that will be used in the decision-making process in turn

    2. (b)

      To contribute towards a standardized and holistic approach for the evaluation of learning analytics tools.

    3. (c)

      To systemise, defend, and recommend concepts of right and wrong to tackle the potential for data misuse, and issues about the right, legitimate, and proper ways to use data

    4. (d)

      To control who has access to the metrics, what can and cannot be done with the metrics, and for how long access is granted after the collection and analysis of the raw learning and context data occurs.

Correct answer: b.

  1. 4.

    ACTIVITY/PRACTICE QUESTION (Reflect on)

    We encourage you to elaborate on your response about learning analytics metrics limitations and quality, in the following reflective task:

    1. 1.

      Can you provide examples of how the limitations of analytics quality apply when addressing learning objectives using specific learning analytics metrics? You can use the example in this section as guidance.

    2. 2.

      Do you understand the difference between limitations as quality measures for learning analytics and the ethical limitations for learning analytics? Please, provide specific examples of each category of data quality

3.2.4 Ethical Treatment of Learner-Generated Data and Measurements

Learning analytics provides tremendous opportunities to assist learners – but they also pose ethical implications that shouldn’t be ignored. The practical challenge of learning analytics metrics is the question of privacy of the learner and how to protect the learner from potential harm due to data misuse. Questions abound:

  • Who has access to the learner’s data? Who owns individuals’ data?

  • To what degree do you need to inform users that their data are being collected?

  • Do you need learners’ permission to use their data?

  • Where should the data be stored? How secure does it need to be?

  • Is identification of individuals possible from metadata?

  • What about misinterpretation of data, or other data errors?

Towards addressing these issues, the Learning Analytics: The need for a code of ethics video (see useful video resources) elaborates on the need to establish a code of ethics for learning analytics. This code of practice aims to set out the responsibilities of educational institutions to ensure that learning analytics is carried out responsibly, appropriately and effectively, addressing the key legal, ethical and logistical issues which are likely to arise.

Slade and Prinsloo (2013) identified three broad classes of ethical issues: (a) the location and interpretation of data; (b) informed consent, privacy, and the de-identification of data; and (c) the management, classification, and storage of data. As we have explicitly explained, in the learning analytics cycle, data are collected about individuals and their learning activities, and metrics are constructed; the data will be analysed and interventions (might) take place. This entails opportunities for positive impacts on learning, as well as risks for misunderstandings, misuse of data and adverse impacts on students.

When learners perform learning tasks within a learning environment to increase their knowledge and develop skills and competences, they expect to receive support to overcome gaps in knowledge/competences. They also expect to be in a “safe” environment where their mistakes will be treated with respect, without serious consequences or unfair and unjustified discrimination against them, as individuals. Two critical issues are hidden in the implied “safety” of the learning environments: (a) the learners should feel “secure” and maintain the “privacy” of their data (integrity of the self), and (b) the learners’ data should be treated in an “ethical” manner. Drachsler and Greller (2016) provided a clear differentiation between ethics and privacy: “Ethics is the philosophy of morality that involves systematizing, defending, and recommending concepts of right and wrong conduct […] privacy is a living concept made out of continuous personal negotiations with the surrounding ethical environment”. The main ethics considerations are illustrated in Fig. 3.10 and are outlined as follows:

  • Privacy: the regulation of how personal digital information is being observed by the self or distributed to other observers – protection from unauthorized intrusion. Anonymize and de-identify individuals.

  • Ownership: the act of having legal rights and complete control over a single piece or set of data – information about the rightful owner of data assets and the acquisition, use and distribution policy implemented by the data owner.

  • Consent: documentation that clearly describes the processes involved in data collection and analysis. Explain how the data will be used, and why – and how it won’t be used – and get consent from each individual before any data are collected.

  • Transparency: the regulation about the purposes for which data will be collected and used, under which conditions, who will have access to data, the measures through which individuals’ identity will be protected, and how sensitive data will be handled.

Ethics provides us with guides on what is the right thing to do in all aspects of life, while the law generally provides more specific rules so that societies and their institutions can be maintained (Tsachuridou, 2015).

Fig. 3.10
A diagram indicates the learning analytics' ethical considerations namely privacy, ownership, consent, and transparency.

Ethical considerations in learning analytics

Over the past 5 years or so, a number of guidelines, codes of practice and policies have been developed in response to this. Slade and Prinsloo (2013) established one of the earliest frameworks with a focus on ethics in learning analytics. Others have followed, including JISC’s code of practice in 2015, the Learning Analytics Community Exchange (LACE) framework in 2016 (Drachsler & Greller, 2016) and a learning analytics policy development framework for the EU by the SHEILA project (Tsai & Gasevic, 2017). More recently and in the light of the rapid development of Learning Analytics on a global basis, International Council for Open and Distant Education (ICDE) has taken the initiative to produce a set of guidelines for ethically-informed practice that would be valuable to all regions of the world (March 2019).

To address the issues raised earlier in this section and demystify the ethics and privacy limitations around learning analytics, the LACE project published the DELICATE instrument to be used by any educational institution. The instrument includes policies and guidelines regarding privacy, legal protection rights or other ethical implications that address learning analytics. The DELICATE checklist helps to investigate the obstacles that could impede the rollout of learning analytics and the implementation of trusted learning analytics for higher education. The eight points are shown in Fig. 3.11 and include:

  1. 1.

    D-etermination: Decide on the purpose of learning analytics for your institution.

  2. 2.

    E-xplain: Define the scope of data collection and usage.

  3. 3.

    L-egitimate: Explain how you operate within the legal frameworks, refer to essential legislation.

  4. 4.

    I-nvolve: Talk to stakeholders and give assurances about the data distribution and use.

  5. 5.

    C-onsent: Seek consent through clear consent questions.

  6. 6.

    A-nonymise: De-identify individuals as much as possible.

  7. 7.

    T-echnical aspects: Monitor who has access to data, especially in areas with high staff turn-over.

  8. 8.

    E-xternal partners: Make sure externals provide highest data security standards.

The EU SHEILA project focused on developing a learning analytics policy development framework for the EU under the 6 dimensions of the Rapid Outcome Mapping Approach (ROMA) (Ferguson et al., 2014; Macfadyen et al., 2014), and consisting of 49 action points, 69 challenges, and 63 policy questions. The ROMA dimensions, as considered by the SHEILA framework, include: (1) The political context of an institution, i.e., identifying the ‘purposes’ for adopting learning analytics in a specific context; (2) The involvement of stakeholders, i.e., the implementation of learning analytics in a social environment involves collective efforts; (3) A vision of behavioural change and potential impacts; (4) Strategic planning, including resources, ethics & privacy, and stakeholder engagement and buy-in; (5) Institutional capacity to affect change, i.e., assessing the availability of existing resources; (6) A framework to monitor and evaluate the efficacy and continue learning.

Fig. 3.11
A delicate diagram indicates eight points which are Determination, Explain, Legitimate, Involve, Consent, Anonymise, Technical aspects and External partners.

The delicate checklist

In addition, the ICDE report on Ethics in Learning Analytics identified several core issues that are important on a global basis for the use and development of Learning Analytics in ethics-informed ways. Those issues are shown in Fig. 3.12 and include:

Fig. 3.12
A diagram indicates the ethics in learning analytics which identified several core issues including data ownership and control, communications, and responsibility.

Ethics in learning analytics based on ICDE report

  • Transparency: how learners’ data are collected, analysed and used to shape learners’ paths.

  • Data ownership and control: the presumption is often that data collected are owned by the institution. However, “data are not considered as something a student owns but rather is. Students do not own their data but are constituted by their data” (Prinsloo & Slade, 2017). Therefore, institutions do not own the student data that they hold but have temporary stewardship.

  • Accessibility of data: can relate to both the determination of who has access to raw and analysed data, and to the ability of students to access and correct their own data. Within a learning analytics context, we might expect that data are accessed on a ‘need-to-know’ basis to facilitate the provision of academic and other support services.

  • Validity and reliability of data: Datasets should be kept valid, reliable, accurate, representative of the issue being measured, current, complete and sufficient.

  • Institutional responsibility and obligation to act: how access to knowing and understanding more about how students learn brings with it a moral obligation to act.

  • Communications: care should be taken when communicating directly with students on the basis of their analytics.

  • Cultural values: measures established as being correlated with successful or unsuccessful outcomes are likely to differ in different geographies and cultures.

  • Inclusion: Learning Analytics should be primarily used to support students, in student-centred ways that minimize the risk to legitimise exclusion.

  • Consent: In line with GDPR, consent is not required for the use of non-sensitive data for analytics, is required for use of sensitive data, and would be required to take interventions directly with students on the basis of the analytics.

  • Student agency and responsibility: it is recommended that institutions seek to engage students in applications of learning analytics so as students can be actively involved in helping the institution to design and shape interventions that will support them.

Questions and Teaching Materials

  1. 1.

    Carrie is an instructional designer and a book writer. She creates content for online courses and she also prepares a printed version of her book. From the student data available on the online platform that she uses for the courses, she can identify students who are in need of additional learning support, and she decides to promote and sell her book to those students and make profit from it. What are the ethical/legal issues raised here with regard to student data?

    1. (a)

      The location and secure storage of data

    2. (b)

      Misinterpretation of data, or other data errors

    3. (c)

      Informed consent, privacy, and ownership of student data

    4. (d)

      The regulation about the purposes for which data will be collected and used

Correct answer: d

  1. 2.

    What is the difference between “data ownership” and “data privacy”?

    1. (a)

      Data privacy requires us to, at least conceptually, agree that you as the data subject own your data and the data you generate. Data ownership in itself does not necessitate that privacy be respected by default.

    2. (b)

      Data privacy is the regulation of how personal data will be collected and used, under which conditions, and who will have access to data. Data ownership is the act of having legal rights and complete control over a single piece or set of data.

    3. (c)

      Data privacy is the right of a citizen to have control over how personal information is collected and used. Data ownership is the regulation of how personal data will be collected and used, under which conditions, and who will have access to data.

    4. (d)

      Data privacy is the act of having legal rights and complete control over a single piece or set of data. Data ownership defines and provides information about the rightful owner of data assets and the acquisition, use and distribution policy implemented by the data owner.

Correct answer: a.

  1. 3.

    Match the appropriate definition (from the right column), to the respective “point” in the left column

1. External partners

a. Define the scope of data collection and usage.

2. Determination

b. Seek consent through clear consent questions.

3. Anonymise

c. Make sure externals provide highest data security standards.

4. Technical aspects

d. Explain how you operate within the legal frameworks, refer to essential legislation.

5. Explain

e. Talk to stakeholders and give assurances about the data distribution and use.

6. Legitimate

f. De-identify individuals as much as possible.

7. Consent

g. Decide on the purpose of learning analytics for your institution.

8. Involve

h. Monitor who has access to data, especially in areas with high staff turn-over.

Correct answer: 1.c / 2.g / 3.f / 4.h / 5.a / 6.d / 7.b / 8.e.

  1. 4.

    Read the paper Tsai et al. ( 2018 ) and focus on sect. 4 Results. What are the identified challenges for stakeholders in the three case studies?

    1. (a)

      It was difficult to define ownership and responsibilities among professional groups within the university.

    2. (b)

      The provision of opt-out options conflicts with the goal to tackle institutional challenges that involve all institutional members.

    3. (c)

      Anonymised data could potentially be reidentified when matched with other pieces of data.

    4. (d)

      All the above

Correct answer: d.

  1. 5.

    Which of the following statements is correct?

    1. (a)

      The SHEILA framework is used to inform the development of policies for learning analytics, but strategies are not covered

    2. (b)

      The DELICATE checklist addresses issues of power-relationship, data ownership, anonymity, data security, privacy, data identity, transparency, and trust.

    3. (c)

      The ICDE report on Ethics in Learning Analytics identifies which core principles relating to ethics are core to all, unless there is legitimate differentiation due to separate legal or more broadly cultural environments.

    4. (d)

      All the above

Correct answer: b.

  1. 6.

    ACTIVITY/PRACTICE QUESTION (Reflect on)

    We encourage you to elaborate on your response about learning analytics ethical considerations and policies, in the following reflective task:

    1. 1.

      Read the SHEILA-research-report and choose 2 action points, 2 challenges and 2 policy questions that you find most interesting. Please, elaborate on your choices.

    2. 2.

      Study the DELICATE framework and the ICDE framework and discuss the overlap between them.

3.3 Analyzing Data and Presenting Learning Analytics

3.3.1 Methods for Analyzing the Learner-Generated Data and the Measurements Over Them

As already explained, the learning analytics cycle describes the whole process from collecting the learner and context data to taking data-driven actions and interventions. The raw learner and context data do not tell a lot on their own, but when converted to metrics, they have the potential to reveal what we don’t know about our learners.

Good metrics have three key attributes: their data are consistent, clean, and valid to use (see Sect. 3.2.3). Data cleaning and management is a demanding task (see Chap. 2). Given that good and clean data are available, next the data analysis method needs to be selected. Here we explain what methods can be used for analysing the educational data and learning analytics. This step is the main “game” of Data Science; it requires the procedures under the umbrella of Data Science. Data Science is a blend of various tools, algorithms, and machine learning principles with the goal to discover hidden patterns from the raw data (Sharma, 2019). The main generic categories of methods of this step are shown in Fig. 3.13 and include (but are not limited to):

Fig. 3.13
A basic data analysis methods diagram has statistical analysis, data mining, machine learning, qualitative and social network analysis, and visualization.

The basic data analysis methods in learning analytics

  • Statistical methods

  • Data mining

  • Machine learning

  • Qualitative methods

  • Social Network Analysis

  • Visualization – This step is related to the output procurement and will be extensively presented in the next section.

However, not all data analysis methods can yield the results one is seeking. To achieve that, a number of criteria need to be specified, e.g., the learning analytics objective you want to address (modelling learners, prediction of performance, adaptation, recommendation, etc., see Sect. 3.2.1), the metrics you have to compute (effective, efficient, outcome, see Sect. 3.2.2), and the type of analytics you want to use (descriptive, diagnostic, predictive, etc., see Sect. 3.2.2). The analysis methods will be utilized to form a better understanding of the educational settings and learners: learning analytics focus on the application of known methods and models to address issues affecting student learning and the environments in which it occurs.

Before we explain how the appropriate analysis method can be chosen according to the needs, we briefly introduce (in simple terms) the approaches commonly used in learning analytics. Specifically, the learning analytics metrics come from data related to learners’ interactions with course content, other learners, and instructors. Different techniques are applied to detect interesting patterns hidden in the educational data sets.

Among the analysis techniques, some have received increased attention in the last couple of years, namely statistics, data mining, machine learning, qualitative analysis, social network analysis, and visualizations (Chatti et al., 2012; Khalil & Ebner, 2016; Papamitsiou & Economides, 2014). In a recent report on the current state-of-the-art in learning analytics, a corpus of 100 studies was considered (Misiejuk & Wasson, 2017). Figure 3.14 shows the frequency of the data analysis methods used in the corpus.

Fig. 3.14
A horizontal bar graph for Method versus Frequency. The data are arranged from top to bottom as lowest to highest bar values, in which descriptive statistics is the highest with 43.

Frequency of data analysis methods in learning analytics. (Data Source: http://bora.uib.no/handle/1956/17740)

By far, statistics is the most commonly used method, including descriptive statistics (43%), correlation analysis (36%), ANOVA (10%) and T-Test (10%). Data mining methods like regression analysis (24%) and cluster analysis (13%) are also common techniques, followed by network analysis (16%) and data visualisations (13%). The remainder of the methods were reported 1–5 times. Some of these less used approaches are machine learning methods such as neural networks and support vector machines. More recently, multimodal analysis uses more sophisticated data such as video, gaze, gestures, and combines various methods such as computer vision, machine learning, etc.

Although the different analysis methods are inherently technical, they can provide pedagogical insights if properly used. For example, descriptive statistics (such as the mean, median and standard deviation) can be used to showcase the students’ interaction with a learning system (the usage), as it is coded with efficiency metrics (see Sect. 3.2.2) like the time online, total number of visits, distribution of visits over time, frequency of students’ postings/replies, percentage of material read, etc. Statistical methods can also be used to signify the importance of the analysis results (e.g., analysis of variance – ANOVA, and t-tests), or to explain more complex constructs of learning (effectiveness metrics), such as engagement (e.g., Principal Component Analysis – PCA). Data mining methods like classification and clustering can be used to model and explain learner performance (outcome metric), and machine learning techniques can be successfully applied to detect learners’ affective states (effectiveness metrics) during the learning activities.

Next, we focus on how the most commonly used statistical methods can tell the story in the data. In particular, statistics are used for measuring, controlling, communicating and understanding the data (Davidian & Louis, 2012). It is a mathematical science including methods of collecting, organizing and analyzing data in such a way that meaningful conclusions can be drawn from them. In general, statistics begin with data collection using a sampling method (you have learned about that in Chap. 1), and next, for understanding the collected data, its investigations and analyses fall into two broad categories called descriptive and inferential statistics. Furthermore, descriptive statistics deals with the processing of data without attempting to draw any inferences from it (Kenton, 2018). Finally, inferential statistics is a scientific discipline that uses mathematical tools to make forecasts and make generalizations about the larger population of subjects by analyzing the given data (Kuhar, 2010).

The Statistics – Introduction to Statistics video (see useful video resources) presents a brief introduction to statistics. Before advancing to more sophisticated techniques, we elaborate more on the fundamentals of statistical analysis and how they can tell the story in learning data analytics.

As already explained, descriptive statistics are used to summarize data in a way that makes sense. Descriptive statistics are, as their name suggests, descriptive: they illustrate what the data shows but do not generalize beyond the data considered. Here is a list of commonly used descriptive statistics (Dillard, 2017):

  • Frequencies – a count of the number of times a particular score or value is found in the data set. For example, how many students (within all participants) have scored 5 out of 10 on a test.

  • Percentages – used to express a set of scores or values as a percentage of the whole.

  • Mean – numerical average of the scores or values for a particular variable, e.g., the average score that the students achieved on a test. Taken alone, the mean is a dangerous tool. In some data sets, the mean is also closely related to the mode and the median (two other measurements near the average). However, in a data set with a high number of outliers or a skewed distribution, the mean simply doesn’t provide the accuracy you need for a nuanced decision.

  • Median – the numerical midpoint of the scores or values that is at the center of the distribution of the scores.

  • Mode – the most common score or value for a particular variable, e.g., the most common score that was achieved among all students.

  • Minimum and maximum values (range) – the highest and lowest values or scores for any variable.

  • Standard deviation (σ) – quantifies the amount of variation or dispersion of a set of data values, or otherwise, how close the data points are to the mean – the measure of a spread of data around the mean. A low standard deviation indicates that the data points tend to be close to the mean of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.

Mean, median and mode are measures of central tendency, while range and standard deviation are measures of dispersion.

Descriptive statistics may be sufficient if the results do not need to be generalized to a larger population, e.g., outside the specific assignment; when comparing the percentage of students that have solved an assignment correctly versus wrongly, descriptive statistics may be sufficient. Most analytics fall into the basic data evaluation category, and there is tremendous value here, and opportunities for some huge wins.

However, using only this kind of statistics entails the risk of ‘picking the low hanging fruit’ of learning analytics – descriptive information or simple statistics that values what can be easily measured rather than measuring what values. If it matters to understand, not only what happened, but also why it happened, utilizing the data to make inferences or predictions about learners is needed, and using inferential statistics is required.

Inferential statistics can be used to generalize the findings from sample data to a broader population, and examine the differences and relationships between two or more samples of the population (Kuhar, 2010). These are more complex analyses and are looking for significant differences between variables and the sample groups of the population. Inferential statistics allow testing hypotheses and generalizing results to the population as a whole. Following is a list of basic inferential statistical tests (Rathi, 2018):

  • Correlation – seeks to describe the nature of a relationship between two variables, such as strong, negative positive, weak, or statistically significant. If a correlation is found, it indicates a relationship or pattern, but keep in mind that it does not indicate or imply causation.

  • Analysis of Variance (ANOVA) – tries to determine whether or not the difference in the means of two sampled groups is statistically significant or due to random chance. For example, the test scores of two groups of students are examined and proven to be significantly different. The ANOVA will tell you if the difference is significant, but it does not speculate regarding “why”.

  • Regression – used to determine whether one variable is a predictor of another variable. For example, a regression analysis may indicate to you whether or not participating in a test preparation program results in higher ACT scores for high school students. It is important to note that regression analysis are like correlations in that causation cannot be inferred from it.

Questions and Teaching Materials

  1. 1.

    Which of the following statements best explains the generic role of Data Science in Learning Analytics?

    1. (a)

      Data Science is used to convert the raw student data into learning analytics metrics

    2. (b)

      Data Science uses mathematical tools to make forecasts about the larger student population by analyzing their data

    3. (c)

      Data Science is a blend of various tools, algorithms, and machine learning principles with the goal to discover hidden patterns from the raw student data and to understand learning in online and blended learning environments

    4. (d)

      Data Science uses complex analyses and is looking for significant differences between variables for the sample groups of the student population

Correct answer: c.

  1. 2.

    What data analysis method would you use to signify the differences in on-task effort exertion (e.g., in time-spent to complete the task) between different student groups?

    1. (a)

      Median and standard deviation

    2. (b)

      t-tests and/or ANOVA

    3. (c)

      Principal Component Analysis

    4. (d)

      Machine Learning

Correct answer: b

  1. 3.

    Which are the generic categories of statistical methods?

    1. (a)

      Simple statistics, Complex statistics, Inferential statistics

    2. (b)

      Sampling methods, Simple statistics, Complex statistics

    3. (c)

      Sampling methods, Descriptive statistics, Inferential statistics

    4. (d)

      Descriptive statistics, Complex statistics, Inferential statistics

Correct answer: c.

  1. 4.

    Match the appropriate definition (from the right column), to the respective “descriptive statistic” in the left column

1. Mode

a. numerical average of the scores or values for a particular variable, e.g., the average score that the students achieved on a test.

2. Median

b. a count of the number of times a particular score or value is found in the data set.

3. Mean

c. Quantifies the amount of variation or dispersion of a set of data values, or otherwise, how close the data points are to the mean – the measure of a spread of data around the mean.

4. Standard deviation

d. the most common score or value for a particular variable, e.g., the most common score that was achieved among all students.

5. Frequencies

e. Used to express a set of scores or values as a percentage of the whole.

6. Percentages

f. the numerical midpoint of the scores or values that is at the center of the distribution of the scores.

Correct answer: 1.d / 2.f / 3.a / 4.c / 5.b / 6.e.

  1. 5.

    Calculate the frequency of the students (within all participants) who have scored above 5 (>5) out of 10 in all assignments, from the table below.

    1. (a)

      3

    2. (b)

      7

    3. (c)

      8

    4. (d)

      12

 

Assign.1

Assign.2

Assign.3

Mid-term Test

Assign.4

Assign.5

Final Test

Stud1

5

7

4

4

6

7

5

Stud2

7

9

7

9

8

8

8

Stud3

5

5

7

5

6

5

5

Stud4

4

5

3

4

4

3

2

Stud5

7

8

6

7

4

5

4

Stud6

5

6

5

3

4

5

3

Stud7

7

7

7

6

7

5

5

Stud8

7

7

3

7

6

7

8

Stud9

3

7

6

5

3

3

2

Stud10

6

8

7

4

6

7

6

Stud11

6

9

5

4

4

7

5

Stud12

6

6

6

5

6

5

5

Stud13

7

8

7

6

6

4

4

Stud14

7

7

4

6

5

2

4

Stud15

5

6

8

7

7

4

5

Stud16

6

7

8

5

5

4

4

Stud17

4

7

4

6

4

5

3

Stud18

7

6

9

5

3

3

3

Stud19

5

5

7

8

8

7

6

Stud20

7

9

8

9

10

8

9

Stud21

6

7

4

3

4

2

3

Stud22

6

5

4

2

4

5

4

Stud23

4

4

4

4

5

5

6

Stud24

7

6

7

6

5

3

6

Stud25

6

5

5

7

5

4

5

Correct answer: a.

  1. 6.

    Mariana is an instructional designer, and she needs to redesign the educational material for a course, on which the majority of students failed. She has available student data from the previous time the course was available. What statistical method should she use to predict students’ scores from their participation variables (e.g., time on assignments, number of assignments completed, frequencies of logins, etc.)?

    1. (a)

      Mean and standard deviation of the participation variables: they illustrate what the data shows

    2. (b)

      ANOVA of the participation variables: determine whether or not the difference in the means of the sampled groups is statistically significant or due to random chance

    3. (c)

      Regression: determine whether the participation variables can explain the scores

    4. (d)

      None of the above: more advanced data analysis methods are required

Correct answer: c.

  1. 7.

    ACTIVITY/PRACTICE QUESTION (Reflect on)

    We encourage you to elaborate on your response about the analysis methods employed in learning analytics, in the following reflection task:

    1. 1.

      Provide 2 examples of learning analytics metrics and explain why you would use the mean and standard deviation to describe their values. Please, elaborate on your choices.

    2. 2.

      Provide examples of learning analytics metrics that could be used to explain a learning outcome, and elaborate on the statistical method you would use to explore the relationship.

3.3.2 Presentation Methods for Reporting on Learner Data Analytics

Now the educational data that were collected have been analyzed. How did students perform in an assignment? How did they perform compared to the previous assignment? How many of them downloaded the material that was made available online? How much time did the students spent on studying the online material compared to the score they achieved on the assignments?

These are common questions that can be answered when the educational data that have been collected, are analyzed using the respective metrics. The collected learner and context data and learning can be presented in many different ways to help make it easier to understand and more interesting to read. After collecting and organizing data, the next step is to display them in an easy to read manner – highlighting similarities, disparities, trends, and other relationships, or the lack of, in the dataset.

Data can be used to make data-driven and informed educational decisions, but all the data in the world won’t help if one cannot understand what the insightful analysis can present. The first step to presenting data is to understand that how data is presented matters (Kiss, 2018). Take these two visuals. They display the results of the scores that 250 students achieved on the five assignments and the mid-term exams during one semester, on a scale 0–100. The first one (infographic style – Fig. 3.15) is “prettier.” However, the visual is difficult to understand unless one actually reads the information on it. Pretty, but not helpful…

Fig. 3.15
A diagram indicates the score ranges of 5 assignments and the mid-term exams during one semester, on a scale of 0–100.

Infographic style visualization of learning data

On the other hand, the second one (Fig. 3.16) uses simple bars to display the same information. Helpful, and still pretty…

Fig. 3.16
A bar graph for 5 assignments and the mid-term exams against the score range from 0 to 100.

Simple visualization of learning data

In this section we elaborate on the different ways used to represent educational data and learning analytics metrics in a meaningful manner.

As already explained, displaying the analysis results and what is within the educational dataset in a clear way, is helpful in telling the story and making sense of the data that have been collected. Data reports present the data, analyses, conclusions and recommendations in an easy to decipher and digest format (Lebied, 2016).

The methods commonly used to display data include tables, charts, bar graphs, pie graphs, and line plots. Other commonly used ways to present data are histograms, box- plots, scatterplots, and stem-and-leaf plots. Sometimes, a combination of the graphical representations is used as a dashboard: presenting data results together should tell a story or reveal insights together, that isn’t possible if left apart.

Why do we use tables, diagrams or charts to display the learner/learning information?

  • Displaying data visually (with pictures) can make it easier to understand.

  • It makes the information stand out on a page.

  • It is easier to display using pictures, rather than lots of words. For example, it is easier to show someone the layout of a town using a map, rather than describing it in words.

Data can be presented in various forms depending on the type of data collected. For example, a frequency distribution table shows how often each value (or set of values) of the variable occurs in a dataset. A frequency table is used to summarize categorical or numerical data. Frequencies are also presented as relative frequencies, that is, the percentage of the total number in the sample. Except from the tables, there are other, graphical ways to present data. Analytics presented visually make it easier for decision makers to grasp difficult concepts or identify new patterns.

The Value of Data Visualization video (see useful video resources) provides a quick introduction to the value of data visualization. Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data. Data visualization is a powerful tool, especially in a world desperate for hard facts. When it comes to making sense of learning analytics and understanding learning patterns in the educational data, one can start from simple graphs that can demonstrate this information. For example, quiz submission data, discussion interaction data (e.g., participation in the forum), data from the access to the learning management system, assignment completion data have been gathered and analyzed. What’s next is to answer questions like the following:

  • How well an individual student did in comparison to the entire class?

  • What was the overall performance on a quiz?

  • Is there a relationship between quiz performance and content access?

To address these questions, graphic representations that are easy to interpret are needed (Blits, 2017). Figure 3.17 illustrates the most common data visualization types.

Fig. 3.17
A diagram indicates the most frequent types of data visualization. These are bar graphs, line graphs, pie charts, histograms, and scatter plots.

Data visualization types

A bar graph is a way of summarizing a set of categorical data. It displays the data using a number of rectangles, of the same width, each of which represents a particular category. Bar graphs can be displayed horizontally or vertically, and they are usually drawn with a gap between the bars (rectangles). For example, to answer to how well an individual student did in comparison to the entire class, a bar graph can be used, where each student in the classroom is represented by a bar.

A line graph is particularly useful when we want to show the trend of a variable over time. Time is displayed on the horizontal axis (x-axis) and the variable is displayed on the vertical axis (y- axis). In the above example, a line graph can be used to showcase the overall performance on a quiz.

A pie chart is used to display a set of categorical data. It is a circle, which is divided into segments. Each segment represents a particular category. The area of each segment is proportional to the number of cases in that category. For example, a pie chart can be used to display the successful completion of an assignment.

A histogram is a way of summarizing data that are measured on an interval scale (either discrete or continuous). It is often used in Exploratory Data Analysis (EDA) to illustrate the features of the distribution of the data in a convenient form. In the above example, a histogram can be used to show the distribution of scores of students on the final exams.

A scatter-plot displays values for typically two variables for a set of data. The data are a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis. The scatter-plot is usually used to determine if a correlation exists between the data, and how strong it is. For example, a scatter-plot can show if there is a relationship between quiz performance and content access, or if there is a relationship between assignment completion and quiz performance.

It needs to be clarified that, in statistics, exploratory data analysis (EDA) is a preliminary data analysis approach to summarize the main characteristics of a given dataset, often with visual methods. EDA refers to a critical process of performing initial investigations on data to discover patterns, to spot anomalies, to test hypothesis and to check assumptions with the help of summary statistics and graphical representations. It is a good practice to understand the data first and try to gather as many insights from it.

In most cases, a single graph does not contain all the information that is hidden in the data, cannot provide all the insights that might be needed to understand students’ learning behaviour or outcomes, and is not sufficient for informed decision-making. The solution is to use combined graphs of the learning analytics metrics that all together can tell the story in the data. These combined graphs are called dashboards. “A dashboard is a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance” (Few, 2004). Here are five examples of learning analytics dashboard implementations, in relation to the educational objective they aim to address.

LAPA – Learning Analytics for Prediction & Action

The goal of LAPA dashboard is to inform learners’ online learning behaviour to learners themselves and the instructor and guide their learning in a smart, personalized way. The first version of LAPA (Fig. 3.18) consists of 7 graphs. The graph chosen for the online activity summary is the scatterplot, where individual learners can choose the X-axis and Y-axis to locate their position in class. The other 6 graphs are provided with a trend line of their activity every week along with the average activity information of their peers. All graphs in LAPA are updated every week until end of semester (Park & Jo, 2015).

Fig. 3.18
A dashboard of online learning activity status in statistics depicts the frequent types of data visualizations.

The LAPA dashboard. (Source: Park & Jo, 2015)

LADA – Learning Analytics Dashboard for Advisers

LADA is a learning analytics dashboard that supports academic advisers in compiling a semester plan for students based on their academic history. LADA also includes a prediction of the academic risk of the student (Gutiérrez et al., 2018). LADA visualizes two categories of information: a) The chance of success and prediction quality components b) The various information card components designed to support the adviser (Fig. 3.19).

Fig. 3.19
A L A D A dashboard depicts the various information card components designed to support the adviser.

The LADA dashboard. (Source: Gutiérrez et al., 2018)

LISSA – Learning Dashboard for Insights and Support during Study Advice

LISSA provides an overview of every key moment in chronological order up until the period in which the advising sessions are held: the grades of the positioning test (a type of entry-exam without consequence), mid-term tests, January exams, and June exams. A general trend of performance is visualised at the top: the student path consists of histograms showing the position of the student among their peers per key moment (Charleer et al., 2018). LISSA is shown in Fig. 3.20.

Fig. 3.20
A L I S S A dashboard depicts histograms highlighting the student's route and representing the position of the student among their peers per key moment.

The LISSA dashboard. (Source: Charleer et al., 2018)

SmartKlass (Moodle)

SmartKlass™ is a Learning Analytics dashboard for Institutions, Teachers and Students. By analyzing student’s behavioural data SmartKlass™ creates a rich picture of the evolution of the students in an online course: it can help teachers to identify the students lagging behind, help teachers to identify the students that content is not challenging enough for them, help teachers to compare participation and results to other courses, so the teachers can take action (Fig. 3.21). Students can also learn about their performance, individually and compared with the group.

Fig. 3.21
A SmartKlass dashboard indicates that teachers may use online students' progress to identify lagging students and help teachers to compare participation and results to other courses so that they can take action accordingly.

The SmartKlass. (Source: https://moodle.org/plugins/local_smart_klass)

Acrobatiq

The Learning Dashboard (Fig. 3.22) generates summary graphs, tables and reports and dynamically displays student learning estimates, engagement data and activity data in real time. It enables faculty, students, and other stakeholders to visualize and act on student learning performance. It can be used for revealing what students did/not learn, quantifying how well students have learned each skill, identifying consequential patterns in students’ learning behaviours, and measuring effectiveness of instructional and design choices.

Fig. 3.22
An Acrobatiq dashboard depicts real-time estimations of student learning and activity. It helps teachers, students, and others to see and act on student learning performance.

The Acrobatiq. (Source: https://www.acrobatiq.us/products/the-learning-dashboard.html)

Signals

Course Signals was developed to allow instructors the opportunity to employ the power of learner analytics to provide real-time feedback to a student. Course Signals relies not only on grades to predict students’ performance, but also demographic characteristics, past academic history, and students’ effort as measured by interaction with Blackboard Vista, Purdue’s learning management system (Arnold & Pistilli, 2012). The Course Signals Explanation video (see useful video resources) is a brief introduction to Signals.

KlassData

The learning process in virtual environments is more complex to analyze, but the generated data unlocks the power of learning analytics and opens the door to personalized paths in education. The KlassData: Learning Analytics for Education video (see useful video resources) application explains how KlassData works.

Questions and Teaching Materials

  1. 1.

    Match the visualizations (from the left column) to the respective evaluation of data presentation clarity (i.e., “Easy to understand” / “Difficult to understand” in the right column)?

(a) Easy to understand

(b) Difficult to understand

Correct answer: 1.b / 2.b / 3.b / 4.a.

  1. 2.

    What is the purpose of dashboards?

    1. (a)

      To present data results together so that they tell a story or reveal insights together, that isn’t possible if left apart

    2. (b)

      To summarize categorical or numerical data

    3. (c)

      To grasp difficult concepts or identify new patterns

    4. (d)

      To produce and deliver richly interactive visualizations

Correct answer: a.

  1. 3.

    What type of graph is more appropriate to present all students’ scores on monthly assignments and the average class performance, and what for the distribution of the scores?

    1. (a)

      A scatter plot to visualize the students’ scores on monthly assignments and the average class performance, and a histogram for the distribution of the scores

    2. (b)

      A bar graph with a line to visualize the students’ scores on monthly assignments and the average class performance, and a histogram for the distribution of the scores

    3. (c)

      A bar graph with a line to visualize the students’ scores on monthly assignments and the average class performance, and a pie for the distribution of the scores

    4. (d)

      A scatter plot to visualize the students’ scores on monthly assignments and the average class performance, and a pie for the distribution of the scores

Correct answer: b.

  1. 4.

    Select the visualization that better illustrated the performance of all students on all assignments

a.

b.

c.

d.

Correct answer: c.

  1. 5.

    Match the objective (from the right column), to the respective visualization dashboard in the left column.

1. SmartKlass

a. supports academic advisers in compiling a semester plan for students based on their academic history.

2. LISSA

b. Reveals what students learn, quantifies how well students have learned each skill, identifies patterns in students’ learning behaviours, and measures effectiveness of instructional and design choices.

3. LAPA

c. Help teachers to identify the students lagging behind, help teachers to identify the students that content is not challenging enough for them, help teachers to compare participation and results to other courses, so the teachers can take action.

4. Acrobatiq

d. Informs learners’ online learning behaviour to learners themselves and the instructor and to guide their learning in a smart and personalized way.

5. LADA

e. Provides an overview of every key moment in chronological order up until the period in which the advising sessions are held.

Correct answer: 1.c / 2.e / 3.d / 4.b / 5.a.

  1. 6.

    What is the main focus of the visualization dashboard systems that have been developed?

    1. (a)

      To capture moment-by-moment learning and students’ achievements

    2. (b)

      To increase students’ awareness of their own progress, guide self-learning, and support self-regulation of learning

    3. (c)

      To predict students’ progress during the semester and make content recommendations

    4. (d)

      To monitor individual students’ learning and reveal gaps, misunderstandings, or difficulties and help teachers tailor their instruction to the students’ needs

Correct answer: d.

  1. 7.

    ACTIVITY/PRACTICE QUESTION (Reflect on)

    We encourage you to elaborate on your response about the data representation techniques in learning analytics, in the following reflective task:

    1. 1.

      Provide 2 examples of learning analytics metrics and explain what type of representation method you would employ to demonstrate their role. Please, elaborate on your choices.

    2. 2.

      Assume that you want to get insight about learners’ engagement in an online activity. What learning analytics metrics you would consider and what visualizations would you provide on a dashboard to monitor how these metrics change? Please, elaborate on your decisions/suggestions.

3.4 Interpreting Learning Analytics and Inferring Learning Changes

3.4.1 Making Sense of Learners’ Data Analytics and Analysis Results

The intersection of learning science with data and analytics enables more sophisticated ways of making meaning to support student learning. All these available learner and context data “carry” so much knowledge about the learners and the learning processes, that remains hidden and waits to be revealed. But data from tracking systems are not inherently intelligent. Hit counts and access patterns do not really explain anything. The intelligence is in the interpretation of the data; what all those statistics about the learner’s data and measurements can inform us about. For example, login frequencies, time-spent on tasks or numbers of forum posts do not measure the impact on students’ learning. However, the data analysis techniques can reveal potential relationships between metrics that otherwise, in a human-analysis perspective, would be undiscoverable or even ignored. In the above example, learning analytics metrics such as time-spent or frequencies of attempts can be used to identify specific units of study or assignments in a course that are difficult (or trivial) for most of the students, and reveal the correlation between task-difficulty and student behaviour. Ideally, data analysis techniques enable the visualization of interesting data that in turn sparks the investigation of this data. Figure 3.23 illustrates the path from learners and their data to the interpretation of learning analytics.

Fig. 3.23
A diagram indicates the path of flow from learners to the interpretation of learning analytics.

The path from learners to knowledge

The statistical analysis uses a combination of potentially actionable metrics to predict an outcome that needs attention and improvement. For example, to predict the successful completion of an assignment, metrics can include measurable events, such as time-spent on-task, on-task mental effort, number of attempts to solve a task, frequency of question posing, frequency of help-seeking, etc. Less obvious data can also be used, such as non-cognitive variables, like stress levels, emotional intensity, attention, etc. Analyses provide a score for each student, so students can be grouped objectively into categories needing high-, medium- or no-intervention to successfully complete the assignment. The analysis cannot say that the learning analytics metrics caused the outcome, but it can show what combination of indicators is related to the outcome. Your data reports and visualizations will help you to identify historical trends and correlations, which you can use to understand what happened and (probably) why.

Behavioural data can also be used to track students’ approaches to study. For example, frequency and sequence of interactions can be tracked, as students engage with learning tasks. While this may not directly measure student learning, it can provide insights on the student’s on-task activity and help to identify strategies that could improve how they plan and regulate their study.

Data science is promising to have a substantial influence on the understanding of learning in online and blended learning environments. This, of course, implies a shift on the typical role of educators, from being instructors and facilitators to performing some of the tasks data analysts usually hold (Fig. 3.24). They need to be able to discover the patterns in the data and convey the meaning in educational terms, that is to interpret the analysis results into meaningful learning schemas.

Fig. 3.24
A diagram depicts the roles of the educator as a facilitator, instructor, and analyst.

The different roles of the educator in relation to data

The more an educator will use the learning analytics metrics, tools and visualization dashboards, the more she will understand what the story that the data can tell is, and what the most important patterns in the data are in explaining students’ engagement, progress and outcomes. The analysis might reveal correlations between metrics that the educator had never thought of before, and behavioural patterns that are repeated from student to student and from class to class.

As the educator moves from efficiency metrics to effectiveness metrics to outcomes (see Sect. 3.2.2), she should keep in mind that all metrics are proxies for what ultimately matters. The different types of analytics facilitate the selection of the most appropriate metrics and guide their interpretation. Next, we elaborate on how the analysis outcomes associate with the learning analytics objectives and the analytics types.

As already discussed, the common objectives of learning analytics include monitoring learners’ progress, modelling learners/learners’ behaviour, detecting learner’s emotions, predicting learning performance/dropout/retention, generating feedback, providing recommendations, guiding adaptation, increasing self-reflection/self-awareness, and facilitating self-regulation. To address these objectives, four types of learning analytics can be used, namely descriptive, diagnostic, predictive and prescriptive analytics. The infographic by CommLabIndia and the article by eLearningIndustry give a comprehensive overview of different levels of learning analytics and of how bases of and approaches to using analytics can lead to deeper insights.

Each analytics type can be supported and facilitated by specific data analysis methods that are appropriate for that type of data transformations. For example, descriptive statistics and simple visualizations (using bar graphs, histograms, etc.) are the suitable analysis technique to provide descriptive analytics. Similarly, correlation analysis better facilitates diagnostic analytics, whereas regression analysis is commonly used for prediction purposes, and as such it is an indicative analysis technique for predictive analytics. When it comes to prescriptive analytics, more sophisticated analysis techniques can be employed (e.g., heuristics, machine learning), which, however, require strong background in data science and are beyond the scope of this chapter. Depending on the objectives and the types of analytics used, the interpretation of the analysis results can vary from gaining insights, to making decisions, to taking actions (Fig. 3.25).

Fig. 3.25
A diagram indicates the aims and types of analytics utilized, and the analysis results may be interpreted as insights, decisions, and actions.

The learning analytics types with respect to the objectives and actions

For example, let’s assume that, in anticipation, an educator wants to early predict students’ success in the final exams in order to provide them proactive feedback, recommendations, support their self-regulated learning strategies, and prevent failure or drop-out. Let’s also assume that the educator has available all the data from the students’ activity during the semester (online participation, assignments’ completion, quizzes’ scores, etc.). The learning management system the educator is using can provide all the descriptive statistics about students’ misconceptions, engagement, achievement, progress, etc., and deliver this information using multiple visualizations of the different learning analytics metrics, demonstrating some critical interrelationships between them and facilitating some diagnostic operations. The dashboard can also provide the result from a regression analysis in graphical formats that considers the most critical metrics and forecasts the evolution of the prediction variable (e.g., success in final exams) and displays the tendencies in the metrics. If the educator combines all this graphical information, that is the result of the analytics processing, she will be able to associate the numerical facts with each student’s progress and learning needs.

Questions and Teaching Materials

  1. 1.

    How can learning analytics contribute to human learning?

    1. (a)

      Learning analytics can measure the impact on learning

    2. (b)

      Learning analytics can directly measure human learning

    3. (c)

      Learning analytics can show what combination of indicators is related to the outcome

    4. (d)

      Learning analytics metrics can show what has caused the learning outcome

Correct answer: c.

  1. 2.

    Steven is an etutor. For his online course, he wants to identify areas that require improvement – e.g., learner engagement or the effectiveness of course delivery – and he also wants to identify gaps and performance issues early, before they become problems. What type of analysis methods and analytics he should use?

    1. (a)

      Descriptive statistics (e.g., mean, standard deviation, min, max) – descriptive analytics (e.g., course enrolments, course compliance rates, what learning resources are accessed and how often)

    2. (b)

      Correlation analysis (e.g., ANOVA, t-test) – descriptive analytics (e.g., course enrolments, course compliance rates, what learning resources are accessed and how often)

    3. (c)

      Regression analysis – predictive analytics (e.g., high/low performance, high/low engagement)

    4. (d)

      Machine learning (e.g., classification) – predictive analytics (e.g., high/low performance, high/low engagement)

Correct answer: a.

  1. 3.

    ACTIVITY/PRACTICE QUESTION (Reflect on)

    We encourage you to elaborate on your response about the learning analytics interpretations, in the following reflective task:

    1. 1.

      Provide 2 examples of learning analytics objectives and explain what learning analytics type you would employ to achieve those objectives. Please, elaborate on your choices

3.4.2 Explaining the Data Analysis Results in an Educationally Meaningful Manner to Understand Learners and the Environment they Learn In

What analytics cannot do by themselves is improve instruction. While they can point to areas in need of improvement and they can identify engaging practices, the numbers cannot make suggestion for improvements. This requires a human intervention.

Intervention should be personalized to the learner – based on their engagement and/or performance data and any personal information you may have. For example, if the educator notices that a student stopped participating in online forums just before their performance began to drop, it would be proper to encourage the student to resume their involvement in the forums. At the same time, it could be helpful to get feedback from the student to find out why they stopped participating. There may have been an event in the course or some other obstacle that the educator should address in order to facilitate the student’s involvement in the online forums.

Effective intervention may involve adapting teaching styles. If students tend to do better with certain kinds of media, interactivity, or assessments, the course design should be adapted to enable better learning. However, some learning professionals are hesitant to initiate a learning analytics practice for two reasons: the perception that they must address everything at once, and the concern that leadership will use the insights in a penalizing way. Τhe Learning Analytics to inform teaching practice video (see useful video resources) explains how learning analytics can be used to inform teaching practise.

If a metric is not informing a decision, there’s no need to keep gathering it. If it is, optimize the specific data and learn how to turn it into insights that inform decisions that matter. Over time, add more metrics, always keeping in mind the decisions they inform. The data one collects should be a combination of engagement and performance data – but it is important to make sure that one is not collecting information that will not use. The Jisc Learning Analytics: Making data useful video (see useful video resources) demonstrates an example of how data can be effectively used and how one can give meaning to data.

Questions and Teaching Materials

  1. 1.

    What is the first step teachers should consider for using learning analytics?

    1. (a)

      How to design the feedback and intervention using learning analytics?

    2. (b)

      What data should they collect to transform into learning analytics metrics?

    3. (c)

      What learning analytics metrics should they use to solve the problem at hand?

    4. (d)

      What kind of problem or aspect of learning you they want to detect and act on the learning environment?

Correct answer: d.

3.5 Concluding Self-Assessed Assignment

3.5.1 Introduction

In order to proceed, you are requested to complete a concluding self-assessed assignment. This self-assessed assignment is a real-life scenario activity (based on the use case of the instructional designer David), using a rubric across three proficiency levels and an exemplary solution rating. When you have completed this assignment, you will assess it yourself, following the rubric which will list the criteria required and give guidelines for the assessment.

This self-assessed assignment procedure consists of 5 steps:

  • Step 1. Real life scenario

  • Step 2. Getting familiar with the assessment rubric

  • Step 3. Prepare your answer

  • Step 4. Review a sample solution

  • Step 5. Self-evaluate your answer

3.5.2 Step 1. Real Life Scenario

David is an instructional designer. He always aims to create engaging learning activities and compelling course content. Recently he has been organizing the educational material and learning and assessment activities for a new course, and he wants to design a dashboard to monitor progress, engagement, and performance, both for individual students and for the whole class, that will advance the learning experience. He has available several types of student data tracked by the LMS during students’ activities (e.g., login data, content/ educational material access, timestamp for each activity, file downloads, assignments completed, correctness of assignments, grades on assignments, posting on online forums, quiz scores, discussion participation, etc.), as well as demographic and enrolment data (e.g., age, gender, socioeconomic status, special education needs, course enrolment, etc.). It is important for David to deliver a dashboard that will increase students’ self-awareness about their progress, motivate them to self-reflect and identify their needs, and finally enhance their retention and performance.

However, David is new in learning analytics and educational data literacy. Help David design a dashboard that will integrate students’ needs and will address the above learning objectives.

3.5.3 Step 2. Getting Familiar with the Assessment Rubric

David has searched on the Internet for Learning Analytics Dashboards samples, to get some design inspiration, and designs and Initial ExampleDB.

Please help David to evaluate this Initial ExampleDB using the Rubrics for assessing the dashboard and to identify potential issues.

ACTIVITY/PRACTICE QUESTION (Discussion)

We encourage you to elaborate on your response about the evaluation of the Initial ExampleDB created by David, in the following discussion task, by posting your thoughts on the discussion board. You may discuss:

  1. 1.

    Does this example dashboard comply with the dashboard design criteria in the Rubric?

  2. 2.

    If not, what would you advise David to modify, so that this dashboard serves and addresses the learning objectives he has set?

3.5.3.1 Initial Example DB

A diagram of the common types of graphical representations of data depicts the indication of the results for 131 assignments, 20 are pending and 76 are approved.

3.5.3.2 Rubric for Assessing the Example DB

Criteria

Unacceptable (1)

Good/solid (3)

Exemplary (5)

Clarity: Graphs and charts answer the specific question/ address the specific objective.

Graphs and charts do not have clearly defined topics and fail to address specific questions.

Graphs and charts have somewhat clearly defined topics but fail to address specific questions.

Graphs and charts have concise and clearly defined topics that address specific questions.

Information quality: Graphs and charts complement each other – there is no information redundancy.

Graphs and charts are not relevant to each other and there is information redundancy.

Graphs and charts are relevant to each other but there is information redundancy.

Graphs and charts complement each other well, without redundant information.

Appropriateness: Graphs and charts types are appropriate for the data types and scale.

None/a few of the graphic types used are suited for the type and scale of the data they represent.

Most graphic types used are well suited for the type and scale of the data they represent.

All graphic types used are well-suited for the type and scale of the data they represent.

Interpretability: Graphs and charts convey meaningful information to the viewer and facilitate decision making.

Graphs and charts are overwhelmed by text, color, and symbolism, that are irrelevant to the question the visualization seeks to answer.

Graphs and charts contain some color, symbolism, or text that is irrelevant to the question the visualization seeks to answer.

Graphs and charts contain no color, symbolism, or text that is irrelevant to the question the visualization seeks to answer.

Organization: Graphs and charts are well organized and easy to follow.

Graphs and charts are a bit of a mess. The dashboard is not easy to follow.

Graphs and charts are visually appealing and somewhat well organized. The dashboard is somewhat easy to follow.

Graphs and charts are visually appealing and well organized. The dashboard is easy to follow.

Usability: Legends describe and explain every graphic variable type employed.

Either there is no legend, or it does not describe any of the graphic variable types present in the visualization.

Legend describes a few/most of the graphic variable types present in the visualization.

Legend describes every graphic variable type present in the visualization.

Aesthetics: Visualization makes appropriate use of color.

More than 12 colors are used. Similar colors are adjacent.

Fewer than 12 colors are used, but similar colors are not adjacent.

Fewer than 8 colors used in visualization, colors are discrete.

3.5.4 Step 3. Prepare Your Answer

Please assist David to design a prototype of the dashboard that will integrate students’ needs and will address the above learning objectives. For this purpose, you will have to design a detailed prototype of the dashboard (using pen and paper and/or any tool of your preference). Please, consider that David (and you!) has available all types of student data he might need, and help him select the most appropriate ones for each learning objective, mapping the learning analytics metrics to the respective and most suitable type of graph and/or chart.

ACTIVITY/PRACTICE QUESTION (Reflect on)

We encourage you to elaborate on your response about the prototype of the dashboard that David wants to design to increase students’ self-awareness about their progress, motivate them to self-reflect and identify their needs, and finally enhance their retention and performance, in the following reflective task:

  1. 1.

    What are the key indicators to include (visualize) in the dashboard and can help students monitor their progress in the course?

  2. 2.

    What are the key indicators to include (visualize) in the dashboard and can help students monitor their performance on assignments and quizzes?

  3. 3.

    What are the key indicators to include (visualize) in the dashboard and can help students monitor their engagement in the course and course materials and tools?

3.5.5 Step 4. Review a Sample Solution

Please review a sample of an Exemplary Sample Solution that follows the criteria specified in the Rubrics for assessing the dashboard.

ACTIVITY/PRACTICE QUESTION (Reflect on)

We encourage you to elaborate on your response about the Exemplary Sample Solution that follows the criteria specified in the Rubrics for assessing the dashboard, in the following reflective task: Do you identify any design requirements that you did not take under consideration when creating your dashboard prototype?

3.5.5.1 Εxemplary Sample Solution

A diagram of the common types of graphical representations of data depicts the indication of a particular student's progress, performance, and engagement.

3.5.6 Step 5. Self-Evaluate Your Answer

Now that you have seen the Exemplary Sample Solution, please rate your initial answer (evaluate the dashboard you created), using the Rubric table below.

Criteria

Unacceptable (1)

Good/solid (3)

Exemplary (5)

Clarity: Graphs and charts answer the specific question/ address the specific objective.

Graphs and charts do not have clearly defined topics and fail to address specific questions.

Graphs and charts have somewhat clearly defined topics but fail to address specific questions.

Graphs and charts have concise and clearly defined topics that address specific questions.

Information quality: Graphs and charts complement each other – there is no information redundancy.

Graphs and charts are not relevant to each other and there is information redundancy.

Graphs and charts are relevant to each other but there is information redundancy.

Graphs and charts complement each other well, without redundant information.

Appropriateness: Graphs and charts types are appropriate for the data types and scale.

None/a few of the graphic types used are suited for the type and scale of the data they represent.

Most graphic types used are well suited for the type and scale of the data they represent.

All graphic types used are well-suited for the type and scale of the data they represent.

Interpretability: Graphs and charts convey meaningful information to the viewer and facilitate decision making.

Graphs and charts are overwhelmed by text, color, and symbolism, that are irrelevant to the question the visualization seeks to answer.

Graphs and charts contain some color, symbolism, or text that is irrelevant to the question the visualization seeks to answer.

Graphs and charts contain no color, symbolism, or text that is irrelevant to the question the visualization seeks to answer.

Organization: Graphs and charts are well organized and easy to follow.

Graphs and charts are a bit of a mess. The dashboard is not easy to follow.

Graphs and charts are visually appealing and somewhat well organized. The dashboard is somewhat easy to follow.

Graphs and charts are visually appealing and well organized. The dashboard is easy to follow.

Usability: Legends describe and explain every graphic variable type employed.

Either there is no legend, or it does not describe any of the graphic variable types present in the visualization.

Legend describes a few/most of the graphic variable types present in the visualization.

Legend describes every graphic variable type present in the visualization.

Aesthetics: Visualization makes appropriate use of color.

More than 12 colors are used. Similar colors are adjacent.

Fewer than 12 colors are used, but similar colors are not adjacent.

Fewer than 8 colors used in visualization, colors are discrete.