1 Introduction

Nowadays, with an abundance of data and information that can be harvested from online or offline learning environments, educators or instructors in higher education institutions typically face several challenges before such data can be utilised to improve the teaching and learning processes. Several steps are required until the data can be deployed, including storing, analysing, and anonymising. With a better understanding of such data and how it can be used to improve education, problems need to be forecasted and detected before they arise by taking the relevant measures (and if acted accordingly) (Hlosta et al., 2017; Waddington et al., 2014). With the upsurge of this data, a vast increase in Open Educational Resources (OER) has also been seen in recent years as many initiatives have been created to share, reuse, and standardise online materials (Sinclair et al., 2013). To tackle these challenges, Learning Analytics (LA) plays a significant role in making the data comprehensible. LA consists of miscellaneous steps that include data harvesting, data storing, data cleaning, data anonymisation, data mining, data analysis, and data visualisation (Drachsler & Greller, 2016).

LA is a relatively young field of research and first appeared around 2010 (Call for Papers of the 1st International Conference on Learning Analytics Knowledge (LAK2011) 2011). It utilises an evidence-based approach and practice for analysing and understanding data to support and improve the complexities of both, the learning and teaching processes. LA is defined as “the measurement, collection, analysis, and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs” (Call for Papers of the 1st International Conference on Learning Analytics Knowledge (LAK2011) 2011). In its early stages, organisations such as Society for Learning Analytics Research (SoLAR) and the International Educational Data Mining Society were founded to steer the research community towards data analytics and data mining in education (Siemens, 2013). As a result, LA caught the attention of educational institutions, especially those involved in distance higher education (e.g., Open Universities). This interest stems from the need to improve learning and teaching by moving towards personalisation and customisation to answer the challenges of delivering distance education to large cohorts of online students.

Fostering LA is no easy task. One of the challenges is that LA is a very data-intensive field. The technical development does not allow the web-based data volumes to be processed at sufficient speed, as special hardware requirements are necessary. Until a few years ago, only large data centres could deal with this. Since then, the field and the application of LA in the educational domain have been growing steadily (Ebner & Markus., 2018). With the rise of more capable processors and architectures that allow for the processing of large amounts of data in a reasonable time, scientists around the globe are attempting to make use of the vast amount of information and utilise it for the advantages and benefits that LA can bring to the educational scene (Greller & Drachsler, 2012).

The aim of this chapter is to shed light on the OER repositories, LA, and LA dashboards and present an implementation of a research-driven LA dashboard for displaying OER and their repositories that allows the visualisation of educational data in an understandable way for both educators and learners. This chapter is divided as follows. Section 2 presents an overview of the OER, the problems, and challenges. Section 3 presents our proposed Learning Analytics Dashboard (LAD), the methodology, indicators, and components. Finally, in Sect. 4, we present the conclusion, future work, and limitations of this work.

2 An Overview of Open Educational Resources (OER)

In today’s world, education has transformed from purely face-to-face learning into an additional, hybrid way of learning, i.e., blended learning, which is a combination of face-to-face learning (for example, in a classroom or university setting) and online learning via the use of digital technologies. This transformation has been a result of the increasing necessity for learners to be flexible and mobile when undertaking their education due to work and other commitments, as well as the technologies becoming better and more advanced, so that online education is often successful for an increasingly large number of individuals. Nowadays, the ease of access to information and knowledge makes it possible for a large majority to obtain quality education anytime and anywhere around the globe. This is the reason why online learning resources are rapidly increasing in number and size, including those that were previously known as learning objects (LOs) and reusable learning objects (RLOs). Currently, there are plenty of OER repositories that provide educational resources to teachers and students. It is essential to have a platform that provides information on the current state of the Open Educational Resources (OER) in order to keep educators informed (Sinclair et al., 2013).

The United Nations Educational, Scientific and Cultural Organization (UNESCO) defines OER as learning and teaching materials in any form that either reside in the public domain or under an open copyright licence and that can be shared, adapted, and reused with no or limited restrictions (Scientific and Cultural Organization (UNESCO) xxxx). UNESCO is a branch of the UN agency and has created a dedicated OER programme. In 2002, UNESCO organised a forum where they first introduced the term “Open Education Resource”. UNESCO emphasises the free and easy access to high-quality content for every individual, including those in minority and disadvantaged groups. UNESCO believes that universal access to information through high-quality education contributes to peace, sustainable social and economic development, and intercultural dialogue. OER provide a strategic opportunity to improve the quality of learning and knowledge sharing as well as policy dialogue, knowledge-sharing, and capacity-building globally.

2.1 OER Mapping Problems and Challenges

In our research study, we harvested the data mostly from the German OER repositories Econis,Footnote 1 OpenClipArt,Footnote 2 UniWeimar,Footnote 3 ZOERR,Footnote 4 Lecture2Go,Footnote 5 and HOOU.Footnote 6 We encountered a number of problems and challenges while analysing the data. Specifically, the received data required to be organised, prepared, and planned to present it in the way that the learning resource was intended to be utilised. We, therefore, decided that in this situation, an LA dashboard would be helpful for users to organise and visualise the large amounts of learning resources appropriately. Each OER is tagged with metadata and, hence, a tool was needed to enable the exploration of these resources as well as to search and locate them.

To analyse the data and explore which metadata elements could be used for our LAD (presented in Sect. 3), we created a python script. After narrowing down the number of elements, we investigated what information is stored in the remainder of the fields. The metadata elements include URL, title, description, language, contribute, copyright and other restrictions, location, source repository, datetime, catalogue entry, keywords, generated subjects, meta-metadata, classification, learning resource type, technical, and abstract. A total of 17 metadata elements are reported in the dataset. There are 14 similar metadata fields that are present in all six repositories. While the “classification” metadata element is present throughout, it is actually empty in every single instance. On the other hand, the “meta-metadata” element has the same content as the “contribute” field throughout. The “catalogue entry” has non-empty values only in the Econis dataset. As for the remaining metadata elements, “abstract” is found only at UniWeimar and Econis including the data. “Technical” is present with the entries in four of the datasets. “Learning resource type” is found in four of the datasets along with the entries.

After cleaning and removing the non-similar metadata fields, 14 metadata fields remained, title, description, location, source repository, datetime, abstract, learning and learning resource type (contain single values), Keywords, URL and language (arrays containing multiple values), Catalogue entry, copyright and other restrictions, technical, and contribute (complex data structures of arrays and nested keys and values).

The “contribute” metadata element contains information about the authors, publishers, and providers of the resource. The “catalogue” entry contains information about the ID of the resource in a catalogue, such as the International Standard Book Number (ISBN) or Online Computer Library Center (OCLC). “Copyright and other restrictions” contains information on the intellectual property restrictions. The “technical” element contains information on the format of the resource and its size.

To tackle the challenges and successfully present the information in a meaningful manner, we developed an LA dashboard for displaying OER that shows information about the existing German OER repositories as part of our EduArc project located in Germany (presented in Sect. 3).

3 Our Proposed Learning Analytics Dashboard (LAD)

This section presents a short overview of the Learning Analytics Dashboards (LAD), metrics, and indicators created in our research team as part of the EduArc project. Typically, LA applications collect data from their interactions with the system resources. To make sense of these captured data, they need to be categorised in a corresponding unit of measurement. These units of measurement are referred to as metrics (Ahmad et al., 2022).

Our LAD consists of multiple adopted indicators (see Fig. 1 below). Metrics are used to create these indicators (Ahmad et al., 2022). Metrics are measurements of the activities a learner does in a learning environment (Ahmad et al., 2022) (e.g., number of reading sessions, duration of reading sessions, number of reading interruptions (Sadallah et al., 2015), number of learning activities, student attendance, student grades (Ruiz et al., 2016), etc.). An indicator shows if and to what extent a particular concept can be derived from the metrics (e.g., reading analytics, ideal reading material (Sadallah et al., 2015), self-regulation, emotional state (Ruiz et al., 2016), etc.). Therefore, the LAD consists of educational foundations, indicators, metrics, and visualisations.

Fig. 1
A dashboard with bar and line graphs, a pie chart, and a table indicates learning activities having various distributions such as licenses, keywords, and more.

Our Learning Analytics Dashboard (LAD)

3.1 Research Methodology

In our literature review, we searched and incorporated the following publication outlets: the Technology Enhanced Learning (TEL), the Learning Analytics and Knowledge Conference (LAK) series since 2011, the Journal of Learning Analytics (JLA), the European Conference for Technology Enhanced Learning (ECTEL) since 2012, and IEEE Transactions on Learning Technologies related to the focus of this study.

To create our LAD for displaying OER, we harvested 170 publications. From this selection of publications, we first evaluated the abstracts and excluded theory and policy papers, which are irrelevant for this study. We further excluded papers that had no specific LA concept or any information relating to data visualisation. We ended up with 126 articles, which were read in full by the research team. Finally, we gathered 153 indicators in total as a sum derived from these 126 articles. We used OpenLAIR to search for relevant indicators for our dashboard. It is a Learning Analytics Indicator Repository that helps practitioners and educational researchers make informed decisions about selecting LA indicators for their course design or LAD (Ahmad et al., 2022).

We selected 33 indicators from our extensive list of 153 indicators for our study because the data we harvested from OER repositories are limited to only such information on those OER (presented in Sect. 2.1). Subsequently, we further removed the indicators that focused on user behaviour and interaction because the OER repositories usually do not track user behaviour and interaction and also due to the General Data Protection Regulation (GDPR) (Voigt & Bussche, 2017). Eventually, we were left with 13 indicators in total, as follows:

  1. 1.

    Clickstream analysis (Park et al., 2017)

  2. 2.

    Keystroke analytics (Casey, 2017)

  3. 3.

    Resource usage awareness (Santos et al., 2013)

  4. 4.

    Curriculum/Resource usage (Ferguson, 2012)

  5. 5.

    Clustering/Distribution (Bogarín et al., 2014)

  6. 6.

    Engagement and disengagement (Feild et al., 2018; Papoušek et al., 2016)

  7. 7.

    Performance (resource/user) (Agnihotri et al., 2017; Aljohani et al., 2019; Iandoli et al., 2014; Park & Jo, 2015; Syed et al., 2019)

  8. 8.

    Word count (Purday, 2009; Zancanaro et al., 2015)

  9. 9.

    Ideal resources (Sadallah et al., 2015)

  10. 10.

    Long term engagement (Zhu et al., 2016)

  11. 11.

    Authors’ self-reflection (Schumacher & Ifenthaler, 2018)

  12. 12.

    Licence distribution (Europe PMC Consortium, 2015)

  13. 13.

    Language distribution (Kostas Vogias Giannis Stoitsis & Ilias Hatzakis, xxxx; OpenDOAR. xxxx)

The selection of these 13 indicators was based on our project requirements, scope, and nature of the OER datasets. We further harvested the metrics or measurements for each indicator. To implement these indicators, we needed to take their metrics into consideration in the initial stages of the development of our LAD. These 13 selected indicators were examined further and adapted to our project use case. Our proposed dashboard consists of 16 indicators inspired by the work of the 13 listed and cited indicators above.

3.2 Our dashboard’s Indicators

Our proposed dashboard includes 16 indicators or visualisations in total. To organise these indicators, we divided our dashboard into three sections - 1. Upper panel, 2. Repositories panel and 3. Keywords distribution panel.

3.2.1 Upper Panel of the LAD

The upper panel of our dashboard includes seven indicators (see Fig. 2). This shows the basic number of resources in total and in each OER repository. This section aims to give users a quick understanding of the scale and size of the repositories. These seven indicators include Total OER, Total OER in Econis, Total OER in OpenClipArt, Total OER in UniWeimar, Total OER in ZOERR, Total OER in Lecture2Go, and Total OER in HOOU (see Fig. 2). The indicator Total OER is the sum of all the resources harvested and presented in our dashboard. The remaining six indicators show the total number of resources present in each repository.

Fig. 2
Upper panel indicators of the L A D depict total O E Rs which is 114207. Total OER in Econis, 100000; OpenClipArt 9998; UniWeimar 2570; ZOERR 866; Lecture2Go, 583; Total O E R in H O O U, 190.

Upper panel indicators of the LAD

To make these numbers more appealing and informative, we have developed a sub-indicator that shows the percentage of increase in the number of OER compared to the previous year. This indicator is presented as a green arrow pointing up with the percentage indicating the increase (see Fig. 2). The indicators in this panel were inspired by the works of Park et al. (2017); Casey, 2017; Santos et al., 2013; Ferguson, 2012; Europe PMC Consortium, 2015).

3.2.2 Repositories Panel of the LAD

In this section, we present six interactive indicators in three by two grids and the different distributions of OER repositories in various visualisations. This includes OER licence distribution, OER language distribution, authors and publishers with the most publications, OER distribution by year, and OER distribution by type. The core idea of the repositories panel is to offer the user a combination of an overview based on all resources and the ability to examine in detail each visualisation on the repository level, as studies (Eckerson, 2010; Kirk, 2016) suggest that the user should be offered the ability to filter and receive more detailed information (see Fig. 3).

Fig. 3
An O E R instrument panel has line graphs, bar graphs and tables indicating authors with the most publications.

OER repositories panel of the LAD

In Fig. 3, the donut chart is the key visualisation in our LAD. It represents the percentage of resources from each repository. The bigger the donut slice, the higher the number of resources from the given repository. Hovering on a particular slice shows the total number of resources present in the relevant repository. By clicking on a slice or on the legend at the top, the user can change the information source for the whole panel to the repository represented by the slice, thus examining, filtering, and obtaining repository-specific information (see Fig. 4). All five other visualisations are connected to the donut chart. After clicking on a specific slice, the chart will automatically update the data in all five connected visualisations in the panel. The colours inside the panel are changed to that of one of the donut slices as a visual indication that the user is presented with the data for a specific repository. The titles of the visualisations are also changed for clarity. The “Global summary” button at the top left of the donut chart gives the user the opportunity to show the combined information from all repositories. The light blue colour of the “Global summary” button, contrasting to the pastel colours designated for the individual repositories, is used only for the combined information rather than information from a specific repository.

Fig. 4
A pie chart of the repository distribution with the highest percentage of econis implementation and the least for others.

Repositories distribution Donut chart (key visualisation) of the LAD

In Fig. 5, a horizontal bar chart represents the distribution of the licences in the UniWeimar repository. It is the result of clicking on the “UniWeimar” slice in the donut chart. It should be noted that, at the most, the top 10 licences will be displayed. The chart’s x-axis shows the percentage relative to the number of resources in the given repository. On the y-axis, the licences themselves are displayed. The number of tics/points/labels on the x-axis is reduced for better readability when viewed on a smaller screen. Hovering over a bar reveals a tooltip with the complete name of the licence and the percentage of the resources that are tagged with that licence. All the repositories have a metadata element that indicates if a resource is under copyright or other restriction. Only some of the repositories include more specific information about the types of licence. If available, more detailed information is shown upon request. Otherwise, the number of resources including/not including a licence is shown.

Fig. 5
A bar graph of license distribution has several kinds of licenses on the y-axis and distribution on the x-axis. In copyright has the highest and attribution, no derivatives are the lowest.

Licence distribution in UniWeimar, shown in the LAD

Furthermore, to enhance user experience, we also allow users to download a particular visualisation in SVG, PNG, or CSV format for sharing or better understanding. To download a specific indicator, the user must click on the menu button (highlighted in Fig. 5) on the top right to see the download options. The indicator licence distribution is inspired by the works of Santos et al. (2013); Bogarín et al., 2014; Europe PMC Consortium, 2015).

The indicator “language distribution” is visualised with a horizontal bar chart (see Fig. 6). This indicator presents the global summary of OER by language. Figure 6 shows the ten most used languages in the German OER. Approximately 60% of the German OER are presented in German and 30% in English, which is roughly 90% of all the OER. In Fig. 6, the y-axis displays the ten most used languages, while the x-axis shows the percentage of the resources. Each horizontal bar also displays the total number of OER that exist in the relevant repository. Hovering on a bar will display more in-depth information on the used language category. This indicator is inspired by the works of Ferguson (2012); Bogarín et al., 2014; Iandoli et al., 2014; OpenDOAR. xxxx).

Fig. 6
A horizontal graph of language distribution has languages used on the y-axis and the resource percent on the x-axis. The German language has presented the highest percentage.

Global summary of language distribution shown in the LAD

Furthermore, Fig. 7 presents the percentage of language distribution in the “UniWeimar” resource repository. This indicator results from clicking on the slice UniWeimar or clicking on the legend UniWeimar in the donut chart presented in Fig. 4.

Fig. 7
A horizontal graph of repository language distribution represents the 45 and 50 % percent distribution of English and German languages.

Language distribution in UniWeimar shown in the LAD

Figure 8 presents the OER distribution by type. This indicator is visualised with a horizontal bar chart. This indicator represents the global summary of the number of resource types used in the repositories. Like other visualisations in the repositories panel, the light blue colour is always used for the global summary. The metadata element “learning resource type” is not present in all the harvested repositories. If this is the case, a “No Data” label indicator will be displayed. This indicator is the outcome of the findings of Schumacher and Ifenthaler (2018); OpenDOAR. xxxx; Kostas Vogias Giannis Stoitsis & Ilias Hatzakis, xxxx).

Fig. 8
An O E R resource chart represents the summary of the number of resources. The book resources were available in a large amount where images and others were fewer.

OER resource types

Figure 9 is another similar example that shows the OER resource types distribution in the UniWeimar repository. The colour of the graph is based on the selected repository in the donut slice presented in Fig. 4. The title of the visualisation is also dynamic and changes to the chosen repository. The indicators are shown in Figs. 6, 7, 8, and 9 could be helpful to users as a guide in which repository to search for resources in a given language.

Fig. 9
A bar graph of open educational resources. Highlighted bars represent high repository utilization in large amounts at 15, 33, and 37 %.

OER resource types in UniWeimar

To provide authors and publishers with information on their published resources to reflect their performance, we developed an indicator with two tabs - one is dedicated to authors and one to publishers (see Figs. 10 and 11). Figure 10 presents the ranking in a tabular form, where the top 15 authors with the highest number of published open educational resources are shown. On the other hand, Fig. 11 presents the top 15 publishers with the most published OER. By default, the indicator shows the list of top authors. To see the list of top publishers, one must click on the “publishers with the most publications” tab.

Fig. 10
A table illustrates the authors with the most publications. Author j 4 p 4 n has the highest number of publications.

Authors with the most publications

Fig. 11
A table of publishers with the most publications. Duncker and Humblot has the highest.

Publishers with the most publications

Figures 12 and 13 present the top 15 authors and publishers with the most publications in the repository Econic. Like other visualisations, the information is based on the donut slice selected. Some repositories, such as UniWeimar, Lecture2Go, etc., do not give information about the publishers. If that is the case with the selected repository, an empty table is displayed to the user. However, no colour indication shows which repository is currently being displayed here. Instead, the tab title at the top is dynamic and changes to the name of the repository selected. These indicators are inspired by the works of Aljohani et al. (2019); Syed et al., 2019; Schumacher & Ifenthaler, 2018).

Fig. 12
A table of authors with the most publications in the econis. The topmost author has 52, the highest publications.

Authors with the most publications in Econic

Fig. 13
A table of the publishers with the most publications in econis. It represents the topmost 15 publishers who have published their repositories in economics. Duncker and Humblot, 2895 is the highest.

Publishers with the most publications in Econic

The final indicator of the “Repositories Panel” of the global summary is a line chart showing the number of resources published each year (see Fig. 14). Again, light blue colour coding is used for the global summary. Not all data points are shown on the x-axis to keep the information more readable, but the user can still zoom in and investigate smaller time periods. This indicator is fully interactive, and the user can filter the time frame with the mouse cursor or mouse zoom in and zoom out function. Further, in Fig. 14, there are also several options provided on the top right of the indicator, where the user can also zoom in, zoom out, select, and move the years from left to right or right to left, and the home icon is used for going back to default. The indicator also provides the functionality of downloading the visualisation in the format of the user’s choice.

Fig. 14
A line graph of resource distribution on the y-axis with year on the x-axis. It represents the large, bifurcated peaks of distribution from 1977 to 1986.

Resource Distribution by Year shown in the LAD

Figure 15 is another similar line chart to the one discussed previously. This indicator shows the number of resources used per year in the Econis repository. The dark grey colour line is used to represent the repository Econis. A dynamic title is used to indicate the selected source repository. This visualisation could be helpful to determine trends in the number of publications published at repository level. This indicator is the result of the analysis of the studies (Bogarín et al., 2014; Feild et al., 2018; Santos et al., 2013; Zhu et al., 2016).

Fig. 15
The line graph of repository distribution with years depicts the rise in value till the year 1979, afterwards, a fluctuation and a steep decrease in value from 1984 is observed.

Resource Distribution in Econis repository by Year shown in the LAD

3.2.3. Keywords distribution panel of the LAD.

The third and final section is the keywords distribution panel. In this section, we have included and developed indicators based on the “keywords” metadata element. Unlike the repositories panel, the data here is based on the global summary. The indicators here aim to provide an overview of the keywords used in the harvested repositories. After a review of different methods, we decided to utilise a word cloud for the representation of the keywords in the LAD. In Fig. 16, the word cloud is used to visualise the most used keywords in the selected OER datasets. Both colour intensity and size are encoded into the visualisation. The bigger and the darker the shade of blue is, and the more central the position of the keyword, the more times this keyword is found as a tag in the keyword metadata element. Hovering over a keyword reveals the exact number of times this keyword is found. In the current implementation, the word cloud shows the top 99 keywords. This was a design decision, so as not to overcomplicate the visualisation and was not due to a technical limitation. It can easily be changed to show a bigger or a smaller number of keywords. Further, we also included and connected a line chart to the word cloud. The line chart is used to show the trend for the usage of a selected keyword. By default, a trend for the most used keyword is shown. In the case depicted here, the line chart shows a trend for the keyword “Deutschland”, which first appeared just once in the year 1912, but since then, the number of uses rose to 1853 times in the year 1979.

Fig. 16
2 illustrations of the word cloud and a line graph of the trend of keyword, Deutschland is illustrated.

A word cloud and a line graph depicting the top keywords shown in the LAD

Another example can be seen in Fig. 17, which presents the word cloud and a line graph depicting the keyword “Frankreich”. The word cloud provides the functionality of clicking on a keyword to see its trend. By clicking on any of the keywords, the keyword trend is shown on the line graph to the right. The line graph depicts the number of times this keyword was found in publications from a given year. The user can zoom in and investigate a smaller time period. This visualisation could be useful for discovering growing interest in a specific area. This visualisation is the outcome of the results and analysis of the works (Purday, 2009; Sadallah et al., 2015; Zancanaro et al., 2015).

Fig. 17
2 illustrations of the word cloud and a line graph of the trend of keyword, Frankreich depict the utilization trends of keywords with the most fluctuation in the initial years of research.

A word cloud and a line graph depicting the keyword “Frankreich” shown in the LAD

Figure 18 is the last included indicator. It is presented as a stacked horizontal bar chart. Here, the user sees how the top keywords are distributed by location. The 99 keywords from the word cloud are paginated into pages of 10. The user can go through the pages one by one or jump directly to the first or the last page. Our idea is to give the users an overview of where a resource containing a given keyword might be found. The colour-coding of the elements in the chart is not repository-based but based on the order in which the repositories show in the set of keywords.

Fig. 18
A paginated view of the horizontal graph represents keywords with the locations. The keywords are arranged in an ordered repository set.

Keyword distribution by location shown in the LAD

The proposed dashboard is connected to a dynamic database server. Currently, for the sake of our project scope, we are limited to a small number of repositories, but the tool is flexible enough to manage other newly added repositories. The LAD is accessible from any device, but it is recommended to access it from a desktop or laptop computer.

3.3 Technologies Utilised for Creating the LAD

This section discusses the technologies used and the system architecture (see Fig. 19). Our LAD is designed and developed for online access, so we primarily deployed technologies such as HTML, HTML5, Bootstrap, JavaScript, Typescript, Angular, and other server-side languages and services. The utilised technologies and services/processes are divided into three sections - 1. Front-end, 2. Back-end, and 3. Data processing.

Fig. 19
A system architecture depicts a flow diagram with the client, Heroku and amazon web services.

Our system architecture

3.3.1 Front-End

The main drivers of the front-end are the Typescript-based web applications framework and Angular. According to the 2020 Stack Overflow Developer Survey, Angular is the third most used web framework (www.stackoverflow.com.Stack Overflow, 2020), and it is developed and managed by Google. Our LAD consists of multiple visualisations and indicators. Therefore, we have used two different visualisation libraries. The word cloud visualisation (see Fig. 16) is created using amCharts,Footnote 7 and all other visualisations on the dashboard are constructed using Apex charts.Footnote 8

3.3.2 Back-End

To power the back-end of our LAD, we used NodeJS combined with ExpressJS. NodeJS is an open-source, cross-platform, back-end JavaScript runtime environment, and it was designed as real-time, push-based architectures (Cantelon et al., 2014). NodeJS is a popular server-side language among the software developer community. NodeJS is used by big tech companies such as PayPal, Netflix, eBay, etc.

Angular and NodeJS communicate through the middleware ExpressJS. Once NodeJS receives a request through ExpressJS to send the data that feeds the visualisations to the client-side, it first reaches the caching server. This same event flow can also be seen in our system architecture (see Fig. 19). To effectively analyse and visualise the data, it is necessary to store the data on the database initially. To store our data, we used Elasticsearch; and to host or manage this data, we used Amazon Web Services (AWS). AWS is a secure cloud services platform and a collection of remote computing services that offer computing power, database storage, content delivery, and hosting dynamic websites and databases (Amazon, 2015). Elasticsearch is a powerful and scalable open-source engine offering the ability to search and analyse big datasets quickly (Gormley & Tong, 2015). Elasticsearch is used by Wikipedia, Stack Overflow, GitHub, Twitter, etc. Further in Fig. 19, after receiving the raw data from Elasticsearch based on the user/client request, the data is processed by NodeJS. NodeJS sends the processed data as a response back to the Angular client and it is then displayed to the user.

3.3.3 Data Processing

After NodeJS fetches the data from Elasticsearch, it must be processed to be in the proper format for loading onto the charts. Doing the data processing on the back-end saves computational costs on the device of the user. The data we obtained was not ready to use data. We had to analyse and transform the data into understandable data for the indicators. For example, metadata fields like “language” and “copyright and other restrictions” cannot be visualised as they are on our LAD. Therefore, preprocessing and transformation of the data are required. The following is an example of how we solved such issues:

The “language” metadata field contains the languages tagged on a resource. The format of the values is ISO 639-2/T or a three-letter lowercase code describing each language. To provide the user with the full language name instead of the ISO 639-2/T code, the language value should be filtered after receiving the data from Elasticsearch. We wrote a Python script that scrapes a Wikipedia pageFootnote 9 and results in a JSON object containing 129 elements with keys that are ISO 639-2/T codes and values that are the full language name. Then, we placed the resulting JSON in a JavaScript file (utils/languages_filter.js) inside the project folder. All the JSON objects that contain language values pass through the filter and update their code values to a full language name.

4 Conclusions, Future Work, and Limitations

In this paper, we proposed an LAD in an attempt to assist students, researchers, and teachers in their interactions with OER by displaying useful information according to the users’ needs. This LAD is the outcome of a literature review, in which we identified appropriate indicators and implemented these in our LAD. The research outcomes of various publications inspired this work resulting in our research-driven LAD. The technical implementation of this work is grounded in established technologies that are easily scalable for a significantly larger amount of metadata. Therefore, we plan to extend this dashboard to support a larger number and variety of OER such as MIT Open Courseware, Khan Academy, etc. The technology stack used in this work lends itself as a stepping stone for further development. This study has three main limitations. First, our proposed dashboard may not be the only approach to providing a sophisticated data visualisation. Second, the selected sample of publications was limited since we mostly focused on tool- or LAD-specific papers. Therefore, we recognise that we might have missed indicators and metrics for other purposes to be included in our review and proposed dashboard. Third, there could be a small margin of error in data harvesting due to human lapses or slips.