1 Digitization in Educational Institutions

School as we know it has barely changed since its invention in the 17th century.Footnote 1 Tests on paper, use of blackboards, lessons with textbooks—despite all methodical and pedagogical reforms there has long been no sign that these basic things would ever change. However, classrooms and lecture halls have undergone gradual changes in recent years: Nowadays, smart boards can be found in many schools, lectures come along with complementary online servicesFootnote 2 and students may participate by using real-time feedback and voting apps.Footnote 3 Even most preschool kids are already familiar with tablet computers,Footnote 4 smartphones and the like.

Against this background, the whole learning process is expected to undergo significant changes. New technologies—particularly digitization—will most likely transform the educational sector, i.e. schools and universities, considerably.

2 The Future of Education—Predictions and Benchmarking

For decades if not centuries, individual learning behavior was measured by exams, grades, credit points, or certificates. Nowadays, there are numerous technologies that allow for far more sophisticated insights of rather informal nature. For example, detailed data cannot only be retrieved for individuals (i.e. a single student), but also for groups (e.g. classes) or specific clusters (e.g. multiple institutions in a school district).

  • For instance, when using an e-learning platform, teachers can see how often their latest slides have been downloaded. At the same time, they can retrieve statistics on how long students were logged in and what they were occupied with in the meantime.Footnote 5

  • When using massive open online courses (so-called MOOCs) it is possible to analyze each and every participant’s clickstream. That allows drawing conclusions on the individual’s learning behavior and her/his potential shortcomings.Footnote 6

  • Where tablets, smartphones and e-books replace conventional textbooks, we can retrace at which pace a student reads and which passages s/he read to prepare herself/himself for the exam. We can even figure out whether s/he actually fulfilled the compulsory reading task at all.

Those examples show that new technologies and concepts—such as integrated learning (so-called blended learning)Footnote 7—generate huge volumes of data, which might relate to the learning behavior of individual students, the learning progress of an entire class or the success of a teaching concept. However, gathering the data is only the first step. The real promises (and challenges) of education 2.0 lie in the second step. That is linking and analyzing the data.

3 Educational Data Mining and Learning Analytics

For this purpose, techniques such as educational data mining (EDM) and learning analytics (LA) come into action. The first task is to properly organize all information that has been gathered in the course of a completely digitized learning process. That usually involves two types of information: structured and unstructured data. While structured data includes information such as a learner’s IP address or her/his username, unstructured data may relate to various texts from internet forums, video clips, or audio files. Beyond that, there is so-called metadata (e.g. activity data or content-related linkage).

At first, educational data mining allows extracting relevant information, organizing it and putting it into context—regardless of its original function. That is a crucial step in order to process the data for further analytical purposes.Footnote 8 In this regard, as a matter of fact, educational data mining resembles what is known as commercial data mining. Commercial data mining describes the systematical processing of large datasets in order to gain new, particularly economical insights.Footnote 9 It is often used in financial or industrial contexts.Footnote 10

Subsequently, learning analytics seeks to interpret the collected data and draw conclusions from it.Footnote 11 Basically, the underlying idea is to optimize the individual learning process by exploiting the provided raw data. This does not only include a comprehensive visualization and reproduction of past learning behavior. It rather aims for predicting future learning behavior. This process is called predictive analytics.Footnote 12 It allows, for instance, to detect and tackle individual learning deficits at an early stage in order to prepare a student for the next assignment.

4 Stakeholder

Who benefits? On one hand, it is students and teachers, of course. Teachers can not only retrieve summarized reports for entire classes but also track the learning behavior of individual students. This allows them to respond at an individual level and with measurements that are tailored to the student’s particular needs. Lecturers, too, are able to receive in-depth and real-time feedback regarding their teaching behavior and skills.

In addition, educational data might be useful for scientific or administrative purposes. One might want to evaluate institutions, lecturers, curricula or student profiles, for example. When it comes to structural reforms, educational data might also come handy for political actors.Footnote 13

Last but not least, there are quite a number of economic players. As a matter of fact, it is not only software developers or hardware manufacturers that long to know more about the use of their products. Schoolbook publishers or market actors that provide learning-related services such as private lessons, coaching or retraining are among the interested parties, too. Employment agencies and HR departments are certainly interested in educational data as well.

5 Data Sources and Data Protection

While private sector companies have discovered big data as an emerging technology some time ago (think of buzzwords like industry 4.0 or the internet of things), in terms of education big data does not seem worthy of discussion yet. This is quite surprising given the fact that students, teachers, and lecturers generate a considerable amount of data. That, in turn, raises the question: Where does educational data come from?

Since the US educational system proves to be more liberal in implementing new technologies, it provides some ideas. Teachers in the US increasingly rely on so-called classroom management systems (CMS) or mobile apps to organize their classes. However, very few of these applications are actually tested and/or approved by supervisory authorities. Therefore, the use of educational software is hardly regulated. Besides, many apps lack common IT security standards. From a quality point of view,Footnote 14 very few apps guarantee sufficient accuracy standards.

Apps are just one source of educational data. Additional information might come from e-learning platforms, laptops and tablets (sometimes provided by schools) or student IDs with RFID functionality.Footnote 15 Data from social network services (SNS) might be involved as well.Footnote 16 A common problem with these sources relates to unauthorized access. In some cases, third parties are likely to have access to more data than schools and universities have. At least, that is what one provider of adaptive learning systems claims.Footnote 17

After all, it is the student who provides most educational data.Footnote 18 That is crucial since s/he is often a minor who is usually protected by specific underage provisions. As s/he (involuntarily) discloses personal and highly sensitive data, privacy activists fear the Orwellian “transparent student”.

Against this background, it is astonishing that big data in education is not a controversial topic in Europe yet. Particularly, since European jurisdictions have considerably higher standards regarding privacy and data protection. In Germany, for instance, there are not only supranational (GDPR) and federal provisions (BDSG) but also school-specific regulations at state level (e.g. sections 120–122 SchulG NRW).Footnote 19

6 Summary and Challenges

To put it in a nutshell, big data promises revolutionary changes in education. It is true, as Slade and Prinsloo point out, that “[h]igher education cannot afford to not use data”.Footnote 20 However, the difference between the US educational system and European—particularly German—schools and universities does not only result from a different level of technological implementation but also from unequal privacy legislation standards.

Since educational big data technologies are still in their infancy in Europe, all relevant stakeholders should take the chance to enter into a joint discussion as early as possible. Such a dialogue should focus on a critical reflection of promises and risks. First and foremost, it needs to take into account aspects like privacy, data protection, transparency and individual freedoms. After all, tracking and analyzing an entire educational career creates unforeseeable implications for both individuals and the society as a whole.Footnote 21