According to the United Nations Educational, Scientific and Cultural Organization (UNESCO), global investments in Open Educational Resources (OER) are needed for the purpose of improving access to education. OER term was created at the 2002 UNESCO Forum on the impact of open courseware (UNESCO, 2011). Open education is anchored both in the history of new education and the experiences of free schools, such as the Summerhill School in England with the focus of the promoting their use of OER: we pass resources to practices; one way to say that the provision of these resources was not enough to ensure its use (D’Antoni, 2009).

By definition, educational resources are educational materials, in the form of presentation, video, etc. accessible through the Internet under license. A very famous example of organization that develops open educational resources is the Khan-academy, established in 2006. It is a no-profit organization whose mission is to provide free education for all around the world (Khan Academy, 2006).

The demand for training is increasing in higher education. For example, in Africa, the number of students is increased from 200,000 in 1970 to 5,000,000 in 2014, an increase of 9% per year (Bateman, P, & Moon, 2012) so we can talk here about the growth of numbers and massiveness of students.

Recently another movement, not far from OER, Massive Open Online Course (MOOC) is one of the modalities of the e-learning. It is indeed a course distributed digitally. It was one of the dominant events in the world of higher education in 2012 (Caramel, 2015). In addition, in massive open online courses, open learning has raised new challenges. For instance, a problem of retention in online courses and retention in MOOCs should be carefully considered (Koller, Ng, & Chen, 2013). Data analysis and observations of (Khalil et al., 2014) show the most significant factors that cause a high attrition rate of MOOCs; lack of time, lack of motivation of learners, lack of interactivity in MOOCs, lack of knowledge and skills.

Each learner is unique, has individual potential and learn differently including the learning from his/her peers. The common characteristics of using educational games are that they allows learners to be active, reflective and engaged. Furthermore, educational games allow learners to learn individually or in group. They allow also to collect rich traces about the learners which help in the process of learners modeling and personalization of courses.

The Massif attribute is common in MOOCs and massive games. In MOOCs, Many thousands, even tens or hundreds of thousands of learners can register. But beyond the numbers, it is a new experience that is offered to MOOC participants. Just like the massive character of some online games (like the famous World of Warcraft) which allow discovering the players behaviors, such as emulation and building teams. Also, the relationship between MOOCs and educational games has revolutionized the contacts, relationships between students and the way to recommend information and knowledge. The MOOCs dimension makes it possible to develop a mutual aid that allows some to learn better by helping their peers, by questioning more freely, or to solve together an enigma that will allow everyone to progress in their learning.

The use of open educational game may be a solution to enhance the motivation of learners and the interactivity in MOOCs. Also, educational games help in solving many problems such as arithmetic, sorting and searching. We can even program computers to have supernatural abilities in solving skills (Amory et al., 1999). Also we can get computers to play some board games better than any human being. For most learners, computer games have become a major part of their lives. Using games for learning would be a good manner to encourage learners to learn with fun. In this paper, we focus in the engineering side: creating games that apply in MOOCs and that help to learn.

The questions we asking here is: How can we optimize the retention rate in MOOCs? This question directs as to two sub-questions: (1) What is the optimal personalization strategy in MOOCs. (2) Haw using educational games in MOOCs to enhance the learners’ retention rate?

This study examines the problem of students who failed their MOOCs; it also provides strategies that can be implemented to increase the overall retention rate of students. Our objective is to automatically select the appropriate personalization strategy in MOOCs. Our approach improves the retention rate by personalizing the learning according to the appropriate personalization strategy in MOOCS. In order to achieve our goal, we use the K-means algorithm to classify the learners based on their traces when they play open (open source) educational games. In fact, educational games have high interactivity level and allow to collect rich traces about the players. This will allow selecting the appropriate strategy for personalizing MOOCS to each group of learners. This is done to minimize the number of students’ groups and decrease the complex combinations of personalization parameters.

The rest of the paper is organized as follows. Related works section contextualizes the contribution by analyzing the researches on e-learning personalization. Proposed solutions for optimizing the personalization strategy section, introduces our approach for the selection of appropriate personalization strategy in MOOCs. Validation method section presents the experimentation method, the participants, the procedure and the instruments. Results and interpretation section describes and discusses the experimentation results. Finally, Discussion, conclusion and future research section presents the conclusion and gives an overview of future works.

Related works

This section presents works related to e-learning personalization and MOOCs. Besides, it explains the e-learning personalization.

MOOCs personalization

Personalization in MOOCs in a global way, personalization is to change the behavior and characteristics of a system according to the user who interacts with it. The customization proposed to a specific user is based on his/her profile. The profile of a user contains information that characterizes him/her and it is the instantiation of the user model. Personalization in MOOCs is a well-established research topic that is becoming increasingly important. Several definitions and explanations have been proposed to present the personalization in the educational setting. In (Verpoorten et al., 2009), personalization is defined as the automatic structuring of learning paths to meet the needs of the learner. In particular, (Baguley et al., 2014) describes individual learning as the adaptation of pedagogy, curricula and learning environments to meet the learning needs and styles of each learner. The personalization of learning environments aims to change the traditional perspective of teacher-centered teaching into a learner-centered perspective. Klašnja-Milicevic et al. (2017) defines the aspects that can be personalized in a learning environment: (1) the content delivered to learners during the learning process, (2) the presentation and order in which the content is presented and (3) the method used to evaluate learners.

E-learning and MOOCs personalization approach

Researchers presented several approaches for the e-learning and MOOCs personalization (Essalmi, Ayed, Jemni, Graf, et al., 2015). Table 1 presents examples of personalization approaches in e-learning and MOOCs.

Table 1 Examples of personalization approaches for the E-learning and MOOCs

There are others approaches which provide learners with recommendations in MOOCs. This is the case for example of the solution presented by (Gutiérrez-Rojas, 2015) to help learners achieving their learning objectives of the MOOCs. The learner model contains the MOOCs already followed by the learner. This information is retrieved by asking the learner an explicit/her. This is done based on the assessments that have been assigned in the current MOOC and on a similarity calculation with the others MOOCs. In (Iftene, 2016), a recommendation system is proposed in the form of a conversational agent that recommends MOOCs.

Some personalization strategies are being identified to increase retention rates in MOOCs and online learning. These research works aim to improve the quality of open education. In this paper, we examine the problem of learners who failed their MOOCs; we also provide an optimal strategy that can be implemented to increase the overall retention rate of learners.

E-learning personalization strategy

The e-learning personalization strategy involves students in deciding their own learning process, as we’ll discuss below. This teaches the students vital skills that will serve them throughout their lives. For example, Sharing in goal-setting helps students develop motivation and reliability. Also, engaging in self-assessment helps students develop self-reflective abilities. In addition, the personalized learning strategies are ways that learners use to acquire, integrate and remember the knowledge they are taught (Essalmi et al., 2010).

There is an important number of personalization strategies (Essalmi et al., 2010). For example, if we have N concepts included in a course and if we assume that each personalization parameter includes K different characteristics of the learner (Essalmi, Ayed, & Jemni, 2007). In this case, the teacher responsible for the course must prepare N * K learning scenarios if he/she consider only one personalization parameter. Taking the example of personalization parameter learner’s Felder–Silverman learning style which includes the learners’ characteristics: {Sensing / Intuiting, Visual / Verbal, Active / Reflective and Sequential / Global} (Felder & Silverman, 1989). For N = 19 and K = 4; 19*4 = 76 different learning scenarios.

The problem of selecting the personalization parameters becomes more complex with the increase of the number of students in MOOCs. In the same context, (Essalmi et al., 2015) we need a multi-parameters personalization approach to combine several personalization parameters including the pedagogical approach. This multi-parameters combination combined with the massive number of learners represents an important issue in MOOCs.

Parameters of personalization

Our purpose is to personalize the MOOCs for motivating the students and enhancing their attention. There is an important number of personalization systems using different personalization parameters (Chorfi & Jemni, 2004; Essalmi et al., 2007, 2010, 2015). Each of them aims generating a personalized course, according to a set of learners’ characteristics.

Table 2 presents a set of personalization parameters as well as their potential values reported in the literature and used by teachers for personalizing their courses.

Table 2 Examples of values for the personalization parameters (Essalmi et al., 2010)

Proposed solutions for optimizing the personalization strategy

This section presents our architecture to optimize the selection of personalization strategy in MOOCs. It presents also a clustering algorithm used to apply our approach.

MOOCs personalization system

In this sub-section, we present our architecture for the MOOCs personalization strategy. The presented architecture focuses on the analysis of the personalization strategy.

Figure 1 presents a general of the personalization strategy system; it contains an easy and uncomplicated user interface used to better communicate with the teacher. The system requests information from the Database component when it’s needed. The Database stores all the information about the traces which are used to evolve the user profile. Then, the classification algorithm is applied and the appropriate personalization strategy is generated.

Fig. 1
figure 1

Proposed architecture of the MOOCs personalization strategy

The principle is simple in the Fig. 1: learners will play open (free of use and could be modified as open software) educational games, during which all their actions will be traced. Favor to the traces of learners’ interactions with the platform, the learners’ profiles can be generated. Our application based on k-Means algorithm classifies the learners and select the appropriate personalization strategy. It is possible to automatically determine for each learner new activities and new paths, according to the information contained in his profile. Then, the cycle will be able to begin again, since new traces will be generated by the learners when they play new educational games. For each learner, the system generates activities that correspond to his/her characteristics (learner profile).

The objective of our approach is to automatically optimize the personalization strategy for each learner. One of the challenges that MOOCs will have to meet is providing personalized paths for learners, in order to better attend the course. In order to respond to this major issue of personalization in the field of MOOCs, we have proposed a complete model of MOOCs personalization. Our approach is based on the follow-up of the learners, and analyzes of their traces in order to build new learning path for each student.

In the first step, the learners play the educational game. The players must be connected with the application by different logins and passwords. Thereafter, they play a game presented in the system such as Pacman mini games versions (action and puzzle). Then, the players can respond to QCM presented in the game. The traces such as the choices of the player for any games and his/her path were collected and recorded in a MySQL database already connected to these mini-games.

The second step represents the traces classification. We apply the unsupervised classification K-Means algorithm, which classifies the players in groups based on their similar traces. The classification results will be visualized as a graph. Players with the common characteristics are assigned to the same group and players with different characteristics are placed in different groups.

The process of optimizing the personalization strategy in MOOCs is presented in the Fig. 2.

Fig. 2
figure 2

Process of personalization strategy optimization in MOOCs

System of trace extraction

In the context of the main question addressed by this paper (How can we optimize the retention rate in MOOCs?), our first sub-question is: What is the optimal personalization strategy in MOOCs. To answer to this sub-question, our contribution consists in proposing an approach based on the analysis of the learner’s trace for recommending courses that takes into consideration both the learner’s characters and his/her preferences. Trace-based systems have been used as the main brick in a recommendation model (Develay, 1996), which makes it possible to use all the information collected on a learner to show him/her the activities to follow in order to succeed in the process of learning. A good recommendation also requires a good definition of the profile of the learner (Salomon, 1992).

K-means algorithm and its application on traces of open educational games

In unsupervised classification algorithms, unsorted information can be grouped according to similarities and differences even if no categories are provided. Partitioning data is an important task in data analysis; it divides a set of data into several subsets, these subsets are named clusters (Ng, Jordan, & Weiss, 2002). In the clustering problem (Larose, 2005) a set of unmarked data is given and we want the algorithm to automatically aggregate it into coherent or clustered subsets consistent for us.

K-means defined by McQueen is one of the simplest automatic data classification algorithms. K-means is the most used clustering algorithm (MacQueen, 1967). The main idea is to choose randomly a set of centers fixed a priori and to iteratively seek the optimal partition. Each individual is assigned to the nearest center. After having assigned all the data, the average of each group is calculated, and constitutes the new representatives of the groups. These groups are ideally characterized by a strong internal similarity and a strong dissimilarity between the members of different groups (Kogan, 2007). When a stable state is reached and no group of data changes, the algorithm is stopped. The k-means algorithm is popular due to its simplicity and its ability to process large datasets (Kogan, 2007). The pseudo code of the K-means algorithm applied for students’ classification based on their traces in open educational games is as follows:

figure a

In addition, (Ruwet & Haesbroeck, 2012) are determined the classification efficiency of the k-means rule classification efficiencies of the logistic discriminations. In the following graph we present a simple example of k-means (red points are centroids, and blue points are students):

We note that to receive the k means algorithm is applicable in our classification; we adapted this algorithm when implementation to some modification. The main idea in Fig. 3 is to choose at k the number of characteristics a set of fixed centers a priori and to search iteratively for the optimal partition. For example, each student is assigned to the nearest center. After assigning all the data that the average of each group is calculated, it constitutes the new representatives of the groups. The algorithm is stopped when the algorithm is stable and no group of data changes.

Fig. 3
figure 3

Example of results k-means algorithm

The implementation is divided in two parts: in the first part, two open educational games have been implemented. The first game is an educational version of the Pacman game in the form of puzzle. It allows the learners to benefit from the drag and drop technology, in an amusing way. In fact, it allows the learners to construct a program step by step. The second game is a learning version of Pacman game (Khenissi et al., 2012). It is considered as an action game. It keeps the learners moving and involved in order to conduct Pacman correctly. The learners’ choices for any path in their two games are automatically saved in a Database. Thereafter, the learners will answer to the Felder-Silverm Learning Style Models (FSLSM) questionnaire. It contains questions corresponding to each of the four dimensions of the FSLSM. The aim of the questionnaire is to determine the learning style preferred by each learner.

The second part is the implementation of the k-means algorithm for the student classification based on their traces collected during the use of the educational games.

Figure 4 presents the main interface of menu web application which is available in, where many alternatives are displayed according to the learners’ preferences, such as play games, QCM. The display of the learners group could be used only by teachers.

Fig. 4
figure 4

Menu of the Web application for playing open educational games

The first part presents the game scenario, the goal of this game is to teach the SQL database language in the form of questions that allows the player to choose only one answer for each question. When Pacman eats a gold kiwi, a question is visualized. The player answers the question and the correct answer is displayed.

In the second part, when the player ends the game, he/she will answer to the question his/her traces are saved in the database.

When learner is playing an open educational game, his/her traces are followed. Currently, we have two open educational games.

  • The first one is an educational version of PacMan action game which is presented in the Fig. 5.

  • The second is an educational version of PacMan puzzle game which is presented in the Fig. 6.

Fig. 5
figure 5

Educational version of PacMan action game

Fig. 6
figure 6

Educational version of PacMan puzzle game

This action game PacMan 1 is an advanced Pacman game composed by three levels.

Then, this simple game PacMan 2 is the classic pacman game in which we add some learning scenarios.

As long as game integration in MOOCs is further supported by researchers (Huangxing Zeng, 2016), they realize that games can inspire learners by increasing productivity, and conduct research on creativity. With a clever integration of the study of the MOOC game, it will undoubtedly bring significant innovation and development to the teaching of educational computerization accordingly.

In addition, the online educational game is developed as an educational website using the game elements in MOOCs to make it more interactive for users. Therefore, the implementation of the game elements in MOOCs increases the user motivation, improves engagement during the learning process, and allows users to use the MOOCs for long time.

Validation method

In order to verify the validity of the proposed solution we conduct the following experimentation.

Participants and procedure

Fifty-seven learners from the Raccada secondary School in Kairouan in the field of computer science has participated in the experiment. They are from different educational levels (2nd-year secondary and 3nd-year secondary of computer branch); learners are the ages between 15 and 18 years. They are grouped randomly into the control group and the experimental group; 27 learners (17 girls, 10boys) in the control group and 30 learners (19 girls, 11 boys) in the experimental group. The learners of the two groups were requested to answer the pre-test given by the same teacher using paper and pencil. The given pre-test was composed of questions that allow a question that allows testing the learner’s knowledge. Then, the control group learned through the traditional method (reading) and the experimental group learned through the learning games. Then, the post-test which focus on evaluating the student’s level of knowledge is used.

Here it is important to mention that during the experimentation faced a major problem of material; classroom computers didn’t work properly and there were no internet connection to play the game online. Under these circumstances, I found myself obliged to give my own laptop to students to use one by one, which was a waste of precious time and energy. In addition, the experiments are conducted in a high school and in the field of computer science and since most of the existing MOOCs are university-level courses, so the experimentation is adapted to the secondary level for that we adapt this experimentation, by introducing the MOOC and making a quick explanation of it for the learners,


In order to verify the validity of the hypothesis: the classification of learners minimizes the complexity of personalization parameters problem, we use the following instruments:

We have used pre and post-tests which focus on verifying student’s level of knowledge. After answering the pre-test, the 30 students of experiment group played the educational games. The 27 control group students learned by using text containing the same information presented in the games.

The results of the post-test will make the research team know whether or not the MOOCs, as well as the process learning, is useful. Our research team has the following two hypotheses:

  • H1: The proposed process of personalization learning with MOOC makes students learn better.

  • H2: The traditional learning makes students learn better.

To identify the appropriate personalization parameter, learners had to answer the Index of Learning Styles (ILS) which is a questionnaire containing 44 questions, 11 questions corresponding to each of the four dimensions of the Felder-Silverm learning style models (FSLSM). The Index of Learning Styles (ILS) is a questionnaire which is validated and presented in the literature (Felder & Silverman, 1989).

Also, the learners had to answer the questionnaire of the Technology Acceptance Model (TAM) which includes 20 questions evaluating the learner’s satisfaction. The realized TAM has 20 items (ranging from 5 for “strongly agree” to 1 for “strongly disagree”). In fact, to obtain a valid and reliable questionnaire, it should have at least three items for each variable. Specifically, students of the experimental group were requested to answer to the TAM questionnaire. It includes instances of the 5 items for Usefulness (U) and the 5 items for Ease of Use (EOU). It includes also instances of the 5 items for attitude toward using the system (ATT) and the 5 items for behavioral intention to use the system (INT) presented in (Davis, 1989).

This study used also a test of learners’ preferences. The preference’ values were divided into three groups, namely prefer, neutral, and do not prefer to use the game.

The learners of the two groups were requested to answer the pre and post-tests given by the same teacher using paper and pencil. Additionally, both groups had the same amount of time to play. In the classroom pre-test and post-test, learners had to response to the post-test in order to evaluate the effectiveness of each learning method. We have integrated an implementation of the K-means algorithm in our system; it uses as input the number of students with their traces. The output is groups of students with maximum intra-class similarity.

Results and interpretation

This section presents the experiment results.

Pre-test and post test results

In this sub-section, the comparison of students’ results in the pre-test and the post-test was used to draw conclusions about the educational games effectiveness compared to the simple traditional learning method. We calculate the averages of students’ scores on the pre-test and the post-test. These averages show that, in average, students of the experimental and control groups have similar levels of knowledge before starting the experimentation (12.91 for the experimental group and 11.90 for the control group). However, after the experimentation, the level of the students in the experimental group is greater than the level of the students in the control group (10.86 for the experimental group and 10.22 for the control group). Table 3 presents the average of learners score in pre-test and post-test:

Table 3 Average of learners score

From Table 3 we can conduct two observations: The average score of the experiment group who has used the educational games is superior than the average score of the control group who has learned with the classical method, although both groups have approximately the same average in the pre-test. We note also that the post-test was more difficult than the pre-test. This explains the fact that the average scores in the post-test is not good.

In Table 4 above, results for the two groups are presented. As we can observe, there is a difference between the learning outcomes presented by the two groups. However, is this difference significant? To answer this question, ANOVA test (see Table 4) is used.

Table 4 ANOVA test outputs: single factor

Two hypotheses are discussed. H0: there is no significant difference between the control and the experimental groups. H1: there is a significant difference between the two groups. After running the ANOVA test, we obtained the results presented in Table 4.

The alpha value is 0, 05. Furthermore, p-value = 0.00000000231 < alpha = 0.05. So, H0 is rejected and we conclude that there is a significant difference between the two groups.

Also, we used the t-tests to analyze data by comparing the two groups. We have two hypotheses.

  • H0: There is no significant difference between the learning outcomes in the two groups (Control and experimental groups).

  • H1: There is a significant difference between the learning outcomes in the control and experimental groups (Table 5).

Table 5 The results of t-test comparing experiment group and control group t-Test: two-sample assuming equal variances

We observe that the experiment group has a significant progression of gained knowledge compared to the control group when the learners of the two groups had all an approximate average in the pre-test. So, the experimentation shows that the use of games in MOOCs is more efficient and useful than traditional methods of learning.

Technology acceptance results

To calculate the level of acceptance of the educational games, the experimental group answered to the technology acceptance model (TAM) questionnaires. We have adapted the Technology Acceptance Model (TAM) (Masrom, 2007); since it is considered as the most used model for the validation of the information systems. Table 7 presents the averages and medians of the learner satisfaction.

Furthermore, to prove the reliability of the questionnaire, we use Cronbach’s alpha coefficient. It is a statistic used to measure the internal reliability of the asked questions. Its value is between 0 and 1 and it is considered as acceptable from 0.7.

Two participants from the experiment group did not reply to the totality of the given questionnaire, hence, the corresponding responses were eliminated. The final valid sample therefore includes 55 students.

Table 6 displays the results of reliability analysis. All constructs have acceptable measures of reliability since their values exceed 0.7.

Table 6 Reliability analysis results of the proposed TAM

A reliable TAM questionnaire with four constructs and 20 items is determined to measure students’ attitude toward the developed educational game.

Table 7 presents the averages and medians of the learner satisfaction.

Table 7 Averages and medians of learner’s satisfaction

Table 7 shows that the average and median values for the TAM questionnaires are near to 5 which mean that the learners were satisfied with using the educational games.

Classification of students and interpretation

Figure 7 presents the results of applying the K-means algorithm for classifying students in MOOCs.

Fig. 7
figure 7

A classification of learner in MOOCs

In the Fig. 7, the dots that have the same colors present the students who have similar characters and they are in the same group. The K-Means algorithm allows to aggregate the data into coherent groups of learners. Groups are ideally characterized by strong internal similarity and strong dissimilarity between members of different groups of learners.

The results displayed at the Table 3 showed that the learners benefited from the open educational games which represents a fun way for learning and a learning which has a positive impact on learning outcome. Table 7 shows that learners are satisfied with the open educational games. Furthermore, the Fig. 7 shows that the k-means algorithm could be used to classify students in MOOCs.

Discussion, conclusion and future research

This section discusses the obtained results and their similarities with other research work regarding the preferences on using the web application including open educational games. The uses of personalization parameters and k-means algorithm are also discussed.

The obtained results showed that learners have positive attitudes towards using the educational versions of PacMan-game. These results are similar to the results of recent studies (Khenissi et al., 2012, 2016); in wish the personalization of learning games according to learning styles is discussed.

The k-means algorithm is implemented to classify students. This allows personalization parameters to be used for personalizing the open educational games.

When we need to personalize MOOCs, the question is: which personalizing parameters we have to use?

Assume that we have to consider all the personalization parameters for personalizing open content to massive learners.

This proposition aims to apply a large number of personalization parameters for personalization MOOCs. Consequently, the combination of personalization parameters will be more complex. The generated learning scenarios have to fit all the characteristics of the learners. This problem is discussed in (Essalmi et al., 2010) in the E-learning context. We have integrated two components to minimize the complexity of this problem. The first one is the open educational games which allow to model the learners implicitly. This will allow to avoid the huge task of learners when they answer to many explicit questionnaires about their level of knowledge, motivation, learning styles and so on. The second component is the classification algorithm (k-mean in our case) which allows classifying learners and then help teachers to select the personalization parameters and combine them flexibly to define different personalization strategies.

The proposed open educational games combined with the classification algorithm could constitute a solution to improve the learners’ retention in MOOCs. For instance, the open educational games help learners to enjoy at the time of learning. Also, the classification algorithm allows personalizing learning according to the learners’ profiles.

This study aims to improve personalization in MOOCs. It is proposed for individualizing the interactions with students and for improving their retention rate in MOOCs. Optimized personalization strategies enhance the learning process and help the teacher to select the appropriate combination of personalization parameters in MOOCs.

The open educational games (educational version of PacMan action game and educational version of PacMan puzzle game) bring a great learning experience and solicit different skills and abilities. They provide the learner with basic learning needs by providing pleasure, motivation and creativity. Also, these open educational games allow collecting learners’ traces and then classifying them. The results of the experimentation showed that the proposed system could help learners, improves their skills and quality of learning.

Future directions of this research could focus on the generalization of our e-learning system to support different MOOCs and satisfy different needs of teachers. Furthermore, it is possible to test others classification algorithms in MOOCs.

This study still has some limitations, where the evaluation of the MOOCs platform has not been well completed. The prototype is developed as an educational website using gamification elements to make this site more interactive for users. Thus, the level of success and effectiveness of the proposed game elements has not been proven. Therefore, in the next study, the prototype of the MOOCs platform will be tested both by direct use and by collecting student feedbacks to analyze the performance of the platform.