Abstract
The personalization of curriculum plays a pivotal role in supporting students in achieving their unique learning goals. In recent years, researchers have dedicated efforts to address the challenge of personalizing curriculum through diverse techniques and approaches. However, it is crucial to acknowledge the phenomenon of student forgetting, as individuals exhibit variations in limitations, backgrounds, and goals, as evidenced by studies in the field of learning sciences. This paper introduces the complex issue of fully individualizing a curriculum while considering the impact of student forgetting, presenting a comprehensive framework to tackle this problem. Moreover, we conduct two experiments to explore this issue, aiming to assess the difficulty of identifying relevant curricula within this context and uncover behavioral patterns associated with the problem. The findings from these experiments provide valuable prescriptive recommendations for educational stakeholders seeking to implement personalized approaches. Furthermore, we demonstrate the complexity of this problem, highlighting the need for our framework as an initial decision-making tool to address this challenging endeavor.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction and Context
Education in universities and engineering schools is undertaking quite important changes, mostly to reflect the use of new technologies, breakthroughs in the Technology Enhanced Learning (TEL) field, and ever-changing society’s needs Kryukov and Gorin (2017); Daniela et al. (2018). Therefore, it is important to understand these changes to provide reliable assistance to education stakeholders (i.e. students, teachers, and institutions). A significant observable change in these institutions is the gradual substitution of predefined curricula with more modular alternatives University of Reading (2019).
In this paradigm, emphasis is given to personalizing the learning experience to better match students’ career expectations and goals. Students can then choose courses at each academic term (e.g. semesters) to build their sequence of courses (namely a curriculum) according to their objective (e.g. career goal). Yet, an inadequately structured curriculum presents several challenges to students. The absence of a coherent and adapted progression may hinder their assimilation of skills, competencies, behaviors, attitudes, abilities, or knowledge and impede their effective application Tetzlaff et al. (2021); Aleven et al. (2016). Difficulties stemming from insufficient prerequisites or unpreparedness for advanced coursework may also precipitate disengagement Walkington and Bernacki (2014). Erroneous sequencing can also extend degree completion timelines, thus affecting institutional graduation rates. Considering that the average time-to-degree for a Bachelor’s degree in Europe is approximately 3.5 years Vossensteyn et al. (2015) in a traditional learning environment, these aforementioned challenges could substantially prolong students’ time-to-degree. Furthermore, these badly structured curricula could be hard to identify by teachers and institutions Caputi and Garrido (2015) and may yield inefficiencies in resource allocation, with certain courses witnessing disproportionate demand while others are underutilized.
Currently, to circumvent these risks, in the vast majority of cases, institutions implementing this kind of approach decide which time periods and courses can be personalized by students. Individualizing curricula consists of relaxing these constraints for students: they have to fully define their entire curriculum which will, potentially, better match their objective. We call such curricula fully individualized curricula.
Student forgetting, a phenomenon observed in the field of learning science, refers to the gradual loss or decay of previously acquired knowledge or skills over time. It is influenced by various factors such as the passage of time, lack of reinforcement or practice, and interference from new information Arthur et al. (1998); Ebbinghaus (2013). The decay of knowledge significantly impacts the outcome of a student’s curriculum, as it can lead to the loss of essential prerequisites for future courses within the curriculum determining the success or failure of a student’s educational journey. Recognizing and predicting the potential impact of this decay is needed to optimize the learning experience and ensure students have a strong foundation for continuous academic growth. Yet, to the best of our knowledge, works personalizing curricula do not take into account the decay aspect and its effect on the generation of solutions.
However, educational stakeholders involved in the individualization of curricula may already be confronting challenges stemming from knowledge decay. Students, as an example, are put in a rather challenging situation Daniela et al. (2018) as they must plan courses for the coming years without actual prior knowledge about the courses they have to choose, ensure that the sequencing is well made in terms of prerequisites, and have to assess the relevance of each course concerning their objectives. Furthermore, students should engage in self-examination to recognize the potential decay of their knowledge over time, an inherently difficult task. For teachers, this context makes some practices harder, such as multi-modal teaching, re-take exams, or one-to-one attention, because it tends to favor a considerable heterogeneity of students’ backgrounds.
In such a context, institutions should guarantee the quality and equity of the educational journey of each student, as some curricula could end up being more difficult than others. Consequently, institutions have to assess whether a curriculum is either well-formed or not depending on several factors, such as the fulfillment of course prerequisites, timetable scheduling Loo et al. (1986), or teachers’ availability. Institutions also need to support the individualization of a curriculum according to the student’s profile Klinkenberg et al. (2011); Desmarais and Baker (2012); Papousek et al. (2014). It is also necessary for institutions to ensure that the curriculum aligns with the student’s objective and is attainable. This implies that courses have to be properly cataloged by institutions, including the knowledge they taught and their prerequisites. However, all these tasks are typically carried out manually by institutional staff members, as no study explicitly emphasizes the challenges posed by the decay of knowledge concerning the individualization of curricula.
In this paper, we address the Fully Individualized Curriculum with a Decaying Knowledge Problem (FIC/DK-P). The objectives of this paper are threefold: 1) propose a theoretical and reproducible framework for the problem; 2) study the effect of decaying knowledge in the individualization of a curriculum and its complexity; 3) formulate several actionable recommendations and warnings for education stakeholders who are considering or currently implementing curriculum individualization. The remainder of this paper is structured as follows. In Section 2, we provide an overview of related works on curricula personalization, including the major challenges and techniques employed, and the theory of student forgetting. The Section 3 formalizes the problem by introducing the FIC/DK-P framework. This section is followed by Section 4, which covers the experiments conducted to study the problem, as well as the data generated for these experiments. Based on the analysis of the experimental results, we put forth nine recommendations for education stakeholders in Section 5. We then conclude the paper in Section 6. Finally, in Appendix A, the interested readers can explore the mathematical foundation of the FIC/DK-P framework.
Literature Study
FIC/DK-P consists of recommending to a student a sequence of courses until graduation that matches his/her objective, whether it be personal and/or professional while considering that knowledge can decay over time. Although this issue has not been addressed directly in the literature to our knowledge, it is noteworthy to mention related works in learning path personalization. Learning path personalization refers to approaches that generate learning paths by taking into account the individuality of a student and his/her learning preferences Deng et al. (2017). This personalization operates at various levels, mostly to the learning object level Belacel et al. (2014), at the topic level, the lesson level Nabizadeh et al. (2020) and at the course level Nabizadeh et al. (2017); Parameswaran et al. (2011). Duval and Hodgins (2003) introduced a modular content hierarchy based on these levels used to promote the sequencing of contents. Yet, from an institution’s point of view, there is no consensus over this hierarchy and its explicit or implicit implementation changes across institutions, hence the need for a flexible and adaptable learning path recommendation system.
In the literature, two main methods of personalizing a learning path can be observed. Either 1) computing and recommending an entire learning path for a student (or a group of students), such as in Kardan et al. (2014); Feng et al. (2011); Belacel et al. (2014) or 2) recommending a path for a student learning content by learning content, as shown in Govindarajan et al. (2016); Salahli et al. (2013) for example. The second approach may offer significant computational speed advantages but is inherently limited in capturing certain unique aspects and broader contextual nuances that emerge when taking into account the entirety of a student’s learning journey, including factors like knowledge decay.
The algorithms used in these methods are numerous. Among them, we can cite machine learning techniques, such as clustering or tree classifier Kardan et al. (2014); Lin et al. (2013). Recently, notable studies have incorporated semi-supervised learning and unsupervised learning approaches to analyze student data, aiming to predict students’ performance and provide recommendations for personalized curricula Backenköhler et al. (2018); Wong (2018). However, these machine learning techniques tend to merge student profiles, resulting in a loss of precision concerning the individuality of each student, which is essential in addressing our specific problem. Additionally, they tend to require a large amount of data to be efficient.
Among the other algorithms used are greedy algorithms Durand et al. (2013), graph theory Li et al. (2016); Belacel et al. (2014), Markov decision process Durand et al. (2011) or bayesian network Zhang and Koren (2007). While these algorithms generally yield high-quality results in terms of recommendation, they tend to be highly dependent on the problem and data, often requiring extensive fine-tuning and optimization techniques. As a result, they may not be well-suited for exploring novel problems. In the e-learning literature, we observe that genetic algorithm is a widely used technique Seki et al. (2005); da Silva Lopes et al. (2009); De-Marcos et al. (2009); Al-Muhaideb and Menai (2011); Benmesbah et al. (2021); de-Marcos et al. (2008) that can also produce high-quality, locally optimal, solutions. As a meta-heuristic approach, genetic algorithms demonstrate problem-agnostic characteristics, making them a promising candidate for initial exploration and analysis of our specific problem.
All these above works are implicitly based on the hypothesis of an ideal memory model of students Georghiades (2000). Nevertheless, compelling evidence suggests that students experience forgetting, and their knowledge retention curve exhibits a distinct pattern that is unique to each individual Bahrick (2000). This forgetting process is supposed to be driven by a core set of major factors Arthur et al. (1998); Bacon and Stewart (2006) such as 1) length of the non-use interval, 2) degree of overlearning, 3) task characteristics, 4) cognitive interference, 5) retrieval conditions, 6) training and instructional strategies and methods and 7) spontaneous loss of knowledge. In the forgetting curve theory, which thus considers that students’ knowledge proficiency follows a declining curve, we can observe several works modeling this decay over time.
Nonetheless, the modeling of forgetting remains one of the longstanding unresolved issues in the field of experimental psychology and is not the subject of consensus Klammer and Gueldenberg (2019). In Averell and Heathcote (2011), the authors propose a general memory model based on a study of a large dataset and Bayesian model selection to account for the student’s capacity to forget, where a power function seemed to be favored Anderson and Schunn (2013). Another popular model of forgetting is the exponential forgetting curve of Ebbinghaus Ebbinghaus (2013), yet this model was initially conducted by Ebbinghaus himself in an incomplete study. Nonetheless, Murre and Dros (2015) attempted to replicate Ebbinghaus’ experimentation and findings; eventually they showed that the experimental results were similar to Ebbinghaus’ curve, therefore supporting the relevance of his model.
Hence, a question remains concerning how the decay of knowledge can impact actual learning algorithms, especially recommending systems. The literature gives evidence that taking this decay phenomenon into account can lead to better solutions but tends to make the problem harder. In Lindsey et al. (2014), the authors incorporated memory models into factor analysis (Item Response Theory van der Linden and Hambleton (2013) is a canonical model of factor analysis). The authors’ model performed better compared to models that did not implement memory models. However, optimal solutions for personalized scheduling were found to be intractable. This evidence is also supported by Choffin et al. (2019); Huang et al. (2020).
Considering the challenges associated with manually personalizing a curriculum, which demands extensive technical and pedagogical expertise Vanitha and Krishnan (2019), it would be unwise to expect education stakeholders to undertake such a task without the aid of suitable decision-making tools or recommendations, especially while considering that decay phenomenon has a significant impact on the problem hardness.
Problem Definition
In this section, we model FIC/DK-P. This problem being related to other hard problems, such as scheduling problems, we had to make the four following hypotheses to study it and to give initial points of comparison:
-
1.
Logistical aspects (e.g. rooms availability, teachers availability) have no impact on the quality of a curriculum;
-
2.
The course catalog used is considered complete and sound, that all the information is at our disposal and there is no implicit information;
-
3.
Courses could not overlap two or more academic terms: they are always confined into a single academic term (and last this entire academic term);
-
4.
The learning process is perfect, meaning that at the end of the course, a student has acquired everything that a course should provide so that we do not introduce probabilistic learning models in the study of FIC/DK-P.
FIC/DK-P is the problem, for a student, of selecting for a specific time range a sequence of courses to acquire the necessary skills, competencies, behaviors, attitudes, abilities, or knowledge such that he/she becomes qualified for his/her objective (most often the objective being a professional one), while these elements being subject to a decay effect. This decay effect makes it more complex to plan a coherent sequence of courses as the student may no longer be qualified to attend specific courses when academic terms are distant. We illustrate FIC/DK-P in Fig. 1. In the following subsection, we model the components of FIC/DK-P.
Illustration of the fully individualized curriculum with decaying knowledge problem. A student has to choose from the course catalog of an institution a sequence of courses – namely a curriculum – to reach a (professional) objective while being sure to be qualified to attend each planned course. A rectangular box represents a course, and its ordinate position in a line \(k_i\) represents the level of prerequisite expected of this knowledge. A color corresponds to a unique course. The curve in each line illustrates the evolution of the knowledge. The hourglasses represent the decay of knowledge through time. Here, the second course in the \(k_2\) line and the first in the \(k_3\) constitute a risk for the student as the prerequisites are not met
Modeling Knowledge and Mastery
The essence of the complexity of FIC/DK-P lies in the order in which skills, competencies, behaviors, attitudes, abilities, or knowledge are learned during courses, to what extent, and how they evolve throughout a curriculum. As multiple definitions of these properties exist, we define a surjective mapping of the mastery whether it be of skill, competency, behavior, attitude, ability, or knowledge into a set, that we call Knowledge for convenience.Footnote 1 A member of this set encodes the mastery information continuously between [0; 1] for a specific knowledge, where 0 signifies that the corresponding knowledge has not been encountered by a student and 1 indicates complete mastery of the knowledge. Such a set allows for a generic representation of mastery and can be used in vast educational situations and paradigms, such as in more classic learning Bloom et al. (1956); Mandin and Guin (2014), constructivism learning Bada and Olusegun (2015) or with works considering knowledge mastery as a binary property Huang et al. (2020). This set only requires from institutions to agree on a knowledge decomposition according to their epistemological, didactic and/or practical standpoints and agree on the mapping of their knowledge graduation into our interval, which can be quite straightforward (e.g. dividing the [0; 1] interval by the number of possible grades assignable to a student for each knowledge).
Modeling Courses
Courses are the basic building blocks of curriculum, as they can be considered the main vector of knowledge Hill et al. (2005) for students. Consequently, to keep our modeling generic, we consider a course as a macro entity that provides knowledge that can potentially be aligned with any learning material level and hierarchy Duval and Hodgins (2003). Each course is considered to mobilize at least one knowledge to the student; we did not define an upper bound about the number of knowledge that can exist in a course as we did not find a formal threshold in the literature. The amount of knowledge taught by a course is expressed as a mastery value.
Additionally, courses can also have prerequisites. The importance of prerequisites in a curriculum is highlighted in works such as Molontay et al. (2020). Prerequisites ensure that a student has the minimum background to fully acquire the knowledge provided by the course. Attending a course without meeting all the prerequisites can pose risks in a student’s educational pathway, such as failure, cognitive overload, and increased stress. This is an important decision factor regarding the quality of a curriculum. We express these prerequisites as a knowledge mastery value threshold, meaning that a student should have more, or at least equal, knowledge mastery to guarantee success in the concerned courses.
Another property of courses is their temporal availability. Typically, courses within educational institutions adhere to specific scheduling constraints, operating during designated academic terms for various reasons. To capture this characteristic generically, each course is associated with a set of temporal availability terms, designated as academic terms.
Furthermore, we take into account the attendance and the involvement of students with the courses they take. We define a notion of credit, that works both with the American credit system and the European Credit Transfer and Accumulation (ECTS) standard Herrero and Algarrada (2010). Consequently, each course is assigned a credit value which is earned by the student at the end of the course. One can consider a specific threshold of credit value for a student to graduate (e.g. 180 ECTS, which represents a bachelor’s degree).
Modeling Curriculum and Student’s Objective
A fully individualized curriculum is a sequence of courses that spans over one or more academic terms. Each of these academic terms is designed to accommodate a dedicated number of courses, and this number can be different for each academic term. This implies that a student is limited in the number of courses he/she can take both in an academic term and his/her entire curriculum. This limit is theoretically different from one institution to another, which makes the computational nature of the FIC/DK-P more complex. In addition, please remember that, via our third hypothesis, a course cannot overlap two academic terms.
Furthermore, a curriculum should be designed to qualify a student for his/her objective – most often a professional one. Consequently, we had to model the objective of a student. Again, for generic purposes, we based the modeling of student objectives on our knowledge representation. An objective is therefore expressed as a set of knowledge mastery values, indicating which knowledge is expected and to what extent. One can see these objective mastery values as the final requisites of the entire curriculum. In real-life scenarios, the identification of these final requisites will most likely be conducted by the institutions themselves, especially by collaborating with the professional world.
Ideally, the sequence of courses should be defined so that no prerequisites are missed at any time. We consider such a sequence as a good fully individualized curriculum. We also consider that a curriculum only serves the purpose of only one student objective at a time – yet the objective can hold any amount of knowledge.
Modeling Student Profile and Decay
What sets FIC/DK-P apart in the literature dedicated to learning path personalization is its consideration of the decay of knowledge over time when formulating individualized curricula for students. This consideration aligns with findings in educational psychology, underscoring the significance of this factor in students’ educational experiences and their reception of course materials. Several works Arthur et al. (1998); Ebbinghaus (2013); Heller et al. (2006) shown that the mastery of knowledge is not stationary in time: it can decrease when the knowledge is not used over a certain period, and vice-versa, according to the pedagogical context. Predicting such variations is an important challenge as it can greatly improve the learning experience of the students, as shown by the works related to the spaced repetition system (SRS) Settles and Meeder (2016) – even if monitoring the mobilization of knowledge outside a pedagogical context is difficult. Yet, it also adds complexity to the curriculum design process. As the knowledge acquired by a student during a specific academic term can diminish over time, it may reach a point where some of the prerequisites for future courses in subsequent academic terms are no longer fulfilled. Therefore, it becomes essential to predict and mitigate this decay effect to determine the optimal sequence of courses that ensures all prerequisites and the requisites of the objective are satisfied for the student, thus diminishing the risk of failure of the student.
We model the decay of knowledge, given a student, as a function of the elapsed time since the last time this knowledge has been learned. The codomain of the function is [0; 1], representing the amount (i.e. mastery value) of the knowledge lost during this period. This function impacts how the mastery evolve throughout the curriculum. The explicit function’s mapping should be defined according to one’s psychological standing of the decay, for example by using Ebbinghaus’ forgetting curve Ebbinghaus (2013) .
In accordance with prior research works such as Howe (1980), learning is commonly regarded as a cumulative process. In our model, the evolution of mastery is characterized by the accumulationFootnote 2 of previous mastery levels and the acquisition of additional knowledge mastery from relevant courses. Alternatively, mastery may experience decay if the acquired knowledge is not utilized during the academic term. Thus, to predict the mastery of a knowledge for a student in an academic year, the amount of decay is subtracted from the accumulated mastery value of this knowledge. Each knowledge within the framework may possess a unique decay function tailored to its characteristics. Additionally, a decay function can change based on various properties such as time or specific knowledge thresholds in order to capture the notion that certain knowledge becomes more resistant to forgetting over time (e.g. riding a bicycle once learned). We give two examples in the Fig. 2 regarding the evolution of the mastery of a knowledge according to a decay function. In our model, it is possible to observe a mastery overflow if, after attending a course, the mastery should be greater than 1 (see Fig. 2b). In that case, the value of one’s knowledge is expressed as \(min(1,m_{k,t})\).Footnote 3
Finally, we introduced the concept of student’s profile. At each academic term, knowledge mastery values of a student are stored. In addition, the profile of a student is also composed of a set of decay functions concerning each knowledge. Indeed, knowledge could face differences regarding how they are forgotten by a student: this phenomenon is strongly dependent on the student Brewer and Unsworth (2012); Mozer and Lindsey (2016). These decay functions may also change over time. This allows us to fine-tune the prediction of forgetting if needed, and the individualization: given two students having a different profile but the same objective, the best fully individualized curriculum will potentially be different.
Example of the evolution of the mastery of a knowledge k through time t in two different scenarios, illustrating how the decay \(\delta _k\) affects the mastery, as well as the mastery brought by a course \(\alpha _{c,k}\). Notations are detailed in Appendix A
Experimentation
In this section, we outline two experiments that were conducted to address our problem: one utilizing an exact method and the other employing a meta-heuristic approach. A meta-heuristic is an agnostic problem-solving strategy designed to find approximate solutions across a wide range of optimization problems – for a comprehensive view on the subject please refer to Sörensen (2015). The objectives of these experiments were to gain a deeper understanding of the problem, explore its complexity, assess the impact of decay on problem difficulty, establish initial benchmarks for the research community, and derive preliminary recommendations based on our findings. Before presenting the experiments, we provide a comprehensive overview of the experimental context.
Experimental Context
Academic Background
Below, we present the setup we used for instantiating FIC/DK-P from the presented model. The problem assumption is inspired by the academic background of a French engineering school. Since FIC/DK-P modeling is generic, it allows for flexible assumptions to accommodate various backgrounds and requirements.
Assumption 1
An academic year is divided into two academic terms, also known as semesters. Therefore, a five-year curriculum consists of ten academic terms.
Assumption 2
An academic term should always bring to the student 30 credits once completed. These 30 credits represent the ECTS credits earned by students.
Assumption 3
It is not possible for a student to take the same course more than once during his/her entire curriculum.
Assumption 4
We consider the following epistemological model for the decay function \(\delta \), inspired by the works done in the neuroscience field Averell and Heathcote (2011); Ebbinghaus (2013):
with t representing the difference between the last time a student saw this knowledge and the current academic term. This function illustrates that, the less a student uses one of his/her knowledge, the greater he/she forgets it. The function was designed mainly for an academic curriculum of 3 and 5 years with two academic terms by year; for any other duration, one should modify the coefficient of memory stability s introduced by Ebbinghaus (here, \(s=2\)).
It implies that the decay function associated with each knowledge in a student’s profile is the same: each knowledge will develop in the same way.Footnote 4
Assumption 5
There is no decay regarding the mastery of a knowledge within an academic term (here, a semester). This can be expressed as \(\delta (0)=0\).
Assumption 6
The student for whom we solve FIC/DK-P has no prior knowledge at the beginning of his/her curriculum. That means every knowledge mastery that composes his/her profile is set to 0. Note that, in real-life applications, it is more than likely that some of his/her knowledge mastery will be different from zero.
Data Generation Background
To the best of our knowledge, no sufficiently comprehensive public catalog of courses exists in our community, at least publicly. This is arguably because the elaboration of such catalogs is an important task for both teachers and institutional stakeholders: each course must be properly described, as well as all its properties. In the absence of strong incentives, their creation seems not to have been a priority. For example, in France, the Commission des Titres d’Ingénieurs (CTI) recently acts for the creation of a detailed syllabus for each course of engineering schools, describing the knowledge taught to the students and the adoption of an approach by competencies Commission des Titres d’Ingénieurs (2023): these institutions have started the elaboration of some kind of course catalog alongside a knowledge catalog.
It is in this context that we produced our datasets for the following two experiments. Having at our disposal the syllabuses of courses taught during the two years of a master’s degree at IMT Nord Europe, a French engineering school, we identified the number of courses available, their temporal availability, the number of knowledge taught in each course, the prerequisites for each one of them, the overall distributions of knowledge and how many courses at average a student should attend. Nonetheless, some information was missing from these syllabuses and we had to presuppose them. This was the case for the knowledge mastery that each course brings to a student. To approximate this value, we use our prior knowledge of the courses we knew and contact some of the referees of the other courses.
Thus, we have extrapolated the gathered data to define a data generation model that is representative of a five-year curriculum to produce simulated datasets. The size of the course catalog \(\mathcal {C}\) was set in between [300; 500] courses, and the size of the catalog of knowledge \(\mathcal {K}\) was set in between [200; 600]. The maximal amount of prerequisites a course can have \(\mathcal {P}\) was set in between [4; 5] and the maximal amount of knowledge taught by a course \(\mathcal {T}\) was set up to 5. The selection of knowledge taught and used as prerequisites followed a uniform distribution and the mastery \(\mathcal {M}\) was set in between [0.25; 0.75]. Finally, we set the number of courses \(\mathcal {S}\) a student has to choose at each academic term in between [10; 20].
A dataset can therefore be expressed as the combination of \(\langle \mathcal {C}, \mathcal {K}, \mathcal {T}, \mathcal {P},\mathcal {S}, \mathcal {M}\rangle \), plus the random seed used during computation. We produce, for the same configuration, 30 different datasets by changing the problem seed. To reduce the combinatoric, we select \(\mathcal {C}\) in increments of 10, \(\mathcal {K}\) in increments of 50 and \(\mathcal {S}\) in increments of 1. By doing so, we have produced 124740 different instances of the problem (which is roughly equal to 15 GB of experimental data).
Even if our studies are based on simulated data, we have made them as close as possible to real-life scenarios. Nonetheless, we had no prior knowledge that, given a student profile and an objective, a solution whether exists or not – since this relates to directly solving FIC/DK-P. Please note that there are also some biases using simulated data (e.g. the effect of the distribution used), yet this was an essential first step in order to study FIC/DK-P. We are currently working to elaborate on a real problem instance that could be shared with the community.
We also had to randomly generate the student objective. Essentially, it is defined according to the available dataset: the prerequisites and objective requisites are taken from \(\mathcal {K}\), the set of all knowledge that can be taught in the institution. The amount of prerequisites and requisites was set in between [2; 4] and the expected mastery of each of these was picked in between [0.5; 0.95]. We attempted to convey that students should be fairly proficient in the knowledge essential for their future employment.
Exact Solving Method
In this section, we present our experiment implementing an exact method to find the best curriculum, according to an initial student profile and a student objective. One goal of this experimentation was to study how hard the problem of fully individualizing a curriculum is and to identify the moment that educational stakeholders should be assisted in the customization task of curricula. The experimental results tend to show that FIC/DK-P exhibits an important combinatorial explosion, making this problem probably not suited for 1) exact methods – even considering a small set of courses, and for 2) educational stakeholders to manually solve this problem.
An exact method for solving FIC/DK-P consists in finding at least one complete assignment (i.e. a sequence of courses chosen from the catalog of courses) that satisfies all the constraints entailed by our problem. As a reminder, these constraints are: 1) all the courses in the sequence must be different; 2) each course prerequisite must be validated, as well as objective requisites ; 3) a course can only be taken when it is available; 4) the number of courses in an academic term must be equal to the theoretical value used; 5) each academic term must bring at least 30 ECTS.
We do not speak of the quality of a solution nor the optimality of a solution, as they should be left to the discretion of the pedagogical stakeholders. Is a solution maximizing the mastery values of knowledge required by an objective to be considered better than one which maximally diversifies the knowledge seen by a student? We do not know.
Experimental Setup
Software
We implemented in prolog language two constraints-based search algorithms (i.e. solvers) to solve FIC/DK-P. Algorithmic details as well as implementation are available in Appendix B. The first solver – Solver 1 – evaluates the validity of a solution after its full assignment. We suppose that the behavior of Solver 1 is somewhat representative of an educational stakeholder’s behavior trying to solve FIC/DK-P: the evaluation of the solution will be carried out at the end of a full assignment for the sake of convenience.
Yet, we face an important combinatorial explosion that forced us to drastically reduce the dimensions of the datasets used. At the end of the experiment, we used \(\mathcal {C} \in [24;100]\), \(\mathcal {K} \in \{10;20;30\}\) and \(\mathcal {S} \in \{2;3\}\), with the objective of the student expressed as 2 requisites and the number of academic terms T being equal to 6 or 10: above these parameters the problem becomes intractable, exceeding the time limit of 12 hours we fixed (considering that the problem is supposed to be solved for several hundred or thousand students in a real context). Thus, we designed a second solver – Solver 2 – implementing a search heuristic based on the decay prediction (see Appendix B, Algorithm 2). This heuristic does not prune by itself any path of the exploration tree (it does not prevent forward chaining and backward chaining)Footnote 5; it prioritizes courses that maintain the student’s mastery of the learning objective at a level above or equal to the expected value when subjected to the decay effect.
To summarize, Solver 1 is a full exact method that evaluates implicitly all the possible solutions. Solver 2 is built on Solver 1 and uses the heuristic as a predictive model to prioritize some courses over others during the search.
In our different instances configuration, the search always starts at the beginning of the first academic term of the first academic year, and the student profile has all its mastery scores set to zero.
Hardware
This experiment was conducted on a 2.0 GHz Intel i5-8250 laptop with 8 GB of RAM. In the following, we considered logical inference (LI) made by the solvers, rather than time spent, because of more scalable and representative information regarding the computation force required to solve the problem.Footnote 6
Results
The Fig. 3 presents the experimental results obtained from our exact solving attempts. The y-axis is logarithmic and represents the logical inferences (LI) made by the solver for solving the problem. The x-axis represents the number of courses available for each academic term.
First of all, let us note that regarding the LI made, we obtain better results in terms of computation time for all the observed cases with Solver 2 which uses the decay as a selection heuristic than Solver 1. This observation leads that pedagogical models could be useful to design efficient selection heuristics and reduce the computational time of a problem.
In Fig. 3 a), b) and c), we are solving the problem for a bachelor curriculum (\(T=6\)), where two courses are attended by the student at each semester (\(\mathcal {S}=2\)). As we increase the pool of courses available at each semester N, we quickly observe a combinatorial explosion: around \(N=15\), FIC/DK-P becomes generally not tractable in a reasonable time. We also vary the pool of available knowledge: \(\mathcal {K}=10\) for a), \(\mathcal {K}=20\) for b) and \(\mathcal {K}=30\) for c). Interestingly, the configuration used for b) appears to make the problem more difficult than the configuration used for c). Additionally, in a), we can observe a strong advantage towards Solver 2: it finds an answer to the problem for \(N=12\) whereas Solver 1 cannot solve these instances under 12 hours.
In d), we compare similarly configured instances (\(\mathcal {S}=2\), \(\mathcal {K}=10\)) over two different durations: a full bachelor degree (\(T=6\)) and a full bachelor and master degree (\(T=10\)). It can be observed that the number of academic terms has a nonnegligible effect regarding the overall tractability of the problem: for \(T=10\) the two solvers could not solve the problem above \(N=10\) (see the doted plots). Thus it appears that when the number of academic terms increases, so does the difficulty of the problem.
In e), we study the effect of the number of courses \(\mathcal {S}\) to be taken each semester. The results are unequivocal: the more \(\mathcal {S}\) increases, the more the problem is difficult. For \(\mathcal {S}=3\), the problem becomes not tractable in a reasonable time for \(N=8\). For \(\mathcal {S}=4\) (not plotted), the solver could not solve the problem when \(N=7\). It appears that \(\mathcal {S}\) has also an important effect on the combinatorics of FIC/DK-P as one would anticipate, maybe more important than T.
Overall, despite the small scale of our experiment, the results show that FIC/DK-P is a very difficult problem to solve, highly demanding computation-wise. Some parameters, such as \(\mathcal {S}\) and T, seem to have a strong effect on the complexity of the problem. Additionally, our results are further evidence of the difficulty that educational stakeholders will face in addressing this problem in real-life scenarios. Furthermore, we are inclined to discourage the utilization of exact methods alone to solve FIC/DK-P, unless accompanied by robust heuristics capable of efficiently exploring the state space. Nonetheless, these first results have been useful to establish some prescriptive recommendations for the educational stakeholders, which are discussed in Section 5.
Meta-Heuristic Solving
As we have not been able to scale up to real case scenarios while using an exact solving method, we decided to further study the problem using a meta-heuristic approach. The objective of this experiment was not to find an exact solution, namely a sequence of courses for which all the constraints are satisfied, but good enough solutions where the constraints are violated as little as possible. Considering the extensive use of genetic algorithm (GA) in the e-learning literature to solve problems and its efficiency, such as arranging and delivering e-learning materials Al-Muhaideb and Menai (2011); Benmesbah et al. (2021), we developed a GA to solve FIC/DK-P. By doing so, we provide the very first benchmarks and insights to the community for the full-scale problem, being as typical as real-life scenarios. We hope that these contributions will give researchers in our community a solid foundation for developing new, more efficient algorithms. Before presenting our experimental results, we present the GA parametrization for the sake of reproducibility and discussion.
GA Parametrization and Reproducibility
GA is known to be multi-parametric: a parameter’s value can have a substantial effect on the quality of a solution Eiben et al. (2003). Yet, identifying the best configurations is computationally intensive (e.g. identifying good crossover, mutation, and tournament combinations), and most of the time the values are chosen empirically Eiben et al. (2003). Below we discuss the parametrization of our GA.
Problem Representation
In our implementation, we opted for a widely used representation in e-Learning, especially in learning object recommendation, which is the integer chromosome encoding with fixed length da Silva Lopes et al. (2009); De-Marcos et al. (2009); Al-Muhaideb and Menai (2011); Benmesbah et al. (2021). A single individual represents a curriculum. Each individual’s gene represents a specific course. The size of the genome of an individual equals the quantity of the overall number of courses a student will attend during its entire curriculum (which is the number of academic terms T multiplied by the number of courses that must be attended at each academic term \(\mathcal {S}\)). The order of the genes within an individual is discretized by an academic term that represents the succession of the courses. Each course encodes its prerequisites, the knowledge it teaches as well as the credit value it is worth. Figure 4 illustrates an individual in our implementation.
In our GA, the initial population is pseudo-randomly generated. Instead of randomly picking a course in the entire catalog for each gene of each individual we create, we verify that 1) a course can effectively be taken in the academic term it is planned and 2) all the courses are different. This integrity check is computationally straightforward and dramatically improves the overall quality of the initial population, making a more efficient convergence. The population size of each generation was set to 100 as it appears to be a good compromise between exploration and computation efficiency; increasing the size could potentially improve the likelihood of finding better solutions for learning paths but at the expense of a higher computational costFootnote 7 Chen (2009).
During the computation of the next generation of individuals, some individual inconsistencies may happen due to the stochastic nature of GA. In such a case, either we re-generate the ill-formed individual with a probability of \(p=0.75\) or we replace the faulty course(s) with a valid one with a probability of \(p^{'}=1-p\). GA does not guarantee to reach an optimal solution during the generational process. To stop it, we used a common disjunction of case Eiben et al. (2003); Samia and Mostafa (2007); De-Marcos et al. (2009): either reaching a fitness threshold of 0, which means that all the constraints are successfully passed, or reaching the maximum number of generations. We empirically select 10000 generations as a maximum, as we notice strong convergence from the individuals around \(8\times 10^{3}\) generations.
Fitness
The fitness function f expresses the quality of an individual. It is based on four \(\nu _i \in [0;1]\) metrics. \(\nu _1\) expresses the difference of credits between the amount expected and the amount obtained at each academic term. \(\nu _2\) expresses the amount of mastery lacking to entirely match the requisites of the student objective. \(\nu _3\) expresses the quantity of misallocated courses. \(\nu _4\) expresses the amount of mastery lacking to match each of the prerequisites of courses at each academic term. When all the metrics are maxed, \(f(x)=0\), meaning that the individual fully passes all the constraints. When \(f(x) = 4\), each constraint is violated. f is defined as:
Crossover, Mutation and Tournament selection
GA is driven by three important operators: tournament, crossover, and mutation. A tournament is the selection of the individuals that will contribute to the new individuals of the next generation. Crossover is the creation of new offspring from the combination of two selected individuals. The mutation is the modification of an individual genotype to introduce some noise in the population. All of these operators are known to have a significant impact on the quality of the solutions found. Consequently, we led several upstream experiments to empirically select the most effective rates.
We used a generational replacement strategy Hovakimyan et al. (2004) coupled with parsimonious elitism by always selecting the best individual to prevent the eventual loss of the best solution. This strategy is driven by a deterministic tournament selection, mostly because it is efficient to code and allows the selection pressure to be easily adjusted Miller et al. (1995). We empirically chose a tournament size of \(\tau =2\).
We implement a one-point crossover operation that produces two new individuals: this is a common operation in our domain Hovakimyan et al. (2004); da Silva Lopes et al. (2009). We empirically chose a crossover rate of \(X=0.75\). As for mutation, we implement a simple binary mutation operation that changes, for the concerned individual, one of its courses to another one from the course catalog. This binary mutation is coupled with an integrity check regarding the course to pick: we randomly select a course having all its prerequisites met thanks to the previous gene if any such a course exists, otherwise we randomly pick one from the entire catalog. We empirically chose a mutation rate of \(M=0.75\).
Experimental Setup
Software
The algorithm have been implemented in C++. We used the Paradiseo library, a framework to design parallel and distributed metaheuristics Cahon et al. (2004), to handle all the aspects of the GA.
Hardware
We used a computer cluster consisting of 1388 heterogeneous threads, where each thread ran our GA over a specific instance of the problem. Depending on the configuration of the problem, and the hardware, running the algorithm took approximately 40 minutes to 3 hours to terminate. This led roughly to a lower bound of 9 years of computation if our entire experimentation was conducted on a single thread at 2.67 GHz. Please note that we did not parallelize our computation, with each thread performing a single run at once.
Results
The first experimental finding, and maybe the most important, is that solving FIC/DK-P seems to be strongly dependent on the problem instance. For the same data-set configuration, \(\mathcal {C}=400, \mathcal {K}=400, \mathcal {S}=10\), we produce 10 different problem instances by changing the random seed and then performing for each one 30 different runs. We present the result in Fig. 5. These results are supplementary evidence that the difficulty of the problem is both dependent on the course catalog and the student objective. Despite this dependence, it is still interesting to study the overall trends obtained from our experiments to understand critical aspects of FIC/DK-P.
Fitness-wise, considering the entirety of the experiments we conduct, we always obtained better solutions than the initial, pseudo-randomized, population, as presented in Fig. 6. In some cases, we were even close to fully solving FIC/DK-P, with \(f\approx 0.1\). These results confirm that 1) it is possible to compute nearly faultless curricula despite the intractability faced in our previous experiment – which should give incentives to the scientific community to try to solve this problem in depth – and 2) it is possible to provide the education stakeholders with decision-making tools to help them assess the quality of a curriculum. The three fitness peaks of the optimized population observable in Fig. 6 (i.e. \(f\approx 0.4, f\approx 0.8 \text { and} f\approx 1.1\)) are related to the requisites of the student’s objective being met. So, we see that we are also able to provide the student with a more fitted curriculum concerning his/her objective. Nonetheless, the objective appears to be acting as a local optimum. In any case, the pseudo-randomized population converging toward a single peak illustrates the difficulty of solving FIC/DK-P.
In Fig. 7 we presented a part of the 124740 experiments carried out, with the number of courses a student has to attend at each academic term \(\mathcal {S}\) represented along the x-axis. The best results were globally obtained when \(\mathcal {C}=400\). We observed that the spread of the fitness is not correlated with \(\mathcal {S}\): the results show a substantial similarity of the spread which does not quite change as \(\mathcal {S}\) increases or does not change linearly (e.g. see Fig. 7c or f). Additionally, the relationship between how the fitness spreads and the problem configuration appears to be non-trivial and even counterintuitive, as, for less large problem configurations (first row of Fig. 7), the spread tends to be more important than larger problem configuration (last row of Fig. 7).
Another interesting finding related to \(\mathcal {S}\) can also be observed in Fig. 7. Intuitively, as \(\mathcal {S}\) increases, we can suppose that so does the likelihood of reinforcing the student’s qualification for future courses, because the student will attend more courses and theoretically will acquire more knowledge. This behavior is indeed observable in Fig. 7 by looking at the initial population: the overall trend of the fitness decreases as \(\mathcal {S}\) increases. Yet, this intuition does not stand while observing the final populations. It even appears that greater values of \(\mathcal {S}\) in a highly diversified catalog of courses (i.e. see Fig. 7i) tend to produce less good solutions than by assigning a more constrained number of courses to the student. Thus, simply feeding the students with more courses will not necessarily make the problem easier when searching for a good solution.
Figure 8 highlights the mixed difficulty of maximizing the four metrics that compose the fitness. Looking at Fig. 8c, all the individuals of the initial population are such that \(\nu _3(x)=1\) because we built the entire population to maximize this constraint. What is interesting however is that there is no derivation whatsoever for this metric in the final population, suggesting that accommodating this metric is rather simple and/or is a local optimum. Maximizing the credit metric \(\nu _1\) does not seem to bear strong difficulty either. As shown in Fig. 8a, in the initial population, about \(90\%\) of the individuals are maximizing \(\nu _1\). The final population reinforces the score by making approximately \(99\%\) of the individuals maximizing the metric.
Cumulative distribution comparison for the four fitness criteria considering the 124740 runs we conducted. The x-axis is the criterion score where 1 means that all the metrics are fully answered. \(\nu _1\) expresses the difference of credits between the amount expected and the amount obtained at each academic term. \(\nu _2\) expresses the amount of mastery lacking to entirely match the requisites of the student objective. \(\nu _3\) expresses the quantity of misallocated courses. \(\nu _4\) expresses the amount of mastery lacking to match each of the prerequisites of courses at each academic term
Figure 8d presents the \(\nu _4\) metric related to the amount of difference between the prerequisites of each course and the current mastery of a student at a given academic term. Unexpectedly, the score of the initial population is not as low as we initially expected: about half of the individuals range between a metric score of 0.5 and 0.6. Yet, this metric is continuous; interpreting it from a discrete standpoint means that the student will not be qualified to attend more than half of the courses composing his/her curriculum. The results are far more encouraging for the final population, where all the individuals ranged between 0.7 and 1.0 (all the prerequisites are met). This suggests that the decay of knowledge can be compensated with a thoughtful sequence of courses. Interestingly, the two cumulative distributions share the same kind of compressed shape. It may suggest that the decay of knowledge can create attractive local optima. In any case, this was quite unforeseen.
Finally, the metric \(\nu _2\) related to how well the requisites of the objective of the student are met seems to be the most difficult to solve, even if it is expressed with less knowledge than the entire curriculum. As the cumulative distribution of Fig. 8b) shows, the initial population performed very poorly: more than \(70\%\) of the individuals are such that \(\nu _2 = 0\). The final population shows a significant improvement, where no more than about \(19\%\) of the individuals obtain zero to the metric – which is still not ideal for real-life context scenarios. On average, half of the individuals are above or equal to 0.33, which means that at least one knowledge could at least be fully met. Some individuals fully met the expected requisites for the objective of the student. The results of \(\nu _2\) therefore suggest that meeting the requisites of the student objective is maybe the hardest part of the problem and its importance during the search should perhaps be weighed against the other metrics. In addition, the results highlight the need to design new, more efficient algorithms since it is possible to obtain good values for the metric.
In Fig. 8e and f we plot the cumulative distributions of these four criteria between the pseudo-randomized population and the optimized population to facilitate the comparison. The final population shows significant improvements compared to the initial one. Coupled with the results of Fig. 6 and that the fitness of the best individuals is about \(f\approx 0.1\), this is an encouraging result for the study of new methods dedicated to solving FIC/DK-P.
Prescriptive Recommendations
From the study of the results of our experiments conducted in Section 4, interesting findings and insights emerged that could be helpful for various education stakeholders, either to give a reflection on current practices or to facilitate the adoption of new practices regarding more flexible curricula. In this section, we present several prescriptive recommendations.
The Importance of the Student Profile
The profile of the student has a significant effect on solving FIC/DK-P. Institutions must therefore be careful to identify the precise profile of their students before proposing customization. Although several institutions often conduct written or oral tests for selecting students that can later be used to identify their mastery of knowledge, how they forget specific knowledge should also be identified. Otherwise, the risk is to propose sequences of courses irrelevant to students’ needs. In that regard, a way to identify decay functions could be to conduct self-positioning tests. By that means, institutions and teachers could regularly test and compare the evolution of knowledge of students, helping them to estimate and refine decay functions over the years. Another way to identify students’ decay functions, possibly less efficient, could be to define several clusters based on specific features (e.g. students grades, high school specialties) and, for each cluster, assign predefined functions.
Objectives of Students: a Cornerstone
One of the primary challenges in developing a comprehensive and tailored curriculum lies in determining the student’s objective, as it can potentially influence the entire solution. As a result, institutions should meticulously evaluate and review these objectives. Ideally, these objectives should be aligned with the needs and expectations of socioeconomic stakeholders, determining the necessary knowledge and its desired extent, which can potentially lead to the establishment of a shared taxonomy for the entire community, fostering collaboration and coherence in defining and organizing educational objectives. A way to mitigate this difficulty could be for institutions to first propose a catalog of objectives that can then be tailored to the student’s expectations, eventually motivating the student during his/her curriculum.
Additionally, our framework could help institutions to assess the consistency of their current curricula with the professional objectives that are currently reachable by identifying which knowledge is missing or lacking. This could eventually give new education quality indicators.
Inequitable Outcomes
Our experimental results show that some objectives seem harder to solve than others. This finding could have important consequences in an educational context. It suggests that depending on their objective – or how objectives are described by an institution – some students could have either an advantage or a disadvantage in obtaining their degree compared to their peers. These differences can manifest in several ways, such as a shorter curriculum or an easier one. This could potentially introduce ethical and inclusive biases that institutions should be aware of. A way to mitigate this disparity of chance could be to monitor year by year several relevant information for each objective, such as the percentage of failure of students, and refine the objectives that need it. In any case, institutions and researchers should further study these biases.
Pooling Considerations for Curricula
Our research outcomes underscore a robust correlation between curricula and the specific catalogs employed by institutions. Consequently, we are driven to emphasize the possible risks linked to the dissemination, the exploitation and the assimilation of curricular material originating from other external institutions. This is particularly applicable in scenarios involving program or student exchanges. Misalignment may occur between how institutions define their course and knowledge catalogs (e.g. knowledge not described with the same granularity, different interpretation of the prerequisite scale, semantic ambiguity regarding knowledge). This can potentially lead to inappropriate curriculum for students participating in an exchange program and thus degrading their performance or their engagement. This, in turn, can undermine the credibility and reputation of the diploma, as it may no longer represent a consistent level of knowledge. Our observation serves as a compelling incentive for institutions involved in such programs to maintain a constant dialog. One way to mitigate these risks for institutions could be to engage in the elaboration of common catalogs, ultimately promoting interoperability.
Feeding the Student
To increase the qualification of a student for a specific objective, intuitively, one could be inclined to feed the students with numerous courses to attend during each academic term. However, our findings suggest that this assumption seems to not stand. Increasing the number of courses a student has to attend does not improve the quality of the solution and, overall, tends to decrease it. On average, the best solutions were found when the student had to attend ten courses (\(\mathcal {S}=10\)) per academic term, followed by \(\mathcal {S}=17\) which appears to be an interesting value to study further. In addition, by increasing the number of courses, the likelihood of the student facing a cognitive overload increases and will eventually be counterproductive. Although the theoretical reasons for such an observation are currently unknown to us, we believe institutions should be aware of it while designing their academic terms’ constraints.
Towards an Equilibrium
From our observation, there seems to exist a state of balance between 1) the number of available courses, 2) the amount of knowledge, and 3) the quantity of knowledge taught by each course and their prerequisites. Although we were not able to exactly define this equilibrium point, we observed that it may be preferable to maintain a course over competencies ratio superior or equal to one \(\left( \frac{\mathcal {C}}{\mathcal {K}} \ge 1\right) \). Otherwise, the problem seems unbalanced, less tractable, and harder to solve. Maybe this is an indication for teachers that knowledge should not be described in too fine a detail, or that the granularity of knowledge description should be made in accordance with the number of courses available.
Providing People with the Right Tools
Our results suggest that there is a turning point where humans need to be assisted in FIC/DK-P to take an informed, high-quality decision. As we have shown that the problem becomes quickly intractable (\(\mathcal {S}=3, \mathcal {T}=6, \mathcal {K}=10\) and \(\mathcal {C}=12\)), the need for decision-making tools emerges at an early stage of the process. We strongly advise institutions against adopting a large-scale personalized or individualized curricula approach without the support of appropriate tools, as it would present significant challenges for their employees. The fitness metrics we provide could help education stakeholders to evaluate the quality of a solution they come up with, as well as recommend some curricula and explain why they are potentially good candidates. This could improve both the discussion with students and the identification of critical periods in their curricula. In the meantime, institutions that currently enable students to customize their curricula to a certain extent should be aware of the difficulty of evaluating the quality of these personalized curricula.
Streamlining Knowledge Taxonomy
Usually, institutions have access to a knowledge taxonomy of their entire courses through more or less comprehensive syllabuses made by teachers. This taxonomy can be leveraged to produce the needed catalogs of courses and knowledge to solve FIC/DK-P, although additional work may be required to clean the data and establish numerical prerequisites for courses. In the case of institutions that do not yet have implemented syllabuses or have incomplete ones, our model can provide a straightforward method to produce relevant syllabuses as the two catalogs can be used together to produce a CbKST graph Heller et al. (2006), which is a knowledge taxonomy. Then, from this taxonomy, it is straightforward to produce comprehensive institution’s syllabuses as all the relationships between courses have already been expressed. Implicitly, institutions can employ our model to assess the integrity of their pre-existing syllabuses and take appropriate measures as needed.
Conclusion and Discussion
Discussion
The experimental results we present are all based on simulated data produced from one real-life dataset we extrapolated, which may have introduced some biases in them. The first point to which we wish to draw attention on the representativeness of the simulated data, as we cannot guarantee their soundness nor accuracy compared to real-life situations: doing so is equivalent to solving FIC/DK-P. Consequently, we could not guarantee that, for a given objective, a solution exists. This may be one of the reasons regarding the difficulty of maximizing the fitness metric related to the objective of a student \(\nu _2\). Yet, this is an interesting point to consider, as it clearly illustrates how, for institutions, the task of certifying that the academic outcomes they propose are aligned with the courses is difficult.
Another point that seems important to consider is the difficulty of gathering courses and knowledge information in institutions that are not fully engaged in this kind of practice. This requires raising teachers’ awareness regarding how to describe their teachings. An important future step to further study the problem could involve the creation of a comprehensive real-life dataset that is shared within the community. This effort could be facilitated by developing a collaborative procedure for collectively defining a standardized description of the dataset. Benchmarking such a dataset will be worthwhile as it will allow a common comparison point for the entire community that can therefore be deployed in a real-life context, leading to the very first feedback from the education stakeholders. For now, we are aiming to create a data warehouse containing simulated datasets.
Our objective in this paper was not to find the best solution for a given instance of FIC/DK-P, but to prove that near-good solutions can be computed and to investigate FIC/DK-P. As a consequence, we used in our GA a mono-objective method which may be not, in retrospect, the most suited for solving the problem: we show by studying our experimental results that this problem is multi-dimensional. A better approach could be to work on multi-objective optimization and identifying the Pareto front of the problem which could help to find the best combination. Incidentally, we also question the independence of the parameters of the GA (e.g. mutation, crossover) regarding the instance of the problem. In any case, we advocate that it should be the education stakeholders that make the decisions process, supported by this kind of decision-making tool, and not the other way around.
Opportunities
Having such a framework to compute improved curricula brings interesting opportunities, such as the explainability of the risks of failure of students. Institutions could also implement interactivity in their current academic offers by showing to the student the effect of replacing a course a curriculum. Involving students directly in the decision-making process of the customization or individualization of curricula could have a positive effect on their motivation, and could be a way to mitigate the difficulty of the problem. In addition, institutions could assist students who failed an academic year or students who wish to reorient themselves, by computing better-suited curricula; institutions could also compute on-the-fly new curricula at each academic term for each student to ensure that the initial recommendations are still valid or to correct them if necessary. It could also help students to make informed choices, by indicating the estimated risk of a curriculum according to his/her objective.
Another promising opportunity lies in the possibility for an institution to self-assess the quality of its academic offer. By computing, for each of its students, a fully individualized curriculum, an institution could identify which courses are undertaken or not correctly located from an academic terms point of view, and thus modify its offer accordingly. Another self-assessment an institution could perform concerns the quality of its academic offers compared to the proposed academic outcomes. By computing curricula with a specific academic outcome as an objective for all its students, an institution could identify if this objective is effectively reachable and, if not, inform students under what conditions this objective should or should not be pursued.
Finally, we envisage adopting a multi-objective approach based on the \(\nu _2\) and \(\nu _4\) metrics. Studying the Pareto front will give a way to quantify the difficulty of a curriculum based on a student’s objective. This could become a powerful decision-making tool, as it could show the risk of a specific curriculum to a student and help him/her to make an informed choice, especially by identifying some challenging aspirations.
Conclusion
In this paper, we recognize that impact of knowledge decay over time and incorporate it as a fundamental aspect of the problem of finding a sequence of courses (i.e. curriculum) that can qualify a given student for their objective. This is typically not considered in curriculum design, which leads to a new problem we called the Fully Individualized Curriculum with Decaying Knowledge Problem (FIC/DK-P). The principal objective of our work was to investigate and study this new problem. For that, we initially proposed a modeling of FIC/DK-P, which we formalize further in the last section of this paper. We conducted two experiments: one being an exact solving method and one using a genetic algorithm, a meta-heuristic, to study the problem. The experimental results show that the problem is very difficult to solve and is not fitted to be manually solved by education stakeholders (we prove in Appendix A that the problem is NP-Complete).
Our objective during these experiments was not to find the best solution given a student’s profile and his/her objective. Rather, we wanted to prove that it is possible to increase the quality of a given curriculum up to a near-good solution, despite the problem’s high complexity, to encourage future optimization works. Our objective was also to investigate FIC/DK-P and produce the first prescriptive recommendations for the education stakeholders who are currently considering implementing individualization or who have already done so. We were able to identify nine recommendations. We have also illustrated the effects that our proposals could have on the different stakeholders.
In its current state, our framework can be useful in institutions implementing partial or fully individualized curriculum design for monitoring a student’s progress and risks in his/her curriculum. This can be achieved by updating the profile of the students after each assessment. Our approach additionally encourages the adoption of dynamic curricula, allowing students the flexibility to modify their planned course of study in real time. This can occur due to different factors such as a change in personal objective or a failure in completing a particular academic term – which is a common occurrence considering the average time-to-degree for a Bachelor’s degree in Europe is approximately 3.5 years Vossensteyn et al. (2015).
Our future works involve implementing state-of-the-art solving strategies from the literature, such as meta-heuristic hybrids, in the context of FIC/DK-P to assess their effectiveness through experiments and benchmarking them. Because of its importance, we hope that the community will continue to explore this problem, which could lead to the identification of new efficient and innovative techniques and algorithms to better individualize students’ journeys, thus contributing to the United Nations’ sustainable development goals for quality education.
Notes
Therefore, in the rest of the paper, we will use the term “knowledge” as an encompassing term of all the aforementioned ones, for the sake of readability.
It is absolutely possible to modify this accumulation behavior by defining a mastery evolution function to express how a mastery should evolve.
Notations are detailed in Appendix A.
However, we recall that the model is designed to assign a different decay function for each knowledge if needed.
However, it gives additional information to the Prolog engine to automatically eliminate unfeasible solutions. That is why Solver 2 took less logical inferences (LI) than Solver 1.
Note that \(1 \text { s} \approx 9 \times 10^{5} \text { LI}\)
Please remember that, in our context, curriculum needs to be computed reasonably fast as it should be computed for each student of an institution.
References
Acampora, G., Gaeta, M., & Loia, V. (2011). Hierarchical optimization of personalized experiences for e-learning systems through evolutionary models. Neural Computing and Applications,20(5), 641–657
Aleven, V., McLaughlin, E. A., Glenn, R. A., & Koedinger, K. R. (2016). Instruction based on adaptive learning technologies. Handbook of Research on Learning and Instruction, 2, 522–560.
Al-Muhaideb, S., & Menai, M. E. B. (2011). Evolutionary computation approaches to the curriculum sequencing problem. Natural Computing,10(2), 891–920
Anderson, J.R., & Schunn, C.D. (2013). Implications of the act-r learning theory: No magic bullets. In: Advances in Instructional Psychology, pp. 1–33. Routledge
Arthur, W., Bennett, W., Stanush, P. L., & McNelly, T. L. (1998). Factors that influence skill decay and retention: A quantitative review and analysis. Human Performance,11(1), 57–101
Averell, L., & Heathcote, A. (2011). The form of the forgetting curve and the fate of memories. Journal of Mathematical Psychology,55(1), 25–35. https://doi.org/10.1016/j.jmp.2010.08.009. Special Issue on Hierarchical Bayesian Models
Averell, L., & Heathcote, A. (2011). The form of the forgetting curve and the fate of memories. Journal of Mathematical Psychology, 55(1), 25–35.
Backenköhler, M., Scherzinger, F., Singla, A., & Wolf, V. (2018). Data-driven approach towards a personalized curriculum. International Educational Data Mining Society
Bacon, D. R., & Stewart, K. A. (2006). How fast do students forget what they learn in consumer behavior? a longitudinal study. Journal of Marketing Education, 28(3), 181–192.
Bada, S. O., & Olusegun, S. (2015). Constructivism learning theory: A paradigm for teaching and learning. Journal of Research & Method in Education, 5(6), 66–70.
Bahrick, H.P. (2000). Long-term maintenance of knowledge.
Belacel, N., Durand, G., & Laplante, F. (2014). A binary integer programming model for global optimization of learning path discovery. In: EDM (Workshops). Citeseer
Benmesbah, O., Lamia, M., & Hafidi, M. (2021). An enhanced genetic algorithm for solving learning path adaptation problem. Education and Information Technologies, 1–32
Bloom, B.S., et al. (1956). Taxonomy of educational objectives. vol. 1: Cognitive domain. New York: McKay 20, 24
Boland, N., Bley, A., Fricke, C., Froyland, G., & Sotirov, R. (2012). Clique-based facets for the precedence constrained knapsack problem. Mathematical Programming,133(1–2), 481–511
Brewer, G. A., & Unsworth, N. (2012). Individual differences in the effects of retrieval from long-term memory. Journal of Memory and Language, 66(3), 407–415. https://doi.org/10.1016/j.jml.2011.12.009
Cahon, S., Melab, N., & Talbi, E.-G. (2004). Paradiseo: A framework for the reusable design of parallel and distributed metaheuristics. Journal of Heuristics,10(3), 357–380
Caputi, V., & Garrido, A. (2015). Student-oriented planning of e-learning contents for moodle. Journal of Network and Computer Applications, 53, 115–127. https://doi.org/10.1016/j.jnca.2015.04.001
Chen, C.-M. (2009). Ontology-based concept map for planning a personalised learning path. British Journal of Educational Technology,40(6), 1028–1058
Choffin, B., Popineau, F., Bourda, Y., & Vie, J.-J. (2019). Das3h: modeling student learning and forgetting for optimally scheduling distributed practice of skills. arXiv:1905.06873
Commission des Titres d’Ingénieurs (2023). Références et orientations. https://www.cti-commission.fr/wp-content/uploads/2023/03/RO_Referentiel_2023_VF2023-03-16.pdf. Accessed 24 April 2023
da Silva Lopes, R., & Fernandes, M.A. (2009). Adaptative instructional planning using workflow and genetic algorithms. In: 2009 Eighth IEEE/ACIS International Conference on Computer and Information Science, pp. 87–92. IEEE
Daniela, L., Visvizi, A., Gutiérrez-Braojos, C., & Lytras, M. D. (2018). Sustainable higher education and technology-enhanced learning (tel). Sustainability,10(11), 3883
de-Marcos, L., Barchino, R., Martínez, J.-J., Gutiérrez, J.-A., & Hilera, J.- R. (2008). Competency-based intelligent curriculum sequencing: comparing two evolutionary approaches. In: 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 3, pp. 339–342. IEEE
De-Marcos, L., Martínez, J.J., Gutiérrez, J.A., Barchino, R., & Gutiérrez, J.M. (2009). A new sequencing method in web-based education. In: 2009 IEEE Congress on Evolutionary Computation, pp. 3219–3225. IEEE
Deng, Y., Huang, D., & Chung, C.-J. (2017). Thoth lab: A personalized learning framework for cs hands-on projects (abstract only). In: Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education. SIGCSE ’17, p. 706. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3017680.3022442.
Desmarais, M. C., & Baker, R. S. (2012). A review of recent advances in learner and skill modeling in intelligent learning environments. User Modeling and User-Adapted Interaction, 22(1–2), 9–38.
Durand, G., Laplante, F., & Kop, R. (2011). A learning design recommendation system based on markov decision processes. In: KDD 2011 Workshop: Knowledge Discovery in Educational Data, ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2011) in San Diego, CA
Durand, G., Belacel, N., & LaPlante, F. (2013). Graph theory based model for learning path recommendation. Information Sciences, 251, 10–21.
Duval, E., & Hodgins, W. (2003). A lom research agenda. In: WWW (Alternate Paper Tracks). Citeseer
Ebbinghaus, H. (2013). Memory: A contribution to experimental psychology. Annals of Neurosciences,20(4), 155
Eiben, A.E., Smith, J.E., et al. (2003). Introduction to Evolutionary Computing vol. 53. Springer
Feng, X., Xie, H., Peng, Y., Chen, W., & Sun, H. (2011). Groupized learning path discovery based on member profile. In: New horizons in web-based learning-ICWL 2010 workshops: ICWL 2010 workshops: STEG, CICW, WGLBWS, and IWKDEWL, Shanghai, China, December 7-11, 2010 Revised Selected Papers 9, pp. 301–310. Springer
Garey, M.R., & Johnson, D.S. (1979). Computers and Intractability vol. 174. freeman San Francisco
Georghiades, P. (2000). Beyond conceptual change learning in science education: Focusing on transfer, durability and metacognition. Educational Research, 42(2), 119–139.
Govindarajan, K., Kumar, V.S., et al. (2016). Dynamic learning path prediction–a learning analytics solution. In: 2016 IEEE Eighth International Conference on Technology for Education (T4E), pp. 188–193. IEEE
Heller, J., Steiner, C., Hockemeyer, C., & Albert, D. (2006). Competence-based knowledge structures for personalised learning. International Journal on E-learning,5(1), 75–88
Herrero, I., & Algarrada, I. (2010). Is the new ects system better than the traditional one? an application to the ects pilot-project at the university pablo de olavide. European Journal of Operational Research, 204(1), 164–172.
Hill, H. C., Rowan, B., & Ball, D. L. (2005). Effects of teachers’ mathematical knowledge for teaching on student achievement. American Educational Research Journal, 42(2), 371–406.
Hovakimyan, A., Sargsyan, S., & Barkhoudaryan, S. (2004). Genetic algorithm and the problem of getting knowledge in e-learning systems. In: IEEE International Conference on Advanced Learning Technologies, 2004. Proceedings., pp. 336–339. IEEE
Howe, M.J. (1980). The Psychology of Human Learning. New York: Harper & Row
Huang, Z., Liu, Q., Chen, Y., Wu, L., Xiao, K., Chen, E., Ma, H., & Hu, G. (2020). Learning or forgetting? a dynamic approach for tracking the knowledge proficiency of students. ACM Transactions on Information Systems (TOIS), 38(2), 1–33.
Kardan, A. A., Ebrahim, M. A., & Imani, M. B. (2014). A new personalized learning path generation method: Aco-map. Indian Journal of Scientific Research,5(1), 17–24
Klammer, A., & Gueldenberg, S. (2019). Unlearning and forgetting in organizations: a systematic review of literature. Journal of Knowledge Management, 23(5), 860–888.
Klinkenberg, S., Straatemeier, M., & van der Maas, H. L. (2011). Computer adaptive practice of maths ability using a new item response model for on the fly ability and difficulty estimation. Computers & Education, 57(2), 1813–1824.
Kolliopoulos, S. G., & Steiner, G. (2007). Partially ordered knapsack and applications to scheduling. Discrete Applied Mathematics,155(8), 889–897
Kryukov, V., & Gorin, A. (2017). Digital technologies as education innovation at universities. Australian Educational Computing,32(1), 1
Li, Z., Papaemmanouil, O., & Koutrika, G. (2016). Coursenavigator: interactive learning path exploration. In: Proceedings of the Third International Workshop on Exploratory Search in Databases and the Web, pp. 6–11
Lin, C. F., Yeh, Y.-C., Hung, Y. H., & Chang, R. I. (2013). Data mining for providing a personalized learning path in creativity: An application of decision trees. Computers & Education,68, 199–210
Lindsey, R. V., Shroyer, J. D., Pashler, H., & Mozer, M. C. (2014). Improving students’ long-term knowledge retention through personalized review. Psychological Science, 25(3), 639–647.
Loo, E. H., Goh, T. N., & Ong, H. L. (1986). A heuristic approach to scheduling university timetables. Computers & Education,10(3), 379–388
Mandin, S., & Guin, N. (2014). Basing learner modelling on an ontology of knowledge and skills. In: Sampson, D.G., Spector, J.M., Chen, N.-S., Huang, R., Kinshuk (eds.) IEEE International Conference on Advanced Learning Technologies, pp. 321–323. IEEE Computer Society, Athènes, Greece
Miller, B. L., Goldberg, D. E., et al. (1995). Genetic algorithms, tournament selection, and the effects of noise. Complex Systems,9(3), 193–212
Molontay, R., Horváth, N., Bergmann, J., Szekrényes, D.L., & Szabó, M. (2020). Characterizing curriculum prerequisite networks by a student flow approach. IEEE Transactions on Learning Technologies
Mozer, M.C., & Lindsey, R.V. (2016). Predicting and improving memory retention. Big Data in Cognitive Science 34
Murre, J. M., & Dros, J. (2015). Replication and analysis of ebbinghaus’ forgetting curve. PloS One, 10(7), 0120644.
Nabizadeh, A.H., Mário Jorge, A., & Paulo Leal, J. (2017). Rutico: Recommending successful learning paths under time constraints. In: Adjunct Publication of the 25th Conference on User Modeling, Adaptation and Personalization, pp. 153–158
Nabizadeh, A. H., Leal, J. P., Rafsanjani, H. N., & Shah, R. R. (2020). Learning path personalization and recommendation methods: A survey of the state-of the-art. Expert Systems with Applications, 159, 113596.
Papousek, J., Pelánek, R., & Stanislav, V (2014) Adaptive practice of facts in domains with varied prior knowledge. In: Educational Data Mining 2014
Parameswaran, A., Venetis, P., & Garcia-Molina, H. (2011). Recommendation systems with complex constraints: A course recommendation perspective. ACM Transactions on Information Systems (TOIS), 29(4), 1–33.
Salahli, M. A., Özdemir, M., & Yasar, C. (2013). Concept based approach for adaptive personalized course learning system. International Education Studies, 6(5), 92–103.
Samia, A., & Mostafa, B. (2007). Re-use of resources for adapted formation to the learner. In: 2007 International Symposium on Computational Intelligence and Intelligent Informatics, pp. 213–217. IEEE
Seki, K., Matsui, T., & Okamoto, T. (2005). An adaptive sequencing method of the learning objects for the e-learning environment. Electronics and Communications in Japan (Part III: Fundamental Electronic Science) 88(3), 54–71
Settles, B., & Meeder, B. (2016). A trainable spaced repetition model for language learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1848–1858. Association for Computational Linguistics, Berlin, Germany. https://doi.org/10.18653/v1/P16-1174. https://www.aclweb.org/anthology/P16-1174
Sipser, M. (1996). Introduction to the theory of computation. ACM Sigact News,27(1), 27–29
Sörensen, K. (2015). Metaheuristics-the metaphor exposed. International Transactions in Operational Research,22(1), 3–18
Tetzlaff, L., Schmiedek, F., & Brod, G. (2021). Developing personalized education: A dynamic framework. Educational Psychology Review, 33, 863–882.
University of Reading (2019). The modular system at Reading explained. https://student.reading.ac.uk/essentials/_study/course-and-departments/the-modular-system-at-reading-explained.aspx. Accessed 19 Jan 2020
van der Linden, W.J., & Hambleton, R.K. (2013). Handbook of Modern Item Response Theory. Springer, ???
Vanitha, V., & Krishnan, P. (2019). A modified ant colony algorithm for personalized learning path construction. Journal of Intelligent & Fuzzy Systems, 37(5), 6785–6800.
Vossensteyn, J.J., Kottmann, A., Jongbloed, B.W., Kaiser, F., Cremonini, L., Stensaker, B., Hovdhaugen, E., & Wollscheid, S. (2015). Dropout and completion in higher education in europe: Main report
Walkington, C., & Bernacki, M.L. (2014). Motivating students by “personalizing” learning around individual interests: A consideration of theory, design, and implementation issues. In: Motivational interventions vol. 18, pp. 139–176. Emerald Group Publishing Limited
Wong, C. (2018). Sequence based course recommender for personalized curriculum planning. In: Int. Conference on A.I. in Education, pp. 531–534. Springer
Zhang, Y., & Koren, J. (2007). Efficient bayesian hierarchical user modeling for recommendation system. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 47–54
Funding
This research was funded by APACHES, an I-SITE ULNE project, grant number FIPE18-007-VERMEULEN.
Author information
Authors and Affiliations
Contributions
Alexis Lebis is the main author of the research, participating in all stages: bibliographic survey, definition of the problem, elaboration of the case study structure, definition/elaboration of the mathematical model, implementation of the algorithms, theoretical/conceptual analysis, writing of the complexity proof, writing of the manuscript. Jérémie Humeau contributed to the definition of the problem, GA implementation, theoretical/conceptual analysis and mathematical modeling. Anthony Fleury contributed to data acquisition and elaboration of the case study structure and manuscript writing. Flavien Lucas contributed to mathematical soundness/accuracy, figure elaboration and manuscript writing. Mathieu Vermeulen contributed to the figures elaboration, bibliographic survey and conceptual analysis. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing Interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Mathematical Foundation
In this section, we lay the mathematical foundation of FIC/DK-P for the sake of reproducibility. We also prove that FIC/DK-P is NP-Complete.
A.1 Notation and glossary
-
\(\mathcal {C}\) : a course catalog;
-
\(\mathcal {K}\) : a catalog of knowledge;
-
\(\alpha _{c,k} \in [0;1]\) : the mastery of a knowledge k a course c brings to a student;
-
\(\tau _{c,k} \in [0;1]\) : the mastery of a knowledge k a course c require to be attended without risk – a prerequisite according to our definition;
-
\(\tau _{o,k} \in [0;1]\) : the mastery of a knowledge k required by the student objective;
-
\(\mathcal {O}=[\tau _{o,k_1}, \dots , \tau _{o,k_{\vert \mathcal {K} \vert }}]\) : a vector representing the state of all the knowledge mastery \(\tau _{o,k}\) expected in order to be qualified for the student objective;
-
\(\epsilon _c\) : the amount of credit a course c gives;
-
E: a specific threshold of credit for a student required to graduate;
-
t : an academic term;
-
T : the overall number of academic terms of a curriculum;
-
\(\Delta _t\) : the difference between two academic terms t and \(t+1\), with \(t+1 > t\);
-
\(\mathcal {S}_t\) : the number of courses c a student must attend in an academic term t;
-
\(\lambda =[c_i,\dots ,c_j]\) : a vector of size \(\sum _{t=1}^{T}\mathcal {S}_t\) representing all the courses c of the curriculum planned for a student;
-
\(m_{k,t} \in [0;1]\) : the mastery of a knowledge k of the student at a specific academic term t;
-
\(P_t=[m_{k_1,t},\dots ,m_{k_{\vert K \vert },t}]\) : a vector representing the state of all the knowledge mastery \(m_k\) of a student at a specific academic term t, namely the student profile;
-
\(\delta _k(\Delta _t)\) : the value of the decay a knowledge k face given a period \(\Delta _t\) if it is not mobilized;
-
f : the fitness function of an individual (see Section 4.3);
-
\(\nu _1\): a metric used in the fitness function f describing the difference of credits between the amount expected and the one obtained;
-
\(\nu _2\): a metric used in the fitness function f describing the difference of mastery between the requisites from the student objective and his/her profile at the end of the curriculum;
-
\(\nu _3\): a metric used in the fitness function f describing the quantity of repetition of each course in the curriculum;
-
\(\nu _4\): a metric used in the fitness function f describing the difference of mastery between the prerequisites of each course and the mastery of a student at the moment he/she attends it. It can be interpreted as the risk of a student failing his/her curriculum or the curriculum difficulty;
A.2 Equations
The mastery of a knowledge k at an academic term \(t+\Delta _t\) of a student is given by the following iterative definition:
As a reminder, the fitness function is defined in (2) in page 17 as:
The first metric \(\nu _1\) used for the fitness function is defined as:
The second metric \(\nu _2\) used for the fitness function is defined as:
The third metric \(\nu _3\) used for the fitness function is defined as:
with \(\mathcal {N}\) corresponding to the number of unique courses selected.
The fourth metric \(\nu _4\) used for the fitness function is defined as:
A.3 Complexity Study of FIC/DK-P
The experimental results presented in Section 4 highlighted that FIC/DK-P is a hard combinatorial problem. We have not been able to scale up to real-case scenarios while using an exact solving method. At most, an answer could be computed for three courses by semester with a maximum of seven courses available for each period. Consequently, we had to study the complexity of FIC/DK-P to better understand this problem and for future reference.
Let \(\alpha \) be a complete assignment. Since we cannot know if \(\alpha \) is optimal such that \(f(\alpha )\) whilst all the assignment has not been visited, one can say that our individualized curriculum problem I is at least strongly NP Sipser (1996). Here, we claim that optimizing our problem is nondeterministic polynomial-time complete (NP-Complete).
Proof
Let consider the Partially Ordered Knapsack Problem (POK) Kolliopoulos and Steiner (2007); Boland et al. (2012). The knapsack problem consists of finding an assignment of items i from a set of items \(\mathcal {N}\) whose maximize an overall usefulness score, while the total weight of the items does not exceed the knapsack capacity \(b \in \mathbb {Z}^{+}\). Each item \(i \in \mathcal {N}\) has a value \(v_{i} \in \mathbb {Z}\) and a weight \(w_{i} \in \mathbb {Z}^{+}\). In addition, POK introduces a partial order set (poset), which is a set of precedence relationships on items, denoted \(\mathcal {S} \subseteq (\mathcal {N} \times \mathcal {N}, \prec _{p})\), where a precedence relationship \((i,j) \in \mathcal {S}\) stands iff i can be placed in the knapsack only if j is already in the knapsack, i.e. \(i \prec _{p} j\). This precedence constraint can be represented by a directed graph \(G=(\mathcal {N},S)\), where each precedence constraint in S is represented by a directed arc from and to items in \(\mathcal {N}\). Garey and Johnson (1979); Kolliopoulos and Steiner (2007) shown that this kind of problem is strongly NP-Complete. A POK formulation can be as follow. Let
Then the POK may be written as:
Let us also consider a sub-part of the FIC/DK-P problem, such as \(I' \subset I\). Here, \(I'\) is an idealized representation of I, where there is no decay over time t regarding the learned competencies k of a student, such that \(m_{k,t+\Delta _{t}} = m_{k,t}\), and where once a knowledge is taught to the student, it is fully assimilated, such that \(\forall k, c, \alpha _{c,k} = 1\). Besides, we do not consider any credit system, therefore we can write \(\forall c, \epsilon _{c} = \infty \) as \(\epsilon \) acts as a constraint in FIC/DK-P. Consequently, the dependencies between courses can be expressed using logical operators \(L = \langle \wedge ,\vee \rangle \) to form a \(\mathcal {N} = \mathcal {C}+1\) nodes and-or directed graph \(G=(\mathcal {N},S)\) – where \(e: S\rightarrow L\) is a labeling function and the \(\mathcal {C}+1\) node is the career goal \(\mathcal {O}\). We can decompose G into n graphs \(G'\), given that each of these graphs is an instance of the problem where precedence relationships are no longer conditional, i.e. that each \(\langle \vee \rangle \) choice has been decided, and \(G'_{i} \ne G'_{j}, i,j\in n\). These precedence relationships on the item are expressible as a set of posets, denoted \(\mathfrak {S} = \left\langle (\mathcal {N}\times \mathcal {N}, \prec _{p1}),\ldots ,(\mathcal {N}\times \mathcal {N},\prec _{pn}) \right\rangle \), each poset \((\mathcal {N} \times \mathcal {N}, \prec _{i})\) being associated with \(G'_{i}, i\in n\). It appears to be clear that an optimal assignment for \(I'\) entails finding the best value (cf. (A.6)) among the subgraphs \(G'\), and can be written as follow:
With (A.11) representing the global time constraint (i.e. the fact that you cannot take more courses than you can physically attempt to). Now, by supposing that \(n=1\), (A.10) and (A.12) respectively collapse to (A.6) and (A.8). Therefore, POK is a specific case of \(I'\) where \(n=1\). Since the decomposition of G to \(G'\) graphs is made in a polynomial time, one can say that \(I'\) is NP-Complete.
Thereafter, we show that the full individualized curriculum problem I is at least as hard as \(I'\). By reintroducing the decay \(\delta \) in the formulation of our problem, (A.10) can be written as:
making the finding of an optimal assignment asymptotically equivalent if enough courses are visited, or harder otherwise. Incidentally, the transitivity of the partial set is maintained, as \(\forall (x,y) \in (\mathcal {N}\times \mathcal {N}, \prec )\), we either have xRy or yRx, where, \(R = \delta _{k}(t), \forall k\in x \cup y\). Reintroducing credits in the problem formulation means that, for each period, we must have enough credit at each academic term; it could potentially reduce the course candidates in each academic term by a constant factor, but this does not change the combinatorial property of the problem.
Consequently, I is at least as hard as \(I'\), meaning that the Fully Individualized Curriculum with Decaying Knowledge Problem is NP-Complete. \(\square \)
Please note that several works, such as Acampora et al. (2011); Al-Muhaideb and Menai (2011), had claim that arranging and delivering e-Learning materials to a specific learner is a NP-HARD problem. Taking into account the decay thus make the problem harder. This is an additional argument for giving to education stakeholders appropriate tools.
Appendix B: Algorithmic Details and Methods Implementation
B.1 Pseudo-code
This is a depth-first search (DFS) based algorithm, where each vertices of G have all the other non-taken courses as neighbors. Note that to optimize the search time, one can build G such that its depth equals \(\sum _{t=1}^{T}\mathcal {S}_t\) and that, at depth i, only the available courses can be taken.
This heuristic is employed alongside Algorithm 1 to guide the course exploration process, ensuring that the student’s mastery of the learning objective remains equal to or above the expected value \(\tau _{o,k}\). It is important to note that this heuristic does not restrict the search space. However, if the heuristic results in \(b=\emptyset \), the specific behavior for selecting the next node to explore is not defined and is left to the discretion of the implementation. In our case, it is contingent on the Prolog engine. Finally, one can then customize the Algorithm 1 to prune specific branches that are sure to lead to non-solution (e.g. not enough courses that can improve mastery of k above the \(\tau _{o,k}\) threshold).
B.2 Source Codes
B.2.1 Source Code of the Exact Method
For the actual implementation in Prolog of the exact solving method presented in subsection 4.2, please refer to: https://archive.softwareheritage.org/swh:1:dir:1eba2c7a6dbee16774a59a0988e6101f3e99851f.
B.2.2 Source Code of the Meta-Heuristic Method
For the actual implementation of the meta-heuristic method presented in subsection 4.3, please refer to : https://archive.softwareheritage.org/swh:1:dir:a4dfe03c88df199b431b3482920e6aa70e43f8ef
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lebis, A., Humeau, J., Fleury, A. et al. Fully Individualized Curriculum with Decaying Knowledge, a New Hard Problem: Investigation and Recommendations. Int J Artif Intell Educ (2023). https://doi.org/10.1007/s40593-023-00376-9
Accepted:
Published:
DOI: https://doi.org/10.1007/s40593-023-00376-9