The Flat peer learning model (hereinafter \({\mathcal {F}} PL\)) models the interactions of a population of agent-learners connected with each other. Each agent features an internal state that is updated via two stochastic processes, independent and peer learning.
Specifics of the model
The model supposes that agents are: (i) autonomous units, free to interact with other agents; (ii) reactive with a form of memory; (iii) heterogeneous regarding their state; (iv) interdependent as they influence others in response to the influence that they receive [31].
Patterns
The learners are represented by a set of M agents; each one occupies a node in a peer network (hereinafter Pn). The learner’s positions in the network define a neighbourhood between individuals; for each one, this is the set of other learners who possibly can help him or that he can help. The potential influence of a learner is the number of connections he has with others; this number is represented by the node-degree in Pn. The agents are indexed by the integers, so that \(a_i\) is the agent-number i, \(N_i\) his neighbourhood and \(|N_i|\) his degree.
In the following we will consider successively two patterns for the \({\mathcal {F}} PL\) model, a regular lattice and a scale-free structure; the first one models an homogeneous space zone like a classroom, while the second refers rather to a relational network.
Discrete states
The current state of agent \(a_i\) is his skill level or knowledge with respect to an arbitrary task and is denoted by \(\mathrm{state}_i\). The model allows for agents to progress through a fixed number of levels. The set of states values \(\Sigma =\{\)level\(_1,...,\)level\(_L\}\) is a set of finite cardinal totally ordered with order defined by the \(\mathrm{succ}\) function such as \(\forall \) j as \(1\le j < L, \mathrm{succ}(\)level\(_j)=\) level\(_{j+1}\) and succ(level\(_L)=\) level\(_L\). This defines a total order on the states by:
\(\mathrm{state}_i \prec \mathrm{state}_j\) iff \(\exists n \in \mathbb {N^*}\) with \(\mathrm{state}_j\) = \(\mathrm{succ}^n\)(\( \mathrm{state}_i\))
where the integer n means an iterate with n steps.
Process overview
At each time step t, each agent \(a_i\) updates his current state \(\mathrm{state}_i^{t}\) according to a local transition rule:
$$\begin{aligned} \begin{array}{ccccc} \Phi &{} : &{} \Sigma ^{|N_i|+1} &{} \rightarrow &{} \Sigma \\ &{} &{} {sN_i \cup \{\mathrm{state}_i^{t}\}} &{} \mapsto &{} { \mathrm{state}_i^{t+1}} \\ \end{array} \end{aligned}$$
where \(sN_i\) is the set of states of all the agents in \(N_i\); so, the function \(\varPhi \) takes as input the neighbourhood states and the own state of the agent.
At a time, for all the agents, the updates are synchronous. An agent can not progress by more one level at each time step (i.e. \(\mathrm{state}^{t+1}_i = \mathrm{state}^{t}_i\) or \(succ(\mathrm{state}^{t}_i )\)); as the level of an agent never diminishes, there is an irreversible ratchet effect. The entire process represents the evolution of the system from the initial configuration (all agents have state value level\(_1\)) to a configuration in which all agents have reached the target state value level\(_L\). For each agent, we define his own performance as his learning-time that is the number of time steps necessary to reach the target skill level.
The model is build on the assumption that knowledge is accomplished following a linear path passing through intermediate levels. You have to be aware that in real life this is not always the case; even fixing level\(_1\) and level\(_L\) there could be many different paths joining them and each learner can be more comfortable with one path than another one. Although this assumption induces a limitation, it is necessary, at least in a first approach, to highlight the exogenous causes of the peer learning process.
Independent learning
This part of the model captures the ability for learners to improve their skill level by studying the material, or practicing, on their own. Independent learning thus refers to the ability of any given learner to improve his skill level by one during one-time step; the probability p for such an improvement to occur is fixed and identical for every learner.
For instance, if p is set to 0.1, each learner has a one-in-ten chance of improving his skill level during one-time step. This boils down to any given learner improving one skill level every 10 time steps, on average. So, if we consider 10 time steps to represent the duration of the learning episode (e.g. a semester-long course), then a value for p that is much lower than 0.10 would indicate that many students will not achieve mastery of the skill being taught by the end of the course; conversely, if p is much higher than 0.10, most students will achieve mastery by the end of the learning episode.
To model the independent learning process we present two approaches based, first on probabilistic equation, then on agent-based modelling.
Probabilistic equation model
For each person, the learning process is independent and follows a Negative binomial distribution. Such distribution, pertains to the trial number t at which the first L successes have been obtained, each trial being the realization of a Bernoulli variable with success probability p. Therefore the probability for a learner to reach the target value level\(_L\) at step t is:Footnote 1
$$\begin{aligned} \left( \!\!\! \begin{array}{c} t-1 \\ L-1 \end{array} \!\!\!\right) p^{L} (1-p)^{t-L} \end{aligned}$$
(1)
Following Wolfram [43] we deduce that a learner reaches the target level\(_L\) on average after \(\frac{L}{p}\) time steps with a standard-deviation of \(\frac{\sqrt{L \times (1-p)}}{p}\).
Agent-based model
First and foremost, let us recall that with independent learning a single run entails the simulation of M independent processes. For each learner, the neighbourhood is not taken into account and thus the function \(\varPhi \) takes its values from \(\Sigma \) only. During each time step t, the level of a learner \(a_i\) is updated on the basis of a stochastic dynamics: a random variable \(x_p\) is generated on the uniform interval [0..1] and compared to the p value; so, the probabilistic transition function \(\varPhi \) is fully defined by:
$$\begin{aligned} \mathrm{state}^{t+1}_i=\left\{ \begin{array}{rl} succ(\mathrm{state}^{t}_i) &{} \text{ if } (x_p < p) \\ \mathrm{state}^{t}_i &{} \text{ else } \end{array}\right. \end{aligned}$$
(2)
Let us note that, once a learner reaches the top level, \(\varPhi \) is the identity function.
Peer learning
Now, we combine the ability for each learner to improve by himself his skill level with the ability to increase this one by interacting with other learners. As peer learning is described as a way of moving beyond independent to mutual learning [10], we are particularly interested in exploring the extent to which interactions can improve the learning process both from the point of view of each participant but also at the level of the group considered as a whole. Obviously, this objective will be all the more meaningful in that the learners have a measured capacity to progress by themselves.
Peer learning rules
The \({\mathcal {F}}PL\) model captures interactions between learners. Any given agent \(a_i\) may only engage in peer learning with agents \(a_k\) checking the following conditions:
-
1.
\(a_k \in N_i\)
-
2.
\(a_k\) features a higher skill level than \(a_i\) (i.e. \(\mathrm{state}_i \prec \) \(\mathrm{state}_k\));
-
3.
\(a_k\) is ’within the reach’ of \(a_i\)
More knowledgeable other
To specify the third condition, we will refer to the notion of peer formalised by the developmental psychologist Lev Vygotsky using the concept of More Knowledgeable Other (hereinafter \(\mathrm{MKO}\)) [2, 13, 35, 38].
Vygotsky suggests that knowledge is developed through social contact and that learning takes place through interactions between teachers and students as well as between students themselvesFootnote 2 [23, 39]. In a real learning scenario, a \(\mathrm{MKO}\) is anyone who can help a learner with regard to a particular task; such a person may be a teacher, a parent, an older adult, a coach or a peer. Following this, the \(\mathrm{MKO}\) of a learner is defined as the subset of all other learners that may help him. We propose to specify the \(\mathrm{MKO}\) according to the following educational strategy: let \(\delta \) an integer larger or equal to 1, for each agent-learner \(a_i\) his \(\mathrm{MKO}\) is the set:
$$\begin{aligned} \mathrm{MKO}_i=\{a_k \in N_i | \mathrm{state}_k = \mathrm{succ}^{\delta }(\mathrm{state}_i)\} \end{aligned}$$
(3)
The integer parameter \(\delta \) means an iterate with \(\delta \) steps; it will be referred to as the level gap; for simplicity, it is assumed that its value is identical for all learners. This strategy means that a neighbour (condition 1) can help if he is better (condition 2) with a skill level gap equal to \(\delta \) (condition 3). Although this strategy is logical and understandable way, once applied by all learners, the consequences are difficult to predict.
Peer learning dynamics
During each time step t, if a given agent \(a_i\) did not already improve his skill level via independent learning, thanks to peer learning he may still do so if, at least, there is one other agent in \(\mathrm{MKO}_i^t\). In such a case, his level can be improved via a new process based on both a fixed probability q (assumed to be the same for all learners) and the number of MKOs among his neighbours; the idea is that the more neighbours who can help him a learner has, the more likely he is to progress. This new process supplements the above-described independent learning dynamic and captures the idea of peer learning.
So the \({\mathcal {F}}\mathrm{PL}\) model is based on initially setting the population of agents, each with the initial state level\(_1\). The dynamics then proceeds in a series of discrete time steps. During each time step t, the agent’s states are updated simultaneously based on two stochastic processes: for each agent \(a_i\), two random variables \(x_p\) and \(x_q\) are generated on the uniform interval [0; 1] and compared to p and q values respectively; so, the probabilistic transition function \(\varPhi \) is fully defined by:
$$\begin{aligned} \mathrm{state}^{t+1}_i=\left\{ \begin{array}{rl} succ(\mathrm{state}^{t}_i) &{} \text{ if } x_p< p \\ succ(\mathrm{state}^{t}_i) &{} \text{ if } x_p \ge p \,\,\hbox{and} \,\,x_q < 1 - (1 - q)^{|\mathrm{MKO}_i|} \\ \mathrm{state}^{t}_i &{} \text{ else }\end{array}\right. \end{aligned}$$
(4)
The simulation of a complete learning episode consists of repeating such elementary step until all agents reach the target state value level\(_L\). It should be noted that as soon as an agent has reached the maximum level, he can no longer progress, but can continue to help those who are still learning.
To supplement the description of the model, algorithm 1 provides the pseudo-code and to ensure reproducibility of results, the full source code will be available upon request from the authors. Let’s note that the model implements well the principle of equity because (i) initially all agents have the same state value; (ii) the three parameters p, q, \(\delta \) that control the learning processes are identical for all agents; (iii) and the peer network is static.
Entities and variables
The model parameters can be seen as the variables of a particular entity named observer; the value of such a variable is fixed before an execution and does not vary during the learning process [22]. Table 1 summarizes the ”observer variables”. The other entities are the learners; each has its own variables which can vary during the learning process. Table 2 summarizes the ”learner variables”.
Table 1 Observer variables (global parameters) Table 2 Learner variables Netlogo simulations
In the following sections we will present simulations of the \({\mathcal {F}}\mathrm{PL}\) model. Experiments will be performed with an implementation of the model in the NetLogo multi-agent programmable environment [3, 41]. The observer entity corresponds to the agent-observer and the learner entities to the agent-turtles; so, in Netlogo code, the observer-variables will be the global variables and the learner-variables the turtles-own variables.
Scales
We will refer to as one single run, the process starting with all the skill levels set to level\(_1\) and ending when all the learners has reached the target value level\(_L\).
As the model is probabilistic, all presented quantitative results will be averaged over 100 runs, unless otherwise noted. We use the coefficient of variation (hereinafter cv), that is the ratio of the standard deviation of a sample to its mean, to choose this sample size. As proposed by Lee et al. [26], “the sample size at which the difference between consecutive cv’s falls below a criterion, and remains so is considered a minimum number of runs”. For example, with a grid network, with \(p=0.3\), \(q=0.3\) and \(\delta =4\), as the outcome drawn from sample sizes in \(\{10, 100, 500,1000\}\), yields cv in \(\{0.0042,0.0045,0.0045,0.0045\}\), 100 runs is a reasonable choice [27]. To avoid effects due to a small sample size, the number of agent-locations M will be oversized to 1024.
Global measures of performance
For each learner, his own performance is defined as his learning-time that is the number of time steps necessary to reach the target skill value level\(_L\) from level\(_1\).
As the dynamics can lead to heterogeneous groups with respect to level, simulations will focus on monitoring two aggregate measures over the entire population:
Of course, to know whether the mean and standard deviation are representative, it will be necessary to look at the learning time distribution; if this is indeed the case, the first measure will be a good indicator of learning effectiveness for the group considered as a whole, whereas the second will be an indicator of the extent of dropping out related to the number of learners who are significantly behind their peers and/or also a measure of overachievers. Ideally, we would like to minimize both cL and cE; with self-directed learning, the cost depends on p only, while with peer learning, costs are based on p, q and \(\delta \).