Minimal Generators from Positive and Negative Attributes: Analysing the Knowledge Space of a Mathematics Course

Formal concept analysis is a data analysis framework based on lattice theory. In this paper, we analyse the use, inside this framework, of positive and negative (mixed) attributes of a dataset, which has proved to represent more information on the use of just positive attributes. From a theoretical point of view, in this paper we show the structure and the relationships between minimal generators of the simple and mixed concept lattices. From a practical point of view, the obtained theoretical results allow us to ensure a greater granularity in the retrieved information. Furthermore, due to the relationship between FCA and Knowledge Space theory, on a practical level, we analyse the marks of a Mathematics course to establish the knowledge structure of the course and determine the key items providing new relevant information that is not evident without the use of the proposed tools.


Introduction
We are living in the information era, where almost all data are digitised. Many different techniques of data mining, machine learning and artificial intelligence can be applied to extract useful information in many different contexts. Education is one of such contexts, where the intelligent analysis of this information, applied to academic activities, has led to the emergence of what is known as Educational Data Mining (EDM) [1]. Currently, many approaches have shown the benefits of knowledge extraction techniques to forecast student's performance [2], forecast student dropout [3], or recommend educational activities [4], among others. Roughly speaking, the advantage of using computational learning techniques to study these problems is that they can now rely on large amounts of data and will therefore be able to extract relevant knowledge about the patterns of students' academic behaviour in a wide variety of situations. This knowledge extraction allows determining different strategies to improve the academic success of students.
Mainly, the artificial intelligence techniques that have been applied in Educational Data Mining belong to the class of supervised machine learning [5]. The most popular techniques have been deep neural networks (deep learning) [6] and decision trees (random forests) [7], although other methods have also been applied satisfactorily, including continuous and logistic regression [8], support vector machines [9], association rules [10] or evolutionary algorithms [11].
We will use formal concept analysis (FCA) [12] as our tool to extract knowledge from a dataset with data about results in a mathematics course of students. Specifically, we propose the use of FCA to model the knowledge structure of an academic course and the dependencies among the different units.
Knowledge Space Theory (KST) [13] proposes the evaluation of a person's knowledge status with respect to a given knowledge domain. The reference knowledge domain is a set of questions or items to be solved in a course and the knowledge state is the set of items the person is able to solve. The links between KST and FCA and how to explore KST in terms of FCA are shown in [14], concretely, these authors said "the FCA community developed a bundle of remarkably fast algorithms which could be exploited" in the field of Technology-Enhanced Learning.
Classical FCA extracts knowledge from a binary dataset, which relates objects and attributes, i.e., if in the dataset, in row i, column j we have a one or a cross, this means that O i is related with A j . FCA uses in the general approach the positive information in the dataset, i.e., the ones in the binary relation.
Our proposal extends this standard approach in FCA by incorporating mixed attributes, that is, considering positive and negative information in the same framework. If in the dataset, in row k, column l we do not have a one or a cross, this means that O k is not related with A l . This information is overlooked by the classical approach but we use it in our framework. In the educational context, this means that we use information about passed and failed exams, not only about failed ones, which is the framework established in [15]. Thus, the objective is to identify patterns, seen as relationships between course units, representing relevant hidden knowledge.
In addition, FCA facilitates acquiring a formal logic system based on implications that determine dependencies in the dataset. Therefore, FCA becomes a suitable tool to formally reason with attributes related to the acquisition of concepts, skills and competencies, as well as for the study of academic performance and the knowledge structure of a course, as proposed in the present work. FCA has been recently used to cluster students according to their academic behaviour to guide them in their academic path [16] or to build a decision support system capable of identifying students at risk of dropping out in Massive Online Open Courses (MOOCs) [17].
One of the advantages of using a logical approach to this problem is the possibility of explaining and interpreting the results obtained and, more importantly, the model built. The methods mentioned above in Educational Data Mining, such as neural networks or random forests, are based on the iterated application of statistical techniques and, in some cases, are black boxes, where the model built is not interpretable by the user in a simple way. With FCA, interpretability and explainability are guaranteed because the knowledge is expressed in the form of logical rules, which allow for traceability.
From a theoretical point of view, our contributions to this work are related to mixed formal contexts. Specifically, we establish a relation between the concept lattice of the mixed context and the concept lattices which only consider separately positive or negative information. To extract a more granular knowledge of the mixed context, we will compute the minimal generators [18] of all closed sets from the set of implications computed using FCA.
Minimal generators [19] play a major role in different areas such as databases, graph theory, data mining, etc.
We emphasize that in the theoretical framework, the crisp background presented in [18], about minimal generators, is extended to the mixed paradigm in this paper. Relationships between the minimal generators considering mixed contexts with respect to the minimal generators in positive and negative contexts are presented here. These relationships prove that more knowledge can be retrieved than by analysing individual contexts separately.
As mentioned above, to put the proposed theoretical framework into practice, we have applied it to the specific case of a high school Mathematics course in Spain. From a binary relation with the results of students in a set of relevant items in the course, we extract the implications using FCA techniques [20] and, all the minimal generators and their closed sets using Simplification Logic [21]. Regarding Simplification Logic, it is the reasoning tool that we use to draw conclusions from the relationship between the data, remove redundancy in the implications [22], and compute the minimal generators [18]. In our case study, we show the use of the minimal generators as a means of extracting the core items in the course, as well as answering the question "Is it possible for a student that has failed a certain number of units to pass the course?".
The paper is structured as follows. In Sect. 2, we present the basic notions needed for readers from different backgrounds to fully understand the content of the article. Later, in Sect. 3 we present the technical details of our proposal and the main theoretical contributions. In Sect. 4, the case study is presented, and an in-depth analysis is performed using the mentioned tools. The conclusions of this work and the proposal for future research directions are given in Sect. 5.

Background
In this section, we present the basic notions of formal concept analysis and Simplification Logic, which are used to find the minimal generators of the knowledge space.

Formal Concept Analysis
Formal concept analysis (FCA) is a helpful tool to extract knowledge from a dataset (called formal context). The formal context is defined as a three-tuple = (G, M, I) where G is a set of objects, M is a set of attributes, and I is a relation between G and M (called incidence) with the following interpretation: if the pair (g, m) ∈ I then we say that the object g has the attribute m. The incidence relation is usually represented by a table where the rows are objects; the columns are attributes. When we find a cross in a cell, we have that the object related to the row has the attribute related to the column.
FCA is closely related to Galois connections, which are two maps ∶ P → Q and ∶ Q → P between two ordered sets (P, ≤) and (Q, ≤) satisfying: (1) is antitone, i.e., p 1 ≤ p 2 then (p 1 ) ≥ (p 2 ); (2) is antitone, i.e., q 1 ≤ q 2 then (q 1 ) ≥ (q 2 ); (3) for all p ∈ P and q ∈ Q we have that: Given a formal context = (G, M, I) , we can define a Galois Connection between the set of attributes and objects as follows. The first map, That is, given a set of objects A ⊆ G , A ↑ is the a set of all the attributes shared for all the objects in A. The second map of the Galois Connection, denoted by ↓ , is defined as In other words, given a set of attributes B ⊆ M , B ↓ is the set of all the objects that have all the attributes in B. The pair ( ↑ , ↓ ) forms a Galois Connection [12,20]. Hence, the compositions ↑ • ↓ and ↓ • ↑ are closure operators. For the sake of the presentation, hereafter, we omit the symbol • to denote such a composition; i.e., we write ↑↓ and ↓↑ , respectively. A set C is said to be closed under the Galois connection Once the mappings ↑ and ↓ have been introduced, we can define the notion of formal concept, which is a pair (A, B) ⊆ G × M , such that A ↑ = B and B ↓ = A . The subset A is said to be the extent of the formal concept and B is said to be the intent of the formal concept. Given a formal concept (A, B), all the objects in A share all the attributes in B and do not share any other attributes. Moreover, we can define an order relation between formal concepts, given two formal concepts (A, B) and (C, D), we say that (A, B) ≤ (C, D) if and only if A ⊆ C (or equivalently, if and only if D ⊆ B ). Indeed, this order relation defines a structure of complete lattice in the set of formal contexts, where the supremum and infimum are given by: for any family of formal concepts {(A j , B j ) ∶ j ∈ J} . The complete lattice defined by this order is called the Concept Lattice of the formal context = (G, M, I) and we denote it by ( ) . In addition, it can be proved that every complete lattice L can be seen as a concept lattice of a certain formal context [23,Chapters 3 and 7]. Throughout the paper, we will use the term formal concept for pairs (A, B) ⊆ G × M , as well as for subsets of attributes which are closed under the Galois connection ↓↑ , that is, we identify formal concepts with their intents.

Simplification Logic and Minimal Generators
One crucial advantage of FCA is that the implications arising from a formal context = (G, M, I) can be treated using formal logic. Consequently, we may say that FCA provides a suitable mathematical background to express, represent, reason, deduce and explain the pieces of knowledge extracted from a dataset.
Let us begin by describing the syntax and semantics of the logic that models the dependencies between attributes in a formal context = (G, M, I) . Syntactically, given two sets of attributes A, B ⊆ M , we define an implication "A implies B" and denote it by A → B . Semantically, we say that an implication A → B is true if all objects with all the attributes in A have all the attributes in B. Formally, we say that an implication A → B is valid in a context if and only if B ⊆ A ↓↑ , which is equivalent to A ↓ ⊆ B ↓ . Moreover, we say that a context is a model of a set of implications Σ if every implication in Σ is valid in .
The benefit of using a logic system to represent dependencies between attributes is that we can perform (syntactic) inferences and define logical consequences. Accordingly, we say that a formula A → B is a logical consequence of a set of implications Σ (denoted Σ ⊧ A → B ) if every model of Σ is also a model of A → B . In other words, logical consequences are implications that must be true if we assume a set of valid implications. Hence, and thanks to formal logic, we may differentiate between correct conclusions (i.e., logical consequences) and fallacious conclusions.
The (syntactic) inferences require the use of axioms and inference rules. It is well known that we can manage this kind of implications through Armstrong's Axioms [24]. In this paper, we consider the Simplification Logic instead, this logic is equivalent to Armstrong's axioms but it is more appropriate for designing automatic reasoning methods [21]. Simplification Logic considers "Inclusion" as an axiom [Inc] Inclusion 1 : ⊢ S AB → A. and three inference rules called "fragmentation", "composition" and "simplification", respectively: Page 4 of 16 The Simplification Logic is sound and complete; i.e., given a set of implications Σ , any inferred implication is a logical consequence of Σ . Conversely, any logical consequence of Σ can be deduced through the Simplification Logic. It is worth mentioning that syntactic inferences allow obtaining logical consequences much more straightforward than applying definition. Therefore, Simplification Logic is a powerful tool for the performance of correct deductions. Another point that supports the use of the previously described logic is that we can represent the knowledge in a context through a set of implications, which is much easier to interpret. That is, given a context = (G, M, I) , we aim at determining a set of implications Σ such that (1) is a model of Σ and (2) for all implication A → B valid in , we have that Σ ⊢ S A → B . A set of implications satisfying these properties is called complete. Hereinafter, every set of implications is assumed to be complete.
Given a set of implications Σ and a set of attributes A ⊆ M , we define the logical closure of A as the maximum (for the inclusion) set Y ⊆ A such that A → Y holds, w.r.t. the implications using the Armstrong Axioms, or equivalently Simplification Logic. A + Σ denotes this set. See [21] for an efficient method using Simplification Logic to compute the closure of a set of attributes. Given a complete set of implications Σ and due to the Simplification Logic being sound and complete, the respective logical closure operator induced by Σ coincides with the closure operator of FCA, i.e. for any A ⊆ M , we have A + Σ = A ↓↑ . Therefore, we can use either Simplification Logic or FCA derivation operators to compute the same closure.
Then, given a closed set of attributes A ⊆ M , we say that Therefore, given a context = (G, M, I) , a minimal generator of M determines a minimal set of attributes from which we can infer the rest in M. In other words, minimal generators compress all the knowledge into only a few attributes, and somehow, we may say that they are more valuable than the others. Note that this definition of minimal generator from the use of the logical closure associated with a set of implications is equivalent to using the concept-forming operators from FCA, that is, C is a minimal generator of Then, it is easy to prove that both sets {a, b, d} and {a, b, e} are minimal generators of M (note that we may have different minimal generators of the same set). In this respect, we can ensure that any object with the attributes a, b and d necessarily has also the rest of the attributes due to the dependencies given by Σ . To point out the importance of minimal generators, if we were in a context of employee selection and M were the attributes we are interested in, then we could just focus on the verification of the attributes a, b and d (respectively, a, b and e) of applicants since the rest are consequences of those. ◻

Relationship Between FCA and Knowledge Space Theory
Knowledge Space Theory (KST) is based on the idea that the knowledge acquired by a group of persons about a specific discipline may be represented by a set of questions that they are capable of solving [15]. Accordingly, given a set of questions Q and a subset K of 2 Q (i.e., the powerset of Q), we say that the pair (Q, K) is a knowledge space if [KST1] Q ∈ K and ∅ ∈ K; [KST2] K is closed under union.
Each element in K is called knowledge state. Roughly speaking, each knowledge state represents the knowledge acquired by a subgroup of people. The condition [KST1] states that the full and null knowledge of a discipline must be knowledge states. On the other hand, [KST2] states that if two groups of people are capable of solving questions of two knowledge states K 1 and K 2 , then the join of both groups of people is capable of solving the questions in the knowledge states Additionally, a knowledge space (Q, K) is called quasi ordinal if K is closed under intersection. Quasi ordinal knowledge spaces are motivated in [13] to become the reference structure. They prove that if we can impose some dependencies among questions (in the sense, that if an individual answers question (a), then the individual can answer (b) as well), knowledge spaces have the structure of a lattice under the natural ordering in 2 Q . In this line, [15] relates knowledge spaces and FCA by means of the notion of Knowledge Context, which is a formal context = (P, Q, I) such that P is a set of students, Q is a set of questions/evaluations, and the relationship I is such that (p, q) ∈ I means that student p failed to solve problem q. The complements of the intents of the knowledge context (P, Q, I) form a knowledge space on Q, whose knowledge states represent the tested knowledge of the persons of P [15]. Reciprocally, given a knowledge space (Q, K) , one can easily define the context (K, Q, ∌) that expresses all the knowledge states and is isomorphic to the knowledge context (P, Q, I).
This characterisation of knowledge spaces by means of formal contexts allows us to work with the FCA techniques to analyse and express the hidden structure of the knowledge. Then, FCA provides two main tools for mining knowledge: the concept lattice ( ) , which is isomorphic 2 to the associated knowledge space (Q, K) ; and the set of valid implications in the formal context which determines, in this case, the dependencies among questions. Notably, in [15], it is stated that an irredundant description of a quasi ordinal knowledge space is given by an implication basis of the corresponding knowledge context. Thus, by studying the implication bases, we could obtain deeper insight into the knowledge structure of a field.

Proposal
The aim of this section is to show that bases of implications and minimal generators in mixed contexts give more information than positive, or negative, contexts. This is due to the fact that mixed contexts provide a finer granularity than the classical ones. This section presents a set of theoretical results that support the statement above. Specifically, we prove that all the implications in the positive and negative contexts can be derived from the basis of implications of the mixed context, but the converse is not true, i.e., there are implications of the mixed context that cannot be derived from the implications in the positive or negative one; and likewise for minimal generators. To illustrate, from a practical point of view, the advantages of retrieving information with mixed contexts, we will apply these results to a real-life example in Sect. 4. Next, we introduce briefly the terminology of mixed contexts.
Classical FCA considers a binary relation in which if an object g is related to an attribute m, then the pair (g, m) appears in the relation (positive information). Taking the proposal a step further, we propose in this article the use of an FCA extension considering not only the positive information but also the negative one, that is, when an object is not related to an attribute. Recently some works have appeared in the literature that deal with positive and negative information [25]. To the best of our knowledge, none of them considers the computation of minimal generators taking into account positive and negative information in a dataset.
Our approach follows the line of [26], which uses positive and negative attributes. Given a formal context = (G, M, I) , we define the set of negative attributes as M = {m | m ∈ M} and construct a new formal context ( | ) = (G, M ∪ M, I * ) where the incidence relation I * is defined as: Hence, the new incidence takes the opposite value to the value of m in the original incidence for all m ∈ M . Hereafter, we refer to and as the positive and the negative contexts, respectively.
However, this approach duplicates the number of columns in the context and, as a consequence, it increases the algorithmic and computational cost of working with the dataset. In this work, we follow the line in [26] of defining a new Galois connection over that captures both the positive and negative information without the need to duplicate the columns. The new connection is denoted by ⇑ and ⇓ to differentiate them from those of the original context . In the rest of this paper, the context equipped with the new Galois connection will be referred to as the mixed context. The new operators ⇑ ∶ 2 G → 2 M∪M and ⇓ ∶ 2 M∪M → 2 G are defined as follows: Since these two operators form a Galois connection, their composition is a closure operator. Therefore, they induce a concept lattice over the mixed context of . Let us denote by # ( ) the lattice formed by using the derivation operators ⇑ and ⇓ , in contrast to the concept lattice ( ) built using the concept-forming operators ↑ and ↓ . Note that with this definition of the new derivation operators, we have that # ( ) = ( | ) . This means that handling positive and negative information is more efficient since there is no need to duplicate the number of columns in the context.
The following functions are introduced to capture the positive and negative information related to a given set of attributes A: The two mappings defined above, namely ⇑ and ⇓ , are related to the standard derivation operators ↑ and ↓ . This is illustrated in the following result.
Lemma 1 [26] For a formal context and its complement 3 , the following statements hold: The example below is a running example which will illustrate the results presented along the paper.
Example 2 Let us consider the formal context in Table 1 (a). The apposition (concatenation by columns) of and its complement is in Table 1 (b). We follow the notation | for the concatenated formal context. In addition, we define M = {a, b, c, d} and M = {a, b, c, d}.
We have defined over two Galois connections: ( ↑ , ↓ ) and ( ⇑ , ⇓ ) . We can show the different information they capture with a simple example: In this case, we can check the results of Lemma 1, since {c, d} ⇓ = {c, d} ↓ = {o5} , that is, the operators ↓ and ⇓ coincide when they are applied to a set of positive attributes. However, the operator ⇑ differs in general from ↑ , since, {o5} ⇑ = {b, c, d, a} ≠ {b, c, d} = {o5} ↑ . Despite this difference, they can be related by means of the operator Pos (see Lemma 1), that takes only the positive attributes to the result obtained by applying ⇑ . That is: Notice also that ↓ is defined on 2 M whereas ⇓ is in 2 M∪M , so it does not make sense to write {c, b} ↓ . Instead, we have to rely on ⇓ to compute the desired extent using mixed attributes: {c, b} ⇓ = {o4, o6} . ◻ The use of mixed contexts, with positive and negative attributes, provides richer information than the one obtained by using a context where only one positive (or negative) attribute is considered. In the theoretical results that we show below, we can see how the use of mixed contexts extends the information obtained from simple contexts. Specifically, we present a few properties that relate the closed sets, the lattice of concepts and the minimal generators of the mixed context with both simple contexts (i.e., both positive and negative). The work on mixed contexts was initiated in [26], where some interesting results concerning this framework were proved. The result below shows that the concept lattices of the positive (or negative) contexts are embedded in the mixed concept lattice. As a result, the mixed context gives, at least, as much information as the positive or negative contexts.

Theorem 1 [ 2 6 ] T h e m a p s
are join-preserving and surjective.
As a consequence, besides the isomorphism between the knowledge space (Q, K) and the concept lattice of the knowledge context = (P, Q, I) given by [15], we can say that the concept lattice of the knowledge context is embedded in # ( ) employing the 1 and 2 projection operators. Accordingly, we may consider more information using the positive and negative attributes, i.e., representing the two faces of the same phenomenon.
Example 3 We continue with the same contexts from Example 2. In Fig. 1, we show the concept lattices of the positive ( ( ) ), negative ( ( ) ) and mixed ( # ( ) ) formal contexts. We have used a color code to represent the relationships and embedding of ( ) and ( ) into # ( ) . The backprojection of the concepts in ( ) via 1 is in blue and gray, and the backprojection of ( ) via 2 is in orange and gray. Some of the concepts in # ( ) are in gray because they are the backprojections of both positive and negative concepts. Note that, although a concept belong to the positive concept lattice ( ) , it may be the projection of a concept with negative attributes in the mixed concept lattice # ( ) . For example, the concept {a, c} ∈ ( ) is the projection of {a, c, b, d} ∈ # ( ) . As a result, we can assert that the mixed concept lattice contains those concepts of the other two contexts and provide additional information and granularity. Actually, note that in Fig. 1, none of the concepts coloured in white in # ( ) can be obtained by considering only either positive or negative contexts. ◻ At this point, we can consider applying Simplification Logic on the extended context | as described in Sect. 2.2. However, in such a case, we do not capture the relationship between opposite attributes, since in | the opposite attributes a and a are unrelated. That is a drawback. For such a reason, the following definition presents semantics on implications where positive and negative attributes are involved. As mentioned above, this semantics allows us to establish new relations between implications that do not appear in the standard logic presented in Sect. 2.2. In particular, the authors in [26] proposed an axiomatic system to capture the relationship between opposite attributes, named Simplification Logic for Mixed Attributes, a sound and complete logic system for implications on mixed formal contexts.
[Ref] Reflexivity: [Inky] Inverse key: Besides, it is convenient to display the following inference rules obtained from the previous logic system since they will be used later to explain some results.
[Cont] Contradiction: Let us briefly comment on the details of the specific rules of this logic, [Cont] and [Rft], since they are the ones showing the relationship between positive and negative attributes. The first inference rule, called Contradiction, is a form of the well-known Ex contradictione quodlibet, i.e., we can infer all the attributes from a set of contradictory attributes. The second one, Reflection, is an inference rule that allows us to interchange attributes from the

Example 4 We now study the behaviour of Simplification
Logic on the implications retrieved from the positive, negative and mixed contexts of our running example started in Example 2.
On the one hand, the bases of implications for both simple formal contexts, and , in Example 2 are the following: On the other hand, the basis of implications for the formal context | in Table 1 (b) is: These implications form a minimal set from which all other valid implications in the context | can be deduced. However, thanks to the Simplification Logic for Mixed Attributes, we can infer a set of implications with lower cardinality and equivalent to the 19 implications above: We stress that from these four implications, we can derive the 19 from the basis of the duplicate context | , i.e., they condense exactly the same knowledge (that is, they are equivalent) with less redundancy. The Simplification Logic The rest of the implications in and are deduced similarly.
In summary, we have shown that the set of implications for the mixed formal context contains more information than the union of the two bases of implications for the positive and negative contexts. In other words, there are some relations between attributes that none of the two simple contexts can capture. Secondly, it can be shown that information captured by the two implication bases of the positive and negative contexts is contained in the set of implications of the mixed context. ◻ The notions of logical closure and minimal generator are defined as in Sect. 2.2. In this case, since we obtain two simple lattices, ( ) and ( ) , and the lattice # ( ) of the mixed context, due to the use of different closure operators, we have different sets of minimal generators associated with each of them.
Notation. From now on, let us denote by Gen( ) the set of minimal generators for a given formal context , using the derivation operators ↑ and ↓ , and by Gen # ( ) the minimal generators associated to the operators ⇑ and ⇓ . Note that, since # ( ) = ( | ) , we have Gen # ( ) = Gen( | ).
In this work, the minimal generators for the concepts in # ( ) are used to characterise the structure of the knowledge space. The following theoretical result states that, using these generators, we are accounting also for the generators of the original knowledge space.

Proposition 2 Let = (G, M, I) be a formal context and let
be its complemented context. The following holds:

If A ⊆ M is a minimal generator of a set C ⊆ M in ( ) , then A is also a minimal generator in
then A is also a minimal generator in # ( ) such that Neg(A ⇓⇑ ) = C.
Proof Let A be a minimal generator of C ⊆ M in ( ) , let Z ⊆ A ⊆ M such that Z ⇓⇑ = A ⇓⇑ . We will show that Z = A , which implies that A is also a minimal generator in # ( ) . Since Z ⇓⇑ = A ⇓⇑ , in particular we have that Pos(Z ⇓⇑ ) = Pos(A ⇓⇑ ) . Then, by Lemma 1, we have That is, we obtain that Z is a generator of C ⊆ M . Then, by minimality of A we have that Z = A . Note, that in the same equation we prove that Pos(A ⇓⇑ ) = A ↓↑ = C. The second statement is proved analogously. ◻ As a direct consequence of the previous result, we have the following corollary, which states that the minimal generators of the mixed context contain the minimal generators of both simple contexts.
So when we use the positive and negative information, we capture all the generators, i.e., the positive generators, the negative and the mixed ones.
A question arises from the last corollary: are there elements in Gen # ( ) which are neither in Gen( ) nor in Gen( ) ? The answer is affirmative, and from now on, we focus on showing it in this section.  (

1) If A is a minimal generator in the mixed context, with
A ⊆ M (i.e., A only consists of positive attributes), whose closure is A ⇓⇑ = C ⊆ M ∪ M , then A is also a minimal generator in and A ↓↑ = Pos(C).

(2) If A is a minimal generator in the mixed context, with
A ⊆ M (i.e., it consists only of negative attributes), and its closure is A ⇓⇑ = C ⊆ M ∪ M , then A is also a minimal generator in and A ↓↑ = Neg(C).
Proof We will prove (1), since (2) follows an analogous reasoning. Let us consider a minimal generator A ⊆ M in the mixed context. Let us take B ⊆ A such that B ↓ = A ↓ and we will show that B = A , which would mean that A ∈ Gen( ) . As B ↓↑ = A ↓↑ , applying the operator ↓ , we obtain B ↓↑↓ = A ↓↑↓ . Using that, for any X ⊆ M , then X ↓↑↓ = X ↓ (see Proposition 1 in [27]), we arrive at B ↓ = A ↓ . Applying Lemma 1, we obtain that B ⇓ = A ⇓ . Again, applying the derivation operator ⇑ , we have that B ⇓⇑ = A ⇓⇑ . Since A is a generator in the mixed context, then it must be A ⊆ B . Thus B = A . Therefore, A is a minimal generator in ( ) . Moreover, A ↓↑ = Pos(C) is a consequence of the application of the Lemma 1. ◻ In what follows, our purpose is to characterise the structure of the minimal generators of # ( ) . To this end, we introduce the following notation: This means that there are not only the generators of the positive and negative contexts, but with the mixed context, we are contemplating others that arise from mixing positive and negative attributes.
Gen # ( ) = Gen( ) ∪ Gen( ) ∪ Gen ± ( ) Example 5 Now, we proceed to study the minimal generators of the formal contexts of Example 2 and to check that the previous theoretical results hold in this running example. For the sake of simplicity, we only consider non-trivial minimal generators, that is, such that the minimal generator is not the concept itself. For example, in the positive lattice ( ) , the minimal generator {a, c} is trivial because it generates the concept {a, c} again. The list of non-trivial minimal generators for ( ) and ( ) is the following: Notice that there are only four non-trivial minimal generators in each simple context. In contrast, in the mixed context, we can find new patterns relating positive and negative attributes that do not appear in the list above, as is shown below.
Next, the Hasse diagrams of the minimal generators for the mixed and simple contexts are depicted in Fig. 2, where we have used the same color code as in Fig. 1. For the sake of presentation, we have not included those generators that explicitly described a contradiction in the corresponding figure: a, a , b, b , c, c and d, d . Note that in Fig. 2, the reader can easily visualize the new extracted knowledge resulting from the use of mixed attributes: firstly, it is visually clear that none of those minimal generators of the mixed context coloured by white can be obtained by the minimal generators of the positive and negative contexts (coloured by blue and orange, respectively); and secondly, all the minimal generators of the positive and negative contexts are embedded in the set of minimal generators of the mixed context. ◻ With the theoretical results and the running example in this section, we have illustrated that a greater level of granularity in the knowledge extracted from the context and a more exhaustive exploration of the relationship between attributes can be achieved by working with mixed contexts. New hidden patterns arise with the computation of the minimal generators for mixed formal contexts.

Case of Study and Results
In this section, we present a case study where we use the previous theoretical results and show their advantages with respect to the classical approach. For the sake of reproducibility, all the material (dataset and code, as well as a script to replicate the results) has been collected and presented in a public GitHub repository at https:// github. com/ Malaga-FCA-group/ demo-elear ning.

Introduction to the Case
The object of study is a class of 47 students in the third year of compulsory secondary education in Spain (3 • ESO). The analysis is performed by considering the marks of those students in the subject of Mathematics in all the partial exams and courseworks made during the course 2020/2021. All those exams can be split into three terms (see Table 2). The analysis carried out aims to exploit the information obtained through the use of mixed contexts (with positive and negative attributes) in two ways: • On the one hand, we propose the use of the mixed lattice # ( ) to establish the possible learning paths of a student; • On the other hand, we describe the knowledge space generated by the students employing the minimal generators.
The analysis has been done in R. The package used for the study is fcaR, which was developed by our team and is available at the CRAN package repository [28]. This package allows the user to compute the main operations of FCA, as concept lattices and minimal generators. The dataset, provided by the teacher in a spreadsheet, is turned into a formal context where the students are the objects and each unit is an attribute. A student is related to an attribute if they passed the exam of the respective unit. Note that after considering its mixed context, we have also the negated attribute that determines that a student has failed the respective exam. The names of the (positive) attributes are specified in Table 2 next to the unit, and the negated attribute is represented by an overline; e.g., the attribute I and I represent "the student has passed the exam of Integers and fractions" and "the student has failed the exam of Integers and fractions", respectively.

First Analysis: Exploration of the Knowledge Space
For the first study, the mixed lattice # ( ) has been constructed using the NextClosure algorithm [27], yielding a total of 403 rules in the implication basis and 1769 concepts. This lattice can be navigated to determine the path that a learner has followed during the course to either pass or fail an exam. For example, let us pose a practical question: what has to happen for a student who has failed the polynomials and functions exams to pass the course. In Fig. 3, we show the subsemilattice formed by those concepts (from # ( ) ) that contain the attributes {P, F} ; i.e., the attributes stating that a student failed the exams related to the units of Polynomials and Functions. Note that the top concept also contains {2ndT} , indicating that all the students that have failed those two exams have also failed the second term. The same colour code as in Example 3 has been used to mark those concepts that also appear in the lattice ( ) . To improve the readability of this graph, only those attributes that do not appear in the nodes immediately above are shown in each node. A node marked with ⋆ indicates that its attributes are exactly those formed by the union of its upper neighbours. To reinforce this fact, the arrows have been labelled with symbols + and ∪ indicating that its immediate subconcepts (below) contain either new attributes (specified in boxes) or that the new concept is the union of its upper neighbours, respectively. For example, the concept at the bottom, represented by an ⋆ , is exactly the concept containing the attributes {P, F, 2ndT, I, d, 1stT, 3rdT, M, Final, G, Cwk, S} .
To answer the posed question, we can observe in Fig. 3 that there is a path, marked in blue, according to students that reach a knowledge state (concept) in which the attribute Final is present, i.e., students that pass the course. This learning path includes passing the coursework ( Cwk ), the Statistics exam ( S ) and the third term ( 3rdT ), among other possibilities. In contrast, we can see that many other paths end with the attribute Final , indicating that students have not completed the course successfully. In all these paths, the attribute 3rdT appears, indicating that students failing the third term also fail the course. This exhaustive analysis would not have been possible using only the positive and negative contexts, hence the importance of the wealth of knowledge that can be extracted from the mixed context. Note that using only ( ) or ( ) , we could never have inferred the possible dependency relationship or learning path between two attributes with opposite signs (such as in this case).

Second Analysis: Minimal Generators
In the second analysis, we focus on minimal sets of exams or terms that lead students to fail the whole course. In this case, the mathematical entities of our framework that can help us face this question are the minimal generators. In particular, our analysis aims at providing answers to the following question: "Is it possible for a student that has failed a certain number of units to pass the course?" or, equivalently, "Which are the minimal sets of exams or terms that lead the students to fail the module?".
Notice that, in a well-structured subject, there should not be a unit, nor a small set of units, that implies passing the whole course. For such a reason, we do not focus our analysis on passing the whole course, but on failing it. In this line, since Mathematics is a hierarchical subject, in the sense of one unit being usually necessary to understand the following one, we aim to answer whether a student that failed some exams would fail all the subsequent exams. Therefore, we could support the idea that the contents of one unit can be retaken (without a specific exam) during the rest of the course. This would mean that even with a rough start, hard work pays off.
The computation of all the minimal generators of the mixed context produces an aggregate of 24,975 generators. For the sake of readability, we have filtered this large set to select generators whose support is greater than 10%, that is, generators consisting of combinations of attributes that appear, at least, in 10% of the cases. Since the minimal generators and their closed sets can be unambiguously represented by a set of implications, we will use such representation. Thus, any implication in the following will have a minimal generator as its left-hand side. This allows us to use the Simplification Logic to further reduce the redundancies in the implications. After this process, we restrict the results to those implications interesting for the question posed above and obtain 20 representative implications that are related to failing the course: Firstly, note that left-hand side of each implication is a minimal generator of a concept containing {Final} . Note that, in the right-hand side of some of these implications, {Final} does not appear. This is due to the simplification performed to improve legibility. For instance, for rule number 7, the consequent does not mention explicitly {Final} , but, if we compute the logical closure of the antecedent {P, 2ndT, 3rdT} we obtain {P, 1stT, 2ndT, 3rdT, Final, M} . It suffices to observe that we can use [Comp] with rules 1 and 7 to infer the logical closure.
Secondly, for the sake of understating, let us explain the information represented by some implications in detail. Let us start with rule 1; this one says that every student who has failed the second and third terms has failed the mean of the marks of the whole course and failed the course. This is an obvious piece of information from the teaching point of view. A student who fails two out of three terms will hardly ever pass the module (in this particular case, this situation does not arise, not even once). Whenever an algorithm gives trivial information as an output, it can be seen as a sign of the approach being coherent and trustworthy. However, not all the information given by this set of implications is trivial. For instance, rule 5 states that if a student has failed the first term and the Functions exam, then the student has also failed the Integers exam, the Decimals exam, the Polynomials exam, the second term, the third term and the final mark. In other words, every student that has failed the first term and Functions, a single unit in the second term, has failed the course. This emphasises somehow that Mathematics has a pyramidal structure, where a lack of basic knowledge, namely failing the first term, makes learning new concepts significantly harder. Actually, the set of implications shows up the importance of the Functions unit to pass the course. It does not only appear in the previously described implication 5 but also in the antecedent (i.e., in minimal generators) of implications 2, 3, 10, 11, 12, 15, 16, 17, 18 and 20. On the other hand, we have the information reported by implication 6, whose antecedent contains the only mixed set of attributes; specifically, the positive attribute D and negative attributes 2nd and M . Note that this implication would not be obtained if we were using only positive or negative contexts. This implication states that every student that has passed the Decimals unit but failed the second term and the mean of the module failed the whole subject. Failing the average mark of the module is often a strong enough condition to fail the entire module, but this is not true in every single case. Some students have a failed average mark but passed the course anyway. This is due to some hidden evaluation method carried out by the teacher that depends on information that is not explicitly shown in the dataset; e.g., behaviour in class, punctuality, submitting homework on time, etc. Anyway, note that FCA is capable to show up this hidden evaluation method, since the single set of attributes {M} is not a minimal generator.
Finally, concerning the posed question: Which are the minimal sets of exams or terms that lead the students to fail the module? We focus on implication 16, which says that every student who has failed Integers, Polynomials and Functions, has failed the subject, Decimals and the first and third term. Hence, there is a set of units whose failure implies failing the whole module. Note that this set is not unique. For example, implication 17 says that every student who has failed Decimals, Polynomials and Functions exams has failed the whole module.
A final remark: one could argue that a similar analysis can be done using only positive or negative attributes, to detect, for instance, how failing some items leads to failing the whole course. However, we must emphasise that the combination of positive and negative attributes provides a higher level of granularity that cannot be achieved otherwise. To reflect this point, note that, out of the 24,975 minimal generators, 24,023 are mixed, 509 are purely positive and 443 are purely negative. That is, more than 96% of the generators are mixed. Thus, using only one of the simple contexts would not allow us to reach the level of expressiveness and granularity that we can achieve with the mixed context.
To complete this remark and demonstrate that the mixed information (not purely positive or negative) is not incidental, a more comprehensive listing of mixed minimal generators related to this problem is presented in Appendix A, showing the richness of the information retrieved by Formal Concept Analysis in the mixed context.
The reader can observe that the use of mixed attributes, along with the proposed algebraic and logic tools, allows us to determine the structure of the knowledge obtained by the students during the course. Note that some pieces of that knowledge are not directly visible to the naked eye from the raw data, i.e., they cannot be derived without the use of formal tools.

Conclusions and Future Work
In this paper, we have recalled the basic notions of FCA, Simplification Logic and the relationship of those theories with Knowledge Space Theory. Furthermore, we have presented some new theoretical results about the minimal generators in FCA using positive and negative attributes. The most remarkable result states that the minimal generators of a mixed context contain not only the minimal generators of the positive and negative contexts but also new minimal generators not purely positive or negative. Consequently, we have shown the advantages of using the mixed context instead of using the context with just positive or negative information.
In addition, we have presented a case study. Specifically, we have analysed the students' marks in a Mathematics course in a High School in Andalusia and we have constructed its corresponding Knowledge Space. We have extended the approach of [15], using the information when a student fails or passes an exam. We have computed the minimal generators of the formal context and studied those that lead the students to fail the whole module. We have had some expected information but have collected some non-trivial information as well. Actually, we have explicitly shown that we can obtain a deeper insight by the use of mixed attributes than by the standard FCA approach. This information might be a helpful reference for teachers when preparing the course organisation.
As future work, we plan to extend this study to the fuzzy framework. This will allow us to capture a finer grain in the information and a higher level of detail when modelling the knowledge space. In the long-term, an analysis of the subjects of a whole course would be helpful to find dependencies between different subjects, not only in Mathematics.

Appendix A Mixed Minimal Generators
Mixed minimal generators, i.e. with both positive and negative attributes, and which, according to the theoretical results, are not deduced from any of the individual contexts, can be numerous. Of the 24,975 minimal generators that can be calculated for the mixed context of the case study, a total of 24,023 are mixed.
As an example of the richness and expressiveness of mixed attributes, in the following table, we show those minimal generators whose closure contains the attribute Final and which are mixed (neither purely positive nor negative), with a minimum support of 1%. In other words, they are the left-hand sides of implications whose righthand side contains Final .