General information spaces: measuring inconsistency, rationality postulates, and complexity

AI systems often need to deal with inconsistent information. For this reason, since the early 2000s, some AI researchers have developed ways to measure the amount of inconsistency in a knowledge base. By now there is a substantial amount of research about various aspects of inconsistency measuring. The problem is that most of this work applies only to knowledge bases formulated as sets of formulas in propositional logic. Hence this work is not really applicable to the way that information is actually stored. The purpose of this paper is to extend inconsistency measuring to real world information. We first define the concept of general information space which encompasses various types of databases and scenarios in AI systems. Then, we show how to transform any general information space to an inconsistency equivalent propositional knowledge base, and finally apply propositional inconsistency measures to find the inconsistency of the general information space. Our method allows for the direct comparison of the inconsistency of different information spaces, even though the data is presented in different ways. We demonstrate the transformation on four general information spaces: a relational database, a graph database, a spatio-temporal database, and a Blocks world scenario, where we apply several inconsistency measures after performing the transformation. Then we review so-called rationality postulates that have been developed for propositional knowledge bases as a way to judge the intuitive properties of these measures. We show that although general information spaces may be nonmonotonic, there is a way to transform the postulates so they can be applied to general information spaces and we show which of the measures satisfy which of the postulates. Finally, we discuss the complexity of inconsistency measures for general information spaces.


Introduction
As AI systems may need to deal with inconsistency, some AI researchers started in the early 2000s to develop ways of measuring the inconsistency of a propositional knowledge base, that is, a set of formulas in propositional logic. By now a substantial amount of work has been done along these lines, which includes for instance the approaches developed in [35-38, 40, 42, 45, 47, 55, 56, 62]. A survey on the topic can be found in [26].
But information in many cases is not restricted to propositional logic formulas. In this paper we show that much of the work done for measuring inconsistency in propositional logic knowledge bases can be applied to measure inconsistency in more complex frameworks, where real world information is actually stored. Our approach is as follows.
-We first introduce the concept of a general information space that covers various types of databases and AI frameworks. A general information space consists of a framework, such as a database schema, a set of information units, such as tuples in a relation, and a set of requirements, such as integrity constraints. -Then, we formulate a transformation from a general information space to a propositional knowledge base in such a way that the inconsistencies are preserved. As expected, the propositional knowledge base resulting from the transformation does not capture the rich content of the general information space, but it does capture all of its inconsistencies in a precise way. -Finally, after the transformation we apply a propositional inconsistency measure to the just obtained propositional knowledge base to find the inconsistency of the general information space.
Hence, our approach allows for measuring the inconsistency of any information space and the method can be applied in a uniform manner over a variety of information scenarios. To the best of our knowledge, this is the first approach that lifts the idea of inconsistency measure from propositional knowledge bases to a range of different frameworks used for storing real world data. This makes it possible to apply all the results about inconsistency measures for propositional knowledge bases to a wide range of applications. We believe that this is a significant advance for the whole idea of inconsistency measurement that has been developed for propositional knowledge bases, and explored in several other individual settings such as software specifications [46], databases [4,14,44,51], and ontologies [63,64], among others.
The plan of this paper is as follows. We start by giving some basic information and examples of inconsistency measures for propositional knowledge bases in Section 2. Then, we define the concept of a general information space in Section 3. This is followed by the steps of the transformation and the equivalence of the two for inconsistencies in Section 4. Section 5 contains four examples of general information spaces: a relational database [11] (Section 5.1), a graph database [54] (Section 5.2), a spatio-temporal database [53] (Section 5.3), and a Blocks world scenario [32] (Section 5.4). Our method allows for the important task of evaluating and comparing the inconsistency [23], even for such different ways of storing information. Then we introduce rationality postulates for propositional inconsistency measures in Section 6.1. We show how to modify these postulates in order to check them for general information spaces in Section 6.2. We also construct a table describing the various new postulates satisfied by our inconsistency measures and contrast this with the satisfaction of these measures for propositional knowledge bases in Section 6.3. Finally, in Section 7, we deal with complexity and show that in some cases, unlike for propositional knowledge bases, the inconsistency measures for general information spaces can be computed in polynomial time. We draw conclusions and outline future work in Section 8.
This paper is a substantially revised and extended version of [31]. We have made the following major modifications. In Section 2, we added a discussion of two classifications for inconsistency measures and added an eighth inconsistency measure. In Section 3 we clarified several important issues concerning the definition of a general information space. We substantially extended Section 5. We clarified several issues concerning the graph database of Section 5.2 and added the constraints in first-order logic. Section 5.3 is completely new; here we discuss a specific spatio-temporal database as a general information space and measure its inconsistency. We also clarified several issues concerning the Blocks world database of Section 5.4 and added the constraints in first-order logic. Sections 6 and 7, where we explore postulates satisfaction and complexity, respectively, are completely new. Finally, Section 8 was rewritten to account for the changed scope of this paper.

Brief background on inconsistency measures for propositional knowledge bases
The idea of an inconsistency measure is to assign a nonnegative number to a knowledge base that measures its inconsistency. We start with a propositional language of formulas composed from a countable set of atoms, the fundamental propositions, and the connectives ∧ (conjunction), ∨ (disjunction), and ¬ (negation). We write K for the set of all propositional knowledge bases (KBs), i.e. the set of all finite sets of formulas in the language. We write K for an individual KB. 2 X is the set of all subsets (the power set) of any set X. In general, an inconsistency measure assigns each KB a nonnegative real number or infinity.

Definition 1 A function I : K → R ≥0
∞ is an inconsistency measure iff the following Consistency condition holds for all K, K ∈ K : 1. Consistency. I (K) = 0 iff K is consistent.
Consistency is called a (rationality) postulate. It means that all and only consistent knowledge bases get the measure 0, and hence all inconsistent knowledge bases get a positive measure.. In Section 6.1 we will introduce additional postulates. These are properties that conform to some intuitive notion of how an inconsistency measure should behave. However, the only property that is accepted by all researchers is Consistency.
A classical interpretation i for K assigns each atom a that appears in a formula of K the truth value T (true) or F (false), that is, i : Atoms(K) → {T , F }, where Atoms(K) denotes the set of atoms of K. However, there is an important propositional inconsistency measure, I C , that uses 3 truth values: T , F , and B (both), where B indicates inconsistency. This measure uses Priest's 3-valued logic. This interpretation uses an ordering on the truth values where F < B < T and ∧ computes the minimum value while ∨ computes the maximum value; also ¬(B) = B. So, for example, B ∧ F = F and B ∨ F = B. Then, an interpretation i satisfies a formula iff the truth-value of the formula for i is T or B. Now we are ready to define the propositional inconsistency measures we will consider in this paper. Below the definition we briefly explain the meanings of these measures. Definition 2 (Propositional Inconsistency Measures) For a knowledge base K, the inconsistency measures I B , I M , I # , I P , I H , I nc , I C , and I mv are such that We explain the measures as follows. I B is also called the drastic measure [36]: it simply distinguishes between consistent and inconsistent KBs. I M counts the number of minimal inconsistent subsets [36]. I # also counts the number of minimal inconsistent subsets, but it gives larger sets a smaller weight; the reason is that when a minimal inconsistent set contains more formulas than another minimal inconsistent set, the former is intuitively less inconsistent than the latter [36]. I P counts the number of formulas that contribute essentially to one or more inconsistencies [24]. I H counts the minimal number of formulas whose deletion makes the set consistent [25]. I nc uses the largest number such that all sets with that many formulas are consistent [17]. I C counts the minimal number of atoms that must be assigned the truth-value B in the three-valued logic by an interpretation that satisfies every formula in the KB [24]. Finally, I mv is the ratio of the number of atoms in the problematic formulas to the total number of atoms [62]. For the KB K ex , we have the following results.
1. I B (K ex ) = 1 as it is inconsistent. 2. I M (K ex ) = 2 as there are 2 minimal inconsistent subsets. 3. I # (K ex ) = 1 3 + 1 3 = 2 3 as both minimal inconsistent subsets consist of 3 formulas. 4. I P (K ex ) = 5 as there are 5 problematic formulas. 5. I H (K ex ) = 1 as deleting a 2 suffices to make K ex consistent. 6. I nc (K ex ) = 7 − 2 = 5 as 2 is the largest number such that all subsets of size 2 are consistent. 7. I C (K ex ) = 1 as the following interpretation satisfies all the formulas: 5 as 3 of the 5 atoms appear in problematic formulas. There have been several proposals to classify inconsistency measures. We will deal with two classifications. It should be noted that typically a classification does not encompass all inconsistency measures; rather, it is used to distinguish certain features of the measures.
One classification distinguishes between syntactic and semantic measures. Syntactic measures are based on the syntax in the sense of the interaction of individual formulas with the minimal inconsistent subsets. On the other hand, a semantic measure takes into account the meaning of each formula by the truth values of its component atoms. A recent paper, [12], studies this issue in detail. Among the measures given above, the first six are syntactic measures and I C is a semantic measure. I mv does not fit into either category because it requires breaking the formulas into their atomic components without taking into account the truth values of these atoms.
Another classification distinguishes between absolute and relative measures. The idea is that an absolute measure measures the totality of the inconsistency, while a relative measure measures the proportion of the inconsistency. The bulk of the research on inconsistency measures considered only absolute measures, but a recent paper [6] studies specifically the properties of relative measures. Among the measures given above, the second through the seventh are absolute measures and I mv is a relative measure. I B does not fit into either category because it does not measure a totality (it cannot increase after 1) and although its value is between 0 and 1, as is the case for relative measures, it is not a ratio.

General information spaces
Our goal is to apply the idea of measuring inconsistency in propositional knowledge bases to more complex scenarios that are useful in AI and databases. Intuitively, a general information space provides specific information concerning some subject matter according to a given format. The underlying language is first-order logic where various concepts such as sets, relationships, graphs, etc. can be expressed.

Definition 3
A general information space S = F, U, C is a triple where F is the framework for the information, U is a set of information units, and C is a set of requirements that U must satisfy, where the following hold: A1 (Specification of the frame). The framework F defines all the structures (represented in first-order logic as relations) as well as the domains for all the attributes of the relations. A2.1 (Consistency and unity of the information units). Each information unit in U is a ground atom. A2.2 (Negative information). Negative information is handled by the Closed World Assumption. A3.1 (Consistency of the requirements). C is consistent. A3.2 (Arity of the requirements). Each requirement has an arity which is a non-negative integer. A positive arity, k, means that there must be k information units involved in an inconsistency with that requirement. The arity 0 indicates an existential constraint where an inconsistency arises because some required information is not in U . A3.3 (Restrictions on the requirements) We may restrict the requirements to allow only certain kinds of formulas. A4.1 (Inconsistencies). All inconsistencies of S arise from the interaction of U and one element of C. A4.2 (Finding inconsistencies). There is an effective procedure to find all inconsistencies of S.
We will also use the symbol S for a set, namely S = U ∪ C. In such a case F is specified either explicitly or implicitly.
A relational database is an example of a general information space. Here the framework is the database schema as well as the language used to describe the database and the domains for the constants. The information units are the tuples in the relations of the schema. The set of requirements is the set of integrity constraints.
A graph database is another example. Here the framework is the type of information stored in the vertices and edges. The information units are the information given by the vertices and edges. The requirements are the graph constraints.
A spatio-temporal database provides a third example. In this case the framework provides the objects, times, and spatial points in the database. Each information unit indicates that a particular object is at some point in space at some time value. The requirements are the constraints that specify what is not allowed, such as, that an object cannot be at two different points at the same time.
A Blocks world is a general information space as well. Here the framework is the type of information stored about blocks such as block color and which block is on top of another block. The elements of the domain are also part of the framework. The information units describe the Blocks world such as giving the colors of the blocks and which block is on top of another block. The requirements are the rules of the Blocks world such as that a green block cannot be on top of a blue block.
Another example is a board game configuration. Here the framework is information about the board and the pieces. The information units describe the position of each piece and some action or actions. The requirements are the rules of the game, such as how the pieces may move and what constitutes a winning position.
Thus a general information space encompasses many ways in which information is stored and presented; we will give four concrete examples in Section 5.
It is worth noting that A1, A2, A3, and A4 hold in many real world scenarios, such as those mentioned earlier. For instance, for relational databases, tuples are units of information that are usually assumed to be consistent when considered alone (without interacting with the integrity constraints), integrity constraints are usually satisfiable (there exists a database instance that satisfies them), and procedures for checking inconsistency are well-known for large classes of integrity constraints [1,60].
A requirement for a general information space is really a constraint but we use this terminology to indicate that there need not be a formal language in which the requirement is presented and we will also use English in some examples.
Next we give some details concerning requirements for relational databases, and illustrate the notion of arity for them. 1 Consider a relational database with two relations, a binary relation R 1 and a ternary relation R 2 . As is usual we call a requirement a constraint. Consider the constraint: ¬R 1 (1, 2). This constraint specifically excludes the tuple (1, 2) from R 1 . Hence the existence of that tuple with the constraint is inconsistent. Thus, the arity of this constraint is 1. Consider the more general constraint: . This states the functional dependency of the second attribute of R 1 on the first. This would be violated, for instance, by the two tuples: (1,2) and (1,3) in R 1 causing an inconsistency. So the arity of this constraint is 2. Next consider the constraint: . This states the inclusion dependency that the elements in the second column of R 1 are included in the first column of R 2 . A violation of this constraint is caused by a single tuple in R 1 whose second element is not in the first column of R 2 . This means that the arity is 1. It is worth pointing out an important difference between this case and the previous two cases. Take the first constraint, ¬R 1 (1, 2), the functional dependency case is similar. The existence in R 1 of the tuple (1, 2), that is, R 1 (1, 2) together with the constraint ¬R 1 (1, 2) form an inconsistent set of formulas. But in the inclusion dependency the existence of some tuple in R 1 , say R 1 (1, 2) together with the inclusion dependency is not inconsistent: there may be a tuple in R 2 , say R 2 (2, 3, 4) that would satisfy the constraint. Hence for this type of constraint the context is needed, such as what tuples are in R 2 . In any case, all the usual database constraints such as the various types of dependencies, denial and key constraints are included in our concept of a requirement.
There is also a somewhat different type of constraint. Consider that the requirement R 1 (1, 2) requires the tuple (1, 2) to be in R 1 . So in this case it is a lack of the tuple that violates the requirement and causes an inconsistency. A more general example is ∃x 1 x 2 R 2 (1, x 1 , x 2 ) which states that there must be a tuple in the R 2 relation whose first element is 1. Note how these constraints are similar to the inclusion dependency given above: in all these cases a required tuple is not in the database. The difference is that for an inclusion dependency we can point to a tuple in R 1 that is the source of the problem, i.e., without that tuple there would not be a constraint violation. So for an inclusion dependency it is possible to delete a tuple from R 1 that is the source of the problem. But for the constraints just given, no deletion from the database resolves the violation. We call such a constraint an existential constraint and assign it arity 0.
In our approach the violations of requirements are ascribed to information units appearing in the general information space. This is similar to what is done for instance in database repairing approaches that aim at restoring consistency by tuple deletions only, even in the presence of existential contraints' violations [10,61]. As a matter of fact, for large classes of requirements, e.g., denial constraints, violations can only be ascribed to the presence of some information units. However, violations of existential constraints are due to missing information units. We refer the reader to Section 8 for further discussion on the treatment of existential requirements.

The transformation
We now show how any general information space can be transformed to a propositional knowledge base in such a way that all the violations of the requirements are inconsistencies in the knowledge base. Note however that the transformation also loses some information: there is no way to go back from the propositional knowledge base to the original general information space. In fact many different information spaces representing different phenomena may be transformed to the same knowledge base. But the transformation is appropriate if we are interested in measuring inconsistency.

Definition 4 (Transformation)
The transformation from a general information space S = F, U, C to a propositional KB K S is as follows.
-Define a bijective function f : U → A U that assigns a distinct propositional atom to each information unit in U . -Let B C = {b 1 , . . . , b |C| } be another set of |C| propositional atoms.
-Define a bijective function h : C → B C that assigns a distinct propositional atom to each requirement in C. -Let F S be the set of propositional formulas using A U ∪ B C . -Define a function g : C → F S as follows: For each requirement c ∈ C do as follows.
1) If there is no violation of the requirement, then set g(c) = h(c). Otherwise, there is at least one violation of c. 2) If the arity of c is greater than 0, then a minimal inconsistency is formed by one or more information units together with c. Find all such sets, say M c = {U 1 , . . . , U k } and suppose that |U i | = n.
which is a propositional logic formula. Then, define 3) When the arity of c is 0, define g(c) = ¬h(c) ∧ h(c).
Clearly g is one-to-one because each g(c) is identified with its corresponding unique h(c). Next we show the equivalence between the violation of the requirements C for S and the minimal inconsistent subsets of K S . A requirement violation causes an inconsistency for S but we need a definition for it. Then we define I nc(S) as the set of inconsistencies of S. We say that S is consistent iff I nc(S) = ∅. Proof (⇒) Let M ∈ I nc(S). Every inconsistency of S contains exactly one requirement. Suppose the arity of the requirement c is a positive integer k. In this case the inconsistency contains k elements from U , say u 1 , . . . , u k , and a requirement c ∈ C. Then, according to the construction, g(c) is a propositional formula in CNF, one of whose conjuncts is ¬f (u 1 ) ∨ · · · ∨ ¬f (u k ). (⇐) Let M ∈ MI(K S ). Based on the structure of the transformation with the information units transformed to atoms and only the constraints transformed to formulas that may involve negation, there are two cases. In the first case M = {a 1 , . . . , a k , g(c)} where ¬a 1 ∨ · · · ∨ ¬a k is a conjunct in g(c). As g is one-to-one, g −1 exists and g −1 (g(c)) = c.
which is a requirement having arity equal to zero. In both cases the construction gives |M| = |m −1 (M)|.
Our approach to measuring the inconsistency of a general information space S according to a propositional inconsistency measure I x is to apply the inconsistency measure to the transformed space, that is, we define I x (S) = I x (K S ).

Examples of general information spaces
We will describe four examples to illustrate our approach. In each case we first describe the general information space, then do the transformation, and finally compute the inconsistency measures according to the propositional inconsistency measures of Definition 2.

A relational database as a general information space
A relation instance is a set of tuples over a given relation scheme, and a database instance D is a set of relation instances over a given database scheme. We use e i for terms, that is, elements of the domains or variables. An atom over a database scheme DS is an expression of one of the following forms: (a) R(e 1 , . . . , e n ) where R is a relation scheme in DS , or (b) e i • e j where • ∈ {=, =, >, <, ≥, ≤}. An integrity constraint over DS is any (function-free) first-order sentence over the database scheme DS . For a database scheme DS and a set C of integrity constraints over DS , an instance D of DS is said to be consistent w.r.t. C (or, equivalently, C is satisfied by D, C is not violated by D) iff D |= C in the standard model-theoretic sense.
The components of the general information space S = F, U, C for a relational database instance D over the database scheme DS with a set C of integrity constraints are as follows. The framework F is the database scheme DS and the (function-free) first-order language using a set of uninterpreted constants and predicate symbols for relation names, as well as domains of the attributes for the evaluation of constants. The set U of information units is the instance D (the set of the tuples in the relation instances), and the set C of requirements is the set C of integrity constraints.
We now provide an example. Let the framework F be the database scheme consisting of the relation schemes Asset(SN, DateLoaned, Employee, Date Returned) whose instance contains the serial number, the loan date, the employee' identifier, and the returned date of assets provided by a company to the employees, about whom information is stored in two relations: Employee(SSN, Name, HiringDate), and Family(SSN, Child, Project). Here U , the database instance, is shown in the usual tabular form in Fig. 1. Altogether there are 13 information units (tuples) that for convenience we name t i , with 1 ≤ i ≤ 13.
There are 8 requirements in C as given below both as first-order logic formulas and in English.
, stating that, for every asset, the loan date cannot be later than the return date.
i.e. the constraint that the serial number is a key for Asset.
, stating the numerical dependency [29,30] DateLoaned, Employee → 2 SN whose meaning is that for every date and employee there can be at most 2 assets loaned.
stating that ID is a key for Employee.
, that is, the pair of attributes Name and HiringDate also form a key for Employee.
stating that there must be at least two employees referenced in the Family relation.

Transformation to a propositional knowledge base
We now show how the transformation from a general information space to a propositional knowledge base is applied for this relational database.
6. Now we show the mapping g by going over the constraints one at a time.
The arity of c 1 is 1. The 3 tuples t 4 , t 5 , and t 7 each violate c 1 . Hence, g( The arity of c 2 is 2. The 3 tuples t 1 , t 2 , and t 3 all have the same serial number but are not identical. Hence, g( The arity of c 3 is 3. The three tuples t 5 , t 6 , and t 7 together violate this constraint. Hence The arity of c 5 is 2. It is violated by the pair t 9 and t 10 . Hence, g(c 5 ) = (¬a 9 ∨ ¬a 10 ) ∧ b 5 . c 6 The arity of c 6 is 1. It is violated separately by t 3 and t 4 . Hence, g(c 6 The arity of c 7 is 2. It is violated by the pair t 11 and t 12 . Hence, g(c 7 ) = (¬a 11 ∨ ¬a 12 ) ∧ b 7 . c 8 The arity of c 8 is 0 and it is violated by U . Hence, g(c 8

The calculation of the inconsistency measures
Below are the results of calculating the inconsistency measures of the relational database example. We use the fact that -I B (S) = 1 as K S is inconsistent.
-I M (S) = 12 as there are 12 minimal inconsistent subsets for K S . -I C (S) = 8 as there must be at least 8 atoms, for example a 2 , a 3 , a 4 , a 5 , a 7 , a 9 , a 11 , and b 8 , that must be given the value B for a 3-valued interpretation in order to satisfy all the formulas. -I mv (S) = 18 21 = 6 7 as 18 of the 21 atoms are in problematic formulas.

A graph database as a general information space
We consider a general form of graph databases where vertices may be associated with properties and edges may be labeled [52,54]. We assume the existence of three arbitrary, but fixed, disjoint sets V , L , P, of vertex names, edge labels, and vertex properties, respectively. Each property p ∈ P has an associated domain dom(p) which is a set of values that can be assigned to p.
. An example of a graph database consisting of 7 vertices is shown in Fig. 2.
The components of the general information space S = F, U, C for a graph database are as follows. The framework F consists of basic information about the vertices and the edges of the graph, that is, the sets of V , L , and P. Moreover, F describes which property p ∈ P is associated with a vertex v ∈ V through function ℘, and its domain dom(p). For instance, a property for the vertices of our graph database example is type, whose domain includes person and media. In Fig. 2, the vertices indicated by circles represent people and the vertices indicated by rectangles represent media objects. Moreover, property age is associated with people and resolution with media objects (their domains are obvious). The edges represent relationships between vertices whose meaning is given by the labels.
The data units consist of the information given by the vertices and the edges. Part of the information is given by the shape of a node. So some nodes are counted twice, once for its shape (one unit of information) and once for its properties, if any (another unit of information). Altogether there are 27 data units that we number as follows: Finally we write the requirements (the constraints on the graph) in English.
c 1 : Every circular vertex (i.e, person vertex) must have an associated age value.
c 2 : Every rectangular vertex (i.e, media vertex) must have an associated resolution.
c 3 : There may not be a pair of rectangular vertices whose edges form a cycle.
c 4 : There cannot be 2 edges with the label "posted" going to the same rectangular vertex. -c 5 : For every edge between circular vertices that has the label "likes" there must be another edge with the label "knows".
The English version of the requirements, as given above are easy to understand. In order to formalize them in first-order logic, we will use the binary relations VertexType, Person-Age, PhotoResolution, and the ternary relation Edge. Below we write these requirements as formulas in first-order logic.

Transformation to a propositional knowledge base
We now show how the transformation from a general information space to a propositional knowledge base is applied for this graph database.
5. F S is the set of propositional formulas using A U ∪ B C . 6. Now we show the mapping g by going over the requirements one at a time.
c 1 The arity of c 1 is 1. The two data units u 5 and u 6 each violate c 1 . Hence, g( The arity of c 3 is 2 and is violated by the pair of edges u 26 and u 27 The arity of c 4 is 2. It is violated by the pair of edges u 22 and u 25 The arity of c 5 is 1. The two edges u 19 and u 24 each violate c 5 . Hence, g(c 5

The calculation of the inconsistency measures
Below are the results of calculating the inconsistency measures of the graph database example. We use the fact that MI(

A spatio-temporal database as a general information space
In this section, we consider spatio-temporal databases where atomic statements of the form "object id is/was/will be at point p in space at time t" are represented. This is a deterministic version of the SPOT (Spatial PrObabilistic Temporal) database [53], a declarative framework for the representation and processing of probabilistic spatio-temporal data, where we assume that each region in space consists of a single point. However, we allow regions to be sets of points when specifying spatio-temporal integrity constraints [49].
We assume the existence of three types of constant symbols: object symbols, time value symbols, and spatial region symbols. The constants are in I D = {id 1 , . . . , id m }, T = [0, 1, . . . , tmax] (where tmax is an integer), and the set of r ⊆ Space = {p 1 , . . . , p n }. For simplicity, we use a square grid for Space within which a region r is a rectangle. In the grid, each location can be written as (α, β), where α and β are integers and 0 ≤ α, β ≤ N for some integer N . Thus, Space contains (N + 1) 2 points. For a singleton region consisting of point location p, we simply use p to denote the region (instead of writing r = {p}). We also use variables for each type: object variables, time variables, and spatial variables.
A spatio-temporal atom (st-atom, for short) is an expression of the form loc(x, y, z), where: (i) x is an object variable or a constant id ∈ I D, (ii) y is a space variable or a constant r ⊆ Space, and (iii) z is a time variable or a constant t in T . We say that st-atom loc (x, y, z) is ground if all of its arguments x, y, z are constants. For instance, loc(id, p, t), where id ∈ I D, {p} ⊆ Space, and t ∈ T is a ground st-atom. The intuitive meaning of loc(id, p, t) is that object id is/was/will be at point location p at time t.
A spatio-temporal database ST is a finite set of ground st-atoms, each of them consisting of a singleton region. Integrity constraints have the form of spatio-temporal denial formulas [49], each of them being a universally quantified negation of conjunctions of statoms and built-in predicates. The meaning of a spatio-temporal database ST with a set of constraints is given an interpretation that satisfy it. An interpretation I is a function I : I D × T → Space. Thus, an interpretation specifies a trajectory for each id ∈ I D. That is, for each id ∈ I D, I says where in Space object id was/is/will be at each time t in T .
The components of the general information space S = F, U, C for a spatio-temporal database ST over the set I D of ids, T of time points, and Space, with a set of spatiotemporal denial formulas, are as follows. The framework F consists of the first-order language for defining st-atoms where id, time, and spatial attributes are interpreted using the domains I D, T , and Space, respectively. The set U of information units is the set of atoms in ST , and the set C of requirements is the set of spatio-temporal denial formulas (integrity constraints).
We now provide an example. Consider an airport security system which collects data from sensors. A simplified plan of an airport area is shown in Figure 3 Suppose that the security system uses a spatio-temporal database to represent the information collected. For instance, st-atom loc(id 1 , (7, 4), 9) says that a passenger having id  (2, 0), 5) says that id 1 was recognized at point (2, 0) at the earlier time 5. Here U , the spatio-temporal database, consists of the set of 12 ground st-atoms in Fig. 3b, which includes the two atoms above, namely data units u 1 , . . . , u 12 . Herein, the framework F is defined by using I D = {id 1 , id 2 }, T = [0, 20], and Space consisting of the set of points (α, β) such that 0 ≤ α ≤ 7 and 0 ≤ β ≤ 7.
There are 4 requirements in C as given below both in English and as spatio-temporal denial formulas. The underlying language of general information spaces, and thus of requirements, is first-order logic. However, in specific cases, like spatio-temporal databases, we can have restrictions on the requirements (cf. A3.3 of Definition 3). Spatio-temporal denial formulas are an example of such restrictions.
The first requirement for our spatio-temporal database is that "an object cannot be in two non-overlapping regions at any time between 1 and 20", and it can be expressed by the following spatio-temporal denial formula: where nov stands for "does not overlap".
Moreover, in region b security checks on one individual at a time are performed. The constraint "there cannot be two distinct objects in region b at any time between 1 and 20" can be expressed by the following spatio-temporal denial formula: Due to the distance and the several obstacles between the entrance and the exit, we also have the constraint "no object can reach region c starting from region a in less than 10 time units", that can be expressed as: Finally, as the security check on each individual takes at least 2 time units, we know that "object id can go away from region b only if it stayed there for at least 2 time units", that can be expressed as:

Transformation to a propositional knowledge base
We now show how the transformation from a general information space to a propositional knowledge base is applied for this spatio-temporal database.  7. Therefore K S = {a 1 , . . . , a 12 , (¬a 8 ∨ ¬a 9

The calculation of the inconsistency measures
Below are the results of calculating the inconsistency measures of the spatio-temporal database example. We use the fact that

A blocks world configuration as a general information space
Blocks-world planning has been widely investigated in AI, as it captures several aspects of planning systems [32]. In this case, the components of S = F, U, C are as follows. The framework indicates that there is a finite number of colored blocks of the same size in stacks on a table, which is large enough to hold all (i.e., the number of stacks can be equal to number of blocks). A blocks world configuration is shown in Fig. 4. The data units specify the color of each block and the location of each block. Each ground atom is a 4-tuple (blockid, color, stack number, position) for which we use the shorthand st ij to indicate that this block is in stack i at location j (from the bottom). The data units in this example are as follows:   In order to write the constraints in first-order logic we use a 4-ary predicate symbol Block and obtain the following formulas.

Transformation to a propositional knowledge base
We now show how the transformation from a general information space to a propositional knowledge base is applied for this blocks world example.   4 The arity of c 4 is 2. The blocks st 11 and st 12 as well as the blocks st 11 and st 13 violate this constraint. Hence, g(c 4 ) = (¬a 1 ∨ ¬a 2 ) ∧ (¬a 1 ∨ ¬a 3 ) ∧ b 4 . c 5 The arity of c 5 is 0. There is no purple block in any stack. Hence, g(c 5 ) = ¬b 5 ∧b 5 . c 6 This constraint is satisfied. Hence, g(c 6 ) = b 6 . 7. Therefore K S = {a 1 , . . . , a 11 , (¬a 2 ∨¬a 3 )∧b 1 , (¬a 4 ∨¬a 5 ∨¬a 7 )∧b 2 , ¬a 4

The calculation of the inconsistency measures
Below are the results of calculating the inconsistency measures of the Blocks world example. We use the fact that MI( -

Rationality postulates for inconsistency measures
In Section 2 we defined several inconsistency measures for propositional logic. There are many others. In order to distinguish among them some researchers proposed various properties that, intuitively, a "good" inconsistency measure should possess. These properties are called rationality postulates; we will just use the term "postulate". In this section we present ten well-known postulates for propositional inconsistency measures. Our goal is to discover what happens when we consider these postulates in the context of general information spaces. It turns out, that in order to be applicable, the postulates need to be modified.

Postulates for propositional inconsistency measures
We give the definitions of ten well-known postulates and explain them. Monotony means that the enlargement of a KB cannot decrease its measure. The independence postulates mean that free formulas do not change the inconsistency measure. Penalty states that deleting a problematic formula decreases the measure. Dominance deals with the case where a KB and two formulas φ and ψ are given and φ is consistent and logically implies ψ. Then the addition of ψ to the KB cannot have a larger measure than the addition of φ. Super-Additivity and MI-Separability give information about the union of two knowledge bases under certain conditions. Super-Additivity deals with the case where the knowledge bases are disjoint in which case the measure of the union is at least as great as the sum of the measures. MI-Separability requires that the minimal inconsistent sets of the two knowledge bases partition the minimal inconsistent sets of the union, in which case the measure of the union is the sum of the measures. MI-Normalization, Attenuation, Equal Conflict, and Almost Consistency deal specifically with minimal inconsistent sets. MI-Normalization requires every minimal inconsistent set to have measure 1. Attenuation requires larger size minimal inconsistent sets to have smaller measures; Equal Conflict requires minimal inconsistent set of the same size to have the same measure. Finally, Almost Consistency requires that as minimal inconsistent sets get larger the measures get closer and closer to 0. Clearly, the satisfaction of MI-Normalization implies the satisfaction of Equal Conflict and the violation of Attenuation and Almost Consistency. Table 1 shows the satisfaction and violation of these postulates for propositional logic from [57].

Postulates for general information spaces
Recall how we measure inconsistency in a general information space. We transform it to a propositional knowledge base and then apply a propositional inconsistency measure to it. We cannot just assume that an inconsistency measure applied to a general information space has the same properties as the measure when applied to a propositional knowledge base. In fact, we will consider this issue in detail in the next subsection. In this subsection we focus on the definition of the postulates in our context. The propositional inconsistency measures use concepts such as minimal inconsistent subsets and free and problematic formulas. Some of the postulates involve the set operations union, intersection, and difference. In order to deal with the postulates we need to have definitions for them for general information spaces. The most critical issue is that the logic of general information spaces is not monotonic. For example, consider the relational database given in Section 5.1. Recall that there is a violation of c 8 because there is only one I D referenced in the F amily relation. This leads to the formula ¬b 8 ∧ b 8 in K S , where S is the general information space under consideration. But if a tuple is inserted into F amily with a different I D to form S , then S ⊆ S but K S ⊆ K S because the formula ¬b 8 ∧ b 8 is not in K S (it is replaced by b 8 ) and so I nc(S) ⊆ I nc(S ). But propositional logic is monotonic: adding a formula to a propositional knowledge base cannot make a minimal inconsistent set disappear.
Dealing with inconsistency measures in nonmonotonic logic has been thoroughly investigated [59]. In that work the inconsistency measures and rationality postulates are defined directly in the nonmonotonic logic. Specifically, [59] adjusts the Monotonicity postulate in order to obtain a meaningful one for nonmonotonic frameworks, where monotonicity cannot be required to hold for each additional piece of information. This is achieved by restricting the postulate to the case where additional information has no influence on the conflicts contained in the original knowledge base. From a technical standpoint, [59] relies on the concept of strong inconsistency to define inconsistency measures for nonmonotonic logics. A set of formulas is said to be strongly inconsistent if no superset is consistent. Thus, [59] defines strong monotonicity which, despite the name that emphasizes the role of strongly inconsistent subsets, actually weakens monotonicity due to its additional precondition.
In our approach, we start with a nonmonotonic logic and transform it to a monotonic logic in order to compute the inconsistency measures. This way we can use directly all the inconsistency measures previously defined for propositional logic. But we still need to give appropriate definitions for the concepts used for the postulates. One thing we can do is to restrict the requirements so that the general information space is monotonic. When that is not the case, we define what we call the monotonic elements and restrict the postulates to apply only to them. In the case mentioned above, where the requirements are restricted for monotonicity, all the elements are monotonic. We use the terminology "monotonic" for information units whose removal cannot cause a new inconsistency. This essentially corresponds to the way that [59] uses strong inconsistency and the preservation of conflicts for nonmonotonic logics in general.
Two important postulates consider what happens when a formula is removed from the knowledge base. In our case, either an information unit or a requirement is removed. The problem is that the nonmonotonicity of general information spaces means that the removal of an information unit may cause an inconsistency while the addition of an information unit may resolve an inconsistency. To handle this, we exclude such information units from consideration.
In the following, we assume that a general information space S = F, U, C is given. Recall that we also use the symbol S for the set S = U ∪ C.

Definition 7
We call u ∈ S monotonic iff I nc(S \ {u}) ⊆ I nc(S).
Note that requirements are automatically monotonic. Next we define the free and problematic formulas in our context.
is the set of m-free information units and requirements in S.

-Problematic(S) is the set of m-problematic information units and requirements in S.
In defining the postulates for general information space inconsistency measures, we will be using the m-version of various concepts and substitute I nc for MI. Dominance does not change but all other postulates are changed to their m-versions as given below. Note that Dominance applies only to requirements as an information unit cannot logically imply another information unit. Now we define the postulates we will consider for inconsistency measures in general information spaces.

Postulate satisfaction for general information spaces
In this subsection we show which of the ten postulates for general information spaces is satisfied by the eight inconsistency measures given in Section 2. We cannot just use the results from the propositional case for two reasons: first, our definitions for the postulates are different and second, the transformation of a general information space allows only formulas in propositional logic of a certain form. For example, iceberg inconsistencies [13] cannot occur.

Theorem 2
The satisfaction of postulates for general information space inconsistency measures is as given in Table 2 (1, 1), R 2 (1, 2), ∀x, y, z(    This result shows that the satisfaction results for the m versions of the postulates are the same as in the propositional case for the first 6 measures, all of which are syntactic measures. However, for I C there were several differences and there was also a difference for I mv .

Complexity
As shown in [58] the complexity of computing inconsistency measures in propositional logic is generally high. The fact that even I B , the simplest inconsistency measure, cannot be computed in polynomial time intuitively suggests that it is impossible to compute any propositional inconsistency measure efficiently for general knowledge bases. However, some problems related to inconsistency measurement become computationally easier when considering knowledge bases where every formula is of one of the following two types: i) a Horn clause, that is, a disjunction of literals where at most one literal in the formula is unnegated, or ii) a Krom clause, i.e., a disjunction of at most two literals [58]. In particular, considering the inconsistency measures considered in this paper, it turns out that only the computation of I B becomes polynomial under these restrictions.
Here we consider the complexity of computing inconsistency measures for general information spaces. There are two steps in this process. The first step is transforming the general information space to a propositional knowledge base. Clearly, this step is the same for all inconsistency measures. The second step is computing the specific inconsistency measure of the transformed propositional knowledge base. We consider these issues separately, starting with the second one.
In the second step the results from [58] for general propositional knowledge bases provide an upper bound on the complexity of computing the value of the considered inconsistency measures. We cannot use the results for restricted propositional knowledge bases, where every formula is a Horn/Krom clause, because the transformation does not yield a knowledge base of one of those forms. Nevertheless, as the transformation yields a knowl-edge base in a special format, we can show that the all syntactic measures considered but I H can be computed in polynomial time, and the same holds even for the measure I mv .
Proposition 1 Given a propositional knowledge base K S of the form in Definition 4, for each measure I x with x ∈ {B, M, #, P , mv}, I x (K S ) can be computed in polynomial time with respect to the size of K S . moreover, the size of X is minimum, otherwise V would not be a smallest vertex cover. Therefore, I H (K S ) = k.
(⇐) If I H (D) = k, then there is X ⊆ K S such that |X| = k and ∀M ∈ MI(K S ), X∩M = ∅, and X is a smallest set that has a nonempty intersection with every minimal inconsistent subset. Starting from X we define a minimum vertex cover V of size k for G as follows.
Case A. If X contains only atoms a i s, then V = {v i | a i ∈ X}. Indeed, since X has a nonempty intersection with every minimal inconsistent subset M ∈ MI(K S ), and every M ∈ MI(K S ) one-to-one corresponds to an edge in E, it follows that V is a minimum vertex cover of size k for G.
Case B. X contains some formula φ j . We show that there is an equivalent set of formulas X * ⊆ K S such that (i) I H (K S ) = |X * | = |X| and (ii) ∀M ∈ MI(K S ), X * ∩ M = ∅. Let X * be the set of formulas obtained from X by replacing every formula φ j = (¬a x ∨¬a y )∧b j ∈ X * with a x (or, equivalently, a y ). Since each formula of form φ j = (¬a x ∨ ¬a y ) ∧ b j belongs to exactly one minimal inconsistent subset {a x , a y , φ j } ∈ MI(K S ), by replacing φ j with a x (or a y ) we obtain that {M | M ∈ MI(K S ), X * ∩ M = ∅} coincides with {M | M ∈ MI(K S ), X ∩ M = ∅}. Moreover, since X is a smallest set that has a nonempty intersection with every minimal inconsistent subset, |X * | = |X|; otherwise I H (K S ) < |X|. That is, no formula φ j = (¬a x ∨ ¬a y ) ∧ b j ∈ X * can be replaced with a formula a x (or a y ) already in X * . Finally, let V = {v i |a i ∈ X * }; reasoning as above, it follows that V is a minimum vertex cover of size k for G.
The following theorem states that computing the value of I C is F P NP [log n] -complete. Also in this case, the computation of this measure for K S is not easier than that for general propositional knowledge bases, though K S has a specific form. Proof The membership in F P NP [log n] follows from the fact that the problem belongs to this class for general propositional knowledge bases [58]. The hardness for F P NP [log n] can be proved by showing a reduction from MIN-VERTEX COVER [48]. It suffices to use the construction provided in the proof of Theorem 3. In fact, the propositional knowledge base K S defined starting from an instance of MIN-VERTEX COVER is such that I C (K S ) = I H (K S ), and in particular it can be shown that an atom a i is assigned the truth value B iff the corresponding vertex v i belongs to a minimal vertex cover, from which the statement follows.
As for the complexity of computing I nc (K S ), we do not have a better result than the F P Σ p 2 [log n] -membership provided in [58] for the case of general propositional knowledge bases. Finding a tighter result remains open. In fact, this problem is also open for propositional knowledge bases where, to the best of our knowledge, no hardness for the above-mentioned class has been proved. Now we go back to the first step of the process of computing inconsistency measures for a general information space S, that is the transformation. In fact, this is the bigger issue because it must be done for all inconsistency measures and requires finding I nc(S). Thus, in general, the cost of the transformation is exponential w.r.t. the number of information units and requirements of a general information space. However, it becomes polynomial under some reasonable conditions. For instance, consider first requirements of arity 0. A simple such requirement may be ∃yR(1, y), which can be checked in linear time. However, the requirement ∀x, y(R 1 (x, y) → ∃zR 2 (y, z)) needs a loop and takes quadratic time. Consider next requirements of arity 1. A simple example is ∀x, y(R(x, y) → x < 0 ∧ y > 1) which is a matter of checking all the tuples, leading again to a polynomial case. We get the following result. Proof This follows from the fact that Comb(n, k) ≤ n k , where k is assumed to be a constant.
Therefore, since under the assumptions of Proposition 3, the size of the transformed knowledge base is polynomial w.r.t. the the number of information units and requirements of the input general information space, using the result of Proposition 1, we obtain the following corollary.

Conclusions and future work
As inconsistency in real-world information systems can not be easily avoided, many inconsistency-tolerant approaches have been developed to live with inconsistency [21], and provide appropriate mechanisms to handle inconsistent data [5,8,18]. For instance, approaches for dealing with inconsistency in databases include consistent query answering frameworks [2,9], data repairing [1,39,41,43], interactive data repairing and cleaning systems (e.g. [20,33,34]), as well as interactive data exploration tools [7,22]. An important issue in such kinds of situations is measuring the amount of inconsistency to assess its nature and understand the degree of the dirtiness of data.
In this paper, we developed a general approach for measuring inconsistency in general information spaces which encompasses various ways in which information is stored in realworld systems. Before outlining directions for future work, we discuss some advantages and limitations of our approach.
Independence of the data representation and choice of the inconsistency measure. An important advantage of defining inconsistency measures for general information spaces is its wide range of use. Consider an inconsistency measure defined specifically for relational databases, as proposed for instance in [3, 14-16, 44, 51]. Such a measure allows for comparing two relational databases and determining if one is less inconsistent than the other one or if they have the same inconsistency. But using the concept of a general information space and the uniformity of the definition of an inconsistency measure allows the comparison of the inconsistency of different types of information systems, such as a relational database and a graph database. For instance, we can compare the inconsistency measures obtained for the 4 examples using the 8 inconsistency measures we have considered. These inconsistency measures measure different aspects of the inconsistencies. Hence we find that for some measures one of the examples is more inconsistent than another but for a different measure it is the reverse. So for comparing these general information spaces we need to decide which aspect we really want to measure. In particular, measuring the number of inconsistent subsets, possibly by weighing them, that is, using I M or I # , are good ways to get a general sense of the amount of inconsistency. Using I M we find that the answer is largest for the first example, the relational database, where I M (S) = 12. The values for the other 3 examples are closer together: 6 for the graph database, 4 for the spatio-temporal database, and 7 for he Blocks world example. However, since I M counts the number of minimal inconsistent subsets, it may not be appropriate for cases where, for instance, a single information unit causes many inconsistencies. Consider a case where the address attribute functionally depends on the id attribute in a relation containing data about customers, and there is a single customer address in a tuple t that is misspelled, while it is correct in all the other n tuples for the same customer id. This generates n two-element minimal inconsistent sets, each of them having the wrong information unit t in common. Hence, measure I M would count n minimal inconsistent subsets in the transformed propositional knowledge base, even though they are all due to a single incorrect value in t. Instead, we could use the "repair" measure I H , which counts the minimal number of formulas in the translation whose deletion resolves all inconsistencies; I H counts such an information unit only once. This example shows that the choice of the inconsistency measure is critical for deciding how inconsistencies are counted. For the sake of the presentation we have focused on showing how our approach works with the inconsistency measures listed in Definition 2. But it is important to observe that the transformation creates a propositional knowledge base; hence all propositional inconsistency measures ever proposed are applicable. Therefore, issues that can be ascribed to the choice of the inconsistency measures are prone to be easily solved in our framework, as it is apt to deal with any propositional inconsistency measure having some desired properties, e.g., those of the repair-based measure I H in the example above.

Existential requirements and trade-off between expressivity and tractability.
Our transformation-based approach maps violations of general information spaces' requirements to propositional formulas where the propositions are associated with the information units. An existential requirement asserts the existence of an information unit; the lack of such a unit causes an inconsistency that we transform into an inconsistency in propositional logic. Specifically, we obtain a self-contradiction for existential requirements, for which no information-unit deletion would negate the requirement's violation (recall that we assign arity equal to zero to such requirements). For an existential requirement, an alternative approach would be making explicit what are all the possible options for missing information units. But this would mean dealing with a potentially huge set of information units from attributes' domains. However, we might be able to place a limit to some information units only. For instance, if the existence of an object (e.g., a person) is required in a given relation (e.g. Family relation), as in the case of requirement c 8 in Section 5.1, then we could assume that the only objects explicitly appearing in a relation (e.g., Person) are allowed. Even so, this seems to be an arbitrary decision. Therefore, to close the loop, what remains to be done is to attempt to design a formal language to express such limitations on existence for requirements of general information spaces and then provide an expanded transformation dealing with them, a task outside of this work's aim and scope but that we will pursue as part of our research project. It is worth noting that some advantages of our framework, such as the nice trade-off between expressivity and polynomial time computation, could be lost this way. As a matter of fact, most of the results of the paper would be substantially affected by dealing with existential requirements as outlined above.
Additional directions for future work Besides the above-mentioned attractive research possibility, we plan to continue our work on measuring inconsistency in general information spaces in several other directions. We wish to define disjunctive general information spaces where the information units may be disjunctions. This has additional complications: for example, it may be that several requirements are needed for an inconsistency. In particular, we are interested in handling null values. In another direction, consider that although we showed the applicability of our approach to spatio-temporal databases [28], how to encode probabilistic spatio-temporal knowledge bases [49,50], and in general probabilistic information, into a general information space needs further investigation that may lead to a definition of a concept of probabilistic general information space. Finally, we also plan to study what aspects of inconsistency the various inconsistency measures actually measure to determine which ones are the most appropriate to use for general information spaces. and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.