# Offline and online data: on upgrading functional information to knowledge

## Authors

- First Online:

DOI: 10.1007/s11098-012-9860-4

- Cite this article as:
- Primiero, G. Philos Stud (2013) 164: 371. doi:10.1007/s11098-012-9860-4

- 1 Citations
- 184 Views

## Abstract

This paper addresses the problem of upgrading functional information to knowledge. Functional information is defined as syntactically well-formed, meaningful and collectively opaque data. Its use in the formal epistemology of information theories is crucial to solve the debate on the veridical nature of information, and it represents the companion notion to standard strongly semantic information, defined as well-formed, meaningful and true data. The formal framework, on which the definitions are based, uses a contextual version of the verificationist principle of truth in order to connect functional to semantic information, avoiding Gettierization and decoupling from true informational contents. The upgrade operation from functional information uses the machinery of epistemic modalities in order to add data localization and accessibility as its main properties. We show in this way the conceptual worthiness of this notion for issues in contemporary epistemology debates, such as the explanation of knowledge process acquisition from information retrieval systems, and open data repositories.

### Keywords

Epistemic modalitiesFunctional informationUpgrade## 1 Introduction

The recent debate on the epistemology of the notion of information and the widely discussed disagreement concerning its veridical nature, was ignited by the following definition:
^{1}

DEF Semantic information is *well-formed, meaningful* and *veridical* data.

The resulting different approaches towards DEF can be reduced—at least partially—to the following most fundamental question: what is the difference between information and knowledge? ^{2}

(Floridi 2005) has replied to this question by maintaining that semantic and alethically neutral data provides only necessary but insufficient conditions for information: a pile of syntactical data which proves to be false or meaningless, cannot be accounted for information and, in the best case, reduces to *misinformation* (significant but untrue data). The task is completed by a formal logic that satisfies this notion of information. The logic for *being informed*, introduced in (Floridi 2006),
^{3} provides the formal analysis of a cognitive *statal* state *I*_{a}*p* of an agent *a* holding the information that *p*, where *I* is formalized as a veridical but neither introspective nor reflective necessity operator. This makes information logically distinct from knowledge: its logic requires the axioms of normal modal logic B, thus making it distinct from the KT, S4, S5 epistemic logics. On the other extreme of the epistemic spectrum, this differentiates the logic of being informed from the KD, KD4, KD45 doxastic logics. In (Allo 2010), a number of issues with respect to this approach to modelling information have been raised and a revision proposed in view of a pure and an applied semantics for it.

Besides the purely logical solution of adding the required axioms, the distinction between information and knowledge is completed by resolving one more problem: knowledge encapsulates semantic information, but how to go from the state of being informed to knowing, i.e. how to upgrade? The answer to this question is given in (Floridi 2010a),
^{4} where the semantic theory of information satisfying the veridicality thesis given by DEF is embedded in a *Network Theory of Account* (NTA), which explains the basic epistemic process transforming information into knowledge contents. Briefly explained, according to the process of *accounting* embedded into the NTA, an instance of information *p* can be upgraded to knowledge by connecting *p* to the conceptual network of interrelations to which it belongs. Such a network is constituted by the information that can give reasons in supporting *p*, that is, the information that explains the reasons why *p* is the case. Floridi considers such information to be the answer to what he calls ‘how come’ questions (HC-questions): any standalone instance of information poses a set of HC-questions concerning the path of events that generated the scenario described by *p* (genealogical HC-questions), the mechanism or the way in which the scenario described in *p* was determined (functional HC-questions), and the purpose to which the scenario described in *p* is devoted (teleological HC-questions). The information conveyed by the answers to such questions provides the explanation for *p*.

For many of its defenders (as e.g. Grice and Dretske), the veridicality thesis seems to endure a classical (corrispondentist) theory of truth. Nonetheless, DEF and NTA leave completely open the possibility of understanding truth in terms of redundancy or coherence and, in particular, Floridi defends a correctness theory of truth.
^{5} At any rate, a basic assumption is that when considering these two conceptual pillars of a theory of information, we are talking about a notion of *factual information*, i.e. data working as constraining affordances, typed at a certain level of abstraction that generate the interface at which information is accessed and processed. This notion of information endorses truthfulness. A large disagreement concerns this latter point, also called Veridicality Thesis (VT).
^{6} This controversy can be solved by agreeing that analyses allowing possibly false informational contents refer to a conceptually distinct notion of information. The role of *factual information* as main notion to be focused on in the information sciences is sided by the complementary notion of *instructional* or *functional information*. This distinction has been endorsed in (Floridi 2009).
^{7} Functional information is not the descriptive content of a fact or state of affairs. Its role is instructional, in that it seeks, or otherwise instructs to bring about a state of affairs and it expresses the conditional state of the factual notion of information: it contains the instructions which, when executed, realize factual information. At this level, the non-veridical nature of information is required: it is the sort of data that, when and if executed, will transform a semantic content from false to true (turning a content of misinformation into a piece of semantic information). The execution of the instruction carried by functional information performs the task of actualizing data. This act is data-typology dependent: executing an instruction related to empirical data concerns an act of* instantiation*; executing an instruction related to logical data requires an act of* verification*.

In the following, we shall support the thesis that a further property is required to identify correctly functional information, namely* localization*: we will defend a view according to which functional information is expressed in terms of data asserting* locally* executable sets of instructions; locality is expressed by linking validity to a set of sources. The factual counterpart is identified as the truthful content of globally executed instructions.
^{8} The present notion of functional information as* realizable* instructions requires non-actual data in the sense of (locally) executables, in turn making the factual counterpart be expressed by actual data, in the sense of (globally) executed instructions. From this reading is evident that we are endorsing an epistemic reading of the notion of information, which characterizes factual information as meaning, whereas realizable instructions bear a strong connections to predicable contents. This distinction between the meaningfulness of data for functional information and the meaning of data for factual information translates into the correlation between knowable and known contents.
^{9}

The notion of functional information is pivotal for epistemic (non-realistic) approaches to knowledge, where it is crucial to dispose of a framework in which the notion of verification is conceptually prior to—and foundational for—that of truth.
^{10} In the present paper, we address the analysis of upgrading information to knowledge from this latter epistemic perspective, starting from the ground notion of functional information characterized by locality. We shall rely on the epistemic constructive definition of information (ECDI) introduced in (Primiero 2007) to supervene the alethic/non-alethic issue about informational contents, relying on the formal framework which uses the theory of judgements of Martin-Löf’s Type Theory.
^{11}

We define functional informational contents as syntactically correct and meaningful propositional contents. These are not—unlike truth-makers—holders of a positive alethic value, as they can in turn be negatively characterized (i.e. they admit refutation); but they cannot be taken independently of their alethic characterization either, as they would not act as (valuable) factual information, rather just be discarded as useless—though epistemically admissible—material. *Functional Infons* are *ascriptions of truth to propositional contents*, contents *assumed to be true* functionally to a certain epistemic state being formulated. Only the latter is truthfully characterized and qualifies as being composed by *Semantic Infons*. This briefly expresses the functional or operational view on information in the constructive context in which ECDI is formulated. This theory identifies information with assertion conditions in the context of a constructive theory of truth for which the Verificationist Principle of Truth holds:

VPT Truth is defined by the formulation of a verification.

*A*“The use of a locker at the library requires a 5p coin”.

^{12}A verification is then what allows me to infer that

*B*“It is true that using a locker at the library requires a 5p coin (because so has been checked out)”

*C*“I am informed that the use of a locker at the library requires a 5p coin”.

\(C^\prime: \) “I receive the information that the use of a locker at the library requires a 5p coin.”

^{13}Assumptions can be used in this role:

\(A^{\prime}: \) “The use of a locker at the library requires a 5p coin”.

\(B^\prime: \) “Assume it is true that the use of a locker at the library requires a 5p coin”.

\(C^{\prime}: \) “I receive the information that the use a locker at the library requires a 5p coin”.

*A*′ becomes what is known in the philosophical literature inspired by Constructivism as a judgement-candidate. \(C^\prime\) formulates this epistemic state which holds under condition of such unverified but admissible content. Notice that we have modified the use of the informational content from a statal state (“I am informed“) to a dynamic processual state (“I receive the information“). In turn, the inference from

*C*′ to

*C*relies obviously on the reduction of admissibility to verification:

*D*“I receive the information that the use of a locker at the library requires a 5p coin. If that is true (provided I verify that to be the case) I am informed that the use of a locker requires a 5p coin at the library and I can use it (e.g. because I have one such coin in my pocket)”.

The epistemic state resulting from this notion of information is not a different notion from the one satisfying DEF, but rather a partial, dynamic representation of it. It misses the property of being veridical, but it works in the setting of verified data. We shall refer to it as the epistemic state of Functional Information, *A*′ representing a functional infon.

It seems essential now to compare Functional Information with the upgrade operation to knowledge. The problem will be tackled from the perspectivist epistemology that is shown to underlie Martin-Löf’s Type Theory in (Schaar 2009). We will show how the additional property of data localization is necessary in the process of explaining upgrading for functional information.

In the following, we shall organize our arguments as follows: in section 2, a brief survey of the network theory of account is provided, focusing on the main critique towards the tripartite definition of knowledge and how that affects our epistemic definition of information; in section 3 we shall recast the framework of ECDI to gain a common ground with the semantic definition of information and show the modal extension of our framework which results crucial for the upgrade issue; in section 4 we finally tackle the upgrade issue from the perspective of our extended ECDI, namely by focusing on the distributed online nature of data that compose an upgraded information state and draw the connection to new epistemic phenomena.

## 2 NTA and the De-Coupling Test

The Network Theory of Account (NTA) presented in (Floridi 2010a) is developed for a notion of semantic information considering non-reflective, opaque and aleatoric data. Non-reflective data require that holding the information that *p*, does not necessarily mean to understand *p*: an agent holding coded information *p* for which she is missing the decrypting key will not be able to understand *p*. Opaque and aleatoric data are considered in view of their effect on knowledge states: epistemic luck affects knowledge, but not information, and it cannot survive tests, namely questions concerning the information packets that are constituent of our (possibly lucky) knowledge state. The aleatoric state of information is a symptom of the fact that information packets miss connecting links among each other and so they are mutually independent. Semantic information lacks the necessary structure of relations that permit each informational packet to account for another. Hence, upgrading consists in fulfilling the erotetic deficit intrinsic to data building an information state.

A rather crucial point is the de-coupling problem. The basis of the network structure is given by the relation between a source *s* and a sink *t* which binds the accounted content and the network itself: in order for a network to be able to account for a certain fact, it has to provide all the needed answers and its content has to be no smaller than what allowed by the network’s capacity. This property, formally corresponding to the calculation of the maximum flow in a network, satisfies the apparently trivial observation that an explanation of *p* is correct if and only if it applies correctly to *p*. This can be reformulated as the *De-coupling Test*:

DcT *Explanans* and *Explanandum* cannot be de-coupled without making the explanation incorrect.

NTA survives the de-coupling test trivially, because the theory of accounting is monotonic and if an account *A* of *s* is correct, then *s* correctly accounts for *t* in *A* and they cannot be de-coupled. The corresponding sufficient condition says that the account theory survives a formulation of the Gettier-problem. Notoriously, Gettierization arises for the tripartite definition of knowledge as ‘true justified belief’ precisely in view of the fact that it is in principle always possible to de-couple two of the three elements in that definition, namely the truth of *p* and the reasons that justify an agent *a* in holding *p* true.^{14}

Gettierization represents therefore the first obstacle to any theory of information located in the larger context of a theory of knowledge, and in particular for any account of the upgrade operation. In the case of an epistemic account of functional information, this is actually a reversible problem that we formulate as follows: to show that Gettier-problems are ineffective in the context of the Verificationist Principle of Truth (VPT), represents enough of a proof that an epistemic account of information passes the test of de-coupling and so it satisfies necessary conditions for the sought operation of upgrade.

### 2.1 Gettierization vs Perspectivism

The analysis of Gettier-problems from an agent-based perspectivist epistemology has been recently formulated in (Schaar 2009). The well-known phenomenon of Gettierization involves the tripartite definition of knowledge as ‘justified true belief’, as one can easily imagine cases of true and justified belief that are not knowledge. This happens, in particular, when one uses the content of a knowledge state without the ability of accounting for its justification, as it is the case when one uses or recalls someone’s knowledge (or one’s own knowledge in a distinct and not directly reconstructible situation). The constructivist, first-person perspective epistemology has an easy and very valuable solution to such cases: Gettier cases are generated by shifting from the perspective of the subject to that of the attributer; the latter knows the truth of the content from a completely different perspective than that of the subject. From a first-person perspective theory of knowledge, one cannot adopt two perspectives at the same time, hence Gettier cases cannot be formulated.^{15}

*J*is the predication of truth for a propositional content

*A*, hence representing an higher-order format than classical propositional logic (it is actually as expressive as Intuitionistic FOL):

*a*for the content

*A*which makes it possible to state that

*A*is true:

*a*can be categorical or analytical, i.e. when it needs no further conditions than what expressed in

*a*itself. Such a justification would be the one appropriate for a judgement that requires only analytical skills in order to be understood. Otherwise, the justification can be dependent from a context of conditions needed for asserting

*a*. These external conditions are what makes it possible to formulate the relevant justification. Formally, for any proposition

*A*which is declared true, an appropriate verification

*a*will be valid on the basis of a number of propositions \(A_{1}, \ldots, A_{n}; \) for each of such proposition

*A*

_{i}, an appropriate verification

*x*

_{i}is assumed. Each such assumption

*x*

_{i}will need to be verified (formally, β-reduced to a corresponding proof-term

*a*

_{i}) in order to prove

*a*:

*A*and so to assert \(A\; true: \)

The epistemic conditions \(\Upgamma=\{[x_{1}/a_{1}]\!:\!A_{1}, \ldots, [x_{n}/a_{n}]\!:\!A_{n}\}\) whose verifications make a certain proposition *A* true, formulate the perspective under which an agent knows *A*. To have a ‘justified true belief’ means to be able to formulate the context in which *A* holds true, which in turn requires the formulation of appropriate verification of its assertion conditions. In the perspectivist epistemology, such a context will contain all the conditions that the agent holding the justified true belief needs to satisfy in order to perform such an epistemic act. To simply hold such a belief without justification (as it happens e.g. when someone is referring someonelse’s belief) means to empty the context under which the true belief holds, in turn trivializing the corresponding knowledge state. In other words, under the constructivist/perspectivist epistemology, it is impossible to claim to have justified true belief outside of the scope of the formulation of an appropriate context.

Under this reformulation, VPT is more precisely reformulated as its contextual counterpart:

cVPT Truth is defined by the exhibition of a verification under the explicit satisfaction of its context of conditions.

Shifting from one subject to another (or to the same subject in a different situation) requires switching from one context of conditions to a new one, hence formulating a new perspective under which the validity of *A* is questioned. In particular, for any two equivalent propositions *A* ≡ *B*, equivalent constructions or justifications are required and general formal rules of identity hold both at the level of constructions and propositions, which corresponds formally to defining α-rules on terms. This notion of identity satisfies standard definitional and equality properties.

Contextually justified judgements and their identity conditions are crucial to show that functional information passes the de-coupling test. We recast a contextually derived judgement *J* as a semantic infon; the functional infons on which *J* is based are given by the related set of conditions for *J* formulated in context \(\Upgamma. \) This leads to a variant definition of cVPT in terms of information:

**Definition 1**(*Functional Principle of Semantic Information*)

Semantic Information is defined by the explicit satisfaction of its Functional Infons.

\(\Upgamma\) =“I receive the information that the use of a locker at the library requires a 5p coin”.

[

*x*_{i}/*a*_{i}] :*A*_{i}for every \(A_{i}\in \Upgamma\) = “verify that the use of a locker at the library requires a 5p coin”.*J*= “(as \(\Upgamma\) is verified) I am informed that the use of a locker at the library requires a 5p coin (and I can use it, e.g. because I have one such coin in my pocket)”

The set of functional infons in \(\Upgamma\) formulates the conditions under which *J* holds, \(\Upgamma \vdash J. \) Identical claims need to have identical (or reducible) conditions, \(\Upgamma\vdash J, \Upgamma'\vdash J', J \equiv J' \Rightarrow \Upgamma\equiv \Upgamma'; \) non-reducible contexts provide different conditions, thus leading to non reducible semantic infons. The obtaining of the claim in *J* is constrained to the satisfaction of its conditions, \([x_{i}/a_{i}]\!:A_{i}, \forall A_{i}\in \Upgamma, \) which generates a user-based—but not relativistic—perspective. Notice that *J* is closed under logical consequence, whereas \(\Upgamma\) is not.

The non-relativistic nature of the formal approach here endorsed is, of course, crucial. The verificationist approach, equipped with definitional equality (reflexive, symmetric, transitive) ensures that this (perspectivist) notion of truth is actually based on a normalized notion of justification. This means that two equivalent propositions *A* and *A*′ will have correspondingly equivalent justifications *a* and *a*′, whose formal identity is provable, i.e. either one will be the normal form of the other, or it exists a third justification *a*′′ to which both *a* and *a*′ reduce. It is therefore not admissible for a subject to assert the truth of *A* without relying on a justification *a* which is in turn reducible to the corresponding normal form. The contextual nature of our justifications allows nonetheless such a term to assume different forms and in particular to have appropriate set of assertion conditions, whose unique requirement is the reducibility to corresponding terms.

For the definition of functional information Gettierization is avoided in the same terms as it is for the perspectivist epistemology, namely because functional infons constitute the set of conditions for certain true contents to hold. As explained above, formulating a believed content means to formulate the corresponding justification; in order to formulate and correctly assert the latter, appropriate assertion conditions will be required. Hence, semantic infons are coupled to the relevant set of functional infons. So the De-coupling Test is passed in the following form:

Info-DcT *Functional Infons* and *Semantic Infons* cannot be de-coupled without making the informational state incorrect.

A factual semantic notion of information has a corresponding functional counterpart that endorses a verification procedure. It remains to be seen how our formal framework deals with the distinction between true sentences (semantic information) and their conditions (functional information) without collapsing admissible contents and verified data, in order to preserve the distinction between truthful data and meaningful but untrue data, i.e. misinformation.

### 2.2 Data for Functional Information

The admissibility of refutable truths is a sensible topic for verificationist theories of truth. For the standard intuitionistic meaning explanation of negation, indirect proofs as *reductio ad absurdum* are standardly not admitted, whereas the usual intuitionistic absurdity rule interprets the classical *ex falso quodlibet*.^{16} The foundational idea that truth can be explained also as admissible up to a counter-example was already at the basis of the notion of ‘pseudo-truth’ introduced in (Kolmogorov 1925) for double-negated classical formulas reducible to intuitionistic ones. The problem arises with the standard way of defining assumptions in the constructivist vein. Such definition does not coincide with the notion of assumption as refutable truth needed to define functional information as ‘possibly false content’ (i.e. the core of the notion of information that does not satisfy VT): it is instead considered as a process of forgetting the relevant content contained in a (in principle already obtained) construction.^{17} In the present setting, admissible truths should be literally satisfied by the logical concept of assumption, a computational term which is still possibly refuted.

To obtain such interpretation we need to introduce the analysis of meaning within the formal framework at hand. Thereby we shall characterize functional information in terms of non-reflective, opaque and aleathoric data. As a result of admitting the desired notion of assumption as refutable data, the formal framework is transformed in a related constructive modal type system requiring polymorphism in order to introduce variable and proof constructors as distinct terms, in particular the former becoming an admissible term in view of a missing refutation.^{18}

^{19}To understand the typing procedure, we need to explain the distinction between typing and meaning. Typing an object involves meaning declaration at two distinct levels. As an example, consider the declaration of the value of a certain variable to range on the set of natural numbers; constructively, this requires the definition of the type itself (the natural numbers) by axiomatic construction of its elements:

Hence, at the first level of typing, one is considering the evaluation of the concrete level of meaning, obtained by performing the operation and getting the value by substitution of bounded variables, as in the expression \(b[a/x]:{\mathbb{N}}[a/x]. \) The level of meaning expressed by the object *b* coincides with the predication that it belongs to the type \({\mathbb{N}. }\) A different level of typing is at stake when considering the role of bounded variables for expression of the form \({b:{\mathbb{N}}(x:\mathbb{N}), }\) where *x* is abstracted from \({a:\mathbb{N}, }\) which expresses meaningfulness for the dependent judgement. At this level, one *uses* the meaningfulness of \({\mathbb{N}}\) abstracted from the appropriate constructor (that is, without the related value), in order to proceed in the construction of a new object.

^{20}Nonetheless, the notion of assumption involved by the dependent judgement in the second premise of the rule of λ-abstraction does not really express the idea of refutable content we are after. It is bounded with respect to the introduced type by the first premise, where a construction for that object is obtained beforehand and it is not meant to interpret any refutability on contents. We need instead to interpret such a dependent judgement as a derivation from an open assumption, i.e. our aim is to provide an interpretation of meaningfulness independent from evaluation. This can be formulated in a type system that allows for the following set of typing rules:

*true*

^{*}predicate. This set of rules is completed by appropriate constructors on connectives within

*type*and

*type*

_{inf}. Technically, the crucial step to make this system going is represented by separate implications: a material one for

*type*is obtained by using application of obtained constructions, whereas the one for

*type*

_{inf}instantiates functional abstraction:

*type*fragment includes quantifiers acting only over finite sets of constructors,

*type*

_{inf}fragment.

For both fragments structural rules are definable.^{21}

This notion of assumption can be used to express data for functional information: it admits meaningfulness but not truth; verification is the operation that turns it into satisfied conditions, in turn actualizing justification for the propositional content at hand; the latter will then be factual, semantic information by adding truthfulness; refutation is the corresponding construction operation for revealing misinformation. Data in this language are syntactically construed, correctness being revealed at this level by appropriate typing rules. The kind of *type*_{inf} expressions also preserves the semantic aspect but they miss explicit typing at level of value formation, which can be given in the formal language by the β-conversion rule from above. This makes its content non-reflective.

When we address the notion of functional information, in the form “Let *x* be of type *A* and *y* be of type *B*, then (*x*)*y* is of type \(A\,\rightarrow B\)”, we are giving functional meaningfulness via type declarations; no data value, i.e. output evaluation is performed at this stage, hence information involving types *A*, *B* does not contain enough data to understand the value associated with \(A\, \rightarrow B. \) The *typing value* of the operation \(A\, \rightarrow B\) implements the structural information, i.e. it needs not only to be reflectional in terms of evaluation, it also needs to be transparent in terms of structure. This means that an agent *using* the type \(A\, \rightarrow B, \) or just *guessing* its value, does not say yet anything on the result of the corresponding operation (in this case a functional abstraction). To get it right without an appropriate evaluation process is actually only blind (lucky) knowledge. A similar argument can be brought forth in the case of admitting truths ‘up to refutation’, that is ascribing truth to certain contents. The set of *typing variables* gives the meaningfulness of the expression (message) in terms of functional information, but only *data evaluation* and *data construction* bring the right typing structure as its meaning. This shows that functional infons – expressed by refutable meaningful data – preserve non-reflective data in terms of the distinction between meaningfulness and meaning.

A notion of semantic information that relies on the equation ‘structured meaningfulness + verification’ seems apt to guarantee a minimal degree of transparency to data. Its functional counterpart (i.e. simply ‘structured meaningfulness’), taken in view of a single epistemic state, is sufficiently transparent to the agent to allow her to admit its content in view of her present knowledge. Not so when functional infons express the set of conditions that make true contents possible. Under this reading, functional infons are *user generated* and *user directed* data, which makes them opaque in view of the dynamics of changing contexts, or *opaqueness to external states*. When verification is performed on functional infons, the agent acquires an informational state which can be transmitted over to other states (be they of other agents or of the same agent). When such communication happens (in the limit case among different epistemic states of the same agent) without verification, data is transferred in terms of the admissible epistemic contents in contexts: these are accepted as opaque data. In this sense, information is *collectively opaque and non-reflective*.

**Definition 2**(*Functional Information*)

Functional Information is syntactically well-formed, meaningful, non-reflective, collectively opaque data.

It is our aim to explain in the next section how such global opaqueness for semantic information turns into global transparency for knowledge contents. We maintain this to be the reason why verification does not account per se for the epistemic operation of upgrade, which instead requires a transparent analysis of the message structure for any possible communication chain. By addressing this latter point, we shall show what actually is upgrade from functional information to knowledge.

## 3 Going to a higher level of representation: data access

Contexts of our epistemic language express the network in which functional information can be justified. As mentioned earlier, the dynamics of context-switching can be defined over distinct epistemic states of a single agent, or rather over epistemic states of distinct agents. In the limit case, a single context is the composition of multiple contexts each intended as a singleton indexed for a different agent. The notion of verification for epistemic data valid in contexts is based on the definition of conditions for knowledge and their reducibility to equal contexts.

The definition of appropriate equivalence functions on possibly false data has to preserve their origin and validity location: truthfulness becomes dependent on the possible extension of the original context of assertion conditions to other distinct contexts. This requirement can be formulated by determining *limits* on the network: whereas for the upgrade of the notion of semantic information a necessary requirement is that information on the channel is not less than its capability, as to ensure that no necessary data is missing;^{22} when starting from the lower functional level, one also has to know that the network capability is no *greater* than the amount of valid information needed. In other words, one needs to establish how to access data and where such data is still valid by knowing which accessible context falsifies it, turning it into misinformation. Let us call this the *Data Accessibility Problem*. To reformulate the identity issue raised in the previous section within the model of functional information we are proposing, means therefore to solve the Data Accessibility Problem. The language needs now additional expressiveness in order to refer to meta-properties for the data in the network, in particular for their validity-preserving accessibility relations. An epistemic definition of modalities is the obvious enrichment of the language to obtain this further level of expressiveness.

### 3.1 Epistemic Modalities

The notion of epistemic modality has received an increasing attention in the literature during the last decade. The original debate on the definition of epistemic modality as an operator defining truth *conditionally* on an epistemic state would focus on the explanation of sentences such as “It is possible that *P*” or “It is necessary that *P*” (with *P* a proposition) on the basis of a clarification of related truth conditions, dependently from the epistemic state of the speaker or other.^{23} A different account of the notion of epistemic modality is given in those approaches where one gets rid of truth conditions entirely and defines appropriate counterparts of the standard notions of necessity and possibility by referring to validity of alternative (accessible) assertion conditions.^{24} The analysis put forward in (Primiero 2009b) offers precisely the kind of machinery needed.

*A*is true means that a proof for

*A*is known), necessity for \(\square (A\, true)\) says:

^{25}

- “
*A*is true” is necessary \(\Rightarrow\) “*A*is true” is known$$ \square(A\, true) \Rightarrow K(A\, true). $$

*K*can be seen as a knowledge-operator, in the style of epistemic logics, or as an explicit operator for a (group of) knowing agent(s). When

*A*presupposes further propositions to be known, these represent the context in which

*A*is known to be true, \(\Upgamma=(A_{1}\, true, \ldots, A_{n}\, true). \) The reading of \(\square (A\, true)\) is formulated as knowledge for which no further contextual conditions are needed (\(\Upgamma = \emptyset\)):

- “
*A*is true” is necessary \(\Leftrightarrow\) Agent*K*knows that*A*, for any knowledge state agent*K*is in$$ \square(A\, true) \Leftrightarrow K((\emptyset)A\, true). $$

Eventually, this amounts to say that (*A* true) is valid under any possible \(\Upgamma, \) as by definition assumptions in \(\Upgamma\) cannot formulate conditions contradicting an already expressed construction. Model-theoretical necessity as truth in all possible worlds corresponds directly to the proof-theoretical verification under no specific assertion-conditions, hence under all possible ones.

- “
*A*is true” is possible \(\Leftrightarrow\) Agent*K*knows that*A*, for some knowledge state \(\Upgamma\) agent*K*is in$$ \diamondsuit(A\, true) \Rightarrow K((\Upgamma)A\, true). $$

Only with \(\Upgamma\) empty this formula reduces to the conditions for \(\square(A\, true). \) Otherwise, it means that truth is preserved under *some* knowledge states in which the agent is able to satisfy appropriate conditions, hence only some extension \(\Updelta\) of \(\Upgamma\) will be preserving with respect to the validity of *A* true.

*J*is the judgement stating “A is true” and \(\Upgamma\) a set of assumptions

*x*

_{n}:

*A*

_{n}):

^{26}

By this rule, one accounts for judgements whose conditions have been verified, therefore allowing provability of the conclusion.

*J*stands for a formula of the form (

*A*true)):

This reading extends the previous interpretation of necessity as proof-conditions to the assertion conditions of hypothetical judgements, preserving the formulation of knowledge contents epistemically weaker than strictly proved ones.

*A*is true in a context \(\Upgamma\mid \Updelta\) (\(\Upgamma\) extended by \(\Updelta\)) if the judgement

*J*=

*A*

*true*is justified by

*a*

_{i}:

*A*in context \(\Upgamma, \) i.e. its construction contains all satisfied conditions in \(\Upgamma\) and it remains valid under any extension \(\Upgamma \mid \Updelta; \) then

*A*is said to be

*globally valid*under \(\Upgamma\mid \Updelta; \) otherwise, the judgement

*J*is justified by

*x*

_{i}:

*A*in context \(\Upgamma, \) i.e. its construction depends on open variables in \(\Upgamma\) or in some extension \(\Upgamma \mid \Updelta; \) then

*A*is said to be

*locally valid*under \(\Upgamma\mid \Updelta. \) The context extension operation allows to mimic syntactically the notion of accessibility so that judgemental modal operators express that a proof holds somewhere or everywhere, with respect to contexts.

^{27}A signed (modal) context \(\circ \Upgamma_{i}\) can now be extended in view of a differently signed (modal) context \(\circ \Updelta_{j}\) (where \(\circ=\{\square, \diamondsuit\}\)). A re-definition of modal judgements is now possible in view of the derivability from multi-modal contexts:

\(\square_{k}(A\, true)\) iff for all \(\Upgamma_{j} \in Context, \emptyset \mid \square_{j}\Upgamma \vdash \square_{i}(A\, true), \) where \(j=\bigcup\{1, \dots, i-1\}\in \mathcal{G}; \)

\(\diamondsuit_{k}(A\, true)\) iff for some \(\Upgamma_{i}, \Updelta_{j} \in Context, \square_{i}\Upgamma \mid \diamondsuit_{j}\Updelta\vdash \diamondsuit_{k}(A\, true), \) where \(j=\bigcup\{1, \dots, k-1\}\in \mathcal{G};\)

On the one hand, following the \(\diamondsuit\)-Rule, the type-theoretical expression \({\diamondsuit_{\mathcal{G}}\Upsigma\vdash \diamondsuit J}\) is obtained by \(\square_{i}\Upgamma\mid \diamondsuit_{j} \Updelta\vdash \diamondsuit J\) and expresses the local validity of *J* from source *i* and *j* in view of the information that source *j* makes available when accessed from source *i* and which can be lost when accessing source *k*: the epistemic state expressed by *J* = *A* *true* is obtained by the knowledge distributed over the network \(\mathcal{G}\) at point *i*, using additional data accessible at point *j* and not everywhere else. Accessibility of the data at *j* is necessary to the formulation of *J* and some \(k>j>i \in \mathcal{G}\) can refute *J*.

On the other hand, following the \(\square\)-Rule, \({\square_{\mathcal{G}}\Upsigma\vdash \square J}\) is obtained by \(\square_{i}\Upgamma\mid \square_{j} \Updelta\vdash \square J\) and it expresses the global validity of *J* from source *i* and *j*, in view of the information that source *j* makes available when accessed from source *i* and which persists when accessing any other source *k*: the epistemic state expressed by *J* = *A* true is obtained by the knowledge common to the network \(\mathcal{G}\) at point *i*, using data accessible to any agent at point *j*: accessibility of the data at *j* is crucial to the formulation of *J* because necessary and irrefutable at any other \(k\in \mathcal{G}. \)

## 4 Epistemic upgrade: putting online distributed data

- 1.
From locally indexed data one extracts functional information, i.e. data that is strictly linked to a source and that can be falsified by adding new sources to the relevant network;

- 2.
Evaluation of local data turns content based on it into semantic information, i.e. with the additional property of considering distributed data as true with respect to the relevant network;

- 3.
Assessing that evaluation of local data can be preserved under any extension of the relevant network corresponds to make data true for any possible peer outside of the given network, and thus upgrading semantic information into knowledge.

The crucial aspect of the first of these steps is to consider data not as genetically neutral, rather specified from a given origin, always embedded (or embeddable) in a network. This data is refutable and hence serves only a functional role; it is localized and must be admissible within the network: this makes it structured meaningful but opaque data.

The operation of admitting its truth within the boundaries of a given network makes such content appropriate to justify data and thus to constitute a state of semantical information, which remains collectively opaque, in principle still refutable when a possible extension of the given network is taken into account.

The machinery of modalities adds the possibility of expressing the distinction between validity under network boundaries or under *any* extension of the originating epistemic state: this means to formulate verification at further localities outside of the network. This latter step defines properly the upgrade operation from semantic information to knowledge, as any context extension can simulate the process of accounting for one or more embeddings (equivalent to HC-questions). The everywhere vs. somewhere dichotomy that is crucial to express data validity receives in this way a finer and precise representation.

**Definition 3**(*Offline data*)

Data is in offline status when it is not validly accessed by any admissible network source.

*j*, the judgement (

*B*true) is prefixed by \(\diamondsuit_{i,j}, \) meaning these sources are always to be called upon for the validity of

*B*(i.e. holds at their intersection). The multiplicity condition means that equivalent operations need to be performed within \(\Upgamma_{i}, \Updelta_{j}\) where necessary.

On the other hand, we need to account for verification of structured meanings preserved across distinct epistemic states:

**Definition 4**(*Online data*)

Data is in online status when is validly accessible by any admissible network source.

*A*

*true*) at any point in \(\mathcal{G}, \) which establishes modal derivability under verification. The appropriate formal rule to implement this is the following:

The definition of the upgrade operation is embedded into that of online data.

**Definition 5**(*Upgrade*)

A content is upgraded from information to knowledge when it consists of verified online data.

Messaging among points in the network corresponds to validity checking under different contexts, each such context expressing at least one of the necessary conditions for the content at hand. Such notion of knowledge is by definition irrefutable true content, because validly acceptable by any extension of the originating network and so it refutes any form of Gettierization. Its content as online data expresses accessibility to/from any point of any accessible network and formally corresponds to validity under any extension of the generating epistemic state. Offline data can be equated to data valid according to some privileged point in the network, generated and possibly accepted at more points of the original network, but not yet submitted to the validity of any possible extension of the network.^{28}

Our final thesis is that epistemic data in a distributed network is crucially (functional/semantic) information: when a content (either true or still falsifiable) is not accessible by at least one point of a given network, it cannot be qualified as knowledge. Under this reading, distributed knowledge can be accounted as ‘knowledge’ only in view of the external perspective that collects all the different points of the network involved: from the internal viewpoint of the network, this content cannot be accounted as knowledge by each point. Necessitated data is the result of the proper upgrading from information to knowledge (i.e. the switch from distributed to common data): it is content that becomes accessible by any point of the network, with additional properties available such as overload (more than one point in the network is able to provide a given content, literally each point is able to provide any content) and iteration (every point in the network is able to identify itself as bearer of a given content). Notice that in this setting the notion of information is a more fundamental notion, whereas the notion of knowledge is induced.

### 4.1 Misinformation does not survive online

An additional argument to show that knowledge is correctly defined as semantic information turned online by the actual access of any peer in each possible extension of the original network, is given by showing that what goes online cannot be misinformation. This needs to be further specified to avoid misunderstandings. It is *not* the case that every item of information flowing in a network is true. Rather, if something is defined as online data, it is put to the test of being accessible (checked) from any peer in any possible extension of the originating network. Accordingly, to show data to be misinformation means to have a network extension \(\Upgamma_{i}\mid \Updelta_{j}\) by which functional information *x*_{i} : *A* is extended by semantic data \(A\, \rightarrow \bot. \) This means to establish the validity of the semantic content of *A* to hold up to extension of the network to point *j*. In turn, whereas the inference from \(\diamondsuit_{i}\Upsigma\) to \(\diamondsuit_{i}(A\, true)\) is possible, by extension to \(\Updelta_{j}\) there is no reduction possible to \(\square_{j}(A\, true).\) In other words, if the network of knowledge is considered included *j*, refutation of *A* can be determined and therefore the functional data *A* can be rejected from the network. Up to granting access to *j*, *A* can still be taken functionally in the context of network \(\Upgamma_{i}, \) eventually accepted as true.

An easy way to see this principle applied, is to consider the level of reliability of mass information retrieval systems as the Internet, in particular in the form of open encyclopedias. A notorious case is the reliability of Wikipedia, which crucially relies on “how quickly false or misleading information is removed”.
^{29} The principle that misinformation does not survive online instantiates precisely the virtuous circle that collaborative structures as Wikipedia implement for open contents. Extension via contextual accessibility of the network simulates the practice of revision of an entry by peers: our indexes over constructors correspond to locations for contributors. A given content (e.g. a Wikipedia entry) is admissible as semantic information (true) only if all false contents in it have been removed, and addressed as functional information (falsifiable content) if it exists (in principle) at least one new peer which can edit the entry by removing or altering any false claim it contains: when this latter condition can no longer be satisfied, that is when whoever access the entry would not be able to provide a falsification for (part of) its content, the entry becomes globally valid, hence qualifying as knowledge. Up to the existence of a possible peer that can falsify a claim in the content of a Wikipedia entry, its content is locally valid; a content which is (in principle) verified by any peer accessing the entry, would then be globally valid.

In reality, this process is far less precise, as accessibility can be restricted, it should be considered in view of expertise and trust principles,
^{30} and controversies can arise on issues that are hard to formalize. Nonetheless, this model aims at a general representation, as it can be easily applied for example to the revision of paradigms in science, where the formulation of a new theoretical model that falsifies older data can be accounted as a network extension on a pair with other update processes. It also guarantees that the principle of open revision is maintained up to the highest standard, where the aim is to preserve any possible contribution. Under this reading, assessing the truth of data can no longer be based on the reliability that is attributed to the originating source or network; rather, it is required that truthfulness be assessed by completing a reviewing process consisting of opening the content to all the possible peers considered relevant extensions of the originating network. Only a content surviving such open reviewing of relevant network extensions becomes a knowledge content.

## 5 Conclusions

In this paper, our first concern was to highlight the fundamental role of the notion of functional information in the formal epistemology of information theories. Functional information fulfills the weaker role in the epistemic family of concepts including semantic information and knowledge: with respect to the companion semantic notion it lacks the alethic characterization, with respect to knowledge it lacks verifiability at any point of the network it flows in.

Its importance is, in the first place, given by the need for resolving the debate concerning the Veridicality Thesis: functional information is typically the notion of information that grounds semantic information and that still admits the possibility of turning into misinformation. In the second place, functional information shows that epistemic theories of truth are needed in the context of formal theories of information, as they encapsulate the most needed notions of verification and falsification as a medium to distinguish between information and misinformation. We have shown how this happens without our theory of functional information being affected by basic problems such as Decoupling (from semantic information) and Gettierization (of the related notion of knowledge). Data for functional information share with semantic information syntactical correctness and meaningfulness, but obviously not truthfulness. Moreover, their opaqueness is contextual, as they are the medium of communication among distinct epistemic state. This has led us to an analysis of data location and accessibility, for which epistemic modalities have been used. As a result, the distinction between information and knowledge and the bridging operation of upgrade have been identified via the multiple accessibility of data among admissible networks, providing intuitive epistemic definitions for online and offline data, relevant in view of contemporary forms of knowledge acquisition processes.

It will appear clearly from the next few lines in this introduction that this problem has a complementary question: what is the difference between information and belief? Both questions account for the role of information theories and philosophy of information in the traditional debates in formal epistemology, but only the former will be tackled here.

Notice that we entirely endorse here the distinction from the Semantic Theory of Information between Data and Information, where the former are intended as vehicles of the latter, deprived of meaning.

As it will appear evident below with the formulation of the* Verificationist Principle of Truth*, we are siding here with an anti-realist reading of the truth-functionality of information.

For a theory that fully integrates a constructivist approach to knowledge with the already mentioned NTA, see (Floridi 2011b).

Notice that the process of *verification and validation* required in the account of truth as correctness by Floridi plays a very similar role. See (pp.193–195, Floridi 2011a).

ECDI is located by definition on the side of *epistemic* theories of meaning, endorsing an understanding of meaning and truth in terms of verification. This costructionist approach is to be traced back to the work of Dummett, see in particular (Dummett 1978) and (Dummett 1991). In the Dummettian tradition, when truth is taken to be weakly central to the meaning-theory, then knowledge of the meaning of a sentence is equated with knowledge of its truth-conditions, but in addition some further explanation is required, consisting of the ability of laying down the assertion conditions for that sentence. In particular, to understand a mathematical formula, it is necessary to be able to distinguish between mathematical constructions which do and which do not constitute proofs of it. Notoriously, this extends easily to a theory of meaning as use in the Wittgensteinean tradition. As the next following lines of this introduction are meant to show, ECDI builds on this basis allowing for distinct levels of justification, in order to account for the various *modes* in which assertions present truths. In particular, our approach is founded on a logical theory of forms of judgements which allows to avoid the collapse into a pragmatic approach to meaning, as a theory of *successful* assertions of truth.

(p. 24, Schaar 2009). The same reasoning obviously applies when subject and attributer are the same individual, only accounted from two distinct perspectives, either in time, space or any other coordinate.

See (Primiero 2012) for the full formal language: it is a variant interpretation of the basic system of constructive type-theory that links hypotheses and refutable contents. It extends to a modal type-theory, variating on a theme first proposed in (Pfenning and Dvies 2001) and later expanded in (Nanevski et al. 2008). We shall here only focus on the appropriate introduction rules for justified and assumed contents and expand on the use of modalities in the next section.

Standard references for this debate are e.g. (De Rose 1991) (McFarlan 2005), (Egan 2007) and (Dietz 2008).

Such formal analyses (and applications thereof) are given for example in (Kahle 2006), (Pfenning and Davies 2001), (Murphy 2008), (Primiero 2009b) and (Kahle2012).

Notice here the properly judgemental nature of our modalities, as we let them range over judgements rather than over propositions in the form \((\square A\, true). \) The latter is the form of judgements used for the modal extension of Martin-Löf’s Type Theory in (Pfenning and Davies 2001) and (Nanevski et al. 2008), whereas we use the judgemental modalities both in (Primiero 2012) and (Primiero 2010) in order to express the contextual nature of our proof-terms.

Our distinction between terms and modal operators is technically the same distinction obtained in (Moody 2003) by distinguishing between variables for different kind of hypotheses and labels to refer to locations of such constructors. The complete formal analysis of the properties of the extension to multi-modalities in connection to distributed computing is presented in (Primiero 2010). See also (Primiero and Taddeo 2011) for the use of this framework to formalize communications characterized by trust.

See (Primiero and Taddeo 2011) for the formal details concerning equivalence with common and distributed knowledge.

http://en.wikipedia.org/wiki/Reliability_of_Wikipedia. See also (Viegas et al. 2004) and (Besiki et al. 2008) for studies on the practice of work organization and cooperation in online encyclopedias.

## Acknowledgments

This paper was first presented at an informal meeting of the IEG held at the Oxford e-Research Centre, June, 08, 2010. The author wishes to thank all participants for discussion that helped clarifying some conceptual issues. Any remaining omission or imprecision is entirely his own fault.