Reference and Truth

I apply the notions of alethic reference introduced in previous work in the construction of several classical semantic truth theories. Furthermore, I provide proof-theoretic versions of those notions and use them to formulate axiomatic disquotational truth systems over classical logic. Some of these systems are shown to be sound, proof-theoretically strong, and compare well to the most renowned systems in the literature.


Introduction
Tarski [32,33] has bequeathed to us a pessimistic but valuable lesson about truth. Given a language L with a monadic predicate T and a name ϕ for each sentence ϕ of the language, for T to adequately express the notion of truth of sentences of L, each sentence ϕ should be materially equivalent to T ϕ . However, if the underlying logic is classical and the language is 'expressive enough', the equivalence must fail in some cases, on pain of triviality. For instance, if L allows for self-referential expressions and, in particular, contains a liar sentence λ, that says of itself that it is not true, λ and ¬T λ will turn out to be equivalent. Thus, the equivalence between λ and T λ is untenable. As a consequence, an adequate truth predicate for the whole language is not possible in L, if working within classical logic.
Tarski proposed to restrict the equivalence between each ϕ and T ϕ (expressed as a biconditional) -the so-called T-schema -to the T -free sentences of the language. In this way, λ is easily excluded. Tarskian typed truth theories have been pervasive for many years and play a prominent role in diverse areas of logic and philosophy.
Nonetheless, the shared view nowadays is that these theories are far too restrictive. After all, many sentences containing the truth predicate -e.g. expressions of the form T T ϕ where ϕ is T -free -seem to be entirely unproblematic.
Less drastic solutions have been subsequently explored. Theories which weaken the logic but keep some form of equivalence between each sentence and its truth ascription -originating in the work of Kripke [20] and Martin & Woodruff [23] are very popular. Others have investigated the possibility of remaining classical but imposing less strict restrictions on the admissible instances of the T-schema. The goal is to find a unified criterion that allows us to keep as many unproblematic instances as possible, leaving the paradoxical ones out.
A promising strategy in this direction is to restrict the T-schema to sentences that exhibit 'safe' reference patterns. The orthodox view, championed by Russell, Poincaré, and Tarski, has it that the cause of semantic paradox is self-reference. Despite the popularity of this view, the restriction of the T-schema to non-selfreferential sentences has not yet been explored. The main reason is that the notions of self-reference and reference have proved to be quite elusive in the past, and are now surrounded by an aura of scepticism. 1 Additionally, the self-reference diagnosis has been (relatively) recently challenged by a new paradox that is prima facie free of self-reference: the Visser-Yablo paradox. 2 However, the bearing of this new antinomy is not yet entirely clear: the lack of a precise notion of self-reference has hindered the evaluation of the Visser-Yablo paradox and thus the self-reference diagnosis of paradoxicality.
In [27] I provide a systematic and rigorous account of reference in the context of truth -or "alethic reference", as I call it -and define self-reference and other reference patterns in terms of this notion. I hope the intuitive appeal of the definitions I put forward there helps dissipate the scepticism the corresponding notions are immersed in, at least with regard to the semantic paradoxes. Furthermore, I show that the expressions involved in the Yablo-Visser paradox are not self-referential according to the new notions, refuting the self-reference diagnosis on the roots of paradox. Although this ruins the prospects of restricting the T-schema to non-self-referential expressions, the more general project of restricting it to sentences that exhibit safe reference patterns need not be abandoned, as all paradoxical expressions might share other reference patterns.
The purpose of this paper is to confirm this hypothesis. I provide both semantic and axiomatic theories of truth in which the T-schema is restricted to well-founded expressions, and prove they are encompassing and sound. Moreover, I show this condition can be relaxed even further, i.e. that the reference patterns that underlie paradoxical expressions can be given a more fine-grained characterization. In short, I deploy the reference notions introduced in [27] to formulate different restricting criteria for admissible instances of T-schema in terms of their underlying reference patterns and show that this strategy is successful, as it results in classically consistent, sound, and fairly attractive theories of truth.
Section 2 provides a technical introduction to formal theories of truth, followed by an overview of the state of the issue regarding different restrictive criteria for the truth predicate that have been proposed in the literature. I also give a compact presentation of the approach to reference in the context of truth introduced in [27]. Section 3 employs these notions to construct semantic truth theories. Section 4 provides simpler, proof-theoretic versions of the notions of reference (and related concepts) from Section 2 and formulates several axiomatic truth systems based on them, some of which are shown to be sound and proof-theoretically strong. Finally, in Section 5 I conclude by evaluating these systems in light of Leitgeb's [22] criteria for formal theories of truth.

Formal Truth Theories
Let L be the language of first-order Peano arithmetic (PA) and let L T augment L with a monadic predicate T, to express truth. L contains =, ¬, ∧, ∨, →, ∀, and ∃ as logical constants, an individual constant 0, a monadic function symbol S, two dyadic function symbols + and ×, and a finite stock of extra function symbols for primitive recursive (p.r.) functions to be specified. All other logical and non-logical symbols occurring in formulae are to be understood as the usual abbreviations. Let N be the standard model for L , with ω as its domain. Note that, for each n ∈ ω, L has a term n denoting n (the numeral of n), that consists of n occurrences of S followed by 0.
We work with a fixed (effective and monotonic) coding of expressions of L T by numbers in ω. 3 If σ is a string of symbols of L T , we write σ for the numeral of its code. We often identify expressions of L T with their codes if there's no room for confusion. Unless otherwise indicated, by "formula" and "sentence" we mean formula of L T and sentence of L T , respectively.
Although L speaks primarily about natural numbers, our coding allows it to express many syntactic properties, relations, and functions about the expressions of L T . Thus, our truth theories can be formulated in L T , with background syntactic principles formulated in L .
Formal truth theories can be either semantic or axiomatic. Semantic theories consists of a model or family of models N, expanding N to a model of L T , where is the extension of T in the model. In evaluating them, we look at the truth principles that hold in every model of the family. Epistemic considerations are also relevant, however: it is important that there is a way to know which truth principles belong to the theory, to the extent possible. If the theory is too complex, this would hardly be the case. This is not to say that complex semantic constructions are of no value. On the contrary, they can serve as witnesses of the consistency of collections of truth principles, or play a heuristic role in the formulation of more constructive theories, that is, axiomatic ones. 4 By contrast, axiomatic truth theories result from adding truth-theoretic axioms to a syntax theory, which we assume to be PA. 5 We assume PA contains the defining recursion equations for each extra function symbol in L . As is well known, PA is strong enough to represent every recursive relation between numbers and, therefore, expressions of L T , and to weakly represent every recursively enumerable relation. Let PAT consist of the axioms of PA formulated in L T with induction for the whole language. Call an axiomatic truth theory in L T any recursive extension of PAT. Of course, some theories will be highly incomplete and others simply unsound, but the terminology is convenient. As before, the merits of a theory lie in the truth principles the theory entails. To know what principles hold in an axiomatic theory it is enough to find a proof. 6

Tarski's Theorem (and ways to circumvent it)
Ideally, any truth theory (whether semantic or axiomatic) would satisfy Tarski's condition of material adequacy, according to which all instances of the following schema hold in the theory: These are known as "T-biconditionals". Alternatively, we could work with a more general variant of the T-biconditionals, the uniform T-biconditionals: ) Both the T-schema and its uniform version are known as "disquotational" principles. Note that the set of closed terms of L T is p.r., as well as the value relation that holds between each term of the language and the number it denotes (as L contains only finitely many function symbols). Let CTerm(x) and x • = y represent in PA ("represent", from now on) this set and relation, respectively. 7 Moreover, the substitution function that takes a formula ϕ, a term t, and a variable v and returns the formula that results from replacing all free occurrences of v in ϕ with t, is also p.r. and, thus represented by a term x(y/z) ∈ L . We write ∀t ϕ for ∀v (CTerm(v) → ϕ) and ∃t ϕ for ∃v (CTerm(v) ∧ ϕ), for a suitable variable v. Finally, ϕ(t . ) abbreviates ϕ (t/ v ), for some suitable term variable t, provided that v is the only free variable in ϕ. Thus, the instances of the Uniform T-schema quantify over closed terms, entailing all substitution instances of the standard T-schema uniformly.
Unfortunately, neither of these principles can be implemented unrestrictedly, as the language is 'expressive enough' to allow for paradoxical expressions such as liar sentences.
Let v abbreviate the string of variables v 1 , . . . , v n different from x and y. 5 Robinson Arithmetic, a weaker subsystem of PA, would also do. I choose PA instead to facilitate the comparison between our truth theories and other systems that are found in the literature. 6 However, to know whether a schematic principle holds, the schema of a proof is required. 7 Since L contains enough function symbols for the proof of the Strong Diagonal Lemma (cf. Theorem 2) to go through, it cannot have a function symbol for the value relation (which is also a function), on pain of triviality. Nonetheless, I write x • = y instead of, e.g. Val(x, y), to preserve readability, as is customary.
In equivalences of the form (1), ψ(v) is said to be a fixed point of ϕ(x, v). Let ϕ in Theorem 1 be ¬Tx. Then there is a sentence λ such that the following is a theorem of PAT: λ ↔ ¬T λ (2) λ is normally understood as saying of itself that it is untrue, as a liar sentence. Given (2), no consistent extension of PAT can contain an instance of the T-schema for λ and, a fortiori, full disquotation is untenable. This is what Tarski's undefinability result consists in. Likewise, if we opt for a semantic account instead, no model N, of L T can validate unrestricted versions of disquotation, since all theorems of PAT are true in N, , including (2). Thus, we say λ is paradoxical.
To avoid paradox, Tarski opted to restrict disquotation to sentences without T, but more permissive restrictions are possible without stepping into triviality. Thus, we wonder with Leitgeb [21,p. 156], "What kinds of sentences with truth predicate may be inserted plausibly and consistently into the T-scheme?" An idea that suggests itself is to restrict our disquotational principles to non-paradoxical expressions, that is, to those that can be consistently inserted in the T-schema. Alas, McGee [24] has shown that maximal consistent sets of T-biconditionals decide each and every sentence of L T , which means they are far too complex for an axiomatization. Moreover, McGee's result shows there are uncountably many of those sets, so picking one of them amounts to an arbitrary choice. Consider the following 2-liar cycle: Theorem 1 guarantees that both biconditionals are provable in PAT if we diagonalize the predicate ¬T Tẋ . Given a formula ϕ with exactly one free variable v, let ϕ(v) be short for ϕ (v/ v ), whereẋ is a term of L for the p.r. function that maps each natural number to the code of its numeral. Since v is free in ϕ(v) , we can quantify over it. Clearly, the biconditionals in (3) are inconsistent with the T-biconditionals for λ 1 and λ 2 . However, if one of these T-biconditionals is dropped, consistency is restored. Every maximal consistent set will contain one of them but not the other. A more promising suggestion has been made by Leitgeb [21]. Roughly, his idea is to restrict the T-schema to grounded sentences, that is, sentences whose truth value ultimately depends on expressions not containing the truth predicate. Since the truth value of the liar and of sentences in liar cycles seems to depend, directly or indirectly, on these very same sentences, the latter are considered ungrounded and excluded from our disquotational principles. The truth value of other expressions such as 0 = 0, T 0 = 0 , and T λ ∨ ¬T λ , instead, is ultimately fixed by nonsemantic facts, so their corresponding T-biconditionals hold in the theory. Leitgeb's theory consists of a model N, lf of L T where only grounded sentences belong to lf and all instances of disquotation for grounded sentences are true. The theory is quite natural, elegant, and seemingly free from adhocness. However, it is also fairly complex. 9 This means that an N-categorical axiomatization is not possible, i.e. the class of purely truth-theoretic principles the theory entails is not recursively axiomatizable. 10 Schindler [30] provides a nice axiomatic system, but it is naturally far from capturing the original semantic construction.
Others have put forward simpler, syntactic restrictions on disquotation. Halbach [9] explores the restriction of uniform disquotation to formulae in which T occurs only positively -i.e. in the scope of an even number of negations and conditional antecedents. The resulting system is known as PUTB, for "Positive Uniform Tarski Biconditionals". Schindler [31], in turn, considers restricting the Uniform T-schema to translations of formulae of the language of second-order arithmetic without second-order parameters. Both criteria seem somewhat unnatural and ad hoc.
I would like to propose an alternative path, namely, to restrict disquotation to expressions that exhibit 'safe' reference patterns. According to the self-reference diagnosis, all paradoxical expressions share a common reference pattern, i.e. selfreference. This seems to be certainly the case of λ and the liar cycle given by λ 1 and λ 2 . If this hypothesis were correct, a sensible plan would be to restrict disquotation to non-self-referential expressions. The prior lack of adequate and precise notions of reference and self-reference has prevented us so far from exploring this route. Luckily, this situation has been remedied. In my companion paper [27] I give a systematic and formal account of reference in the context of truth, designed specifically for the study of the reference patterns underlying paradoxical sentences. Moreover, since according to my account reference has a syntactic aspect, restricting disquotation to non-self-referential expressions could turn out to be simple enough for the formulation of axiomatic truth systems.
Unfortunately, the notions I put forward in [27] reveal that the self-reference diagnosis is not correct. As it turns out, there are semantic paradoxes that are free of self-reference. Such is the case of the Visser-Yablo paradox, consisting of an infinite list of sentences, each of which says of the ones coming after that they are untrue. The existence of this list can be proved in PAT by Theorem 1. Diagonalizing the formula ∀z (z > w → ¬Tx(ż/ w )), we obtain a predicate Y(w) such that is provable in PAT. Instantiating w in each numeral results in the following biconditionals, i.e. the list: . . .
By reductio ad absurdum, the T-biconditionals for Y(n) entail ¬T Y(n) for each n ∈ ω, as well as ¬∀z ¬T Y(ż) . Nonetheless, ∀z ¬T Y(ż) does not follow in PAT plus the T-biconditionals. This means that the theory is consistent, albeit ω-inconsistent. 11 Note, however, that no model N, of L T can make all T-biconditionals for each Y(n) true at the same time: since each ¬T Y(n) would have to be true in the model, so would ∀z ¬T Y(ż) . For these reasons, the Visser-Yablo paradox is not considered to be a a paradox in the strict sense, but an ω-paradox. Despite not directly trivializing our axiomatic theories, it is still problematic, as no semantic truth theory can validate all T-biconditionals for the sentences in the list. Furthermore, although (4) is consistent with the T-biconditionals, an outright inconsistency can be obtained by combining (4) with an instance of uniform disquotation for Y(w).
According to my account of alethic reference, no sentence in the Visser-Yablo list is self-referential but they are all unfounded, as will be seen in Section 2.3. This shows that restricting disquotation to non-self-referential expressions is not a viable project; banning self-reference is not enough. However, as will be shown in Section 3, there are other reference patterns shared by unparadoxical expressions that prove to be sensible restrictions on disquotation. Moreover, their relative simplicity will allow us in Section 4 to formulate proof-theoretic approximations that we then deploy in the formulation of sound and encompassing axiomatic theories of disquotational truth. But before we get to that, I will briefly introduce my general account of alethic reference and related notions.

Alethic reference
Before giving a formal definition of alethic reference, I will briefly discuss its motivation and some of its general features.
Alethic reference (just "reference", from now on) is introduced in order to study the reference patterns underlying the semantic paradoxes. Accordingly, it is a relation 11 See Hardy [14] and Ketland [18,19]. A theory formulated in an extension of L is said to be ωinconsistent just in case there is a formula ϕ(x) such that, for each n ∈ ω, ϕ(n) is a theorem and, at the same time, the theory entails ¬∀x ϕ(x). An ω-inconsistent theory may be nonetheless consistent, as inferring ∀x ϕ(x) from the set of all its instances ϕ(n) would require an infinitary rule, not admissible in finitary systems such as the ones we are working with. between sentences of L T . Sentences can refer to one another in two different ways: by mention and by quantification. For a sentence ϕ to refer by mention -"m-refer" -to a sentence ψ the former must contain a closed term denoting (the code of) ψ; and for it to refer to ψ by quantification -"q-refer" -(the code of) ψ must fall under the range of the quantifiers in ϕ, which might be restricted by predicates in a sense to be specified. Since we are only interested in reference in the context of truth, to determine the m-and q-referents of a sentence, we will focus only on terms occurring in the scope of T. As a consequence, only sentences containing T can (alethically) refer. Finally, it is worth pointing out that since reference is strongly tied to the presence of terms in sentences, the notion cannot be closed under logical equivalence, on pain of triviality (cf. footnote 13). In other words, reference is not extensional, but hyperintensional.
We turn now to the definitions.
Definition 1 (M-reference) Let ϕ and ψ be sentences. ϕ m-refers to ψ iff ϕ contains a subsentence of the form Tt and N t = ψ .
A sentence m-refers only to those sentences denoted by closed terms that occur immediately after T. For instance, let ¬ . be a function symbol of L representing the p.r. function that maps each formula to its negation -and similarly for other logical connectives -and let Bew PA (x) weakly represent provability in PA in a natural way. 12 According to the definition of m-reference, while ¬T 0 = 0 m-refers to 0 = 0, T¬ . 0 = 0 m-refers to the negation of 0 = 0 -but not to 0 = 0 itselfand Bew PA ( 0 = 0 ) doesn't m-refer to any expression. Thus, proper subterms in the scope of T and T-free sentences don't play a role in m-reference.
Note also that while T 0 = 0 → T 0 = 0 m-refers to 0 = 0, 0 = 0 → 0 = 0 doesn't, so m-reference is not closed under logical -let alone materialequivalence. 13 For this reason, one should not expect the equivalences delivered by Theorem 1 to bring forth self-m-referential sentences. However, since L contains a function symbol for substitution, there is a stronger version of this result that does so, provided that Tx is a subformula of ϕ. 14 We say that ϕ(t, v) is a strong fixed point of ϕ(x, v). Strongly diagonalizing, for instance, the predicate ¬Tx, we obtain in PA the following identity: l = ¬Tl that is, a sentence that m-refers to itself. ¬Tl is a 'strong' liar sentence. 12 See Halbach & Visser [12,13] for a discussion on natural representations. 13 If it were, every sentence ϕ would refer to every other sentence ψ, as ϕ and, e.g. ϕ ∧ (T ψ → T ψ ) are logically equivalent, and the latter obviously m-refers to ψ. 14 See Jeroslow [17] for a proof. Q-reference is considerably more complicated. The underlying idea is the following. While, e.g. ∀x Tx q-refers unrestrictedly to every expression, sentences such as PA's global reflection principle, q-refer only to the theorems of PA. In general, universal quantifiers followed by conditionals are to be understood as restricted, i.e. ranging over just the sentences satisfying the antecedent, with one proviso: that the antecedents are T-free. In this way, we are able to determine what the sentence q-refers to. 15 Otherwise, the quantifiers are taken to be unrestricted. Giving a formal definition of q-reference requires some additional setup. We first need to introduce the procedure of normalizing formulae of L T ; q-reference for sentences is then defined in terms of their normalizations. Since distinct sentences can have the same normalization, the procedure therefore induces an equivalence relationhaving the same normalization -under which q-reference is closed. Triviality is avoided, however, since this equivalence relation is more fine grained than logical equivalence.
The normalization of an expression is the result of a series of logically valid transformations that deliver a formula in alethic disjunctive normal form (ADNF). Call "prime" any atomic or universal formulae or their negation. A formula is in ADNF just in case it contains no conditionals and no existential or dummy quantifiers, and every subformula of the form ∀v ϕ is s.t. (i) ϕ is a disjunction (of length ≥ 0) of conjunctions (of length ≥ 0) of primes and, (ii) if ϕ is of the form ψ ∨ χ, ψ contains all and only the T-free disjuncts of ϕ (if any) and χ all of the T-containing disjuncts (if any). The point of normalization is that one can easily see whether sentences in ADNF take the form of a restricted quantified claim. Consider a sentence ∀v ϕ in ADNF containing T. If ϕ is a disjunction ψ ∨ χ with some T-free disjuncts, ∀v ϕ can be seen to be equivalent to the restricted quantificational claim ∀v (¬ψ → χ), for the disjuncts of ψ must be T-free whereas the disjuncts of χ must not. In this way, the reference-restricting conditional ¬ψ → χ is made explicit, and ¬ψ is guaranteed to encapsulate all truth-free restrictions imposed on ∀v. If, instead, ϕ is not a disjunction or contains no T-free disjuncts, the quantifiers in ∀v ϕ are unrestricted.
Since formulae in ADNF cannot contain conditionals or existential quantifiers, the first step in normalizing an expression is to replace these connectives with negations, conjunctions, disjunctions, and universal quantifiers, making use of the standard definitions. Let τ : L T → L T carry out these replacements, that is, τ (ϕ → ψ) := ¬τ (ϕ) ∨ τ (ψ) and τ (∃v ϕ) := ¬∀v ¬τ (ϕ). Then, the normalization process is done in stages. It consists of successive transformations of each subfor-mula of the form ∀v ϕ into ADNF, starting from those of lesser depth. Let ϕ contain no conditionals, existential, or dummy quantifiers.
For each i ∈ ω, the i-normalization of ϕ is the result of successively applying the following transformations to each subformula ∀v ψ of depth i: and (¬ψ 1 ∨ ¬ψ 2 ) resp. until they don't occur any longer, starting with the innermost. 2. Erase all double negations. 3. Replace every subformula of the form resp. until they don't occur any longer, starting with the innermost. 4. In every subformula of the form ∀v( is not itself a disjunction), rearrange the disjuncts into χ 1 ∨ χ 2 such that the ones not containing T (if any) occur in χ 1 , whilst the others (if any) occur in χ 2 .
Definition 2 (Normalization) The normalization ϕ * of a formula ϕ is the result of erasing all dummy quantifiers in τ (ϕ) and then, if there are any quantifiers left, performing successive i-normalizations starting with i = 1 and stopping after i = max{dep(∀v ψ) : ∀v ψ is a subformula of ϕ}.
Since every step in the i-normalization of a formula involves only finitely many transformations and formulae contain finitely many quantifiers, the normalization process always terminates. Moreover, it delivers a logically equivalent expression.
We are finally in a position to provide an adequate and precise definition of qreference. Let n abbreviate n 1 , . . . , n m ∈ ω, and n abbreviate n 1 , . . . , n m . Definition 3 (Q-reference) Let ϕ, ψ be sentences. ϕ q-refers to ψ iff ϕ * has a subsentence of the form ∀v χ s.t. T occurs in χ, and one of the following holds: , T doesn't occur in χ 1 , and χ 2 [n/v] q-refers or newly m-refers to ψ, 16 for some n ∈ ω s.t. N ¬χ 1 [n/v]. 2. Either χ := (χ 1 ∨ χ 2 ) and T occurs in χ 1 or χ is not a disjunction or a universal statement, and χ [n/v] q-refers or newly m-refers to ψ, for some n ∈ ω.
In brief, the q-referents of universally quantified claims are the sentences their (possibly restricted) instances m-or q-refer to. More precisely, a sentence whose normalization contains a subsentence of the form ∀v ϕ q-refers to what each instance ϕ[n/v] newly m-refers or q-refers to, unless it ϕ a reference-restricting conditional, in which case only the instances with true antecedents are to be considered. 17 For instance, which is already in ADNF, q-refers to the m-referents of each instance T 0 = 0 ∧ Tm→ . n, provided that the terms delivering m-reference are a product of the instantiation of the quantifiers ∀x∀y, which must be eliminated together 'at once'. Thus, (6) q-refers just to every conditional sentence. On the other hand, which normalizes into ∀x (x = 0 = 0 ∨ ∀y (y = ¬ . x ∨ Ty)), q-refers to the qreferents of each instance ∀y (y = ¬ . n ∨ Ty)) provided that N n = 0 = 0 : i.e. (7) q-refers to the negation of 0 = 0. Just as the definition of m-reference accounts for the self-referentiality of certain sentences delivered by the Strong Diagonal Lemma, Definition 3 implies that 'weakly' diagonalizing certain predicates -e.g. in which Tx occurs as a subformula -also delivers self-referential sentences. 18 This implies that λ is self-q-referential. The Visser-Yablo sentences, instead, only q-refer to sentences coming later in the sequence; they are not self-q-referential.
Mention and quantification exhaust the ways in which a sentence can directly refer to other expressions. Thus, we have the following definition of direct reference -"d-reference". Definition 4 (D-reference) Let ϕ, ψ be sentences. ϕ directly refers to ψ iff it m-or q-refers to ψ.

ϕ and ϕ[s/t] d-refer to the same sentences.
3. ϕ and ¬ϕ d-refer to the same sentences.
If v is not free in ϕ, ϕ, ∀v ϕ, and ∃v ϕ d-refer to the same sentences.
Definition 4 allows us to characterize many 'mixed' reference patterns. Consider, for instance, sentences λ 1 and λ 2 in the 2-liar cycle in (3). While λ 1 m-refers to λ 2 , the latter only q-refers to the former. Thus, we can say they directly refer to each other. However, in a very clear sense they also refer to themselves, albeit indirectly. Otherwise, we would get a semantic paradox without self-reference on the cheap. The following notions are intended to deal with this and other similar cases.

Definition 5 (Chain of reference) A (possibly infinite) sequence of sentences s.t. each sentence in the sequence d-refers to the one coming after, if any.
Definition 6 (Reference) Let ϕ, ψ be sentences. ϕ refers to ψ iff there's a chain of reference starting with ϕ and ending with ψ.
We can employ the notions just introduced to define salient reference patterns, such as the following.
Definition 7 (Self-reference) A sentence is self-referential iff it refers to itself. Definition 8 (Well-foundedness) A sentence ϕ is well-founded iff all chains of reference starting with ϕ are finite. Otherwise, we say that ϕ is unfounded.
Thus, sentences that don't d-refer to any expressions -e.g. the theorems of PA -are well-founded. And sentences that only refer to well-founded expressions are well-founded too. On the other hand, every self-referential expression is obviously unfounded. But there are also unfounded sentences that don't refer to themselves, such as the Visser-Yablo sentences in (5). Thus, not all paradoxical expressions are self-referential. However, the notions introduced in this section can still be deployed in the formulation of restrictions to our disquotational truth principles. As will be seen in the next section, all paradoxical expressions are unfounded, for theories in which disquotation is restricted to well-founded sentences will be shown to be (ω-) consistent. What is more, we will show that this characterization can be refined even further, prompting more encompassing truth systems.

Semantic Theories of Truth
In this section we deploy the notions of alethic reference introduced in Section 2.3 to prove the existence of models of L T expanding N which verify large and wellmotivated sets of instances of disquotation. In other words, we put forward semantic truth theories. In turn, these will serve as witnesses to the consistency and arithmetical soundness of the axiomatic systems we introduce in Section 4.

Well-Founded Truth
Our first theory consists of a single model N, wf of L T in which the extension assigned to T, wf , contains all and only true well-founded sentences. Thus, all instances of the T-schema for well-founded sentences hold in the model. Moreover, since N, wf expands the standard model of arithmetic, the Uniform T-schema restricted to formulae with only well-founded numerical instances also holds in N, wf .
The set wf is obtained via a usual Kripke-style construction, by considering a transfinite sequence of sets α ⊆ ω, with α ∈ On (the class of all ordinals), of which wf is shown to be a fixed point. In order to construct this sequence, we first consider another sequence of sets α whose fixed point, wf , is the set of wellfounded sentences. Thus, this sequence 'stratifies' this set. If ϕ is a sentence, let ϕ be the set of sentences ϕ d-refers to. For each α ∈ On, α is defined as follows: Proof By transfinite induction on β. Let α < β and ϕ ∈ α . If β is 0 or a limit ordinal, the result follows trivially. Let β = ξ + 1. By inductive hypothesis (i.h.), ϕ ∈ ξ . Thus, ξ = 0. If ξ = ζ +1, ϕ ⊆ ζ . Again, by i.h., ζ ⊆ ξ , so ϕ ⊆ ξ . Therefore, ϕ ∈ β . The case in which ξ is a limit can be proved in a similar way.

Proposition 1
There is an α ∈ On s.t., for every β > α, α = β and α is the set of well-founded sentences.
Proof The first conjunct follows immediately from Lemma 1 and cardinality considerations. Therefore, the sequence reaches a fixed point, wf . We show that wf is the set of well-founded sentences.
Let ϕ ∈ wf . Thus, ϕ ∈ α , for some α ∈ On. We show by transfinite induction on α that all sentences in α are well-founded. Assume that, for every β < α, β contains only well-founded sentences. If α = 0, the result follows trivially. If α = ξ +1, then ξ is a set of well-founded sentences, by i.h.. Then, all members of α d-refer just to well-founded sentences and are also well-founded. If α is a limit ordinal, the result follows trivially from the i.h. as well. Now let ϕ be a well-founded sentence. By Definition 8, either ϕ doesn't refer or every chain of reference starting with ϕ is finite. Thus, the following function from the set of well-founded sentences to On is well defined: Assume the result holds for every α < f (ϕ) and let f (ϕ) = β + 1. The case in which f (ϕ) is a limit is similar. Then, β is the supremum of all f (ψ) s.t. ϕ d-refers to ψ. By i.h. and Lemma 1, ϕ ⊆ β+1 , which means ϕ ∈ f (ϕ)+1 . By Lemma 1 and the first conjunct of this proposition, we can conclude that ϕ ∈ wf .
We have now paved the way for the construction of the sequence of α , with α ∈ On, that will give us the desired extension of the truth predicate. At each ordinal α, α contains only sentences that are well-founded at this stage, i.e. that belong to α .
To show that the sequence reaches a fixed point we first need to prove two lemmata. According to the first, the truth value of a statement in a model is not affected by the mere addition or removal of sentences it doesn't d-refer to from the extension of the truth predicate. This establishes a link between direct reference and Leitgeb's [21] notion of dependence. 19 More specifically, it follows that ϕ depends on ϕ . The converse doesn't hold, that is, ϕ is not a subset of every set ϕ depends on as, e.g. T λ → T λ d-refers to λ but depends on ∅. Just like aboutness, dependence is not as tied to the syntactic structure of sentences as reference is.

Lemma 2
Let be a set of sentences. If ϕ ⊆ , then for every set of sentences , Proof Since every sentence is logically equivalent to its normalization (cf. Definition 2), we can prove the result for the normalization of ϕ, ϕ * , by induction on its complexity (number of logical operators). If ϕ * ∈ L , ϕ * is true in every expansion of N or in none, so the result follows trivially. Thus, we assume ϕ * contains T. If ϕ * is an atomic sentence, then it's of the form Tt, where either t denotes a sentence in N or it doesn't. If it doesn't, then ϕ * is false both in N, and in N, ∩ . If t denotes a sentence ψ, then, by Definition 1, ψ ∈ ϕ ⊆ . Thus, ψ ∈ iff ψ ∈ ∩ , so we have the desired result.
Assume the claim holds for every sentence of lower complexity than ϕ * . Since dreference is closed under logical connectives (cf. Observation 3), if ϕ is a negation, conjunction, or disjunction, the result follows trivially from the i.h.

Proposition 2 There is an ordinal
Proof By Lemma 3 and cardinality considerations.
The sequence of sets α stabilizes at some ordinal, it has a fixed point. Let wf be this fixed point. The following result establishes that all instances of the T-schema for well-founded sentences hold in N, wf .
As a consequence, for every well-founded sentence, either it belongs to wf or its negation does. Note also that, by Lemma 1 and Proposition 1, wf contains only well-founded sentences.

More Permissive Criteria
Well-foundedness is not the only restriction we can impose on instances of the Tschema in an expansion of N to L T using the notions of alethic reference: more permissive criteria may also be adopted. For instance, note that in the case of the Visser-Yablo paradox, every expression on the list is unfounded, but also there are infinitely many of them and, additionally, each of them d-refers to infinitely many others. It can be shown that, if only finitely many sentences on the list are considered, no ω-inconsistency arises. More generally, let by a finite set of non-self-referential but unfounded sentences. One can easily find a model N, in which all instances of the T-schema for sentences in wf ∪ hold.

Proposition 4
If is a finite set of non-self-referential unfounded sentences, there's a ⊆ wf ∪ s.t. all instances of the T-schema for sentences in wf ∪ are true in N, .
f is well defined, as is finite and contains no self-referential sentences. Thus, every complete chain of reference restricted to members of ends in a sentence that is not in . For each n ∈ ω, let n := {ϕ ∈ : f (ϕ) = n}. Since is finite, only finitely many of these sets are non-empty. Let k ∈ ω be the greatest number s.t. k = ∅. Thus, = i≤k i . I show that all instances of the T-schema for sentences in wf ∪ are true in N, k . By Lemma 2, all instances of the T-schema for well-founded sentences are still true in N, k , for wf ⊆ k and no well-founded sentence may refer to members of , as they are unfounded. Now let ϕ ∈ , which means there's an n ≤ k s.t. ϕ ∈ n . Let −1 := wf . Therefore, we have that N, k ϕ iff N, n−1 ϕ iff N, n T ϕ iff, by Lemma 2, N, k T ϕ , as ϕ cannot d-refer to any member of k that isn't in n . Consequently, the T-schema also holds for all sentences in .
More interestingly, it can be shown that for any infinite set of non-selfreferential unfounded sentences, each of which d-refers only to a finite number of expressions, there is a model N, in which all instances of the T-schema for sentences in wf ∪ are true.

Proposition 5 If is an infinite set of non-self-referential unfounded sentences, each of which d-refers only to a finite number of expressions, there is a
Proof Let := {ϕ 0 , ϕ 1 , . . . }, n := {ϕ 0 , . . . , ϕ n }, and is. I show that, for every n ∈ ω, we can find sets 1 ⊆ · · · ⊆ n s.t. i ∈ G i for each i ≤ n. By Proposition 4, for each n ∈ ω there's a model N, n with wf ⊆ n ⊆ wf ∪ i≤n ( ϕ i ∪ {ϕ i }) in which all instances of the T-schema for sentences in wf ∪ n are true. For i ≤ n let Thus, for each i ≤ n, i n ⊆ i+1 n . By Lemma 2, we have that all instances of the T-schema for sentences in i hold in N, i n , so i n ∈ G i . Let G be the smallest tree consisting of sequences of members of {∅} ∪ n∈ω G n s.t.: -∅ ∈ G , -if 0 , . . . , n ∈ G and n ⊆ n+1 ∈ G n+1 , then 0 , . . . , n , n+1 ∈ G .
Notice that G is locally finite, that is, each node has finitely many children, for each G n is finite. Moreover, for every n ∈ ω, G contains a sequence of length n, e.g. 0 n , . . . , n n , so G is an infinite graph. Finally, all sequences start with ∅, so all nodes are connected by ∅. By König's Lemma, 21 G contains an infinite sequence 0 , . . . , n , . . . s.t., for each i ∈ ω, i ∈ G i . Let := n∈ω n . By Lemma 2, all instances of the T-schema for well-founded sentences are still true in N, as, for each n ∈ ω, n ∩ wf = wf . Let ϕ i ∈ and let m > i be the least natural number s.t., for every k ≥ m, k doesn't contain elements of ϕ i that are not already in m -i.e. k ∩ ϕ i = m . Such an m must exist, for ϕ i is finite and sequences in G are monotonic. Since i < m and Given the choice of m, Lemma 2 entails that N, Propositions 4 and 5 equip us with a more precise criterion for paradox. Whilst the former shows that infinitely many sentences are required for non-self-referential (ω-)paradox, by Proposition 5 we also know that at least some of the infinitely many paradoxical sentences must directly refer to infinitely many expressions.
Based on the notions of alethic reference and the results provided in this section, different reference patters may be employed to obtain a plethora of semantic truth theories, some more permissive, philosophically motivated, or elegant than others. Our restriction to well-founded sentences seems to fare quite well in these respects, although we have seen more permissive ones may also be adopted.
Unfortunately, though, reference and, a fortiori, well-foundedness and other reference patterns are far too complex to guide the development of axiomatic truth theories in a straightforward manner. For they are not definable in L , as they make essential use of the notion of satisfaction in the standard model (cf. Definition 3). In the next section I provide mirroring notions of reference tailored specifically for that purpose; these are simplified counterparts of those introduced in the previous section.

Axiomatic Truth
The first part of this section is concerned with proof-theoretic versions of the notions presented in Section 2.3, that is, reference concepts that are relative to a particular proof-system. Section 4.2 explores a naïve truth theory formulated purely in terms of these proof-theoretic notions, shows it is unsound, and thus motivates the restriction of disquotation to reference stable expressions to avoid paradox. Finally, in Section 4.3 I formulate several axiomatic theories of disquotational truth deploying the concepts introduced in Sections 4.1 and 4.2. The resulting systems are shown to be sound and proof-theoretically strong in comparison with other systems that exists in the literature. 21 König's Lemma establishes that every connected, locally finite, infinite graph contains an infinite path.

Proof-Theoretic Reference
Note that the set of sentences of L T is p.r., as is as the occurrence relation that holds between a string of symbols of the language and each of its substrings. Let Sent(x) and Occ(x, y) ∈ L express this set and relation, respectively, in a natural way. 22 Thus, m-reference is also a p.r. relation (cf. Definition 1), as it can be expressed in L by the 0 0 formula MRef(x, y) where T . is a function symbol of L for the p.r. function that maps each term t to the formula Tt -and likewise for other predicates that will come up later. Q-reference, instead, is much more complex, as it's not even arithmetical (cf. Definition 3), i.e. expressible in L . As is well known, truth-in-N can be expressed by a 1 1 formula, and this is best possible. Note that q-reference and truth-in-N are interdefinable over a small class of arithmetical notions. On the one hand, truth-in-N is the only notion occurring in the recursive definition of q-reference that involves second-order quantification, as the value and the occurrence relations, the normalization function, and syntactic properties of expressions such as being a disjunction or a universal statement are all p.r. and, thus, arithmetical. Therefore, q-reference is expressible by a 1 1 formula. On the other hand, truth-in-N can be defined for each sentence ϕ ∈ L in terms of q-reference as follows: N ϕ iff ∀x (x = ϕ ∧ϕ → Tx) q-refers to ϕ. Thus, q-reference is 1 1 , namely, hyperarithmetical. By the usual complexity considerations, this result extends to d-reference, reference, chains of reference, self-reference, and also well-foundedness (cf. Definitions 4 and 5-8), which means there is no formula in L that could be employed to restrict the instances of the T-schema or the Uniform T-schema in a first-order axiomatic system of disquotational truth. In other words, there is no straightforward way of axiomatizing the semantic theories introduced in Section 3.
Although there might be interesting, rather indirect, ways of axiomatizing this model, I opt instead to consider simpler, proof-theoretic versions of the concepts introduced in Section 2.3; in particular of well-foundedness. The strategy consists in replacing in Definition 3 the notion of truth-in-N with that of provability-in-PA, our base theory. Thus, all 1 1 notions occurring in Definition 3 are replaced with 0 1 ones. All further reference notions defined in terms of q-reference are modified accordingly. The new concepts can be seen as approximations to the original ones. They turn out to be expressible in L by formulae of fairly low quantificational complexity. This allows for the formulation of generous and well-motivated restrictions on disquotation, which result in powerful axiomatic truth theories, as will be seen soon.
Note that replacing in the definition of m-reference truth-in-N with provabilityin-PA would not alter the extension of the relation, for all identity statements that 22 That is, by means of 0 0 formulae. Recall that every p.r. property (or relation) can be expressed in L by a 0 0 expression and every recursively enumerable property by a 0 1 expression. As PA decides all 0 0 and proves all true 0 1 sentences -i.e. PA is 0 1 -complete -those expressions represent and weakly represent, respectively, the properties they express. are true in N are provable in PA and vice versa. As mentioned before, m-reference is already a p.r. notion. On the other hand, substituting provability-in-PA for truthin-N in the definition of q-reference makes a significant difference: it results in a much simpler, recursively enumerable notion. As is well known, provability-in-PA is itself recursively enumerable. Thus, if we replace truth-in-N by provability-in-PA in the recursive definition of q-reference, it can be shown by induction on the depth of χ (cf. Section 2.3) that the altered notion is recursively enumerable as well, since both the definiendum and provability-in-PA occur only positively in the definiens, provability-in-PA is only preceded by a string of existential quantifiers, and all other notions occurring in the definition are p.r., as it has already been argued. Therefore, the resulting proof-theoretic notion of q-reference relative to PA can be expressed in L by a 0 1 formula, QRef PA (x, y). We can find arithmetical formulae expressing proof-theoretic versions of direct reference, reference simpliciter, self-reference, and well-foundedness relative to PA in terms of MRef(x, y) and QRef PA (x, y). For d-reference (cf. Definition 4), let DRef PA (x, y) := MRef(x, y) ∨ QRef PA (x, y), which is trivially 0 1 . For reference simpliciter, we first need to take a detour to define chain of reference. This requires some subtlety, for infinite sequences cannot be coded by natural numbers. However, finite sequences can. Moreover, note that, according to Definition 6, for ϕ to refer to ψ there must be a chain of reference starting with the former and ending with the latter, i.e. a finite chain. Thus, we can define reference in L exclusively in terms of finite chains. Let Seq(x) ∈ 0 0 express in L the p.r. property of being a (finite) sequence, and let len(x) and (x) i ∈ 0 0 express the p.r. function that maps each sequence to its length and the one that maps each sequence x and number i to the i-th entry of x, if i is smaller than the length of x, respectively. The Well-foundedness, on the other hand, requires a bit more work, as it's defined in terms of possibly infinite chains of reference (cf. Definition 8). However, this is not a hindrance, for there is an equivalent formulation of the definition in terms of finite chains only. We say a finite chain of reference ϕ 1 , . . . , ϕ n can be extended just in case there is a sentence ϕ n+1 such that ϕ 1 , . . . , ϕ n , ϕ n+1 is also a chain of reference. Clearly, a sentence ϕ is well-founded just in case every finite chain of reference starting with ϕ can be extended only a finite number of times. Thus, well-foundedness relative to PA is expressed by the 0 3 formula Wf PA (x) := ∀y (CRef PA (y) ∧ (y) 0 = x → ∃z∀w (CRef PA (w)∧len(y) < len(w)∧∀k <len(y) (y) k = (w) k → len(w) < z)) This cumbersome formula states that for each finite chain of reference y starting with x there is a limit z to the length of every chain that extends y.

Unstable Reference
One would think our job is almost done now. It just remains to relativize disquotation to Wf PA (x) and then extend PAT with the resulting instances (see Section 2.1), that is, for each sentence ϕ ∈ L T , to obtain the desired truth system. Unfortunately, such a theory would be too weak and, what is worse, unsound. Although many sentences turn out to be well-founded relative to PA, only very few are provably so in PA itself. Consider, for instance, the following: Despite being obviously harmless, as it only refers to 0 = 0, PA cannot prove that (9) isn't self-referential, for that would mean that PA proves In other words, PA would prove its own consistency, which is not possible by Gödel's second incompleteness theorem. In general, PA is not able to prove that a given sentence doesn't q-refer to an arbitrary expression, for it cannot prove that a given number or sequence of numbers provably fails to satisfy a given formula. Therefore, PA can only establish positive cases of q-reference plus those negative ones in which q-reference is nowhere restricted by a conditional -e.g. in ∀x T¬ . x -or no (non-dummy) quantifiers occur in the formula. An idea that suggests itself is to inform PA that it doesn't prove false statements, i.e. that it is sound. This can be done by extending PA with all instances of PA's uniform reflection principle, that is, where ϕ ∈ L . Let URfn(PA) be the resulting theory. URfn(PA) is trivially sound, as all instances of URfn PA are true in N. Nonetheless, if we extend URfn(PA) with all instances of (8), we obtain a trivial system.
Note that ∀x (x = l * ∧ γ → ¬Tx) doesn't refer to anything in PA, for PA doesn't prove Bew PA ( γ ). Moreover, URfn(PA) knows this, as it famously implies γ and, therefore, that PA doesn't prove γ . As a consequence, ∀x ¬Bew PA ( ẋ = l * ∧ γ ) is a theorem of URfn(PA), and so is Wf PA (l * ). By an argument similar to that employed in the liar paradox, one can show that the corresponding instance of the T-schema for ∀x (x = l * ∧ γ → ¬Tx) entails a contradiction in URfn(PA).
An immediate consequence of Observation 4 is that the theory that extends PAT with all instances of (8) is unsound. The reason behind this perplexing result is that being well-founded relative to PA does not mean that a sentence is actually wellfounded, but rather that PA has no 'evidence' for its unfoundedness, i.e. it doesn't know that l * = l * ∧ γ . The proof of Observation 4 shows that, if more evidence is available, such as γ , a sentence might turn out to be unfounded relative to the extended system, URfn(PA). This is a consequence of the incompleteness of PA.
Relativizing q-reference and all the other reference concepts that depend on it to URfn(PA) or a stronger system instead of PA will obviously not solve the problem, as any recursively axiomatizable theory will also be incomplete. The issue would emerge all the same, only at a higher level. A better way of bypassing the obstacle is to focus on sentences for which new evidence does not make a difference in the expressions they refer to. I call these sentences "reference-stable" or "r-stable", for short. 23 R-unstable expressions bear a certain analogy with blind truth ascriptions and contingent liars: in both cases we don't know what they express and, a fortiori, if they are paradoxical or not. Only for r-stable sentences we can be sure that their reference patterns are safe.
To properly define this class of sentences, we first need to introduce the idea of a directly-reference-stable sentence, or "dr-stable" for short. These are sentences that only contain reference-restricting conditionals with 'simple' antecedents, that is, antecedents that, if satisfied by a sequence of numbers, PA knows so. Thus, these antecedents must be provably equivalent in PA to 0 1 formulae.
Definition 9 (Dr-stability) Let ϕ be a sentence. ϕ is dr-stable iff in ϕ * all subformulae of the form ∀v (χ 1 ∨ χ 2 ) where T occurs in χ 2 but not in χ 1 are s.t. χ 1 is provably equivalent in PA to a 0 1 formula.
For instance, whereas ∀x (x = l * → ¬Tx) and GRfn PA -i.e. ∀x (Bew PA (x) → Tx) -are both dr-stable, ∀x (¬Bew PA (x) → Tx) and ∀x (x = l * ∧γ → ¬Tx) aren't. Since truth-in-N and provability-in-PA coincide for sentences provably equivalent in PA to a 0 1 statement, q-reference in PA and q-reference simpliciter coincide in the case of dr-stable sentences. Thus, so do d-reference in PA and d-reference simpliciter, so we can drop the qualifier in these cases. The same cannot be said of reference itself. For a dr-stable sentence ϕ might refer to an expression ψ that is not itself drstable, so that the latter d-refers to a sentence χ but doesn't d-refer to it in PA. Our original sentence ϕ, then, also refers to χ but not in PA. Thus, we need the following definition.
Definition 10 (R-stability) Let ϕ be a sentence. ϕ is r-stable iff it's dr-stable and only refers to dr-stable sentences.
Since GRfn PA only refers to T-free sentences, it must be r-stable. On the other hand, ∀x (x = l * → ¬Tx) refers to ∀x (x = l * ∧ γ → ¬Tx) and, therefore, isn't r-stable. Since r-stable sentences are dr-stable and only refer to dr-stable sentences, it follows from our previous considerations that Definition 10 can be equivalently restated by replacing "refers" with "refers in PA". Thus, reference in PA and reference simpliciter concur in the case of r-stable sentences. Moreover, note that dr-stability is expressible in L by a 0 1 formula, DRSt(x), as it's defined in terms of p.r. syntactic properties plus the property of being provably equivalent in PA to a 0 1 formula. Therefore, by the usual complexity considerations, r-stability is expressible in L by the following 2 formula:

Restricting Disquotation
We now have all the resources needed to formulate sound axiomatic theories of disquotational truth. Let URfn(PAT) be PAT+ (URfn PA ) and let WFUTB be the theory extending URfn(PAT) with the following restricted version of the Uniform T-schema, where t abbreviates t 1 , . . . , t n : WFUTB -for "Well-founded Uniform Tarski Biconditionals" -uniformly entails all instances of the T-schema for sentences that are provably r-stable and well-founded in PA.

Proposition 6 WFUTB is ω-consistent.
Proof Since every r-stable sentence that is well-founded in PA is also well-founded simpliciter, by Proposition 3 the axioms of WFUTB are all true in N, wf .
Despite being a disquotational theory, WFUTB is proof-theoretically strong, as it can relatively interpret the theory of ramified truth up to 0 , RT < 0 . This consists of iterations of the theory of typed compositional truth over PAT up to the ordinal 0 , the Feferman-Schütte ordinal. 24 Each of these iterations requires its own truth predicate, bringing about a hierarchy of corresponding languages.
As is well known, natural numbers can effectively codify ordinals up to 0 . 25 Assuming a fixed (effective) coding, if α < 0 , we write α for the numeral of the code of α. For clarity, I often identify ordinals with their codes if there is no room for confusion. Based on this coding, PA can talk about ordinals and represent some of their properties. For instance, the set of ordinals below 0 can be represented by a formula of L , Ord(x), as well as their ordering, say, by the formula x ≺ y. Let ∀α ϕ and ∃α ϕ be short for ∀v (Ord(v) → ϕ) and ∃v (Ord(v) ∧ ϕ), respectively, where v is a suitable variable. As is also well known, PA can prove all instances of transfinite induction up to 0 (< 0 ), i.e.
for ϕ ∈ L and ξ < 0 . 26 This means PA can prove that all ordinals below 0 are well ordered. Let L <0 be L and, for every α such that 0 < α ≤ 0 , let L <α be the result of extending L with monadic predicates T β , for each β < α. PAT <α consists of the axioms of PA formulated in L <α , with the induction schema extended to the whole language. For each α < 0 , let Sent(x, α) ∈ L -or Sent α (x), for short -represent the set of sentences of L <α . We define a cumulative hierarchy of compositional truth theories formulated in these languages, as follows.
Let RT <0 be PA and, for every α such that 0 < α ≤ 0 , let RT <α be the theory formulated in L <α that extends PAT <α with every instance of following axiom-schemata, for each ξ < β < α: where ∀ . and ∃ . are function symbols for the p.r. functions mapping a variable v and a formula ϕ to ∀v ϕ and ∃v ϕ, respectively. Note that the axioms never quantify over the subindex β of T β , on pain of triviality. But they do quantify over the predicates themselves up to a given ordinal β in RT β 9. This set is p.r. for each β < 0 and 24 See Halbach [10, chap. 8]. 25 More comprehensive ordinal notation systems are also possible, but this is enough for our purposes. 26 See Pohlers [28, chap. 3].
representable also for β ≥ 0 insofar as all arithmetical instances of TI β hold in the theory. RT <1 is just the theory of typed compositional truth CT where all occurrences of T have been replaced with T 0 (also within corner quotes), for RT 0 8 and RT 0 9 are vacuously true. Similarly, for 1 < α < 0 , axioms RT β 1, with β < α, establish all instances of disquotation for identity statements for T β , and RT β 2-RT β 7 lay down the compositional character of these truth predicates. RT β 8, in turn, provides instances of disquotation for each truth ascription that belongs to lower levels of the hierarchy, supplementing RT β 1. Finally, RT β 9 establishes that the hierarchy is cumulative.

Proposition 7
The theory of ramified truth up to 0 , RT < 0 , is relatively interpretable in WFUTB.
Proof I first show that L T contains a formula ϑ β (x) that behaves in WFUTB like T β (x) in RT < 0 , for each β < 0 , following Halbach's [10] demonstration of his Lemma 15.24. I then extend this result to every β < 0 .
If ϕ(x) is a formula of L T containing T, let L ϕ ⊆ L T be the language that extends L with ϕ(x) as if it were just a predicate symbol, that is, T occurs in formulae of L ϕ only within ϕ(t), where t is a term. 27 The relation that holds between a sentence ψ and a formula ϕ just in case ψ is a sentence of L ϕ is p.r. Therefore, it can be represented by a 0 0 expression, Sent (x, y). Strongly diagonalizing the formula 27 More precisely, we add the following recursion clause to the definition of well-formed formula: if t is a term, then ϕ(t) is a well-formed formula. over the free variable k, we obtain a predicate ϑ(x, y) -ϑ y (x) -s.t. 28 where Sent <s (t) := Sent 0 (t) ∨ ∃ζ ≺ s Sent (t, ϑζ (x) ) ∈ 0 0 , s and t are terms and ζ is a suitable ordinal variable. 29 Let L <ϑ α be the language that results from merging together every L ϑ β , with β < α. Thus, for every α < 0 , Sent <α (x) represents the set of sentences of L <ϑ α .
Given that (10) was obtained by Strong Diagonalization, we can identify the predicate ϑ y (x) with the right-hand side of the biconditional, except all occurrences of ϑ y (x) are actually occurrences of a more complex term that denotes ϑ y (x). Since alethic reference is closed under Leibniz's Law, this difference won't actually make a difference.
To show that, for each β < 0 , ϑ β (x) behaves like T β (x), we first need to remove the occurrences of T in (10), that is, we need to show that WFUTB contains an instance of the Uniform T-schema for each ϑ β (x), with β < 0 . In other words, I give a uniform proof that every instance of each of these formulae is r-stable and well-founded in PA.
Note first that the normalization of ϑ y (x) is a disjunction of negated universally quantified statements followed by just one reference-restricting conditional, except for the first disjunct that doesn't contain T. Moreover, the antecedents of these conditionals are all 0 0 . This means, on the one hand, that each ϑ β (t) is dr-stable, so that is, if ϑ β (t) q-refers to a sentence in PA, then PA knows about it. But also, the fact that all the antecedents are 0 0 implies that URfn(PA) knows about all negative cases of q-reference for each ϑ β (t) as well. For instance, if Sent <β (z∧ . w)∧ t = (z∧ . w) isn't true of m, n ∈ ω, then PA proves the negation of Sent <β (m∧ . n) ∧ t = (m∧ . n), which means that URfn(PA) proves that PA doesn't prove Sent <β (m∧ . n) ∧ t = (m∧ . n). 28 Note that the fourth disjunct on the right-hand side of the following biconditional requires that the conjunction of z and w is a sentence of L <y , instead of their disjunction. This guarantees that z and w are sentences of the language themselves, and not two disjuncts conforming an expression of the form ϑ k (t), with k < y, which is always a disjunction. This way, only one disjunct in ϑ y (x) can be satisfied at a time. 29 Although strictly speaking this formula is not 0 0 , as it quantifies over the possibly infinitely many ordinals smaller than ζ , it's easy to see that only finitely many cases need to be checked for each t and each s, for every sentence of L T is of finite length. Thus, Sent <s (t) is provably equivalent in PA to a 0 0 expression. Furthermore, since each of the antecedents holds exactly of one natural number or ordered pair of natural numbers, URfn(PA) can also prove general facts about the sentences each ϑ β (t) q-refers to. For example, it follows in URfn(PA) that, if t denotes an identity statement or doesn't denote a sentence of L <ϑ β , then ϑ β (t) doesn't refer to any expression.
Consider the following translation function η : L < 0 → L T : where η . is a term of L representing η. We know such a function exists and is p.r. by Kleene's Recursion Theorem. This way, η not only translates truth predicates occurring in a formula but also those that occur inside corner quotes, as well as those inside corner quotes inside corner quotes, etc. For instance, the translation of In what follows, I show that WFUTB proves the translation of every axiom of RT < 0 . Actually, I prove general, quantified versions of the translations to facilitate the derivation of RT β 9. I only deal with RT β 1, RT β 6, RT β 8, and RT β 9. The other cases can be proved in a similar fashion. Let α < 0 . We reason informally in WFUTB under the assumption that β ≺ α. By (10) we have that which entails the translation of RT β 1. For the other axioms, note that by TI α . Thus, we have that which implies the translation of RT β 6. For RT β 8's, assume ξ ≺ β: Finally, we prove the translation of RT β 9, that is, 31 Assume ζ ≺ β and Sent ζ (t • ). By (14), we have that Sent <ζ (η . (t • )). By our proof of the translation of RT β 8, in which ξ is a variable, we know that ϑ β ( ϑζ (η . (t . )) ) ↔ ϑ ζ (η . (t • )). Thus, we just need to show that ϑ ζ (η . (t • )) ↔ ϑ β (η . (t • )), for every term t. I prove it by an internal complete induction on the complexity of t • .
Def. η which completes the base case. Assume the result holds for every term denoting a sentence of lower complexity than t • 's. If t • = ¬ . s, we have the following: The cases for the other logical operators follow from the i.h. in a similar way. For the last case, assume t • = T ξ (s) , for some ξ ≺ ζ . We then have that This completes the proof of the relative interpretability of RT < 0 in WFUTB. Therefore, WFUTB entails every arithmetical theorem of RT < 0 . This includes all instances of transfinite induction for formulae of L up to α, for some α > 0 , i.e. WFUTB can prove that larger segments of the ordinal notation system are wellordered. This implies that our proofs can be extended to show that RT <α is relatively interpreted in WFUTB, as only arithmetical transfinite induction has been employed so far. In turn, RT <α allows us to prove further instances of arithmetical transfinite induction, which means WFUTB can actually relatively interpret more iterations of compositionality, and so on. The progression reaches a fixed point at 0 , completing the proof. 32 Whilst Proposition 6 establishes the soundness of WFUTB, Proposition 7 shows that the theory is well positioned with respect to some of the better-known truth systems in terms of proof-theoretic strength. For instance, the Kripke-Feferman theory KF, that classically axiomatizes the family of Kripke's fixed-point models over the Strong Kleene evaluation scheme, is relatively interpretable already in RT < 0 . So is Halbach's PUTB. In turn, FS, for "Friedman-Sheard", is even weaker than the latter. 33 Proposition 7 shows that WFUTB is stronger than these four renowned systems. 34 32 See Feferman [4], Halbach [10, sec. 22.2]. 33 See Feferman [4] for KF, Halbach [9] for PUTB, and Friedman & Sheard [7] and Halbach [8] for FS. 34 These results make WFUTB and other (classical disquotational) systems that will be introduced next attractive candidates for minimalist theories of truth. As it's often quoted, according to Horwich [16, p. 42] "the principles governing our selection of excluded instances are, in order of priority: (a) that the minimal theory not engender 'liar-type' contradictions; (b) that the set of excluded instances be as small as possible; and -perhaps just as important as (b) -(c) that there be a constructive specification of the excluded instances that is as simple as possible." Yet, WFUTB can be soundly strengthened in several ways. For instance, we can close WFUTB's truth predicate under provable equivalence in URfn(PAT), that is, we can replace WFUTB's truth-theoretic axiom schema with the following: where ϕ is a formula with exactly n free variables v 1 , . . . , v n , x(t . ) is short for x(t 1 / v 1 ) . . . (t n / v n ), and Bew URfn(PAT) (x) weakly represents provability in URfn(PAT). Call the strengthened system WFUTB + . In WFUTB + , not only r-stable well-founded sentences have corresponding instances of disquotation, but also those that are provably equivalent in URfn(PAT) to r-stable well-founded sentences. This includes, among other expressions, unfounded logical truths and falsities such as ∀x (Tx → Tx) and ∃x (0 = 0 ∧ Tx), which are obviously harmless but are excluded from WFUTB's truth principles.
The ω-consistency of WFUTB + follows from Lemma 2, which establishes that every sentence ϕ depends on ϕ , the set of sentences ϕ d-refers to. This implies that well-founded sentences can be shown to be grounded in the sense of Leitgeb. 35 Since dependence is closed under equivalence in every expansion of N to L T and, a fortiori, in URfn(PAT), so is Leitgeb's truth predicate, which means that his semantic theory N, lf is a model of WFUTB + . 36 Alternatively, we could extend WFUTB along the lines suggested by the results established in Section 3.2. By Proposition 4, any theory extending WFUTB with finitely many instances of the T-schema for r-stable non-self-referential non-wellfounded sentences is ω-consistent. For example, we could add instances of disquotation for finite subsets of the Visser-Yablo sentences in (5) to WFUTB without stepping into ω-inconsistencies.
In turn, Proposition 5 allows us to ω-consistently extend WFUTB with all instances of the Uniform T-schema for arbitrary r-stable non-self-referential non-well-founded sentences, provided that they d-refer to finitely many expressions. Call the resulting theory FUUTB, for "Fintely Unfounded Uniform Tarski Biconditionals". This theory leaves the Visser-Yablo sentences out but allows for instances of disquotation for, e.g. sentences in ω-chains such as the 'truth-teller' sequence, given by an infinite list of sentences, each of which says only of the one coming after that is true. 37 Unlike WFUTB's, the truth predicate of the extensions considered in the previous paragraph cannot always be closed under equivalence, on pain of unsoundness. To see it, notice that ∀x (x = l → ¬Tx) is logically equivalent to the strong liar ¬Tl (= l) but also r-stable, non-self-referential -it only refers to ¬Tl, which just refers to itself -and it d-refers to finitely many expressions.
Allow me to consider one last disquotational theory, NSRTB, for "Non-selfreferential Tarski Biconditionals". NSRTB extends URfn(PAT) with all instances of the following schema: where ϕ is a sentence. Every r-stable non-self-referential sentence has a corresponding instance of disquotation in NSRTB, including all sentences on the Visser-Yablo list. As a consequence, the theory is ω-inconsistent (cf. Section 2.2) and, therefore, unsound. However, it can be easily seen that NSRTB is not inconsistent, by Proposition 4 and an application of compactness. Thus, we may extract the general conclusion that sets of non-self-referential sentences of L T cannot be paradoxical simpliciter but only ω-paradoxical. But note also that, if we adopt the uniform version of (15) instead, that is, the resulting system is downright inconsistent, for we can show in URfn(PA) that all instances of the Visser-Yablo predicate Y(v) satisfy the antecedent of this principle. 38 Since no new instances of disquotation for sentences are allowed by the uniform version, one might wonder what has gone wrong. According to Priest [29], Beall [1], Cook [3], and others, the inconsistency is a consequence of admitting instances of (uniform) disquotation, not for self-referential sentences, but for selfreferential or circular predicates, e.g. Y(v). Although the Visser-Yablo sentences are not self-referential, the argument goes, they are formulated in terms of a circular predicate and, therefore, are circular themselves. To evaluate this claim, adequate notions of reference, self-reference, etc. that apply to formulae, and not just sentences, are required. I leave the task of extending the notions introduced in Section 2.3 to formulae for another time. Let me just say that, provided a natural extension is possible, the claim seems quite plausible.

Conclusions
In this paper I have explored a number of semantic and axiomatic truth theories motivated by the notions of reference I introduced in [27]. I have shown the latter to be proof-theoretically strong compared to most well-known systems in the literature, despite being purely disquotational. However, proof-theoretic strength is not necessarily a sign (or the only sign) of theoretical value. To assess the worth of the systems in themselves and compared to others, in this concluding section I test them against the criteria discussed in Leitgeb [22]. The criteria are the following: (a) Truth should be expressed by a predicate, and a theory of syntax should be available. (b) If a theory of truth is added to mathematical or empirical theories, it should be possible to prove the latter true.  As Leitgeb shows, it is not possible to meet all these criteria at once. Every formal account of truth will necessarily have to give up some of these requirements. It will be instructive to see how the systems we have considered here fare.
Clearly, all of our systems -WFUTB, WFUTB + , FUUTB, and NSRTB -meet the first criterion, for T is a predicate and arithmetic plays the role of the syntax theory in the background. The same holds of other popular theories considered in the previous section, i.e. PUTB, KF, FS, and systems of ramified truth.
(b), on the other hand, is met only partially. For every theorem ϕ of our base theory -URfn(PA) -each of our systems can be shown to entail T ϕ . This is because ϕ is T-free and, therefore, well-founded. However, none of the systems can prove what Leitgeb demands, i.e. a general statement asserting that every theorem of URfn(PA) is true. This would amount to showing that the systems entail URfn(PA)'s global reflection principle: KF, FS, and all systems of ramified truth are known to entail similar principles for their respective base theories. PUTB, on the other hand, doesn't, as shown by Halbach [9, §6]. We can adapt his proof to show that GRfn URfn(PA) is not provable in any of our systems either. Using Kleene's Recursion Theorem let us define a p.r. function s, represented in PA by the function symbol s . , as follows: if ϕ := ∀v ψ ∃v s(ψ, n) if ϕ := ∃v ψ The function symbol lh(x) represents the function that maps every formula to the number of logical operators occurring in it. Then, we can show the following: Lemma 4 If WFUTB ϕ, then there is an n ∈ ω s.t. WFUTB s(ϕ, n), and similarly for WFUTB + , FUUTB, and NSRTB.
I omit the proof, as it's roughly the same as the proof of Halbach's Lemma 6.1. An important lesson to draw from this lemma is that none of the systems entails URfn(PA)'s global reflection principle. If they did, then they would also entail Moving on to (c), it can be easily seen that all the systems meet this requirement, for in all of them T applies to sentences containing T itself -e.g. T T 0 = 0 is provable in the four systems. So do KF, FS, and PUTB. Systems of ramified truth obviously fail this criterion.
By contrast, (d) obviously doesn't hold of any of the systems we are considering, as the liar sentence λ, for example, doesn't have a corresponding instance of disquotation, on pain of triviality. Nonetheless, some of our systems fare better than others. While WFUTB contains 'the least number' of T-biconditionals, NSRTB is the most encompassing one, and the other two systems lie somewhere in between.
Criterion (e) is not met by our systems either. Suppose one of the systems entailed the following compositional principle: (T∧) ∀x∀y (Sent L (x∧ . y) → (T(x∧ . y) ↔ (Tx ∧ Ty))) where Sent L (x) is a formula of L representing the property of being a sentence of this language. By Lemma 4, there should be an n ∈ ω s.t. the following is also derivable: Since ∀x (Sent L (x) → x = s . (x, n)) is provable in PA for every n ∈ ω, we also have that ∀x∀y (Sent L (x∧ . y) → (T(x∧ . y)∧lh(x∧ . y) ≤ n ↔ Tx∧lh(x) ≤ n∧Ty∧lh(y) ≤ n))) By T∧, this formula entails the following: ∀x∀y (Sent L (x∧ . y) ∧ T(x∧ . y) → (lh(x) ≤ n ∧ lh(y) ≤ n → lh(x∧ . y) ≤ n)) Since there are theorems of PA with more than n logical operators that take the form of a conjunction, each of whose conjuncts has at most n logical operators, and all our truth systems prove that each single theorem of arithmetic is true, we get a contradiction. Similar arguments can be given for other compositional principles. Thus, none of the systems puts forward a compositional notion of truth, not even regarding sentences of L . For the same reasons, PUTB's truth predicate is also non-compositional. By contrast, KF, FS, and ramified truth theories are axiomatized by compositional principles. Note, however, that our systems can be consistently extended with compositional principles for L : since every sentence of this language is well-founded, our systems prove all instances of each compositional principle already, which means that these principles are all true in the models we used as witnesses to the consistency of the systems. This also shows that GRfn URfn(PA) can be consistently added to our systems, for it's entailed by the compositional principles for L . 39 Moreover, ω-consistency is preserved in every case.
By contrast, not all of our systems can be consistently extended with compositional principles for expressions containing T. Although unrestricted compositionality might be an unreasonable requirement, one would expect that each truth system is compatible at least with compositional principles for the class of sentences that have a corresponding instance of disquotation in the system. This is certainly the case of WFUTB and WFUTB + . This result follows from the fact that the instances of each principle of compositionality for grounded sentences in the sense of Leitgeb are all true in Leitgeb's semantic theory of grounded truth, N, lf , as dependence is closed under logical connectives and equivalence in every expansion of N to L T . 40 Thus, the compositional principles for truth relativized to the predicate WFUTB + deployes in the restriction of disquotation are also true in N, lf (cf. Lemma 2). Since the latter is a model of WFUTB + , extending this theory with compositionality relativized to this predicate preserves ω-consistency. A fortiori, the same can be said of WFUTB and its respective restricting predicate.
On the other hand, neither FUUTB nor NSRTB can be extended with appropriately relativized compositional axioms. If we strongly diagonalize the predicate T¬ . x we obtain a term l s.t. the identity statement l = T¬ . l is provable in PA. Note that T¬ . l d-refers just to the negation of the sentence denoted by l , i.e. to ¬T¬ . l . T¬ . l is therefore non-self-referential and d-referes to finitely many expressions. Moreover, it is r-stable. Thus, both FUUTB and NSRTB contain an instance of disquotation for this sentence, namely, T T¬ . l ↔ T¬ . l . By Leibniz's Law, this entails Tl ↔ T¬ . l , which is incompatible with the compositional principle that says that truth commutes with negation.
Regarding Leitgeb's criterion (f), all our systems except NSRTB fare well. While the latter is ω-inconsistent, as we saw in the last section, the other three systems all have models extending N, which is what Leitgeb had in mind when he demanded that a formal truth theory based on arithmetic allowed for standard interpretations. While KF, PUTB, and ramified truth theories also meet (f), FS famously doesn't.
Finally, we consider criteria (g) and (h) together. According to Leitgeb, the outer logic of a truth theory is given by the logical laws the theory can prove, whereas its inner logic consists of the logical laws the theory proves to be true. Since the systems we are considering are classical, (h) is met by all of them. (g), instead, is only satisfied by WFUTB + . To see this, note that all logical truths are provably equivalent to each other in first-order logic and, thus, also in URfn(PAT). Since, say, 0 = 0 is wellfounded and r-stable, it has an instance of disquotation in WFUTB + . This implies that all logical truths have instances of disquotation as well in the system. Given that they are all provable, we can conclude that they are also provably true. In contrast, none of the other systems has an instance of disquotation for, e.g. ∀x (Tx → Tx). This is a logical truth, yet it q-refers to every sentence. Thus, it is self-referential, so it's not declared true by WFUTB, FUUTB, or NSRTB. While FS is known for its matching inner and outer logics, KF, PUTB, and systems of ramified truth don't meet this criterion.
It's time to take stock. As we have seen, all the axiomatic systems we have considered in this paper fare equally well regarding criteria (a), (d), and (h). Due to their pure disquotational character, like PUTB the new systems fail requirements (b) and (e). However, unlike it, both WFUTB and WFUTB + can be soundly extended with compositional principles for the language with the truth predicate. Moreover, unlike the systems of ramified truth, ours are untyped; unlike FS, WFUTB, WFUTB + , and FUUTB, ours allow for standard interpetations; and, unlike KF and PUTB, the inner and the outer logics of WFUTB + do coincide.
This shows that the axiomatic systems introduced in the previous section, especially WFUTB + , compare favourably to the best known axiomatic truth theories in the literature against Leitgeb's criteria. Additionally, their proof-theoretic power places them above many popular systems.