Alethic Reference

I put forward precise and appealing notions of reference, self-reference, and wellfoundedness for sentences of the language of first-order Peano arithmetic extended with a truth predicate. These notions are intended to play a central role in the study of the reference patterns that underlie expressions leading to semantic paradox and, thus, in the construction of philosophically well-motivated semantic theories of truth.


Introduction
What is the root of semantic paradox? According to a widely believed hypothesis, famously championed by Russell, Poincaré, and Tarski, the blame can be attributed to circularity. In the case of paradoxes involving truth, such as the paradox of the liar and Curry's paradox, a natural development of this hypothesis diagnoses their paradoxicality as arising from self-reference.
Although this view has been around for more than a century, the self-reference diagnosis has never been properly elaborated. The main reason is that the notions of reference and self-reference have proven to be very elusive, despite their frequent use in heuristic contexts in mathematics and mathematical logic. 1 An additional challenge facing the self-reference diagnosis arises from the Visser-Yablo paradox, involving an infinite list of sentences each of which says that the standard interpretation of L , with ω as its domain. Then, for each n ∈ ω, L has a term n (the numeral of n) denoting n, consisting of n occurrences of S followed by 0.
We work with a fixed (effective and monotonic) coding of expressions of L T by numbers in ω. If σ is a string of symbols of L T , we write σ for the numeral of its code. 4 We often identify expressions of L T with their codes if there's no room for confusion. Unless otherwise indicated, by "formula" and "sentence" we mean formula of L T and sentence of L T , respectively.
Although L T speaks in the first instance about natural numbers, the arithmetization of syntax allows it to express many syntactic properties, relations, and functions. Thus truth theories can be formulated in L T , with background syntactic principles formulated in L .
Theories can have either a semantic or an axiomatic presentation. A semantic truth theory consists of a model or family of models N, expanding N to L T , where is the extension of T in the model. By contrast, axiomatic truth theories result from adding truth-theoretic axioms to a syntax theory -usually PA. We assume PA contains the defining recursion equations for each extra function symbol in L . As is well known, PA is strong enough to represent every recursive relation between numbers and, therefore, expressions of L T , and to weakly represent every recursively enumerable relation. Let PAT consist of the axioms of PA formulated in L T with induction for the whole language. Call an axiomatic truth theory in L T any recursive extension of PAT. Of course, some theories will be highly incomplete and others simply unsound, but the terminology is convenient.

Diagonalization and Tarski's Theorem
Ideally, any truth theory (whether semantic or axiomatic) would satisfy Tarski's condition of material adequacy, according to which all instances of the following schema hold in the theory: T ϕ ↔ ϕ (T-schema) This is often called a "disquotational" principle, and its instances "T-biconditionals". Unfortunately, it cannot be implemented unrestrictedly, as the language is 'expressive enough' to allow for paradoxical expressions such as liar sentences.
Let v abbreviate a string of variables v 1 , . . . , v n different from x and y. 5

Theorem 1 (Diagonalization) For every formula ϕ(x, v) there is a formula ψ(v)
s.t. the (universal closure of the) following equivalence is a theorem of PAT: 4 We require that the coding is effective and monotonic to avoid certain issues brought up by Heck [10] and Halbach & Visser [7,8]. An effective coding is such that, given a number n, there is an algorithm to determine which expression it codifies (if any) and, vice versa, given an expression σ there is an algorithm that delivers the code of σ . A coding is monotonic if, for every two expressions σ and σ , if σ occurs in σ , then the code of σ is smaller than the code of σ . 5 The following result due to Montague [20] is a generalization to formulae with an arbitrary number of free variables of a theorem of Carnap [3], which in turn generalizes Gödel's construction of a self-referential statement for the proof of his first incompleteness theorem.
Proof Let Diag(x, y) represent the p.r. diagonalization function that takes a formula ϕ(x, v) and returns ∀x (x = ϕ(x, v) → ϕ(x, v)). Then, ∀x (x = ∀y (Diag(x, y) → ϕ(y, v)) → ∀y (Diag(x, y) → ϕ(y, v))) (2) is the result of diagonalizing ∀y (Diag(x, y) → ϕ(y, v)). Notice that (2) is the ψ(v) we are looking for. Let n be the code of (2). (2) is logically equivalent to which is equivalent in PAT to ϕ(n, v). Thus, the following is a theorem of PAT: This is the 'universal' proof of the diagonal lemma. An analogous 'existential' proof can be given in terms of an alternative diagonalization function represented by Diag ∃ (x, y) that maps each formula ϕ( Applying 'existential' diagonalization to the predicate ∃y (Diag ∃ (x, y) ∧ ϕ(y, v)) we also obtain a suitable ψ(v). This and the proof of Theorem 1 will become relevant later.
In equivalences of the form (1), ψ(v) is said to be a fixed point of ϕ(x, v). If x is the only free variable in ϕ, ψ is a sentence, commonly regarded as saying of itself that it has the property expressed by ϕ(x), whatever exactly that is. Thus, fixedpoint sentences are considered to be self-referential, and diagonalization is seen as the paradigmatic mechanism for obtaining such self-referential sentences.
Let ϕ in Theorem 1 be ¬Tx. Then we know there is a sentence λ such that the following is a theorem of PAT: λ is normally understood as a liar sentence, i.e. a sentence saying of itself that it is untrue. Given (3), no classical consistent extension of PAT can contain an instance of the T-schema for λ and, a fortiori, unrestricted disquotation is untenable. This, in essence, is Tarski's undefinability result. Similar results arise for semantic theories of truth. No model N, of L T can validate the T-biconditional for λ, as all theorems of PAT are true in N, , including (3). Thus, we say λ is paradoxical. Another example of a paradox is given by the following: This biconditional obtains in PAT by Theorem 1, diagonalizing Tx → 0 = 0. κ is taken to be a Curry sentence, i.e. a sentence that says of itself that it entails something false. Just like in the case of the liar, the T-biconditional for κ is inconsistent with (4), so we say κ is paradoxical. As a final example of paradoxicality consider the following 2-liar cycle: Theorem 1 guarantees that both biconditionals are provable in PAT by diagonalizing ¬T Tẋ . Given a formula ϕ with exactly one free variable v, let ϕ(v) be short for ϕ (v/ v ), whereẋ is a term of L for the p.r. function that maps each natural number to the code of its numeral, and x(y/z) for the (p.r.) substitution function that takes a formula ϕ, a term t, and a variable v and returns the formula that results from replacing all free occurrences of v in ϕ with t. Since v is free in ϕ(v) , we can quantify over it. Clearly, the equivalences in (5) are inconsistent with the Tbiconditionals for λ 1 and λ 2 ; so we say the latter are paradoxical.

The Visser-Yablo Paradox and (Self-)Reference
According to the self-reference diagnosis, all paradoxical expressions involving truth share a common reference pattern, namely, self-reference. This is intuitively clear in the case of the liar, Curry sentences, and of liar cycles, but not so much of the sentences that comprise the Visser-Yablo paradox, each of which say only of the ones that follow that they are untrue. The existence of the Visser-Yablo sentences can be proved in PAT, diagonalizing the formula ∀z (z > w → ¬Tx(ż/ w )). By Theorem 1, there is a predicate Y(w) such that is provable in PAT. Instantiating w in each numeral, we obtain the following biconditionals, i.e. the list: . . .
By reductio ad absurdum, from the T-biconditionals for Y(n) we can easily derive ¬T Y(n) for each n ∈ ω, as well as ¬∀z ¬T Y(ż) . On the other hand, ∀z ¬T Y(ż) does not follow in PAT plus the T-biconditionals, which means that the theory is consistent. 6 However, note that no model N, of L T can make all T-biconditionals for each Y(n) true at the same time: since each ¬T Y(n) would have to be true in the model, so would ∀z ¬T Y(ż) . For these reasons, the Visser-Yablo paradox is not considered to be a paradox in the strict sense, but an ω-paradox. Despite not directly leading to a contradiction in our axiomatic theories, it is still problematic, as no semantic truth theory can validate all T-biconditionals for the sentences in the list. The presence or absence of self-reference in the Visser-Yablo list has been extensively discussed in the literature. 7 It ultimately transpired that the notions of reference and self-reference deployed in the discussion were incomplete, defective, and in some cases even trivial. As Leitgeb points out, there are at least two notions of reference at play in the debate, a 'naïve' and an 'incomplete' one. According to the naïve account, self-referential sentences ψ are fixed points of some predicate ϕ(v), as in (1). This accounts for the liar, liar cycles, and other paradoxical expressions such as Curry sentences. Underlying this notion of self-reference is the idea that a sentence refers to every object that is mentioned in an equivalent expression: Naïve reference: a sentence ϕ refers to an object o (e.g. a sentence) if there is a sentence ψ(t) that is (e.g. arithmetically) equivalent to ϕ and t denotes o.
This notion is naïve in the sense of being trivial. For instance, every sentence ϕ is (logically) equivalent to ϕ ∧ (T ϕ → T ϕ ) and, more generally, to ϕ ∧ (T ψ → T ψ ), where ψ can be any sentence. As a consequence, every sentence refers to every sentence, including itself. This is unacceptable; a good notion of reference must impose more restrictive criteria. According to the second notion Leitgeb traces in the literature, sentences can refer to an object in two ways. They can either contain a term denoting the object or state something about it by means of a description: This notion nicely reflects pre-theoretical intuitions. For instance, the sentence ¬T 0 = 0 surely refers to 0 = 0; and the sentence ∀x (Bew PA (x) → Tx), stating that all theorems of PA are true, surely refers to the theorems of PA.
The problem with incomplete reference, however, is that it gives no information as to whether (and how) quantified sentences of a different logical form may refer: for instance, sentences of the form ∀x ϕ where ϕ is not a conditional and existential claims. It is by no means clear how to fill in this gap to account for, e.g. the prima facie self-referential status of the liar in (3) or the cycle of liars in (5), given that we cannot hold that reference is closed under logical equivalence, on pain of trivializing the notion. In Milne's words, Provable material equivalence in a theory is not normally a criterion of synonymy so we must suppose that it is something particular to gödel biconditionals that is at issue. For a number of reasons the case is hard to make. (Milne [19,p. 212]) The deficiency of both available concepts led many to adopt a rather pessimistic attitude towards the notions of reference and self-reference, in particular in the context of arithmetic. Leitgeb [18, p. 13] writes: "we either suspect that much philosophical work lies ahead of us before the question is finally settled, or that otherwise the question is ill-posed, i.e. that the talk of self-referentiality is to be banished from scientific contexts". It's now time to abandon this pessimistic attitude, at least when it comes to reference in the context of theories of truth. 9 In the next section I advance a natural way of completing the incomplete notion of reference just in the context of truth, giving the expected verdict for all (normally considered to be) clear cases, including the liar and its variants.
With the right notions of alethic reference and self-reference in place, we will be in a position to properly evaluate the orthodox view on semantic paradoxes. Moreover, the new notions also allow us to formulate sensible restrictive criteria for instances of disquotation resulting in philosophically and technically appealing truth theories, as shown in my companion paper [22].

Four Features of Reference
The main purpose of this section is to give an adequate and precise definition of reference in the context of truth, inspired by and extending the incomplete notion outlined in Section 2.3. But before we start, four remarks are in order.
First, one should be careful to confuse the notion of reference we are after neither with the Fregean notion -the truth value of a sentence -nor with the notion of aboutness. 10 Although they have a strong family resemblance, reference is more tied to the syntactic structure of sentences than aboutness. While tautologies are sometimes considered to be about nothing in particular because they convey no information, we would intuitively say that expressions such as T ϕ → T ϕ still refer to ϕ, as Leitgeb's incomplete notion predicts. 11 Second, and related to the previous point, closure under logical equivalence should not be required from a definition of reference, as indicated in Section 2.3. On pain of triviality, reference cannot be extensional. It must be hyperintensional.
Third, throughout the paper we will understand reference exclusively as a binary relation between sentences of L T , and not between sentences and numbers. The reason is simple. We are concerned with arithmetic not as a theory of numbers but of syntax, for the study of formal truth theories and the semantic paradoxes and related phenomena that might affect them.
Note that, since reference to sentences is achieved via a coding, whether a sentence refers to another will inevitably depend on the coding we choose. This choice, even if restricted to effective and monotonic codings, is always fairly arbitrary. As a consequence, what sentences an expression refers to is also very often an arbitrary matter. This is to be expected. The coding we work with fixes the denotation of the terms of 9 I hope to have dissipated some doubts already in Picollo [21], as I have shown how to define reference, self-reference, and other referential patterns in the pure language of arithmetic. Unfortunately, it is not possible to simply extend those notions to L T , as what we are after in this paper is not reference simpliciter but a special kind of reference that only concerns the truth predicate. I come back to this in footnote 15. 10 See Putnam [24] and Goodman [6] for a historical overview and Urbaniak [27] and Yablo [31] for modern takes on aboutness. 11 See as well Leitgeb [18], Cook [4], Halbach & Visser [7,8], and footnote 17.
L to sentences of L T from 'outside'. 12 It is only natural that what a sentence refers to depends on the denotation of the terms that occur in it. Moreover, the arbitrariness of the coding will not cause any inconvenience in the formulation of truth systems, as what counts as an instance of disquotation also depends on the coding and varies accordingly. 13 Finally, we will only focus on alethic reference, that is, reference in the context of truth. Let Bew PA (x) weakly represent provability in PA in a natural way. 14 If we diagonalize ¬Bew PA (x) we obtain a sentence γ such that is provable in PA. γ is also known as the "Gödel sentence of PA". Since γ has been obtained exactly in the same way as the liar sentence λ, one might conclude it is as self-referential as the latter. Nonetheless, γ is completely unparadoxical, as is every other expression belonging to the pure language of arithmetic. Self-reference and other pathological patterns only become dangerous when combined with the truth predicate (and other semantic or logical notions such as satisfaction, property instantiation, and class membership). Thus, we are only interested in the sentences an expression ϕ refers to insofar as they fall in the scope of T in ϕ. 15 For instance, we would like to say that ψ = ψ ∧ ¬T ϕ or Bew PA ( ψ ) → T ϕ alethically refer -or just "refer", for short -to ϕ but not to ψ. Making this idea precise is the aim of the rest of this section.

Reference by Mention
The incomplete notion of reference identifies two ways in which sentences may refer: by mention and by description. Let us first spell out the alethic version of reference by mention (m-reference) in precise terms.
Definition 1 (M-reference) Let ϕ and ψ be sentences. ϕ m-refers to ψ iff ϕ contains a subsentence of the form Tt and N t = ψ .
A sentence ϕ m-refers only to those sentences denoted by closed terms that follow T in ϕ. This means, on the one hand, that terms occurring in the scope of other predicates do not play a role, as anticipated. For instance, while ¬Bew PA ( γ ) doesn't m-refer to any expression, Bew PA ( 0 = 0 ) ∧ T 0 = 0 m-refers to 0 = 0 but not to 0 = 0. On the other hand, it means that proper subterms of terms occurring after T are to be ignored. Let ¬ . be a function symbol of L representing the p.r. function that 12 One might wonder, in the light of Putnam's [25] model-theoretic argument, if there is an alternative way of fixing the reference of the terms of a language at all. 13 See Heck [10]. 14 See Halbach & Visser [7,8]. 15 This is why extending the notions of reference for the pure language of arithmetic I put forward in Picollo [21] to L T is not viable, as anticipated in footnote 9. However, they can serve a heuristic role in the formulation of the alethic notions. maps each formula to its negation (and similarly for other logical connectives). Then T¬ . ϕ m-refers not to ϕ but to ¬ϕ.
This may seem too restrictive. Strictly speaking, only the sentence denoted by t (if any) falls in the scope of the truth predicate in Tt. Nonetheless, one might think that by ignoring t's subterms we could be overlooking dangerous reference patterns. I show this is not the case in my companion paper [22]. Moreover, note that allowing that Tt m-refers to all denotations of subterms of t would result in a grossly overgenerating notion. For instance, T¬ . ϕ would m-refer not only to ϕ but also to every sentence ψ whose code is smaller than ϕ's; and the same goes for T ϕ . For ϕ is a complex term of the form S. . . S0, of which ψ is a subterm. As a consequence, many expressions would be wrongfully classified as self-referential or unfounded, prompting unnecessarily restrictive truth theories.
Although the equivalences delivered by Theorem 1 do not suffice to establish that the fixed points obtained by Diagonalization m-refer to themselves even in the presence of T, there is a stronger version of this result that does so to a certain extent.
We say that ϕ(t, v) is a strong fixed point of ϕ(x, v). A proof of this result can be found in Jeroslow [13]. 16 All that is required is that the language contains a function symbol for the substitution function, as is the case of L . Thus, strongly diagonalizing, for instance, the predicate ¬Tx, we obtain in PA the following identity: l = ¬Tl that is, a sentence that m-refers to itself. 17 ¬Tl is a 'strong' liar sentence. In general, strong fixed points of formulae ϕ(x) with subformulae of the form Tx in which x occurs free are self-m-referential according to Definition 1.
However, self-m-reference might not obtain if, instead, T is followed by an open term s(x) in ϕ(x). For instance, suppose s(x) represents a function mapping each sentence to 0 = 0. In that case, every strong fixed point of ¬Ts(x) will refer to 0 = 0 and not to itself. As another example, consider the formula T¬ . x. Strong Diagonalization delivers a term l such that is a theorem of PA. T¬ . l is a sentence that says not of itself but of its negation that it is true. However, note that, according to Definition 1, whilst T¬ . l does not m-refer to itself, its negation does. Thus, despite not being self-referential T¬ . l will be deemed unfounded (cf. Definition 9), as it m-refers to a self-referential expression.
All that matters for m-reference is the occurrence of subsentences of the form Tt. Any two sentences with the same atomic subsentences of this kind m-refer to the same expressions. This means the notion is trivially compositional: ¬ϕ, ∀v ϕ, and ∃v ϕ m-refer to whatever ϕ m-refers to, and (ϕ∧ψ), (ϕ∨ψ), and (ϕ → ψ) m-refer to everything either ϕ or ψ m-refer to. Thus, m-reference is closed under negation, but also renaming of variables and the introduction and elimination of dummy quantifiers -i.e. that don't bind any variable. As a consequence, it is also closed under a kind of equivalence -more fine grained than logical equivalence -that allows for all valid transformations that do not add or remove types of atoms. Furthermore, m-reference is closed under Leibniz's Law: if s = t, then ϕ and ϕ[s/t] m-refer to the same expressions.

Reference by Quantification
Things are somewhat more complicated in the case of reference by description and, in general, in cases where reference is achieved via the presence of quantifiers, which I call "reference by quantification" (or "q-reference", for short). Let's start by focusing on a paradigmatic example, PA's global reflection principle: According to clause 2 of the incomplete notion of reference, (GRfn PA ) refers just to all theorems of PA, for it says only of the latter that they are true. The conditional following a universal quantifier restricts the referenced sentences to those satisfying the antecedent of the conditional. Thus, in the absence of a conditional, e.g. in ∀x Tx and ∀x (Bew PA (x) ∧ Tx), it appears that the sentence must refer to everything. However, we should not endorse these conditions unrestrictedly in the case of alethic reference.
For one thing, in order to keep reference by mention and reference by quantification apart, it seems reasonable to require that a free variable occurs in the scope of the truth predicate in a sentence for it to refer to other sentences by quantification. For another, we only say that in a statement of the form ∀v (ϕ(v) → ψ(v)) q-reference is restricted to the ϕs as long as it is possible to determine which sentences satisfy ϕ(v): i.e. if T doesn't occur in ϕ. Only then do we regard the conditional as restricting reference by means of the antecedent. Recall the aim of this paper is to offer an appropriate account of reference for the study of the reference patterns underlying paradoxical expressions. However, the ultimate hope (realized in the companion paper [22]) is to find shared patterns in terms of which we can formulate sensible restricting criteria on the instances of disquotation. In turn, the truth principles we embrace will then determine the extension of the truth predicate. On this approach, there is no 'standard' interpretation of this predicate at our disposal yet. Thus, if T occurs both in ϕ and ψ, e.g. in ∀x (Tx → Tx), we will say the expression refers to every sentence, for the conditional is not reference restricting.
Moreover, complex terms occurring in the scope of T should play a role in the q-reference of expressions just as they do for m-reference. Let ϕ be a sentence. In Section 3.2 I stipulated that, e.g. T¬ . ϕ m-refers to ¬ϕ and not to ϕ, because the former and not the latter occurs in the scope of T. I argued that considering subterms is neither necessary nor desirable. For analogous reasons, I would like to say that q-refers, not to ϕ, but to ¬ϕ. Similarly, it seems reasonable to say that ∀x T¬ . x q-refers not to every sentence but just to all negations, and ∀x∀y T(x→ . y) to all sentences of conditional form.
A prima facie sensible way of capturing this idea is to say that universal claims without a reference-restricting conditional q-refer to what their instances m-refer to, whereas for those with a reference-restricting conditional only the instances with a true antecedent matter. But in order to keep m-and q-reference apart, we should add that the occurrences of the terms in the scope of T in the instances are 'new', that is, they are not present in the sentence itself but are a product of the instantiation of the quantifiers at issue. For instance, ∀x (x = ϕ → Tx ∧ T ψ ) would q-refer to ϕ but refer to ψ only by mention.
However, this proposal is not general enough. It could be that, intuitively, the relevant instances of a universal statement don't m-but q-refer themselves to other sentences, in the presence of embedded quantifiers. Consider the following sentence: By analogy with (9), (10) seems to q-refer to what ∀y (y = ¬ . ϕ → Ty) refers to. But the latter does not m-refer to any expression. Intuitively, it q-refers to ¬ϕ. Likewise, the numerical instances of are of the form Bew PA (n) ∧∀y T(y→ . n). Although they do not m-refer to any expression, it looks like they do q-refer, provided that n codes a sentence ϕ. In each such case, we would like to say that Bew PA ( ϕ ) ∧∀y T(y→ . ϕ ) q-refers to all conditional sentences that have ϕ as their consequent, due to the right conjunct. Q-reference calls for a recursive definition whereby q-referents are specified by considering the m-and q-referents of the quantifier instances of its subsentences. This can be done roughly along the following lines: a universally quantified expression q-refers to what its instances newly m-refer or q-refer to, unless it has a reference-restricting conditional, in which case only the instances with true antecedents are to be considered. Yet, we should be careful when dealing with strings of quantifiers. Consider, for example, the following expression: Intuitively, (12) q-refers just to theorems of PA, as only these fall in the scope of T in the instances satisfying the antecedent. But if we applied our recursive clause unrestrictedly, we would be forced to conclude that (12) q-refers to every sentence ϕ of the language, as each ∀y (Bew PA ( ϕ ∧ . y) → T ϕ ) m-refers to ϕ. The recursive clause should be modified to take into account consecutive quantifiers 'all at once'. What about expressions of a different logical form, such as ∃v ϕ(v), ¬∀v ϕ(v), and ∃v (ϕ(v) ∧ ψ(v))? A compositional notion of reference by quantification seems desirable, as in the case of m-reference. This can be achieved by tying q-reference to the presence of quantified expressions as subformulae. In this way, for instance, sentences of the form ¬∀v ϕ(v) and ∀v ϕ(v) would q-refer to the same expressions. It would also be desirable to close q-reference under renaming of variables, addition and removal of dummy quantifiers, and most importantly, under certain valid propositional transformations that do not add or remove atoms. For example, although ϕ and ϕ ∧ ∀x (Tx → Tx) cannot always be said to q-refer to the same sentences, the following pairs of expressions intuitively do: ∀v (ϕ → ψ) and ∀v (¬ψ → ¬ϕ), ∀v (¬ϕ ∨ ψ) and ∀v (ϕ → ψ), ∃v (ϕ ∧ ψ) and ∃v (ψ ∧ ϕ), ∀v ϕ and ∀v ¬¬ϕ, and ∀v ¬ϕ and ¬∃v ϕ. A definition of q-reference that focuses on universal claims but is closed under these and other propositional transformations could fix the q-reference of expressions of any logical form, including existential statements.
For this reason, to provide a formally precise definition with the outlined characteristics, we first take a detour that consists in 'normalizing' all formulae of L T . Then, q-reference for sentences is defined in terms of their normalizations. Note that distinct sentences may have the same normalization. The normalization procedure therefore induces an equivalence relationhaving the same normalizationunder which q-reference is closed. The notion, however, is not trivialized, for this equivalence relation is more fine grained than logical equivalence (cf. Section 2.3).
Call "prime" any atomic or universal formulae, or their negation. The normalization of an expression is the result of a series of logically valid transformations that deliver a formula in alethic disjunctive normal form (ADNF). Formulae in ADNF of the form ∀v ϕ are roughly in prenex disjunctive normal form, as normally defined in textbooks, 18 except quantifiers are not rearranged to appear at the beginning of the formula. The point of writing these sentences in ADNF is that one can easily see whether or not they take the form of a restricted quantified claim. Consider a sentence containing T whose normalized form is ∀v ϕ. Clause 1 ensures that ϕ is written in disjunctive form; clause 2 ensures that all of the T-free disjuncts of ϕ (if any) are pushed to the left. Thus if ϕ is ψ ∨ χ where the disjuncts of ψ are T-free and the disjuncts of χ are not, the original sentence can be seen to be equivalent to the restricted quantificational claim ∀v (¬ψ → χ). In this way, the reference-restricting conditional ¬ψ → χ is made explicit, and ¬ψ is guaranteed to encapsulate all truth-free restrictions imposed on the quantifiers ∀v. If, in contrast, ϕ is not a disjunction or ψ is not T-free, the quantifiers ∀ v in ∀ v ϕ are unrestricted.

Definition 2 (ADNF) A formula is in ADNF iff it contains no dummy quantifiers
Since formulae in ADNF cannot contain conditionals or existential quantifiers, the first step in the normalization of an expression is to replace these connectives with negations, conjunctions, disjunctions, and universal quantifiers, making use of the standard definitions. Let τ : L T → L T carry out these replacements, that is, τ (ϕ → ψ) := ¬τ (ϕ) ∨ τ (ψ) and τ (∃v ϕ) := ¬∀v ¬τ (ϕ). Once conditionals and existential quantifiers have been removed, normalization proceeds in stages. It consists of successive transformations of each subformula of the form ∀v ϕ into ADNF, starting from those with fewer embedded quantifiers, i.e. of lesser depth. Let dep assign numbers to universally quantified formulae without conditionals or existential quantifiers as follows: For every formula ϕ without conditionals, existential quantifiers, or dummy quantifiers, its i-normalization is the result of successively applying the following transformations to each subformula ∀v ψ of depth i: 1. Replace every subformula of the form ¬(ψ 1 ∨ ψ 2 ) and ¬(ψ 1 ∧ ψ 2 ) with (¬ψ 1 ∧ ¬ψ 2 ) and (¬ψ 1 ∨ ¬ψ 2 ) resp. until they don't occur any longer, starting with the innermost. 2. Erase all double negations. 3. Replace every subformula of the form resp. until they don't occur any longer, starting with the innermost. 4. In every subformula of the form ∀v(ψ 1 ∨ · · · ∨ ψ m ) (where each ψ i , 1 ≤ i ≤ m, is not itself a disjunction), rearrange the disjuncts into χ 1 ∨ χ 2 such that the ones not containing T (if any) occur in χ 1 , whilst the others (if any) occur in χ 2 .
Since every step in the i-normalization of a formula involves only finitely many transformations, the process always terminates. Therefore, we can introduce the following definition: Definition 3 (Normalization) The normalization ϕ * of a formula ϕ is the result of erasing all dummy quantifiers in τ (ϕ) and, if there are any quantifiers left, performing successive i-normalizations starting with i = 1 until a fixed point is reached. 19 By way of example, consider the following sentence (we assume Bew PA (x) is already in ADNF): To normalize it, we first apply τ to get rid of conditionals and existential quantifiers: 19 It's guaranteed that we reach a fixed point, for the maximum depth of ϕ's subformulae is always finite.
Then, we then erase the dummy occurrence of ∀z at the beginning and 1-normalize the resulting expression, that is, we distribute the negation over the disjunction and erase double negations in ∀w ¬(z = w→ . y ∧ ¬Tz) -the only subformula of depth 1 -obtaining: Next, we 2-normalize the sentence above, erasing the double negation in ∀z ¬¬∀w (z = w→ . y ∨ Tz), the only subformula of depth 2, obtaining: Finally, we 3-normalize the latter, erasing the double negation and swapping disjuncts so that the T-free ones occur on the left. The result is the normalization of (13): We can see this is a notational variant of so the first pair of quantifiers is restricted by the formula ¬Bew PA (x) ∧ y = ¬ . x, and the second pair by z = w→ . y. As another example, note that the normalization of ∀x (Tx → Tx) is just ∀x (¬Tx ∨ Tx), as none of the steps 1-4 can be applied. The quantifier is, thus, unrestricted.

Proposition 1 For every formula ϕ, ϕ * is in ADNF and is logically equivalent to ϕ.
A similar result can be found in Picollo [21], together with a proof. We are finally in a position to provide an adequate and precise definition of reference by quantification. Let n abbreviate n 1 , . . . , n m ∈ ω, and n abbreviate n 1 , . . . , n m . Definition 4 (Q-reference) Let ϕ, ψ be sentences. ϕ q-refers to ψ iff ϕ * has a subsentence of the form ∀v χ s.t. χ is not a universal statement, T occurs in χ , and one of the following holds: 1. χ := (χ 1 ∨ χ 2 ), T doesn't occur in χ 1 , and χ 2 [n/v] q-refers or newly m-refers to ψ, for some n ∈ ω s.t. N ¬χ 1 [n/v]. 2. Either χ := (χ 1 ∨ χ 2 ) and T occurs in χ 1 or χ is not a disjunction, and χ [n/v] q-refers or newly m-refers to ψ, for some n ∈ ω.
Roughly, the definition of q-reference is intended to capture the idea that the qreferents of universally quantified claims are the sentences their (possibly restricted) instances refer to. Note that it can be easily turned into a (fairly cumbersome) recursive definition that bottoms out in m-reference. 20 In the simple cases, where there are no embedded quantifiers, q-reference is defined purely in terms of the mreferents of the instances, whereas in the presence of embedded quantifiers, the latter may also q-refer. In what follows I offer a variety of examples to illustrate how the definition works and to show that it gives the correct verdicts in paradigmatic cases.
Definition 4 deals adequately with simple cases of restricted quantification. For instance, it entails that GRfn PA q-refers just to all theorems of PA. In its normalization, ∀x (¬Bew PA (x) ∨ Tx), ∀x is followed by a disjunction. Thus, by clause 1, GRfn PA q-refers to every sentence each Tn newly m-or q-refers to, provided that N ¬¬Bew PA (n), that is, that n codes a theorem of PA. Thus, Definition 4 closely follows clause 2 of the incomplete notion of reference introduced in Section 2.3. What's more, since every formula of L T can be normalized, this definition provides a complete account of reference by quantification.
Clause 2 deals with unrestrictedly quantified claims, such as ∀x Tx and ∀x (Bew PA (x)∧Tx). In both cases, the expressions q-refer to whatever each instance, without restriction, newly m-refers to, that is, to every sentence.
Complex terms in the scope of the truth predicate are dealt with as expected. Take, for instance, (9) -∀x(x = ϕ → T¬ . x)) -whose normalization is ∀x (x = ϕ ∨ T¬ . x). Clause 1 guarantees that (9) q-refers just to ¬ϕ, for it entails that the sentence q-refers to what T¬ . ϕ m-refers to. Likewise, clause 2 entails that ∀x T(x→ . 0 = 0 ), which is its own normalization, q-refers to whatever each T(n→ . 0 = 0 ) m-refers to. Thus, the formula q-refers just to all sentences of conditional form with 0 = 0 as consequent.
Finally, strings of quantifiers receive an adequate treatment, as the definition instantiates them all at once. For instance, ∀x∀y T(x→ . y) can be easily seen to refer to all sentences of conditional form, by clause 2, whilst (12) q-refers just to all theorems of PA, by clause 1.
Just as the definition of m-reference accounts for the self-referentiality of certain sentences delivered by the Strong Diagonal Lemma, Definition 4 implies that 'weakly' diagonalizing certain predicates also delivers self-referential sentences. 21 Let's look back at the proof of Theorem 1 in Section 2.2. Diagonalization applied to ¬Tx delivers the following fixed point: ∀x (x = ∀y (Diag(x, y) → ¬Ty) → ∀y (Diag(x, y) → ¬Ty)) -dubbed λ -whose normalization is ∀x (x = ∀y (Diag(x, y) → ¬Ty) ∨ ∀y ((¬Diag(x, y)) * ∨ ¬Ty)) By clause 1 of Definition 4, the q-referents of λ are those of ∀y (Diag( ∀y (Diag(x, y) → ¬Ty) , y) * ∨ ¬Ty) that is, the m-referents of ¬Tn, provided that n is the code of the result of applying the diagonalization function to ∀y (Diag(x, y) → ¬Ty), i.e. λ. In other words, λ q-refers to itself. Similarly, we can conclude that the Curry sentence κ in (4) is selfq-referential, for it obtains by Diagonalization applied to Tx → 0 = 0.
Heck [10] argues to the contrary. They maintain that, unlike m-reference, qreference is not a genuine kind of reference, for it cannot account for certain pieces of self-referential reasoning. As an example, they claim that 'weak' Diagonalization is not sufficient to imply the existence of a real liar sentence in Kripke's fixed-point theory of truth over the Strong Kleene evaluation scheme; rather, Strong Diagonalization is needed (cf. footnote 8). 22 The salient features of Kripke's models are that each sentence ϕ and its truth ascription T ϕ receive the same truth value, and that paradoxical sentences are neither true nor false. In addition, the biconditional in the Strong Kleene evaluation scheme behaves in such a way that ϕ ↔ ψ is true only if ϕ and ψ are both true or both false; if either ϕ or ψ are neither true nor false, so is ϕ ↔ ψ. Thus, as Heck points out, the equivalence λ ↔ ¬T λ in (3) delivered by Theorem 1 cannot be true in Kripke's models, on pain of triviality. If it were so, then λ and ¬T λ would be either both true or both false, so λ and T λ would receive different truth values in the model. From this impossibility Heck concludes that 'real' self-reference cannot be established by means of 'weak' Diagonalization, presumably because they believe that without the equivalence between λ and ¬T λ there is no liar sentence. However, it can be easily shown that the way in which, e.g. λ is obtained entails that its truth value must coincide with that of ¬T λ in every -classical or non-classical -expansion of N (and thus, λ is a true liar sentence), forcing λ to be neither true nor false in Kripke's models. Q-reference is a legitimate kind of reference after all.
Together with Definition 1, Definition 4 also allows us to account for the reference pattern that underlies the 2-liar cycle given by λ 1 and λ 2 in (5), that results from diagonalizing the predicate ¬T Tẋ . Let λ 1 be the following: Applying clause 1 and the same reasoning as before, λ 1 q-refers to T λ 1 . Let λ 2 be this sentence. Thus, λ 2 m-refers to λ 1 and the latter q-refers to the former.
Another interesting example is the Visser-Yablo list. Diagonalizing the predicate ∀z (z > w → ¬Tx(ż/ w )), we obtain the following fixed point: This predicate is Y(w). Each instance, Y(n), is a sentence that, by clause 1, q-refers to the result of instantiating Y(w) in each number greater than n, as expected.
Finally, although the fixed point of the predicate T¬ . x that Theorem 1 delivers does not q-refer to itself, its negation does, as in the case of the strong fixed point of this predicate in (8).
Unlike m-reference, q-reference is not closed under all valid transformations that preserve atoms or literals (i.e. atomic and negation of atomic subformulae), not even for sentences in ADNF. Consider, for instance, ∀x (x = λ ∧Tx ∧¬Tx) and ∀x (x = λ ∨(Tx ∧¬Tx)). These sentences are both logically false, in ADNF, and contain the same literals, but while the former q-refers to every sentence, the latter only q-refers to λ.
However, since q-reference depends exclusively on the presence of certain subformulae of the form ∀v ϕ, the notion is compositional with respect to propositional connectives: ¬ϕ q-refers to whatever ϕ q-refers to, and (ϕ ∧ ψ), (ϕ ∨ ψ), and (ϕ → ψ) q-refer to what either ϕ or ψ q-refer to. Note also that q-reference is trivially closed under renaming of variables and dummy quantifiers.
More importantly, the fact that q-reference is defined in terms of normalization guarantees that the notion is closed under the following propositional transformations: the addition and elimination of double negations, the commutativity of disjunction and conjunction, contraposition of the conditional, the distributivity of conjunction over disjunction and vice versa, the De Morgan laws, and the interdefinability of connectives and quantifiers. Furthermore, q-reference is closed under Leibniz's Law, and the addition and elimination of tautological antecedents: if ψ is a formula of L that is true of every n-tuple of natural numbers, ∀v ϕ and ∀v (ψ → ϕ) q-refer to the same sentences.
Closure under these and other transformations allows Definition 4 to give the intuitively correct verdict with respect to existentially quantified statements. For instance, as ∃x (Bew PA (x) ∧ ¬Tx) is normalized into ¬∀x ((¬Bew PA (x)) * ∨ Tx), it q-refers to the same expressions as ∀x (Bew PA (x) → Tx), i.e. to all theorems of PA, as desired. In general, statements of the form ∃v ϕ q-refer to the same sentences as ∀v ¬ϕ. This allows the given definition of q-reference to account for the self-referentiality and other reference patterns of fixed points delivered by the 'existential' proof of Theorem 1, just as in the case of the 'universal' proof (cf. Section 2.2).

Direct Reference, Reference, and Reference Patterns
In the previous section a completely general account of reference by quantification was provided, extending (an alethic version of) clause 2 of the incomplete notion of reference to all sentences of the language. We are finally in a position to define alethic reference simpliciter, as the disjunction of m-and q-reference. However, I call it "direct reference" or "d-reference", for short, as an indirect reference relation is also possible and relevant to our purposes.
Definition 5 allows us to characterize many reference patterns, including the selfreferentiality of both weak and strong liars, i.e. λ and ¬Tl, and the unfoundedness of the Visser-Yablo sentences, as will be seen later. Nonetheless, there are other cases of intuitive self-reference, such as the circularity of cycles, and other problematic patterns, such as the unfoundedness of chains, which are not accounted for. Consider, for instance, sentences λ 1 and λ 2 in the 2-liar cycle in (5). They directly refer only to one another. Intuitively, however, they somehow refer to themselves as well, albeit indirectly. Otherwise, we would get a semantic paradox without self-reference on the cheap. Furthermore, we can prove the existence of ω-chains, that is, sequences of sentences, each of which directly refers to the expression coming next. For every formula ϕ(x, v) there is an infinite sequence of distinct terms t 0 , . . . , t n , . . . s.t., for every n ∈ ω, the following is provable in PA:

Proposition 2 (ω-chains)
See Picollo [21] for a proof. Note that, since the terms t 0 , . . . , t n , . . . are different, so are the sentences ϕ(t 0 ), ϕ(t 1 ), . . . , ϕ(t n ), . . . and, therefore, the numbers those terms denote. For example, we can obtain an ω-chain for the predicate Tx -a 'truthteller' chain of sentences Tt 0 , Tt 1 , . . . , Tt n , . . . -such that the following are provable in PA: Each Tt n d-refers just to Tt n+1 , but intuitively each of them also refers, indirectly, to all the ones coming later on the list. A more general notion of reference, that is, the transitive closure of d-reference, is therefore necessary. To define such a notion, we make use of the following definition.
Definition 6 (Chain of reference) A (possibly infinite) sequence of sentences s.t. each sentence in the sequence d-refers to the one coming after, if any.
Definition 7 (Reference) Let ϕ, ψ be sentences. ϕ refers to ψ iff there's a chain of reference starting with ϕ and ending with ψ.
It easily follows from the definition of reference that λ 1 and λ 2 refer to themselves, and that each sentence in the truth-teller chain in (14) refers to every expression that comes up later on the list. Moreover, we can employ the notions just introduced to define salient reference patterns, such as the following.
Definition 8 (Self-reference) A sentence is self-referential iff it refers to itself. Definition 9 (Well-foundedness) A sentence ϕ is well-founded iff all chains of reference starting with ϕ are finite. Otherwise, we say that ϕ is unfounded.
Sentences that don't d-refer to any expression -e.g. every theorem of PA -are well-founded. Moreover, if a sentence d-refers only to well-founded expressions, it is also well-founded, for it extends the chains of reference of the latter only by one sentence. For example, ∀x (Bew PA (x) → Tx) is well-founded. On the other hand, every self-referential expression is obviously unfounded. But there are also unfounded sentences that don't refer to themselves, such as the Visser-Yablo sentences in (6) and the truth-teller chain in (14). In both cases, sentences on the list refer to all the expressions occurring later on, but never to other statements that occur before them, thus never to themselves.
If the given definitions are correct, the Visser-Yablo sequence shows that the self-reference diagnosis of semantic paradoxes is mistaken after all: there are nonself-referential (ω-)paradoxes -at least if self-reference is taken, as we have been assuming, to be a property of sentences. But this doesn't mean there is no use for the new notions of reference in the quest for the root of paradox or in the formulation of interesting truth theories. Although restricting our truth principles to non-selfreferential expressions will not always lead to sound truth systems, restricting them to well-founded sentences will, as I show in the companion piece, [22]. This suggests that it is not self-reference but unfoundedness which is behind the semantic paradoxes.

Conclusions
Let's recapitulate. We have seen that the two conceptions of reference and selfreference that are commonly deployed in the literature on semantic paradox and incompleteness are defective or incomplete. Whereas the naïve conception is outright trivial, it was not clear how to turn the incomplete notion, which seems to be somehow on the right track, into a full-fledged account. What's more, a great deal of scepticism surrounded this project. In particular, it has so far been unclear how a notion of self-reference along those lines could account for the self-referential character of 'weak' fixed points -e.g. of λ.
Throughout this paper I have provided a precise account of reference and selfreference via truth that supplements the incomplete notion, extending it to all sentences of L T . It is my hope that this account dispels some of the doubts affecting the notions of reference and self-reference for formal languages, at least in the context of truth. The notion of reference by quantification introduced in Section 3.3 explains the self-referentiality of many expressions obtained by Diagonalizatione.g. λ -in terms of the way the fixed points are constructed. There is subsequently no need to appeal to a naïve and trivializing account of self-reference according to which it is the equivalence this result yields -e.g. between λ and ¬T λ -that is responsible for self-reference.
More generally, unlike the naïve conception of reference, the definitions of mand q-reference given in Sections 3.2 and 3.3 account for the reference patterns underlying many expressions, including the liar and also liar cycles and Visser-Yablo sentences, without invoking any provable or true equivalences, but attending to the syntactic structure of these expressions. The new notions seem to answer Milne's worries satisfactorily and, hopefully, help dissipate to some extent the scepticism over reference and self-reference aired by Leitgeb and Cook.