How to approximate fuzzy sets: mind-changes and the Ershov Hierarchy

Computability theorists have introduced multiple hierarchies to measure the complexity of sets of natural numbers. The Kleene Hierarchy classifies sets according to the first-order complexity of their defining formulas. The Ershov Hierarchy classifies limit computable sets with respect to the number of mistakes that are needed to approximate them. Biacino and Gerla extended the Kleene Hierarchy to the realm of fuzzy sets, whose membership functions range in a complete lattice. In this paper, we combine the Ershov Hierarchy and fuzzy set theory, by introducing and investigating the Fuzzy Ershov Hierarchy.


Introduction
Suppose you wish to know exactly whether a certain object satisfies some property P. Two sorts of difficulties may arise: 1. P may be graded, that is, some objects satisfy P only up to some degree; 2. Or rather, membership to P may be sharply defined but knowing whether an arbitrary object satisfies P may exceed the capabilities of any computer-or, equivalently, of any human armed with pencil, paper, and endless patience.
The first case is studied in fuzzy mathematics; the second one in computability theory. In this work, we'll discuss a natural way of merging these approaches.
Crisp properties on a given domain D-i.e., properties whose membership functions range in the set {0, 1}-can be naturally identified with subsets of D. By adopting this perspective, one may regard classical computability theory as the study of the complexity of crisp properties on the set ω of the natural numbers: e.g., "being even" and "being the code of a Turing machine which halts on a blank tape" are examples of, respectively, a decidable crisp property and an undecidable one.
Computability theorists have introduced multiple hierarchies to measure the complexity of crisp properties. Two such hierarchies will be relevant for the present paper. The Kleene Hierarchy classifies subsets of ω according to the first-order complexity of their defining formulas within arithmetic. The Ershov Hierarchy concentrates on an important initial segment of the Kleene Hierarchy, that of 0 2 sets (which coincide with the sets that are computable in the limit), by classifying such sets with respect to the number of mistakes that are needed to approximate them.
Fuzzy sets, introduced by Zadeh (1965) and later developed into a broad area of research, allow to mathematically study graded properties, such as those properties with blurry boundaries, and to extend the scope of logic to approximate reasoning.
It is natural to ask how to introduce computability theory within fuzzy mathematics. A first approach is to define fuzzy algorithms (as in, e.g., Bedregal and Figueira (2006), Santos (1970), Wiedermann (2002), and Zadeh (1968)), and then rebuild computability theory by permitting fuzzy computations. A parallel approach is to maintain ordinary Turing machines and just adopt them to calibrate the complexity of fuzzy sets. After all, well-established computability-theoretic hierarchies could be extended to the realm of fuzzy objects. This is the case for the Kleene Hierarchy, which has been extended to fuzzy sets by Biacino and Gerla (1989), see also Gerla (2001), Harkleroad (1984), and Harkleroad (1988).
In this paper, we introduce and investigate the Fuzzy Ershov Hierarchy (FEH). That is, we focus on the complexity of approximating fuzzy objects which belong to the class 0 2 . The key idea for evaluating this complexity is that of a mind-change, which is borrowed from Ershov (1968aErshov ( , 1968bErshov ( , 1970 (and it has been intensively studiedsee, e.g., Bazhenov et al. (2020), Cooper et al. (1991), Downey and Greenberg (2020), and Stephan et al. (2009)). Intuitively, mind-changes allow to improve knowledge of a given property P through time. For example, suppose that P is the following property: "being a theorem of Peano Arithmetic". Such a property is certainly crisp but, as is well-known, is also undecidable-that is to say, there can be no algorithm which, for all arithmetic formulas ϕ, can determine, in finitely many steps, whether Peano Arithmetic proves ϕ or not. Yet, an ideal agent A, which is allowed to change its mind, can eventually achieve knowledge of P as follows: • At first, A believes that only the axioms of Peano Arithmetic satisfy P; • Next, A lists, one-by-one, all the consequences of such axioms; • Finally, whenever a formula ϕ appears in the above list, A changes its mind about the theoremhood of ϕ, by declaring that ϕ does satisfy P.
Hence, following this procedure, A can gradually increase knowledge of P in such a way that, in the limit, A will correctly guess the theoremhood of all arithmetic formulas, and for any such formula at most one mind-change will be required. In the classical setting, an approximation to a set A changes its mind on a given input x by switching its guess on whether x belongs to A or not. Moving to fuzzy sets, mind-changes will be formalized by changes in the monotonicity of approximating functions. We will prove that, by allowing more and more mind-changes, we will be able to capture larger and larger sub-classes of 0 2 fuzzy sets. In particular, it will follow that there are fuzzy sets which cannot be approximated only from above or below, but they require approximations which oscillate "up and down" on the membership degree of x, for some inputs x.
The paper is arranged as follows. In Sect. 2, we recall preliminaries concerning fuzzy sets, effective reals, and the Ershov Hierarchy. In Sect. 3, we introduce FEH, and we prove some of the main results of the paper. First, the hierarchy does not collapse (Proposition 5). Second, in analogy with the classical case, sets lying at the so-called finite levels of FEH can be represented as Boolean combinations of fuzzy sets belonging to the first level, i.e., c.e. fuzzy sets (Theorem 6). Third, contrary to the classical case, FEH does not exhaust the class of all 0 2 fuzzy sets (Proposition 8). In Sect. 4, we investigate two natural ways of broadening FEH. First, we refine the proposed hierarchy, by keeping track of all updates needed to approximate a 0 2 fuzzy set, rather than focusing exclusively on those updates which determine a change of monotonicity in the approximating function. We show that such a refined hierarchy is quite wild (Theorem 10). Second, in analogy with the Classical Ershov Hierarchy, we extend FEH to the transfinite. Yet, in sharp contrast with the classical case, we note that even including all transfinite levels, we still do not exhaust all 0 2 fuzzy sets.

Preliminaries
We assume that the reader is familiar with the basic notions of computability theory. For the background, we refer to the monographs (Rogers, 1967;Soare, 2016). Anyway, let us include here definitions of the most basic notions: Turing machine M f so that, for all natural numbers x and y, f (x) = y holds if and only if M f , having a suitable coding of x printed on its input tape, halts after finitely many steps with a suitable coding of y printed on its output tape; • A (crisp) set A ⊆ ω is computable if its characteristic function, defined by if it is the image of a computable function. Equivalently, c.e. sets can be seen as those sets which are the domain of some computable function.
Note that, as is now custom in computability theory, this paper uses the term computably enumerable (or c.e.) in place of recursively enumerable. For a set X , by |X | we denote the cardinality of X . The preliminaries on fuzzy sets mainly follow Gerla (2001). As usual, one fixes an effective bijection ν : Q → ω. This convention allows to transfer familiar computability-theoretic notions to the rationals: for example, a crisp set X ⊆ Q is computable iff ν(X ) is a computable subset of ω.

Fuzzy subsets
Let L be a complete lattice. A fuzzy subset (or an L-subset) of ω is an arbitrary function A : ω → L. In this paper, for the sake of simplicity, we consider the case when L is equal to the real interval As mentioned in the introduction, a fundamental tool for classifying the complexity of crisp subsets of ω is provided by the Kleene Arithmetical Hierarchy (Kleene, 1943) (see, e.g., Chapter 4 in Soare (2016) for a detailed discussion). Biacino and Gerla (1989) extended the Kleene Hierarchy to fuzzy subsets. In our paper, we work only with fuzzy subsets belonging to the levels 0 1 , 0 1 , and 0 2 of the Kleene Hierarchy. Hence, we give formal definitions only for these levels. For more details, the reader is referred to Biacino and Gerla (1989) and Sect. 11.5 in Gerla (2001). By [0, 1] Q we denote the set of all rational numbers q such that 0 ≤ q ≤ 1.
Definition 1 ( (Biacino & Gerla, 1989), see also Sect. 11.2 in Gerla (2001)) A fuzzy set A is computably enumerable (or belongs to the class 0 1 ) if there is a computable function f : ω × ω → [0, 1] Q such that, for all x ∈ ω, we have: We say that such function f is a 0 1 -approximation of the fuzzy set A.
Note that without loss of generality, one can always assume that in the definition above, f (x, 0) equals 0. Hence, c.e. fuzzy sets may intuitively be regarded as fuzzy sets which can be approximated "from below", in the sense that approximations to c.e. fuzzy sets can only increase over time.
If A and B are fuzzy sets, then one can define set-theoretic operations on them: A fuzzy set A is co-computably enumerable (or belongs to the class 0 1 ) if its complement A is c.e. Equivalently (see Theorem 5.2 in Gerla (2001), Chap. 11), A is co-c.e. if and only if there is a computable function f : ω 2 → [0, 1] Q such that, for all x ∈ ω, we have: In the 0 1 case, we may assume that f (x, 0) = 1, for all x. So, co-c.e. fuzzy sets may be regarded as fuzzy sets which can be approximated "from above".
Finally, the main object of study of this paper are 0 2 fuzzy sets. A fuzzy set A belongs to the class 0 2 if A lies in both classes 0 2 and 0 2 of the Kleene Hierarchy. In this paper, we adopt the following equivalent definition (see Proposition 5.4 in Gerla (2001), Chap. 11).

Definition 2 A fuzzy set
We call such function f (x, s) a 0 2 -approximation of the fuzzy set A.

Effective reals
Here we briefly discuss some simple results which connect fuzzy subsets of ω with effectively approximable reals. Informally speaking, a real α ∈ R is effectively approximable if there exists an algorithm which provides an approximation of α (typically, one can view an approximation as a sequence of rational numbers). One of the first mathematical formalizations of this concept was given by Turing (1936): he defined computable real numbers as those reals that have a binary expansion which can be computed by a Turing machine. Computability theorists also consider weaker notions of effective approximability-for example, left-c.e. reals are precisely those reals α such that the set of all rationals q < α can be enumerated by a Turing machine (a more formal definition of a left-c.e. real is given below). We refer to Chapter 5 in Downey and Hirschfeldt (2010) for further details. We consider reals α is c.e. By working with these definitions, it is not hard to prove the following result.

A is c.e. if and only if the reals A(k)
, k ∈ ω, are uniformly left-c.e.

A is co-c.e. if and only if the reals A(k), k ∈ ω, are uniformly right-c.e.
A real α is 0 2 if there is a computable sequence (q s ) s∈ω of rationals such that α = lim s→∞ q s (see, e.g., Theorem 5.1.3 in Downey and Hirschfeldt (2010)). We also observe the following, which is an immediate consequence of the definitions: Proposition 2 A fuzzy set A is 0 2 if and only if the reals A(k), k ∈ ω, are uniformly 0 2 , i.e., there is a computable sequence (q k,s ) k,s∈ω of rationals such that A(k) = lim s→∞ q k,s , for all k.

The Classical Ershov Hierarchy
We give few preliminaries on the Ershov (1968aErshov ( , 1968bErshov ( , 1970; to distinguish it from the fuzzy analogue introduced below, we refer to this hierarchy as the Classical Ershov Hierarchy. For the sake of simplicity, here we discuss only the finite levels of the hierarchy (these finite levels are also called Difference Hierarchy in the literature). In this section, all subsets of ω are crisp.
A set A is co-n-computably enumerable (or co-n-c.e., or belongs to the class −1 n ) if its complement A is n-c.e. Remark 1 It is common to refer to the numbers labeled by the variable s in the last definition as stages. This reflects the intuition that f is a procedure to approximate A through time (that is, by stages). So, in plain terms, a set A is n-c.e. if one is eventually able to achieve complete knowledge of A by some stage-by-stage procedure which, for each number x, has at most n many mind-changes as to whether the property "x belongs to A" holds or not.
Historically, the notion of n-c.e. sets was introduced by Putnam (1965) and Gold (1965). Note that −1 Ershov (1968a) proved that, for each n ≥ 1, there exists an n-c.e. set S n such that every n-c.e. set A is many-one reducible to S n (i.e., there exists a computable function f so that for all x ∈ ω). In addition, S n does not belong to −1 n . In particular, this implies that the Classical Ershov Hierarchy does not collapse.
Sets from the class −1 n can be represented as Boolean combinations of c.e. sets: Theorem 3 (Ershov, 1968a) We refer the reader to the survey (Stephan et al., 2009) for more details on the Classical Ershov Hierarchy.

Fuzzy Ershov Hierarchy
In this section, we extend the classical Difference Hierarchy to the class of fuzzy subsets of ω (see Definition 5 below). We establish some initial properties of this hierarchy: the hierarchy does not collapse (Sect. 3.1); it is connected to the Boolean combinations of c.e. fuzzy sets (Sect. 3.2); the introduced levels of the hierarchy do not exhaust all 0 2 fuzzy sets (Sect. 3.3). As discussed in Remark 1, the Difference Hierarchy classifies 0 2 sets according to the number of mind-changes that are needed to reliably approximate them. Now, note that mind-changes are naturally associated to changes of monotocity of the function f . Indeed, suppose that, for some input x and stage s, we have the following mindchange: f (x, s) = 0 but f (x, s + 1) = 1 so that, on this input, f increases its output from 0 to 1. Then, in order to witness some further mind-change, there must exist a least stage t > s at which f , on the same input, decreases from 1 to 0, thus switching its monotonicity. In order to formally define the Fuzzy Ershov Hierarchy (FEH), we will rely on this connection between mind-changes and changes of monotocity of the approximating function.
We begin by illustrating the intuition behind Definition 5 with the following example. A 0 2 fuzzy set A is called 3-computably enumerable if it possesses a 0 2approximation f (x, s), which changes its mind at most two times. So, for an element x ∈ ω, the worst case behavior looks like this: • First, our approximation (non-strictly) increases-i.e., there is a stage s 1 such that f (x, s) ≤ f (x, s + 1), for all s < s 1 . • Second, the approximation starts to decrease until some stage s 2 > s 1 : • Then the final change of mind happens: the approximation will forever increase- In order to make this idea formal, we introduce mind-change functions, which "track down" the described mind-changes.

Definition 4 Let f be a total function from
Notice the following: if a function f is computable, then both m f and m f are also computable. Now we are ready to give the main definition.

Definition 5 Let n be a non-zero natural number. A fuzzy set
Remark 2 Notice that the third condition of Definition 5 contains a modificationthe upper bound n is changed to n − 1 (cf. the third condition of Definition 3). This modification is of a technical nature. We illustrate the reason behind the modification by an example. Let A be a crisp 3-c.e. set, and let x be a natural number. For simplicity, we assume that for this x, the function f from Definition 3 behaves as follows: So, the upper bound n = 3 is achieved by the number x: Now we calculate the corresponding -mind-change function: The example illustrates the following: if f (x, ·) changes its values at most n times, then the corresponding function m f (x, ·) can change its values only at most n−1 times.
Since we want any crisp n-c.e. set to be also a fuzzy n-c.e. set, we have introduced the discussed modification in Definition 5.
Note that 1-c.e. fuzzy sets are precisely the c.e. sets from Definition 1. In addition, the following fact is immediate.

The hierarchy does not collapse
In order to show the non-collapse of the hierarchy, it is sufficient to prove the following:

Proposition 5 Let A be a crisp subset of ω. Then A is n-c.e. in the Classical Ershov Hierarchy if and only if A is n-c.e. in FEH. A similar fact is true for co-n-c.e. sets.
Indeed, since the Classical Difference Hierarchy does not collapse, Proposition 5 implies that our hierarchy is also non-collapsing.

Proof of Proposition 5 (⇒).
Suppose that A is n-c.e. in the classical sense. We fix a computable function f : ω 2 → {0, 1} satisfying the conditions from Definition 3. It is clear that f (x, s) is a 0 2 -approximation of A, treated as a 0 2 fuzzy set. For an element x ∈ ω, consider all stages s 1 < s 2 < · · · < s k (note that k ≤ n) such that f (x, s i ) = f (x, s i + 1). A straightforward analysis shows the following: This implies that |{s : For an element x ∈ ω, consider all stages s 1 < s 2 < · · · < s k such that g(x, s i ) = g(x, s i + 1). For i ≤ k, one can show the following: In turn, this implies k − 1 ≤ |{s : and the function g(x, s) witnesses that the set A is n-c.e. in the classical sense.

Boolean combinations of fuzzy c.e. sets
We show that similarly to the Classical Ershov Hierarchy (Theorem 3), n-c.e. fuzzy sets admit natural presentations via Boolean combinations of c.e. sets.
Proof (⇒). Let f (x, s) be a 0 2 -approximation which witnesses the fact that C is n-c.e. We define the desired c.e. fuzzy sets A i and B i via their 0 1 -approximations h A i and h B i (in the sense of Definition 1), respectively.
The intuition behind these c.e. sets is as follows. For an element x ∈ ω, we split The function decreases on the rest of the intervals.
• The approximation h A 1 of the set A 1 looks like this: it copies f (x, ·) on the interval [0, a 0 ), and then stabilizes, Formally speaking, for a non-zero i ≤ k + 1, we define: It is not hard to see that these approximations induce c.e. fuzzy sets. In addition, if Again, D(x) = C(x). We deduce that the fuzzy sets C and D are equal.
(⇐). Let D be a 0 2 fuzzy set defined via the approximation h D from (1). We prove that this approximation h D witnesses the fact that D is n-c.e.
First, we note the following easy observation (it follows from computable enumerability of fuzzy sets A i and B i ): for all s ≥ s 0 .
An informal intuition concerning further proof is as follows. Every (approximation of the) real (A i ∩ B i )(x) can be treated as a "hill": first we go up, copying the function s 0 ), we can only go down. Coming back to the whole picture of h D : whenever the mind-change function m h D (x, ·) changes from +1 to −1, it happens because we encountered the top of one of the "hills". At a stage s, consider the following sets: implies that X s ⊆ X s+1 for every s. In addition, X 0 = ∅.
It is not hard to deduce the following equation: Note that for a fixed non-empty set Z , the function max{1 − h B i (x, s) : i ∈ Z } is non-increasing, and max{h A i (x, s) : i ∈ Z } is non-decreasing. Suppose that m h D (x, s) = 1 and m h D (x, s + 1) = −1. Choose the greatest s < s such that either s = 0, or s > 0 and m h D (x, s ) = −1. Towards a contradiction, assume that X s+1 = X s .
Then on one hand, we have which contradicts the fact that m h D (x, s + 1) = −1.
On the other hand, every t such that s < t ≤ s satisfies We obtain a contradiction. Therefore, X s+1 = X s . We deduce that for each stage s with m h D (x, s) = 1 and m h D (x, s + 1) = −1, at least one new element is added to the growing set X = t∈ω X t .
Suppose that n = 2k + 2. Then one can show that |X | ≤ k + 1. We notice the following: if |X | is less than k + 1, then the number of monotonicity breaks (of the function m h D (x, ·)) will be strictly less than the corresponding number for the case |X | = k + 1. Hence, one can consider only the case when |X | = k + 1.
If |X | = k + 1, then there is a stage s * such that for all s ≥ s * , we have h D (x, s) = max{1 − h B i (x, s) : i ∈ X }, and this function can only decrease. A not difficult combinatorial argument shows that |{s ∈ ω : m h D (x, s + 1) = m h D (x, s)}| ≤ 2k + 1.
If n = 2k + 1, then |X | ≤ k. An argument similar to the one above shows that one can consider only the case when |X | equals k.
If |X | = k, then there is a stage s * such that for s ≥ s * , we have One can show that in this case, |{s : Corollary 7 Every finite Boolean combination of c.e. fuzzy sets is an n-c.e. set, for some n ≥ 1.

The introduced hierarchy is not enough
Here we show that the introduced levels of FEH do not exhaust the class of all 0 2 fuzzy subsets of ω.
Proposition 8 There exists a 0 2 fuzzy set A such that for any 0 2 -approximation f (x, s) of A, the sequence (m f (0, s)) s∈ω diverges when s tends to infinity. In particular, A is not n-c.e., for all n ≥ 1.
Proof Choose an arbitrary 0 2 real α, which is not left-c.e. and not right-c.e. (see, e.g., Theorem 5.1.10 in Downey and Hirschfeldt (2010) for an example of such real). The desired fuzzy set A is defined as follows: put A(k) = α, for all k ∈ ω. Since α is 0 2 , Proposition 2 implies that the set A is 0 2 . Towards a contradiction, assume that f (x, s) is a 0 2 -approximation of A such that the sequence (m f (0, s)) s∈ω converges. There are two possible cases.

Broadening the Fuzzy Ershov Hierarchy
In this section, we investigate two natural options for, first, refining the FEH, and, secondly, extending the finite levels of the hierarchy.

Counting updates
We say that a 0 2 -approximation f of a fuzzy set A has an update if f (x, s + 1) = f (x, s), for some x, s ∈ ω. Observe that our notion of mind-change, as in Definition 5, keeps track only of those updates which determine a change of monotonicity in the approximating function: e.g., if f (x, s) is a 0 2 -approximation of a fuzzy set A, m f (x, s) = 1, and f (x, s + 1) > f (x, s), then m f (x, s + 1) remains equal to 1. So, one may explore what happens if one keeps track of all updates for a given 0 2 -approximation. To motivate such approach, consider the following example given by Harkleroad (1984).

Example 1 As usual, K denotes the Halting problem. Define
It is easy to see that H is a c.e. fuzzy set. But note that, for any c.e. approximation h of H (recall that one assumes h(x, 0) = 0 for all x), there must be an infinite crisp set Z ⊆ ω such that h requires at least two updates to approximate each x ∈ Z , as otherwise, K = {x : (∃s)(h(x, s) = 1)} would be computable. So, to distinguish H from those c.e. fuzzy sets which can be approximated with at most one update, we propose the following definition.

Definition 6 A fuzzy set
It is immediate to note that, for all n, every [n] 1 -c.e. is a c.e. fuzzy set. The next result establishes that the hierarchy of [n] 1 -c.e. sets does not collapse, but it also doesn't exhaust the class of all c.e. fuzzy sets.

there is a c.e. fuzzy set which is not [n] 1 -c.e., for all n.
Proof For the sake of exposition, we first re-prove that there is a [2] 1 -c.e. set which is not [1] 1 -c.e. (as is illustrated by Example 1). In doing so, we will suitably modify Harkleroad's example, obtaining a module which is apt to be generalized.
Let U be a crisp 2-c.e. set which is not c.e., that is, U ∈ −1 2 −1 1 . By Theorem 3, there exist c.e. sets W 1 , W 2 so that U = W 1 W 2 . Without loss of generality, one may assume that W 2 ⊆ W 1 : indeed, if W 2 W 1 , then we replace W 1 with the c.e. set W new 1 := W 1 ∪ W 2 . Then we define the following c.e. fuzzy set: It is easy to see that H 2 is [2] 1 -c.e. Indeed, it suffices to define a 0 2 -approximation g 0 which, on input x, has an update if, at some stage s 0 , x is enumerated into W 1 but not into W 2 , and then let g 0 have the second (and last) update if, at stage s 1 > s 0 , x is also enumerated into W 2 .
On the other hand, suppose that there is a 0 2 -approximation g 1 which witnesses that H 2 is [1] 1 -c.e. Since g 1 can have at most one update for each input, we would have that contradicting the fact that U is not c.e.
As usual, this condition can be achieved in a "dynamic" way: For a non-zero i ≤ n +1, the new set W new i includes all numbers x such that there exists a sequence of stages 0 = s 0 < s 1 < s 2 < · · · < s i with the following properties: • for odd j ≤ i, the element x belongs to the finite set where for s ∈ ω, the finite set W k,s contains all elements of W k enumerated by the stage s; • for even j ≤ i, the number x does not belong to the set V s j .
Next, define H n+1 as follows, The fuzzy set H n+1 is clearly [n + 1] 1 -c.e.. By reasoning as above, it is not hard to deduce that it cannot be [n] 1 -c.e. (or, a fortiori, [k] 1 -c.e. for any k < n). Indeed, suppose that there is a 0 2 -approximation g 2 witnessing that H n+1 is [n] 1 -c.e.. Then, we would be able to approximate whether any given x belongs to V with at most n mind-changes, contradicting the choice of such V .
(2) To construct a c.e. fuzzy set which is not [n] 1 -c.e., for all n, it suffices to join all the sets H n 's defined in item (1). For all x and n, let where ·, · denotes a Cantor pairing function, i.e., an effective bijection from ω 2 onto ω. Towards a contradiction, suppose that there is a 0 2 -approximation h 0 witnessing that H is [n] 1 -c.e., for some n. Let m > n and define for all x and s. Then, h 1 would witness that H m is [n] 1 -c.e., a contradiction. Now, we shall similarly stratify each level of FEH, by counting the number of updates that are needed to approximate n-c.e. fuzzy sets. Intuitively, we say that a fuzzy set A is [n 1 , . . . , n m ] m -c.e., if there is a 0 2 -approximation f of A that, for each input x, can go up at most n 1 times, and then go down at most n 2 times, etc.-for m-many ups and downs.
If Bounding never outputs no, for all input x, then we say that f is bounded by α.
Definition 7 Let (n 0 , n 1 , . . . , n k ) be a tuple of natural numbers containing no zeros. A fuzzy set A is [n 0 , n 1 , . . . , n k ] k -c.e. if there is a 0 2 -approximation f of A which is bounded by n 0 n 1 · · · n k 0 ∞ , where 0 ∞ denotes the sequence consisting of infinitely many zeros.

Remark 3
Let ρ ∈ ω k be a k-tuple of natural numbers. For notational simplicity, if a 0 2 -approximation f is bounded by ρ 0 ∞ , we may just say that f is ρ-bounded. Similarly, we may simply write that A is ρ-c.e. (rather than [ρ] k -c.e.).
By the last definition, we have refined FEH with a plethora of new sub-levels. It is natural to ask how these sub-levels compare with each other. For example, can there be a fuzzy set which is [2, 1, 1] 3 -c.e. but not [1, 10, 10] 3 -c.e.? At first sight, one may guess that the answer is no, as it may seem plausible that the behavior of a 0 2 -approximation f which is constrained to have at most 4 updates, for each input, could be always emulated by a function bounded by (1, 10, 10). But this is not the case.  (m 0 , m 1 , . . . , m k ), Proof Let ρ, σ be k-tuples with ρ bitwise below σ . It follows immediately from Definition 7 that, if f is ρ-bounded, then f is also σ -bounded. Hence, any fuzzy set which is ρ-c.e. must also be σ -c.e.
On the other hand, suppose that ρ = (n 0 , . . . , n k ) is not bitwise below σ = (m 0 , . . . , m k ). In particular, let i be the least number so that n i > m i . For the sake of exposition, assume that i is even (the other case being symmetric): this means that n i correspond to a sequence of increasing updates. Now, we shall construct a fuzzy set A which is ρ-c.e. but not σ -c.e. To do so, we will construct by stages a 0 2 -approximation f of A which meet the following infinite list of requirements: • P: f is ρ-bounded; • R e : If the function φ e is σ -bounded, then lim s→∞ ( f (e, s)) = lim s→∞ (φ e (e, s)).
Note that the combination of all R-requirements ensures that A cannot be approximated by any function which is σ -bounded.
Through this construction, it will be very convenient to view f dynamically. That is, rather than defining f (x, s), for all s, we just let f (x) change its value during the construction, and we will interpret lim s→∞ ( f (x, s)) as the limit value (if any) of such a sequence of changes. Similarly, by referring to the current value of φ e (e), we simply mean φ e (e, s), for the current stage s of the construction.
Strategy to meet the requirements For a computable function φ e , the property of being σ -bounded is 0 1 . But to address R e , it suffices to restrict the focus to the behavior of the algorithm Bounding on e, up to k many changes of monotonicity.
The basic idea of the strategy is straightforward. We shall act whenever we witness that the current value of f (e) coincides with the current value of φ e (e). Yet, our actions are implemented in different ways, corresponding to three distinct phases: • In Phase I, to ensure that f (e) differs from φ e (e) we simply alternate, whenever is necessary, between 0 and 1. That is, if we witness that for j ∈ {0, 1}, we respond by updating f (e) to 1 − j. We perform this action for at most i times. Then we move to Phase II. • When we enter Phase II, both f and φ e have changed monotonicity (on input e) i − 1 times. Moreover, since we are under the assumption that i is even, we have that the value of f (e), when Phase II starts, is 0. Now, ρ bounds f to a sequence of n i many increasing updates, while σ bounds φ e to only m i many increasing updates. We take advantage of the fact that n i > m i by performing two sorts of actions: (i) If we witness that f (e) = φ e (e), we respond by increasing the current value of f (e) to the midpoint between f (e) and 1, that is, we update f (e) to f (e)+1 2 . If we also see that φ e exhausted all its m i many updates, we move to the Transition phase.
(ii) If we witness that f (e) < φ e (e), we immediately move to the Transition phase.
• We may start the Transition phase with either f (e) > φ e (e) or f (e) < φ e (e): (a) In the first case, we wait to see if, at some further stage, φ e (e) equals f e (e). If this happens, we let f (e) = 0 and we go back to Phase I. (a) In the second case, we again wait to see if φ e (e) equals f e (e). If this happens, we now let f (e) = 1 and we go back to Phase I.
There are no interactions between different strategies. The construction At any stage, each requirement, independently from all others, can either be in Phase I, or in Phase II, or in the Transition phase. We use a collection of dynamic counters (c e ) e∈ω to record how many times we have changed the monotonicity of f . Recall that ρ is of the form (n 0 , n 1 , . . . , n k ), hence in defining f we are constrained to at most k many changes of monotonicity. We also use a collection of parameters (d e ) e∈ω , which range on the set {−1, 1}, to distinguish the actions of the Transition phase. Finally, at any given stage s, φ e is σ -compatible, if the algorithm Bounding on input e does not return no within s many stages.
• Stage 0. For all e, let c e = d e = 0. Say that every requirement R e is in Phase I.
• Stage s + 1 = e, t . We deal with R e . If φ e is not σ -compatible or c e > k, then we do nothing. Otherwise, we act accordingly to the current phase of R e .
-R e is in Phase I and the current value of f e (e) is j, for j ∈ {0, 1}: If we witness that φ e (e) currently equals j, we update the value of f e (e) to 1 − j. We then increase the counter c e by 1. If, after this action, c e equals i, we enter Phase II; otherwise, we remain in Phase I. -R e is in Phase II and the current value of f e (e) is u, for u ∈ [0, 1): We distinguish three sub-cases, corresponding to the current value of φ e (e): 1. If φ e (e) < u, we do nothing and we remain in Phase II; . If it is also the case that φ e (e) already had m i many consecutive increasing updates, we set d e to 1 and we move to the Transition phase; otherwise, we remain in Phase II; 3. If φ e (e) > u, we set d e to −1 and then we move to the Transition phase.
-R e is in the Transition phase and the current value of f e (e) is v, for v ∈ [0, 1]: If we witness that φ e (e) currently equals v, we distinguish two sub-cases: 1. If d e = 1, we update the value of f (e) to 0. Then, we increase the counter c e by 1, and we go back to Phase I; 2. If d e = −1, we update the value of f (e) to 1 and we go back to Phase I, without updating c e .
This concludes the construction. The verification To conclude that A is ρ-c.e. but not σ -c.e., it is sufficient to show that all requirements are eventually satisfied. First, note that the construction ensures that the evolution of phases of each R-requirement follows the next diagram (possibly completing only an initial segment of it): Phase I → Phase II → Transition phase → Phase I.
To see that the global P-requirement is satisfied, it is sufficient to note that, for all e, the following hold: 1. In Phase I, each action forces f to change monotonicity on e. The same is true for the Transition phase with d e = 1. So, for all i * = i with i * ≤ k, we have that f consumes only 1 update and so it is certainly bounded by n i * ; 2. The updates consumed by f in Phase II must be bounded by n i . This is satisfied, as the construction ensures that, in this phase, f has: • at most m i + 1 many updates (with m i < n i ), if d e = 1; or • at most m i many updates, plus one additional update in the subsequent Transition phase if d e = −1.
3. Finally, if f changes monotonicity k many times, then no further action is allowed, and thus f has no more updates on e.
Thus, f is ρ-bounded. But, in fact, it is even bounded by (1, . . . , 1, n i , 1, . . . , 1). It remains to prove that all R-requirements are satisfied. Suppose otherwise. Let e be so that φ e is σ -bounded and, for all x, the limit value of φ e (x) coincides with the limit value of f e (x). Let s be the first stage such that, after s, neither f (e) nor φ e (e) are updated. Towards a contradiction, we now discuss in which phase R e is, at the least stage e, t > s. Suppose that, at this stage, R e is in Phase I with c e < i. If so, the construction allows to update f (e) to either 0 or 1, ensuring that f (e) differs from φ e (e), a contradiction. Next, suppose that R e is in Phase II. If so, since n i > m i , the construction allows again to make f (e) different from φ e (e), a contradiction. A similar reasoning excludes that R e is in the Transition phase.
The only remaining case is that R e is in Phase I with c e ≥ i. Now, the key observation is that, when R e visits Phase I for the second time, it must be the case that f (e) consumed strictly less changes of monotonicity than φ e (e). To see this, we reconstruct the behavior of f and φ e on e, when moving from Phase II through the Transition phase to Phase I. When Phase II ends, both f and φ had i many changes of monotonicity on input x. Then we separate two cases. If d e = −1, then φ e (e) needs to change monotonicity to reach f (e), but we respond by increasing f (e) to 1 without changing the monotonicity of f (see item 2 of the Transition phase above). Hence, if R e is in Phase I at stage e, t , then, from the fact that φ e (e) is again equal to f (e), we deduce that, on input e, φ e and f changed monotonicity i + 2 and i + 1 many times, respectively. The case d e = 1 is similar. Note in particular that, in this case, Phase II ends with f (e) > φ e (e) and the next sequence of updates that φ e can use are decreasing. Hence, it requires two changes of monotonicity for φ e (e) to reach f (e) (this is vacuously true, even if φ e consumes no decreasing update from the n i+1 sequence, as φ e would still need to consume the n i+2 sequence to copy f ). It follows that, if d e = 1, c e ≥ i, R e is in Phase I, and φ e (e) = f (e), then, on input e, φ e and f changed monotonicity i + 3 and i + 1 many times, respectively.
To sum up, if φ e copies the behavior of f enough times, then the construction reaches a stage in which R e is in Phase I and f (e) has consumed strictly less changes of monotonicity than φ e (e). Such a difference in the number of changes of monotonicity of f and φ e will then be preserved by any further action of Phase I. So, we will be able to respond to any threat of φ e , guaranteeing that eventually f (e) will differ from φ e (e).
This proves that all R-requirements are satisfied. Thus, A, which is ρ-bounded, cannot be approximated by a function which is σ -bounded. This concludes the proof of Theorem 10.
We just proved that the refined hierarchy discussed in this subsection is quite wild. It is now time to discuss how to extend FEH.

Going transfinite
As is shown in Proposition 8, there are 0 2 fuzzy sets which are not n-c.e., for all n ≥ 1. Let us consider another example of such a set.
Fix an effective enumeration of all c.e. fuzzy sets {V e } e∈ω ; one can recover such an enumeration by applying Proposition 1 and re-arranging the enumeration of some left-c.e. reals (as in Herbert et al. (2019)). With the help of Theorem 6, we could also construct an effective uniform enumeration {V n e } e∈ω of all n-c.e. fuzzy sets, for all n ≥ 1. Since each V n e is an n-c.e. set, there is a computable function f n (e, x, s) so that: • lim s→∞ f n (e, x, s) = V n e (x); • f n (e, x, 0) = 0.
Next, we define a computable function g by diagonalization. That is, for all natural numbers n, e, x, let • g(x, 0) = 0; • g( n, e, x , s + 1) = 1 − f n (e, n, e, x , s); The fuzzy set B(x), defined as lim s→∞ g(x, s), is 0 2 , but not n-c.e. for all n ≥ 1. Indeed, assume that B is n-c.e. Then, there must be e so that B = V n e . But this contradicts the fact that the definition of g ensures that B( n, e, x ) = 1 − V n e ( n, e, x ), for every x ∈ ω. Note that the fuzzy set B just defined is significantly different from the set A constructed in Proposition 8. For the set A, the -mind-change sequence (m f (x, s)) s∈ω diverges for each x. For the set B, the corresponding sequence converges on each input x-yet, there is no single upper bound n on the number of mind-changes required for approximating all inputs. Hence, it is reasonable to relax the definition of FEH so to include B. In the crisp world, sets like B are called ω-computably enumerable (or ω-c.e.), and they represent the first transfinite level of the Classical Ershov Hierarchy. Similarly, we define the ω-level of FEH as follows: Definition 8 A fuzzy set A is ω-c.e. if there exist computable functions g : ω → ω and f : ω × ω → [0, 1] Q such that, for all x ∈ ω, we have: So, intuitively, a fuzzy set A is ω-c.e. if, for each x, we can fix a computable bound (given by g) to the number of mind-changes required to approximate x. Or, in other words, A(x) acts as an n-c.e. fuzzy set, where n equals g(x) + 1.
Next, similarly to Definition 5, one could introduce the notion of a co-ω-c.e. fuzzy set. Here, we witness the first difference between the finite and the transfinite levels of FEH. By Proposition 5, we know that, for all n, there is a fuzzy set which is co-n-c.e. but not n-c.e. Moving to the transfinite levels, such a property fails: Proposition 11 Every co-ω-c.e. fuzzy set is ω-c.e.
Proof Let A be a fuzzy set which is co-ω-c.e. Then, there must be computable functions g : ω → ω and f : ω × ω → [0, 1] Q such that for all x ∈ ω: .
We now define two new computable functions g (x) and f (x, s) as follows: It is easy to see that f is a 0 2 -approximation of A. In addition, the functions g and f witness that A is an ω-c.e. set.
To define all transfinite levels of FEH, we shall use the standard technology of Kleene's notations for computable ordinals O, < O . An interested reader is referred to Rogers (1967) for a classical exposition of Kleene's O. Intuitively, such a system of notations allows to build up ordinals in an effective manner, by providing suitable codes for each ordinal that can be described in a computable way. For our current purposes, it is sufficient to summarize some key features of O, < O : 1. O ⊂ ω, and ≤ O is a partial order on the set O; 2. If a ∈ O, then |a| O denotes the countable ordinal having notation a; 3. If a < O b, then |a| O < |b| O (with respect to the standard order on ordinals); 4. For each a ∈ O, the (crisp) set {b ∈ O : b < O a} is c.e.; 5. There are no infinite decreasing sequences in the poset O, < O ; 6. Every finite ordinal n has a unique Kleene's notation.
For the sake of exposition, we start by recalling the definition of the transfinite levels of the Classical Ershov Hierarchy. Note that different authors use slight variations of the following definition, and thus one has to be careful about which type of terminology is used when certain results are stated.

Definition 9
Let a ∈ O be a notation of a non-zero ordinal. A crisp set A ⊆ ω belongs to the class −1 a (correspondingly, −1 a ) if there are computable functions f : ω × ω → {0, 1} and h : ω × ω → {b ∈ O : b < O a} such that for all x and s: Ershov (1970) proved that the transfinite levels of the Classical Ershov Hierarchy do not collapse. It is now time to define the classes −1 a and −1 a of FEH. Definition 10 Let a ∈ O be a notation of a non-zero ordinal. A 0 2 fuzzy set A belongs to the class −1 a (correspondingly, −1 a ) if there exist a 0 2 -approximation f (x, s) and a computable "counting" function h : ω 2 → {b ∈ O : b < O a} such that for all x and s: Note that the above definition encompasses all finite levels of FEH: indeed, if a is the notation for a finite ordinal n ≥ 1, then the class of −1 a sets coincides with the n-c.e. fuzzy sets. Moreover, by reasoning as in Proposition 11, it is not hard to see that the classes −1 a and −1 a coincide when a is a notation for a limit ordinal. To gently guide the reader through the proposed transfinite hierarchy, let us discuss a fairly concrete example. Say that a is a Kleene's notation for the ordinal ω + ω + 1. In fact, for the sake of simplicity, let us freely identify notations with the ordinals that they denote. Now, let A be a −1 a fuzzy set, having a 0 2 -approximation f (x, s) and a counting function h(x, s). Then, for a given x ∈ ω, the behavior of f (x, s) and h(x, s) may look like this: • At first, the approximating function f increases and the counting function h just outputs ω + ω (note that h is required to change its mind only if f changes monotonicity). Hence, we cannot predict, or even bound, the number of future mind-changes; • Next, when f changes monotonicity and it starts to decrease, the counting function f must update its output to some ordinal strictly less than ω + ω, that is, to ω + n for some natural number n; • For the next n many mind-changes, our approximation may work as in the (n + 1)c.e. case, that is, changing its monotonicity at most n many times; • When the counting function becomes equal to ω, we cannot predict the remaining number of mind-changes; • The final block of mind-changes works as follows: The approximation changes monotonicity, forcing the value of h to decrease to some finite ordinal k; next, the approximation works as for a standard (k +1)-c.e. or co-(k +1)-c.e. set (depending on whether the approximation was increasing or decreasing in the previous step).
More generally, for a fixed transfinite level of FEH, the approximation of any given input x can always be represented as a finite sequence of blocks. This is because the counting function is non-increasing and there are no infinite decreasing sequences of notations. Each block works as an n-c.e. or a co-n-c.e. fuzzy set. A priori, we cannot predict either the number of blocks or the size of each of them.
Interestingly, the ordinal ω 2 + 1 is able to encode any finite sequence of blocks (of arbitrary but finite size), and thus it can bound the behavior of any approximation lying at a given transfinite level of FEH. Indeed, let a ∈ O be an ordinal notation for ω 2 + 1. The counting function for a −1 a fuzzy set starts with output ω 2 . Then, it decreases its value to an ordinal stictly below ω 2 , that is, to an ordinal of the form ω × m + n, which corresponds to m many blocks of unknown size. This idea leads us to the following result: Theorem 12 For a 0 2 fuzzy set A, the following are equivalent: 1. for some a ∈ O, the set A belongs to the class −1 a ; 2. there exists b ∈ O such that |b| O = ω 2 + 1 and A belongs to the class −1 b .
The proof can be obtained by straightforward computability-theoretic methods, from a similar crisp theorem of Ershov (1970). For reasons of space, we omit the formal construction. On the other hand, by reasoning as in Proposition 5, one obtains that the transfinite hierarchy does not collapse: Proposition 13 Let a ∈ O be a notation of a non-zero ordinal, and A be a crisp subset of ω. Then A is in −1 a in the Classical Ershov Hierarchy if and only if A belongs to the class −1 a in FEH. The same is true for −1 a sets.
The proof is a direct generalization of the proof of Proposition 5: it is enough to replace n (which bounds the number of mind-changes) with the counting functions of Definitions 9 and 10.
As a concluding remark, we note that the classes −1 a , a ∈ O, still do not exhaust all 0 2 fuzzy sets. In fact, the same example given in the proof of Proposition 8 applies to the transfinite case: the diagonalization is successful because, for any fuzzy −1 a set A and any x ∈ ω, the corresponding -mind-change sequence (m f (x, s)) s∈ω must converge, since there are no infinite decreasing sequences in ordinal notations.

Conclusions
A major goal of this paper has been to highlight the fascinating interplay between two formal approaches to the broad concept of "approximation". That is, we have combined fuzzy set theory, which provides a mathematical foundation to the idea of approximate reasoning, and computability theory, which provides powerful tools to deal with information approximated by stages. Building on previous work (Biacino & Gerla, 1989;Gerla, 2001;Harkleroad, 1984Harkleroad, , 1988, we introduced and explored the Fuzzy Ershov Hierarchy (FEH), which allows to precisely measure how hard it is to approximate certain fuzzy sets. We exhibited both analogies and disanalogies between the Classical Ershov Hierarchy and its fuzzification. Let us conclude by suggesting two natural ways of expanding our work, which may motivate further research: 1. First, in Sect. 4, we have kept separated the analysis of the refined hierarchy and the transfinite one. Yet, nothing forbids to merge these hierarchies and, e.g., to stratify a given −1 a class of fuzzy sets by keeping track of all updates made by a 0 2 -approximation. It is reasonable to expect that by pursuing such a line of research one will encounter several combinatorial intricacies; 2. Second, it is known that every 0 2 crisp set belongs to some transfinite level of the Classical Ershov hierarchy (see Ershov (1968b), Theorem 6). As aforementioned, no analogous result holds for fuzzy sets. Hence, it is natural to ask whether there is a natural way of extending FEH, beyond the boundaries of Definition 10, so to encompass all 0 2 fuzzy sets.