# A Generalized Characterization of Algorithmic Probability

- 525 Downloads

**Part of the following topical collections:**

## Abstract

An a priori semimeasure (also known as “algorithmic probability” or “the Solomonoff prior” in the context of inductive inference) is defined as the transformation, by a given universal monotone Turing machine, of the uniform measure on the infinite strings. It is shown in this paper that the class of a priori semimeasures can equivalently be defined as the class of transformations, by all compatible universal monotone Turing machines, of any continuous computable measure in place of the uniform measure. Some consideration is given to possible implications for the association of algorithmic probability with certain foundational principles of statistics.

## Keywords

Algorithmic probability A priori semimeasure Semicomputable semimeasures Monotone turing machines Principle of indifference Occam’s razor## 1 Introduction

Levin [23] first considered the transformation of the uniform measure *λ* on the infinite bit strings by a universal monotone machine *U*. This transformation *λ* _{ U } is the function that for each finite bit string returns the probability that the string is generated by machine *U*, when *U* is supplied a stream of uniformly random input (produced by tossing a fair coin, say). Levin attached to *λ* _{ U } the interpretation of an “a priori probability” distribution, because *λ* _{ U } dominates every other semicomputable semimeasure and so the initial assumption that a sequence is randomly generated from *λ* _{ U } is in an exact sense the weakest of randomness assumptions.

Earlier on, Solomonoff [20] described in a somewhat less precise way a very similar definition. His motivation was an “a priori probability” distribution to serve as an objective starting point in inductive inference. In this context the definition is known under various headers, including “the Solomonoff prior” and “algorithmic probability”; and it has been associated with certain foundational principles from statistics, to explain or support its merits as an idealized inductive method.

As commonly presented, however, the association with two main such principles (firstly, the principle of *indifference*, and secondly, the principle of *Occam’s razor*) seems to essentially rest on the definition of *λ* _{ U } as a universal transformation of the *uniform measure* *λ*.

This raises the question whether the *a priori semimeasures* (as we will call the functions *λ* _{ U } here) must be defined, as they always are, as the universal transformations of the uniform measure, or that the a priori semimeasures can equivalently be defined as universal transformations of other computable measures.

The main result of this paper is that any a priori semimeasure can indeed be obtained as a universal transformation of *any* continuous computable measure. That is, for any continuous computable measure, an a priori semimeasure can equivalently be defined as giving the probabilities for finite strings being generated by a universal machine that is presented with a stream of bits sampled from *this* measure. More precisely, for any continuous computable measure *μ*, it is shown that the class of functions *λ* _{ U } for all universal monotone machines *U* coincides with the class of functions *μ* _{ U } (i.e., the transformation by *U* of *μ*) for all (*μ*-compatible) universal machines *U*.

This work will be done in Section 2. First, in the current section, we cover basic notions and notation (Section 1.1), discuss the characterization of the semicomputable semimeasures as the transformations via monotone machines of a continuous computable measure (Section 1.2), and the analogous characterization for semicomputable discrete semimeasures and prefix-free machines (Section 1.3).

### 1.1 Basic Notions and Notation

#### 1.1.1 Bit Strings

Let \(\mathbb {B}:=\{ 0,1\}\) denote the set of bits; \(\mathbb {B}^{*}\) the set of all finite bit strings; \(\mathbb {B}^{n}\) the set of bit strings *σ* of length |*σ*| = *n*; \(\mathbb {B}^{\leq n}\) the set of bit strings *σ* of length |*σ*|≤ *n*; \(\mathbb {B}^{\omega }\) the class of all infinite bit strings. The empty string is *𝜖*. The concatenation of bit strings *σ* and *τ* is written *σ* *τ*; we write \(\sigma \preccurlyeq \tau \) if *σ* is an *initial segment* of *τ* (so there is a *ρ* such that *σ* *ρ* = *τ*; we write \(\sigma \prec \tau \) if *ρ*≠*𝜖*). The initial segment of *σ* of length *n* ≤|*σ*| is denoted \(\sigma \upharpoonright _{n}\); the initial segment \(\sigma \upharpoonright _{|\sigma |-1}\) is denoted *σ* ^{−}. Strings *σ* and *τ* are *comparable*, *σ* ∼ *τ*, if \(\sigma \preccurlyeq \tau \) or \(\tau \prec \sigma \); if *σ* and *τ* are not comparable we write *σ*∣*τ*.

For given finite string *σ*, the class \([{\kern -2.3pt}[ \sigma ]{\kern -2.3pt}] := \{ \sigma X : X \in \mathbb {B}^{\omega }\} \subseteq \mathbb {B}^{\omega }\) is the class of infinite extensions of *σ*. Likewise, for \(A \subseteq \mathbb {B}^{*}\), let \([{\kern -2.3pt}[ A ]{\kern -2.3pt}] := \{ \sigma X: \sigma \in A, X \in \mathbb {B}^{\omega }\}\).

#### 1.1.2 Computable Measures

*premeasure*, a function \(m: \mathbb {B}^{*} \rightarrow [0,1]\) that satisfies

- 1.
*m*(*𝜖*) = 1; - 2.
*m*(*σ*0) +*m*(*σ*1) =*m*(*σ*) for all \(\sigma \in \mathbb {B}^{*}\).

*m*gives rise to an

*outer measure*\(\mu ^{*}_{m}:\mathcal {P}(\mathbb {B}^{\omega }) \rightarrow [0,1]\) by

By restricting \(\mu ^{*}_{m}\) to the measurable sets, i.e., the sets \(\mathcal {A}\subseteq \mathbb {B}^{\omega }\) such that \(\mu ^{*}_{m}(\mathcal {B})=\mu ^{*}_{m}(\mathcal {B} \cap \mathcal {A} )+\mu ^{*}_{m}(\mathcal {B} \setminus \mathcal {A})\) for all \(\mathcal {B} \subseteq \mathbb {B}^{\omega }\), we finally obtain the corresponding *(probability) measure* *μ* _{ m }, that satisfies \(\mu _{m}([{\kern -2.3pt}[ \sigma ]{\kern -2.3pt}])=m(\sigma )\) for all \(\sigma \in \mathbb {B}^{*}\).

The *uniform (Lebesgue) measure* *λ* is given by the premeasure *m* with *m*(*σ*) = 2^{−|σ|} for all \(\sigma \in \mathbb {B}^{*}\). A measure *μ* is *nonatomic* or *continuous* if there is no \(X \in \mathbb {B}^{\omega }\) with *μ*({*X*}) > 0.

We call a total real-valued function \(f: \mathbb {B}^{*} \rightarrow \mathbb {R}\) *computable* if its values are uniformly computable reals: there is a computable \(g: \mathbb {B}^{*} \times \mathbb {N} \rightarrow \mathbb {Q}\) such that |*g*(*σ*,*k*) − *f*(*σ*)| < 2^{−k } for all *σ*,*k*. This allows us to talk about computable premeasures. A measure *μ* we then call computable if \(\mu =\mu ^{*}_{m}\) for a computable premeasure *m*.

#### 1.1.3 Semicomputable Semimeasures

We call a total real-valued function \(f: \mathbb {B}^{*} \rightarrow \mathbb {R}\) (*lower*) *semicomputable* if there are uniformly computable functions \(f_{t}: \mathbb {B}^{*} \rightarrow \mathbb {Q}\) such that for all \(\sigma \in \mathbb {B}^{*}\), we have *f* _{ t+1}(*σ*) ≥ *f* _{ t }(*σ*) for all \(t \in \mathbb {N}\) and \(\lim _{t \rightarrow \infty } f_{t}(\sigma )=f(\sigma )\).

*semimeasure*over the infinite strings that is generated from a premeasure

*m*that only needs to satisfy

- 1.
*m*(*𝜖*) ≤ 1; - 2.
*m*(*σ*0) +*m*(*σ*1) ≤*m*(*σ*) for all \(\sigma \in \mathbb {B}^{*}\).

Following [5], we will simply treat a semimeasure as a function over the cones \(\{ [{\kern -2.3pt}[ \sigma ]{\kern -2.3pt}]: \sigma \in \mathbb {B}^{*} \}\):

### **Definition 1**

- 1.
\(\nu ([{\kern -2.3pt}[ \epsilon ]{\kern -2.3pt}])\leq 1\);

- 2.
\(\nu ([{\kern -2.3pt}[ \sigma 0 ]{\kern -2.3pt}])+\nu ([{\kern -2.3pt}[ \sigma 1 ]{\kern -2.3pt}])\leq \nu ([{\kern -2.3pt}[ \sigma ]{\kern -2.3pt}])\) for all \(\sigma \in \mathbb {B}^{*}\).

Moreover, we follow the custom of writing *ν*(*σ*) for \(\nu ([{\kern -2.3pt}[ \sigma ]{\kern -2.3pt}])\). Let \(\mathcal {M}\) denote the class of all semicomputable semimeasures.^{1}

### 1.2 Monotone Machines and Semicomputable Semimeasures

#### 1.2.1 Machines

The following definition is due to Levin [10]. (Similar machine models were already described in [23], and by Solomonoff [20] and Schnorr [19]; see [3].)

### **Definition 2**

A monotone machine is a c.e. set \(M \subseteq \mathbb {B}^{*}\times \mathbb {B}^{*}\) of pairs of strings such that if (*ρ* _{1},*σ* _{1}),(*ρ* _{2},*σ* _{2}) ∈ *M* and \(\rho _{1} \preccurlyeq \rho _{2}\) then *σ* _{1} ∼ *σ* _{2}.

We will not go into the concrete machine model that corresponds to the above abstract definition (see, for instance, [5, p. 145]); we only note that a machine *M* as defined above induces a function \(N_{M}: \mathbb {B}^{*} \cup \mathbb {B}^{\omega } \rightarrow \mathbb {B}^{*} \cup \mathbb {B}^{\omega }\) by \(N_{M}(X) = \sup _{\preccurlyeq }\{ \sigma \in \mathbb {B}^{*}: \exists \rho \preccurlyeq X \left ((\rho ,\sigma ) \in M \right ) \}\) (cf. [7]).

#### 1.2.2 Transformations

Imagine that we feed a monotone machine *M* a stream of input that is generated from a computable measure *μ*. As a result, machine *M* produces a (finite or infinite) stream of output. The probabilities for the possible initial segments of the output stream are themselves given by a semicomputable semimeasure (as can easily be verified). We will call this semimeasure the *transformation* of *μ* by *M*.

### **Definition 3**

*transformation*

*μ*

_{ M }

*of computable measure*

*μ*

*by monotone machine M*is defined by

#### 1.2.3 Characterizations of \(\mathcal {M}\)

*ν*, one can obtain a machine

*M*that transforms the uniform measure

*λ*to

*ν*. Together with the straightforward converse that every function

*λ*

_{ M }defines a semicomputable semimeasure, this gives a characterization of the class \(\mathcal {M}\) of semicomputable semimeasures as

*λ*

_{ M }}

_{ M }is the class of functions

*λ*

_{ M }for all monotone machines

*M*.

A proof of this fact by the construction of an *M* that transforms *λ* to given *ν* was first outlined by Levin in [23, Theorem 3.2]. (Also see [13, Theorem 4.5.2].) Moreover, it can be deduced from [23, Theorem 3.1(b), 3.2] that \(\mathcal {M}\) can be characterized as the class of transformations of computable measures other than *λ*. Namely, we have that \( \mathcal {M}\) coincides with {*μ* _{ M }}_{ M } for any computable *μ* that is continuous.

A detailed construction to prove the characterization (1) was published by Day [4, Theorem 4(ii)]. (Also see [5, Theorem 3.16.2(ii)].) The following proof of the case for any continuous computable measure is an adaptation of this construction.

### **Theorem 1** (Levin)

*For every continuous computable measure* *μ* *,* *and for every semicomputable semimeasure* *ν* *,* *there is a monotone machine* M *such that* *ν* = *μ* _{ M } *.*

### *Proof*

Let *ν* be any semicomputable semimeasure, with uniformly computable approximation functions *f* _{ t }. We construct in stages *s* = 〈*σ*,*t*〉 a monotone machine M that transforms *μ* into *ν*. Let \(D_{s}(\sigma ):=\{ \rho \in \mathbb {B}^{*} : (\rho ,\sigma ) \in M_{s} \}\). □

*Construction* Let *M* _{0} := *∅*.

At stage *s* = 〈*σ*,*t*〉, if \(\mu ([{\kern -2.3pt}[ D_{s-1}(\sigma )]{\kern -2.3pt}])=f_{t}(\sigma )\) then let *M* _{ s } := *M* _{ s−1}.

Otherwise, first consider the case *σ*≠*𝜖*. By Lemma 1 in [4] there is a set \(R \subseteq \mathbb {B}^{s}\) of *available* strings of length *s* such that \([{\kern -2.3pt}[ R]{\kern -2.3pt}]=[{\kern -2.3pt}[ D_{s-1}(\sigma ^{-})]{\kern -2.3pt}] \setminus ([{\kern -2.3pt}[ D_{s-1}(\sigma ^{-} 0)]{\kern -2.3pt}] \cup [{\kern -2.3pt}[ D_{s-1}(\sigma ^{-} 1)]{\kern -2.3pt}])\). Denote \(x:=\mu ([{\kern -2.3pt}[ R ]{\kern -2.3pt}])\), the amount of measure available for descriptions for *σ*, which equals \(\mu ([{\kern -2.3pt}[ D_{s-1}(\sigma ^{-})]{\kern -2.3pt}]) - \mu ([{\kern -2.3pt}[ D_{s-1}(\sigma ^{-} 0)]{\kern -2.3pt}])- \mu ([{\kern -2.3pt}[ D_{s-1}(\sigma ^{-} 1)]{\kern -2.3pt}])\) because we ensure by construction that \([{\kern -2.3pt}[ D_{s-1}(\sigma ^{-})]{\kern -2.3pt}] \supseteq [{\kern -2.3pt}[ D_{s-1}(\sigma ^{-} 0)]{\kern -2.3pt}] \cup [{\kern -2.3pt}[ D_{s-1}(\sigma ^{-} 1)]{\kern -2.3pt}]\) and \([{\kern -2.3pt}[ D_{s-1}(\sigma ^{-} 0)]{\kern -2.3pt}] \cap [{\kern -2.3pt}[ D_{s-1}(\sigma ^{-} 1)]{\kern -2.3pt}] = \emptyset \). Denote \(y:= f_{t}(\sigma )-\mu ([{\kern -2.3pt}[ D_{s-1}(\sigma )]{\kern -2.3pt}])\), the amount of measure the current descriptions fall short of the latest approximation of *ν*(*σ*). We collect in the auxiliary set *A* _{ s } a number of available strings from *R* such that \(\mu ([{\kern -2.3pt}[ A_{s}]{\kern -2.3pt}])\) is maximal while still bounded by \(\min \{x,y\}\).

If *σ* = *𝜖*, then denote \(y:= f_{t}(\epsilon )-\mu ([{\kern -2.3pt}[ D_{s-1}(\epsilon )]{\kern -2.3pt}])\). Collect in *A* _{ s } a number of available strings from \(R \subseteq \mathbb {B}^{s}\) with \([{\kern -2.3pt}[ R]{\kern -2.3pt}]=\mathbb {B}^{\omega } \setminus [{\kern -2.3pt}[ D_{s-1}(\epsilon )]{\kern -2.3pt}]\) such that \(\mu ([{\kern -2.3pt}[ A_{s}]{\kern -2.3pt}])\) is maximal but bounded by *y*.

Put \(M_{s}:= M_{s-1} \cup \{ (\rho , \sigma ) : \rho \in A_{s} \}\).*Verification* The verification of the fact that *M* is a monotone machine is identical to that in [4].

It remains to prove that *μ* _{ M }(*σ*) = *ν*(*σ*) for all \(\sigma \in \mathbb {B}^{*}\). Since by construction \([{\kern -2.3pt}[ D_{s}(\sigma ^{\prime }) ]{\kern -2.3pt}] \subseteq [{\kern -2.3pt}[ D_{s}(\sigma ) ]{\kern -2.3pt}]\) for any \(\sigma ^{\prime } \succcurlyeq \sigma \), we have that \(\mu _{M_{s}}(\sigma ) = \mu (\cup _{\sigma ^{\prime } \succcurlyeq \sigma } [{\kern -2.3pt}[ D_{s}(\sigma ^{\prime }) ]{\kern -2.3pt}]) = \mu ([{\kern -2.3pt}[ D_{s}(\sigma ) ]{\kern -2.3pt}])\). Hence \(\mu _{M}(\sigma ) = \lim _{s \rightarrow \infty } \mu ([{\kern -2.3pt}[ D_{s}(\sigma ) ]{\kern -2.3pt}])\), and our objective is to show that \(\lim _{s \rightarrow \infty } \mu ([{\kern -2.3pt}[ D_{s}(\sigma )]{\kern -2.3pt}])=\nu (\sigma )\). To that end it suffices to demonstrate that for every *δ* > 0 there is some stage *s* _{0} where \(\mu ([{\kern -2.3pt}[ D_{s_{0}}(\sigma )]{\kern -2.3pt}])>\nu (\sigma ) - \delta \). We prove this by induction.

For the base step, let *σ* = *𝜖*. Choose positive \(\delta ^{\prime } < \delta \). There will be a stage *s* _{0} = 〈*𝜖*,*t* _{0}〉 where \(f_{t_{0}}(\epsilon )>\nu (\epsilon )-\delta ^{\prime }\), and (since *μ* is continuous) \(\mu ([{\kern -2.3pt}[ \rho ]{\kern -2.3pt}])\leq \delta - \delta ^{\prime }\) for all \(\rho \in \mathbb {B}^{s_{0}}\). Then, if not already \(\mu ([{\kern -2.3pt}[ D_{s_{0}-1}(\epsilon )]{\kern -2.3pt}])> \nu (\epsilon )-\delta \), the latter guarantees that the construction will select a number of available strings in \(A_{s_{0}}\) such that \(\nu (\epsilon )-\delta < \mu ([{\kern -2.3pt}[ D_{s_{0}-1}(\epsilon )]{\kern -2.3pt}])+ \mu ([{\kern -2.3pt}[ A_{s} ]{\kern -2.3pt}]) \leq f_{t_{0}}(\epsilon )\). It follows that \(\mu ([{\kern -2.3pt}[ D_{s_{0}}(\epsilon )]{\kern -2.3pt}])= \mu ([{\kern -2.3pt}[ D_{s_{0}-1}(\epsilon )]{\kern -2.3pt}])+\mu ([{\kern -2.3pt}[ A_{s} ]{\kern -2.3pt}])> \nu (\epsilon )-\delta \) as required.

*σ*≠

*𝜖*, and denote by \(\sigma ^{\prime }\) the one-bit extension of

*σ*

^{−}with \(\sigma ^{\prime } \mid \sigma \). Choose positive \(\delta ^{\prime } < \delta \). By induction hypothesis, there exists a stage \(s_{0}^{\prime }\) such that \(\mu ([{\kern -2.3pt}[ D_{s_{0}^{\prime }}(\sigma ^{-})]{\kern -2.3pt}])> \nu (\sigma ^{-})-\delta ^{\prime }\). At this stage \(s_{0}^{\prime }\), we have

### **Corollary 1**

*For every continuous computable measure*

*μ*

*,*

### 1.3 Prefix-Free Machines and Discrete Semimeasures

The notions of a semicomputable *discrete* semimeasure on the finite strings and a *prefix-free* machine can be traced back to Levin [11] and Gács [6], and independently Chaitin [1].

### **Definition 4**

A semicomputable discrete semimeasure is a semicomputable function \(P: \mathbb {B}^{*} \rightarrow \mathbb {R}^{\geq 0}\) such that \({\sum }_{\sigma \in \mathbb {B}^{*}} P(\sigma ) \leq 1\).

### **Definition 5**

A prefix-free machine is a partial computable function \(T:\mathbb {B}^{*} \rightarrow \mathbb {B}^{*}\) with prefix-free domain.

### **Definition 6**

*transformation of computable measure*

*μ*

*by prefix-free machine T*is the semicomputable discrete semimeasure \(Q^{\mu }_{T}: \mathbb {B}^{*} \rightarrow [0,1]\) defined by

Let \(\mathcal {P}\) denote the class of all semicomputable discrete semimeasures. Analogous to class \(\mathcal {M}\) and the monotone machines, class \(\mathcal {P}\) is characterized as the class of all prefix-free machine transformations of *μ*, for any continuous computable *μ*. The fact that every *P* can be obtained as a transformation of *λ* is usually inferred from the effective version of Kraft’s inequality (e.g., [5, p. 130], [14, Exercise 2.2.23]). However, we can easily prove the general case in a direct manner by a much simplified version of the construction for Theorem 1.

### **Proposition 1**

*For every continuous computable measure* *μ* *,* *and for every semicomputable discrete semimeasure* P*, there is a prefix-free machine* T *such that* \(P=Q^{\mu }_{T}\) *.*

### *Proof*

Let P be any semicomputable discrete semimeasure, with uniformly computable approximation functions *f* _{ t }. We construct a prefix-free machine T in stages *s* = 〈*σ*,*t*〉. Let \(D_{s}(\sigma )=\{ \rho \in \mathbb {B}^{*} : (\rho ,\sigma ) \in T_{s} \}\). □

*Construction* Let *T* _{0} = *∅*.

At stage *s* = 〈*σ*,*t*〉, if \(\mu ([{\kern -2.3pt}[ D_{s-1}(\sigma )]{\kern -2.3pt}])=f_{t}(\sigma )\) then let *T* _{ s } := *T* _{ s−1}.

Otherwise, let the set \(R \subseteq \mathbb {B}^{s}\) of *available* strings be such that \([{\kern -2.3pt}[ R]{\kern -2.3pt}]= \mathbb {B}^{\omega } \setminus [{\kern -2.3pt}[ \cup _{\tau \in \mathbb {B}^{*}} D_{s-1}(\tau ) ]{\kern -2.3pt}]\). Collect in the auxiliary set *A* _{ s } a number of available strings *ρ* from *R* with \({\sum }_{\rho \in A_{s}} \mu ([{\kern -2.3pt}[\rho ]{\kern -2.3pt}])\) maximal but bounded by \(f_{t}(\sigma )-\mu ([{\kern -2.3pt}[ D_{s-1}(\sigma )]{\kern -2.3pt}])\), the amount of measure the current descriptions fall short of the latest approximation of *P*(*σ*). Put \(T_{s}:= T_{s-1} \cup \{ (\rho , \sigma ) : \rho \in A_{s} \}\).*Verification* It is immediate from the construction that \(\cup _{\sigma \in \mathbb {B}^{*}} D_{s}(\sigma )\) is prefix-free at all stages *s*, so \(T = \lim _{s \rightarrow \infty } T_{s}\) is a prefix-free machine. To show that \(Q^{\mu }_{T}(\sigma )=\lim _{s \rightarrow \infty } \mu ([{\kern -2.3pt}[ D_{s}(\sigma )]{\kern -2.3pt}])\) equals *P*(*σ*) for all \(\sigma \in \mathbb {B}^{*}\), it suffices to demonstrate that for every *δ* > 0 there is some stage *s* _{0} where \(\mu ([{\kern -2.3pt}[ D_{s_{0}}(\sigma )]{\kern -2.3pt}])>P(\sigma ) - \delta \).

*s*

_{0}= 〈

*σ*,

*t*

_{0}〉 with \(\mu ([{\kern -2.3pt}[ \rho ]{\kern -2.3pt}])\leq \delta - \delta ^{\prime }\) for all \(\rho \in \mathbb {B}^{s_{0}}\) and \(f_{t_{0}}(\sigma )>P(\sigma )-\delta ^{\prime }\). Clearly, the available

*μ*-measure

Consequently, if not already \(\mu ([{\kern -2.3pt}[ D_{s_{0}-1}(\sigma )]{\kern -2.3pt}])> P(\sigma )-\delta \), then the construction collects in \(A_{s_{0}}\) a number of descriptions of length *s* _{0} from *R* such that \(\mu ([{\kern -2.3pt}[ D_{s_{0}}(\sigma )]{\kern -2.3pt}])= \mu ([{\kern -2.3pt}[ D_{s_{0}-1}(\sigma )]{\kern -2.3pt}])+{\sum }_{\rho \in A_{s_{0}}}\mu ([{\kern -2.3pt}[ \rho ]{\kern -2.3pt}]) > P(\sigma )-\delta \) as required.

### **Corollary 2**

*For every continuous computable measure*

*μ*

*,*

## 2 The A Priori Semimeasures

In this section we show that the class of a priori semimeasures can be characterized as the class of universal transformations of any continuous computable measure. Section 2.1 introduces the class of a priori semimeasures. Section 2.2 is an interlude devoted to the representation of the a priori semimeasures as *universal mixtures*. Section 2.3 presents the generalized characterization, and concludes with a brief discussion of how this reflects on the association with foundational principles.

### 2.1 A Priori Semimeasures

#### 2.1.1 Universal Machines

*U*is

*universal (by adjunction)*if for some such encoding \(\{ \rho _{e} \}_{e \in \mathbb {N}}\), we have for all \(\rho ,\sigma \in \mathbb {B}^{*}\) that

*weak universality*, which is the more general property that for all

*M*there is a \(c_{M} \in \mathbb {N}\) such that

#### 2.1.2 A Priori Semimeasures

We call a transformation by a universal machine a *universal transformation*. The a priori semimeasures are the universal transformations of the uniform measure.

### **Definition 7**

*U*.

Let \(\mathcal {A}\) denote the class {*λ* _{ U }}_{ U } of a priori semimeasures. The next result implies that every element of \(\mathcal {A}\) can also be obtained as the transformation of *λ* by a machine that is *not* universal.

### **Proposition 2**

*For every continuous computable measure* *μ* *,* *there is for every semicomputable semimeasure* *ν* *a* non-universal *monotone machine* M *such that* *ν* = *μ* _{ M } *.*

### *Proof*

Let U be an arbitrary universal machine. We will adapt the construction of Theorem 1 of a machine M with *μ* _{ M } = *ν* in such a way that for every constant \(c\in \mathbb {N}\) there is a *σ* such that for some \(\rho ^{\prime }\) with \((\rho ^{\prime },\sigma ) \in U\), we have that \(| \rho | > |\rho ^{\prime }| + c\) for all *ρ* with (*ρ*,*σ*) ∈ *M*. This ensures that M is not even weakly universal. □

*Construction* The only change to the earlier construction is that at stage *s* we try to collect available strings of length *l* _{ s }, where *l* _{ s } is defined as follows. Let *l* _{0} = 0. For *s* = 〈*σ*,*t*〉 with *t* > 0, let *l* _{ s } = *l* _{ s−1} + 1. In case *s* = 〈*σ*,0〉, enumerate pairs in *U* until a pair \((\rho ^{\prime }, \sigma )\) for some \(\rho ^{\prime }\) is found. Let \(l_{s} := \max \{l_{s-1}+1,|\rho ^{\prime }|+s \}\).*Verification* The verification that *μ* _{ M } = *ν* proceeds as before. In addition, the construction guarantees that for every \(c \in \mathbb {N}\), we have for *σ* with *c* = 〈*σ*,0〉 that \(| \rho | > |\rho ^{\prime }| + c\) for the first enumerated \(\rho ^{\prime }\) with \((\rho ^{\prime },\sigma ) \in U\) and all *ρ* with (*ρ*,*σ*) ∈ *M*.

We define a *discrete* a priori semimeasure in like manner.

### **Definition 8**

*universal*prefix-free machine

*U*, meaning that

*U*is defined by

### 2.2 Universal Mixtures

*universal mixture*

*weight function*\(W: \mathbb {N} \rightarrow [0,1]\) that satisfies \({\sum }_{i \in \mathbb {N}} W(i) \leq 1\) and

*W*(

*i*) > 0 for all

*i*. Conversely, one can show that every universal mixture equals

*λ*

_{ U }for some universal machine

*U*[22].

It is easy to see from the mixture form of the a priori semimeasures that every element of \(\mathcal {A}\) is *universal* in the sense that it *dominates* every other semicomputable semimeasure. That is, for every \(\lambda _{U} \in \mathcal {A}\) there is for every \(\nu \in \mathcal {M}\) a constant \(c_{\nu } \in \mathbb {N}\), depending only on *λ* _{ U } and *ν*, such that \(\lambda _{U}(\sigma ) \geq c_{\nu }^{-1} \nu (\sigma )\) for all \(\sigma \in \mathbb {B}^{*}\). The converse does not hold: not all universal elements of \(\mathcal {M}\) are of the form *λ* _{ U } or equivalently *ξ* _{ W }. For instance, the sum of *ξ* _{ W }(*σ*) for all strings *σ* of the same length *n* will always fall short of 1 (because it does so for some semimeasures), but we can readily define a universal \(\kappa \in \mathcal {M}\) with (say) *κ*(*σ*) = *λ*(*σ*) for all *σ* up to a finite length *n*.

The aim of the current subsection is to strengthen the above statement of the equivalence of the a priori semimeasures and the universal mixtures, as follows.

First, let us call an enumeration \(\{ \nu _{i} \}_{i \in \mathbb {N}}\) of all semicomputable semimeasures *acceptable* if it is generated from an enumeration {*M* _{ i }}_{ i } of all monotone Turing machines by the procedure of Theorem 1, i.e., *ν* _{ i } = *λ* _{ M } _{ i }. This terminology matches that of the definition of *acceptable numberings* of the partial computable functions [18, p. 41]. Every effective listing of all Turing machines yields an acceptable numbering. Importantly, any two acceptable numberings differ only by a computable permutation [17]; in our case, for any two acceptable enumerations {*ν* _{ i }}_{ i } and \(\{ \bar {\nu }_{i} \}_{i}\) there is a computable permutation \(f: \mathbb {N}\rightarrow \mathbb {N}\) of indices such that \(\bar {\nu }_{i} = \nu _{f(i)}\).

Furthermore, let us call a semicomputable weight function *W* *proper* if \({\sum }_{i} W(i) = 1\); this implies that *W* is computable.

Then we can show that for any acceptable enumeration of all semicomputable semimeasures, all elements in \(\mathcal {A}\) are expressible as some mixture with a *proper* weight function over *this* enumeration.

### **Proposition 3**

*For every acceptable enumeration* {*ν* _{ i }}_{ i } *of* \(\mathcal {M}\) *,* *every element in* \(\mathcal {A}\) *is equal to* \(\xi _{W}(\cdot ) = {\sum }_{i} W(i) \nu _{i}(\cdot )\) *for some proper* W*.*

### *Proof*

Given \(\lambda _{U} \in \mathcal {A}\), with enumeration {*M* _{ i }}_{ i } of all monotone machines corresponding to U. We know that *λ* _{ U } is equal to \(\bar {\xi }_{\bar {W}}(\cdot ) = {\sum }_{i} \bar {W}(i) \bar {\nu }_{i}(\cdot )\) for some acceptable enumeration \(\{ \bar {\nu }_{i} \}_{i}=\{ \lambda _{M_{i}}\}_{i}\) of \(\mathcal {M}\) and semicomputable weight function \(\bar {W}\). First we show that \(\bar {\xi }_{\bar {W}}\) is equal to \(\xi _{W^{\prime }}(\cdot ) = {\sum }_{i} W^{\prime }(i) \nu _{i}(\cdot )\) for given acceptable enumeration {*ν* _{ i }}_{ i } and a semicomputable \(W^{\prime }\); then we show that it is also equal to \(\xi _{W}(\cdot ) = {\sum }_{i} W(i) \nu _{i}(\cdot )\) for proper W .

*ν*

_{ i }}

_{ i }and \(\{ \bar {\nu }_{e} \}_{e}\) are both acceptable, there is a 1-1 computable f such that \(\bar {\nu }_{i} = \nu _{f(i)}\). Then

We proceed with the description of a proper W . The idea is to have W assign to each i a positive computable weight that does not exceed \(W^{\prime }(i)\), additional computable weight to the index of a single suitably defined semimeasure in order to regain the original mixture, and all of the remaining weight to an “empty” semimeasure.

*δ*> 0 we have a \(j \in \mathbb {N}\) with \({\sum }_{i>j}2^{-i-c} < \delta \), hence \({\sum }_{i \leq j} g(i) < {\sum }_{i} g(i) < {\sum }_{i \leq j} g(i) + \delta \).

Next, define \(\pi (\cdot ) = q^{-1} {\sum }_{i} \left (W^{\prime }(i)-g(i) \right ) \nu _{i}(\cdot )\). This is a semimeasure because \(\pi (\epsilon ) \leq q^{-1} \xi _{W^{\prime }}(\epsilon ) < q^{-1}q = 1\). Let *k* be such that *ν* _{ k } = *π*, and let *l* be such that *ν* _{ l } is the “empty” semimeasure with *ν*(*σ*) = 0 for all \(\sigma \in \mathbb {B}^{*}\) (both indices exist even if we cannot effectively find them).

*W*by

*W*is computable and indeed proper, and

□

As a kind of converse, we can derive that any universal mixture is also equal to a universal mixture with a *universal* weight function, i.e., a weight function *W* such that for all other \(W^{\prime }\) there is a \(c_{W^{\prime }}\) with \(W(i) \geq c_{W^{\prime }}^{-1} W^{\prime }(i)\) for all *i*.

### **Proposition 4**

*For every acceptable enumeration* {*ν* _{ i }}_{ i } *of* \(\mathcal {M}\) *,* *every element in* \(\mathcal {A}\) *is equal to* \(\xi _{W}(\cdot ) = {\sum }_{i} W(i) \nu _{i}(\cdot )\) *for some universal* W*.*

### *Proof*

*ν*

_{ i }}

_{ i }. Let k be such that \(\nu _{k}(\cdot ) = {\sum }_{i} 2^{-K(i)} \nu _{i}(\cdot )\), with

*K*(

*i*) the prefix-free Kolmogorov complexity (via some universal prefix-free machine U) of the i-th lexicographically ordered string; 2

^{−K(⋅)}is a universal weight function. Define

*W*is universal because 2

^{−K(⋅)}is, and

□

Hutter [8, p. 102–03] argues that a universal mixture with weight function 2^{−K(i)} is *optimal* among all universal mixtures, essentially because this weight function is universal. The above result shows that this optimality is meaningless: *every* universal mixture can be represented so as to have a universal weight function.

### 2.3 The Generalized Characterization

We are now ready to show that the universal transformations of any continuous computable measure *μ* yield the same class \(\mathcal {A}\) of a priori semimeasures. A minor caveat is that we will need to restrict the universal machines *U* to those machines with associated encodings {*ρ* _{ e }}_{ e } that do not receive measure 0 from *μ*: so \(\mu ([{\kern -2.3pt}[ \rho _{e} ]{\kern -2.3pt}])>0\) for all \(e \in \mathbb {N}\). Call (the associated encodings of) those machines *compatible* with measure *μ*. This is clearly no restriction for measures that give positive probability to every finite string (such as the uniform measure): all machines are compatible with such measures.

We will prove:

### **Theorem 2**

*Let* \(\mu , \bar {\mu }\) *be continuous computable measures. For any* *universal machine* U *that is compatible with* *μ* *,* *there is a universal machine* V *such that* \(\mu _{U} = \bar {\mu }_{V}\) *.*

It follows that \(\{\mu _{U} \}_{U} = \{ \bar {\mu }_{V} \}_{V}\) for any two continuous computable *μ* and \(\bar {\mu }\), with *U* ranging over those universal machines compatible with *μ* and *V* over those universal machines compatible with \(\bar {\mu }\). In particular, since *λ* is itself a continuous computable measure, we have that \(\{\mu _{U} \}_{U} = \mathcal {A}\).

Our proof strategy is to expand the approach taken in [22] to show the coincidence of the a priori semimeasures and the universal mixtures. Let us first derive the fact that a universal transformation of *μ* is an a priori semimeasure.

### **Proposition 5**

*Let* *μ* *be a continuous computable measure and let* U *be a universal machine compatible with* *μ* *.* *Then* \(\mu _{U} \in \mathcal {A}\) *.*

The proof rests on a fixed-point lemma that is a refined version of Corollary 1. For given encoding {*ρ* _{ e }}_{ e }, define \(\mu ^{\rho _{e}}(\cdot ):= \mu (\cdot \mid [{\kern -2.3pt}[ \rho _{e} ]{\kern -2.3pt}])\) for any \(e \in \mathbb {N}\). Here the conditional measure \(\mu ([{\kern -2.3pt}[ \tau ]{\kern -2.3pt}] \mid [{\kern -2.3pt}[ \sigma ]{\kern -2.3pt}] ) := \frac {\mu ([{\kern -2.3pt}[ \sigma \tau ]{\kern -2.3pt}])}{\mu ([{\kern -2.3pt}[ \sigma ]{\kern -2.3pt}])}\) for any \(\sigma , \tau \in \mathbb {B}^{*}\). Let \(\mu ^{\rho }_{M}\) denote the transformation of *μ* ^{ ρ } by *M*.

### **Lemma 1**

*Given encoding*\(\{ \rho _{e} \}_{e \in \mathbb {N}}\)

*of the monotone machines as above. For every continuous computable measure*

*μ*

*,*

### *Proof*

Let *ν* be any semicomputable semimeasure. Since *μ* ^{ ρ } _{ e } is obviously a continuous computable measure for every \(e \in \mathbb {N}\), by the construction of Theorem 1 we obtain for every e a monotone machine M with \(\nu =\mu ^{\rho _{e}}_{M}\). Indeed, there is a total computable function \(g: \mathbb {N} \rightarrow \mathbb {N}\) that for given e retrieves an index *g*(*e*) in the given enumeration \(\{ M_{e} \}_{e \in \mathbb {N}}\) such that \(\nu =\mu ^{\rho _{e}}_{M_{g(e)}}\). But by Kleene’s Recursion Theorem [18], there must be a fixed point \(\hat {e}\) such that \(M_{g(\hat {e})}=M_{\hat {e}}\), hence \(\mu ^{ \rho _{\hat {e}}}_{M_{\hat {e}}}=\mu ^{\rho _{\hat {e}}}_{M_{g(\hat {e})}}\).

This shows that for every *ν* there is an index e such that \(\nu =\mu ^{\rho _{e}}_{M_{e}}\). Conversely, the function \(\mu ^{\rho _{e}}_{M_{e}}\) is a semicomputable semimeasure for every e. □

### *Proof Proof of Proposition 5*

*μ*and universal U compatible with

*μ*. We write out

Lemma 1 tells us that the \(\mu ^{\rho _{e}}_{M_{e}}\) range over all elements in \(\mathcal {M}\). Moreover, \(W(e):= \mu ([{\kern -2.3pt}[ \rho _{e} ]{\kern -2.3pt}])\) is a weight function because {*ρ* _{ e }}_{ e } is prefix-free and U is compatible with *μ*, so *μ* _{ U } is a universal mixture. □

We now proceed to prove that every universal transformation of *μ* indeed equals some universal transformation of \(\bar {\mu }\).

### *Proof Proof of Theorem 2*

*μ*and \(\bar {\mu }\), and universal U compatible with

*μ*. Write out as before

*T*that transforms \(\bar {\mu }\) into

*P*: so \(Q^{\bar {\mu }}_{T} = P\). Denote

*n*

_{ e }:=

*#*{

*τ*: (

*τ*,

*ρ*

_{ e }) ∈

*T*} the number of

*T*-descriptions of

*ρ*

_{ e }, and let \(\langle \cdot ,\cdot \rangle : \mathbb {N} \times \mathbb {N} \rightarrow \mathbb {N}\) be a partial computable pairing function that maps the pairs (

*e*,

*i*) with

*i*<

*n*

_{ e }onto \(\mathbb {N}\). Let

*τ*

_{〈e,i〉}be the

*i*-th enumerated

*T*-description of

*ρ*

_{ e }. We then have

*e*,

*i*〉 for which

*τ*

_{〈e,i〉}becomes defined we can run the construction of Theorem 1 on \(\bar {\mu }^{\tau _{\langle e,i \rangle }}\) and \(\mu ^{\rho _{e}}_{M_{e}}\). In this way we obtain an enumeration of machines {

*N*

_{ d }}

_{ d }such that \(\bar {\mu }^{\tau _{\langle e,i \rangle }}_{N_{\langle e,i \rangle }}=\mu ^{\rho _{e}}_{M_{e}}\) (with

*i*<

*n*

_{ e }) for all

*e*. Then

*V*by \((\tau _{d} \rho ,\sigma ) \in V :\Leftrightarrow (\rho ,\sigma ) \in N_{d}\).

It remains to verify that *V* is in fact universal. Namely, we cannot take for granted that {*N* _{ d }}_{ d } is an enumeration of *all* machines, whence it is not clear that *V* is universal.^{2} Note that it is enough if there were a single universal machine \(V^{\prime }\) in \(\{ N_{d} \}_{d \in \mathbb {N}}\), but even that is not obvious (by Proposition 2 we know that for all continuous computable *μ* there are for any universal *U* *non*-universal *N* such that *μ* _{ N } = *μ* _{ U }).

However, there is a simple patch to the enumeration that guarantees this fact. Namely, given an arbitrary universal machine \(V^{\prime }\), we may simply put \(N_{d} := V^{\prime }\) at some *d* = 〈*e*,*i*〉 where it so happens that \(\bar {\mu }^{\tau _{\langle e,i \rangle }}_{V^{\prime }} = \mu ^{\rho _{e}}_{M_{e}}\). Our final objective is thus to show that \(\bar {\mu }^{\tau _{\langle e,i \rangle }}_{V^{\prime }} = \mu ^{\rho _{e}}_{M_{e}}\) for some *e*,*i*. To that end, define computable function \(g: \mathbb {N} \rightarrow \mathbb {N}\) by \(\mu ^{\rho _{e}}_{M_{g(e)}} = \bar {\mu }^{\tau _{\langle e,0 \rangle }}_{V^{\prime }}\). Since \(Q^{\bar {\mu }}_{T}(\rho _{e}) > 0\) for each *e*, the string *τ* _{〈e,0〉} is defined for each *e*. Hence \(\bar {\mu }^{\tau _{\langle e,0 \rangle }}_{V^{\prime }}\) is defined, and *g*, that retrieves the index *g*(*e*) of a machine that transforms *μ* ^{ ρ } _{ e } to this semimeasure, is total. Then by the Recursion Theorem there is an index \(\hat {e}\) such that \(M_{\hat {e}}=M_{g(\hat {e})}\), so \(\mu ^{\rho _{\hat {e}}}_{M_{\hat {e}}} = \mu ^{\rho _{\hat {e}}}_{M_{g(\hat {e})}}=\bar {\mu }^{\tau _{\langle \hat {e},0 \rangle }}_{V^{\prime }}\). □

### **Corollary 3**

*For any continuous computable*

*μ*

*,*

*and*U

*ranging over those universal machines that are compatible with*

*μ*

*,*

Discrete versions of the above results are derived in an identical manner. Ultimately, we have the following discrete analogue to Corollary 3, where we let \(\mathcal {Q}\) denote the class of all discrete a priori semimeasures.

### **Proposition 6**

*For any continuous computable*

*μ*

*,*

*and*U

*ranging over those prefix-free machines that are compatible with*

*μ*

*,*

#### 2.3.1 Discussion

We now return to the association of the function *λ* _{ U } (as well as its discrete counterpart \(Q^{\lambda }_{U}\)) with foundational principles.

First, there is the association with the principle of *insufficient reason* or *indifference*. This is the principle that in the absence of discriminating evidence, probability should be equally distributed over all possibilities. Solomonoff writes, “If we consider the input sequence to be the ‘cause’ of the observed output sequence, and we consider all input sequences of a given length to be equiprobable (since we have no a priori reason to prefer one rather than the other) then we obtain the present model of induction.” [20, p. 19]. Also see [12, 16].

Second, there is the association with Occam’s razor. Solomonoff writes, “That [this model] might be valid is suggested by ‘Occam’s razor,’ one interpretation of which is that the more ‘simple’ or ‘economical’ of several hypotheses is the more likely \(\dots \) —the most ‘simple’ hypothesis being that with the shortest ‘description.’” [20, p. 3]. Also see [2, 9, 13, 15, 21].

Note that so stated, these associations very much rely on the fact that the uniform measure *λ* always assigns larger probability to shorter strings, and equal probability to equal-length strings. This is a unique feature of *λ*. The results of this paper, however, show that the choice of the uniform measure in defining algorithmic probability is only circumstantial: we could pick *any* continuous computable measure, and still obtain, as the universal transformations of *this* measure instead of *λ*, the very same class of a priori semimeasures. This suggests that properties derived from the presence of *λ* in the definition are artifacts of a particular choice of characterization rather than an indicative property of algorithmic probability, and hence undermines both associations insofar as they indeed hinge on the uniform measure.

## Footnotes

- 1.
Semimeasures as defined here are often referred to as

*continuous*semimeasures, in contradistinction to the*discrete*semimeasures defined in Section 1.3 below (cf. [5, 13]). Due to the possibility of confusion with the earlier meaning of “continuous” as synonymous to “nonatomic,” we will avoid this usage here. - 2.
This is also an (overlooked) issue in the original proof [22, Lemma 4]. It is easily resolvedby the same approach we will be taking here, where it is immediate that for given universal

*V*there isan*e*with*λ*_{ V }=*ν*_{ e }.

## Notes

### Acknowledgements

This research was supported by NWO Vici project 639.073.904. I am grateful to the anonymous reviewers for their thoughtful comments, to Alexander Shen for valuable remarks on an earlier version of this paper, to Peter Grünwald, Jan Leike, and Daniël Noom for helpful discussions, and to Jeanne Peijnenburg for the question that initiated this work.

## References

- 1.Chaitin, G. J.: A theory of program size formally identical to information theory. J. Assoc. Comput. Mach.
**22**(3), 329–340 (1975)MathSciNetCrossRefzbMATHGoogle Scholar - 2.Cover, T. M., Thomas, J. A.: Elements of Information Theory, 2nd edn. Wiley, Hoboken (2006)Google Scholar
- 3.Day, A. R.: On the computational power of random strings. Annals of Pure and Applied Logic
**160**, 214–228 (2009)MathSciNetCrossRefzbMATHGoogle Scholar - 4.Day, A. R.: Increasing the gap between descriptional complexity and algorithmic probability. Trans. Am. Math. Soc.
**363**(10), 5577–5604 (2011)MathSciNetCrossRefzbMATHGoogle Scholar - 5.Downey, R. G., Hirschfeldt, D. R.: Algorithmic randomness and complexity. Springer, New York (2010)CrossRefzbMATHGoogle Scholar
- 6.Gács, P.: On the symmetry of algorithmic information. Soviet Mathematics Doklady
**15**(5), 1477–1480 (1974)zbMATHGoogle Scholar - 7.Gács, P.: Expanded and improved proof of the relation between description complexity and algorithmic probability. Unpublished manuscript (2016)Google Scholar
- 8.Hutter, M.: Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Springer, Berlin (2005)CrossRefzbMATHGoogle Scholar
- 9.Hutter, M.: On universal prediction and Bayesian confirmation. Theor. Comput. Sci.
**384**(1), 33–48 (2007)MathSciNetCrossRefzbMATHGoogle Scholar - 10.Levin, L. A.: On the notion of a random sequence. Soviet Mathematics Doklady
**14**(5), 1413–1416 (1973)zbMATHGoogle Scholar - 11.Levin, L. A.: Laws of information conservation (nongrowth) and aspects of the foundation of probability theory. Probl Inf Transm
**10**(3), 206–210 (1974)Google Scholar - 12.Li, M., Vitányi, P. M. B.: Philosophical issues in Kolmogorov complexity. In: Kuich, W. (ed.) Proceedings of the 19th International Colloquium on Automata, Languages and Programming, pp 1–16. Springer (1992)Google Scholar
- 13.Li, M., Vitányi, P. M. B.: An Introduction to Kolmogorov Complexity and Its Applications, 3rd edn. Springer, New York (2008)CrossRefzbMATHGoogle Scholar
- 14.Nies, A.: Computability and randomness. Oxford University Press (2009)Google Scholar
- 15.Ortner, R., Leitgeb, H.: Mechanizing induction. In: Gabbay, D. M., Hartmann, S., Woods, J. (eds.) Inductive Logic, volume 10 of Handbook of the History of Logic, pp 719–772. Elsevier (2011)Google Scholar
- 16.Rathmanner, S., Hutter, M.: A philosophical treatise of universal induction. Entropy
**13**(6), 1076–1136 (2011)MathSciNetCrossRefzbMATHGoogle Scholar - 17.Rogers, H., Jr.: Gödel numberings of partial recursive functions. J. Symb. Log.
**23**(3), 331–341 (1958)CrossRefzbMATHGoogle Scholar - 18.Rogers, H., Jr.: Theory of Recursive Functions and Effective Computability. McGraw-Hill, New York (1967)zbMATHGoogle Scholar
- 19.Schnorr, C. -P.: Process complexity and effective random tests. J. Comput. Syst. Sci.
**7**, 376–388 (1973)MathSciNetCrossRefzbMATHGoogle Scholar - 20.Solomonoff, R. J.: A formal theory of inductive inference. Parts I and II. Inf Control
**7**(1–22), 224–254 (1964)MathSciNetCrossRefzbMATHGoogle Scholar - 21.Solomonoff, R. J.: The discovery of algorithmic probability. J. Comput. Syst. Sci.
**55**(1), 73–88 (1997)MathSciNetCrossRefzbMATHGoogle Scholar - 22.Wood, I., Sunehag, P., Hutter, M.: (Non-)equivalence of universal priors. In: Dowe, D. L. (ed.) Papers from the Solomonoff Memorial Conference, Lecture Notes in Artificial Intelligence 7070, pp 417–425. Springer (2013)Google Scholar
- 23.Zvonkin, A. K., Levin, L. A.: The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Russ. Math. Surv.
**26**(6), 83–124 (1970)CrossRefzbMATHGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.