On Algorithmic Statistics for Space-bounded Algorithms

Milovanov, Alexey

doi:10.1007/s00224-018-9845-6

On Algorithmic Statistics for Space-bounded Algorithms

Published: 06 February 2018

Volume 63, pages 833–848, (2019)
Cite this article

Theory of Computing Systems Aims and scope Submit manuscript

Alexey Milovanov ORCID: orcid.org/0000-0002-4609-7851¹

133 Accesses
Explore all metrics

Abstract

Algorithmic statistics looks for models of observed data that are good in the following sense: a model is simple (i.e., has small Kolmogorov complexity) and captures all the algorithmically discoverable regularities in the data. However, this idea can not be used in practice as is because Kolmogorov complexity is not computable. In this paper we develop an algorithmic version of algorithmic statistics that uses space-bounded Kolmogorov complexity. We prove a space-bounded version of a basic result from “classical” algorithmic statistics, the connection between optimality and randomness deficiences. The main tool is the Nisan–Wigderson pseudo-random generator. An extended abstract of this paper was presented at the 12th International Computer Science Symposium in Russia (Milovanov 10).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Algorithmic Statistics for Space-Bounded Algorithms

Algorithmic Statistics and Prediction for Polynomial Time-Bounded Algorithms

On Resource-Bounded Versions of the van Lambalgen Theorem

Notes

The definition and basic properties of Kolmogorov complexity can be found in the textbooks [7, 16]; for a short survey see [14].
Kolmogorov complexity of a finite set A is defined as follows. We fix some computable bijection (encoding) A↦[A] from the family of all finite sets to the set of all binary strings. Then we define C(A) as the complexity C([A]) of the code [A] of A.
The randomness deficiency of a string x with respect to a distribution P is defined as d(x|P) := − log P(x) −C(x|P). The optimality deficiency is defined as δ(x,P) := C(P) − log P(x) −C(x). See [18] for details.
We agree that only work tape cells (but not the cells on input or output tapes) are taken into account.
This is only one possible way to define the notion of bounded-space complexity for a distribution. Instead of a randomized program that has P as output distribution, we may consider a program that computes P(x) for a give input x. The relations between these two definitions are not well understood.
We use a stronger variant than the theorem in [5], but the proof is the same: we addedrequirement (c), but this is easily seen to be true, because the constructed program for$\hat {f}$is a simple transformation of f , and it suffices to add some fixed amountof instructions to f . Also, the theorem in [5] does not assume that Pr[f(x)]belongs to $[\frac {1}{3}; \frac {2}{3}]$.However, this assumption is not used in the proof of this theorem.

References

Ajtai, M.: Approximate counting with uniform constant-depth circuits, advances in computational complexity theory. In: DIMACS series in discrete mathematics and theoretical computer science, american mathematical society, pp 1–20 (1993)
Buhrman, H., Fortnow, L., Laplante, S.: Resource-bounded Kolmogorov complexity revisited. SIAM J. Comput. 31(3), 887–905 (2002)
Article MathSciNet MATH Google Scholar
Demer, R.: Stack exchange discussion (Ricky Demer’s answer to the author’s question) http://cstheory.stackexchange.com/questions/34896/can-every-distribution-producible-by-a-probabilistic-pspace-machine-be-produced(2016)
Furst, M., Saxe, J.B., Sipser, M.: Parity, circuits, and the polynomial-time hierarchy. Mathematical Systems Theory 17(1), 13–27 (1984)
Article MathSciNet MATH Google Scholar
Jung, H.: Relationships between probabilistic and deterministic tape complexity. In: Mathematical foundations of computer science 1981 (MFCS 1981), Lecture Notes in Computer Science, vol. 118, pp 339–346 (1981)
Kolmogorov, A.N.: Three approaches to the quantitative definition of information. Probl. Inf. Transm. 1(1), 4–11 (1965). English translation published in: International Journal of Computer Mathematics, 2, pp 157–168 (1968)
MathSciNet Google Scholar
Li, M., Vitányi, P.M.B.: An introduction to Kolmogorov complexity and its applications, 3rd edn. Springer, Berlin (2008)
Longpré, L.: Resource bounded Kolmogorov complexity, a link between computational complexity and information theory, Ph.D. thesis, TR86-776. Cornell University, Ithaca (1986)
Google Scholar
MacWilliams, F.J., Sloane, N.J.A.: The theory of error-correcting codes. I and II. Bull. Amer. Math. Soc. 84(6), 1356–1359 (1978)
Article MathSciNet Google Scholar
Milovanov, A.: On algorithmic statistics for space-bounded algorithms. In: Proceedings of 12th international computer science symposium in Russia (CSR 2017) LNCS, vol. 10304, pp 232–234 (2017)
Musatov, D.: Improving the space-bounded version of Muchnik’s conditional complexity theorem via “naive” derandomization. Theory of computing systems 55 (2), 299–312 (2014)
Article MathSciNet MATH Google Scholar
Nisan, N.: Pseudorandom bits for constant depth circuits. Combinatorica 11, 63–70 (1991)
Article MathSciNet MATH Google Scholar
Nisan, N., Wigderson, A.: Hardness vs randomness. J. Comput. Syst. Sci. 49(2), 149–167 (1994)
Article MathSciNet MATH Google Scholar
Shen, A.: Around Kolmogorov complexity: basic notions and results. In: Vovk, V., Papadoupoulos, H., Gammerman, A. (eds.) Measures of Complexity. Festschrift for Alexey Chervonenkis. ISBN: 978-3-319-21851-9. Springer, Berlin (2015)
Shen, A.: The concept of (α,β)-stochasticity in the Kolmogorov sense, and its properties. Soviet Mathematics Doklady 271(1), 295–299 (1983)
Google Scholar
Shen, A., Uspensky V., Vereshchagin N.: Kolmogorov complexity and algorithmic randomness, MCCME. (Russian). English translation: http://www.lirmm.fr/~ashen/kolmbook-eng.pdf (2013)
Sipser, M.: A complexity theoretic approach to randomness. In: Proceedings of the 15th ACM symposium on the theory of computing, pp 330–335 (1983)
Vereshchagin, N., Shen, A.: Algorithmic statistics: forty Years. In: Computability and complexity. Essays Dedicated to Rodney G. Downey on the Occasion of His 60Th Birthday. LNCS, vol. 10010, pp 669–737. Springer, Heidelberg (2017)
Vereshchagin, N., Vitányi, P.M.B.: Kolmogorov’s structure functions with an application to the foundations of model selection. IEEE Trans. Inf. Theory 50(12), 3265–3290 (2004). Preliminary version: Proceedings of 47th IEEE Symposium on the Foundations of Computer Science, pp. 751–760 (2002)
Article MATH Google Scholar
Vereshchagin, N., Vitányi, P.M.B.: Rate distortion and denoising of individual data using Kolmogorov complexity. IEEE Trans. Inf. Theory 56(7), 3438–3454 (2010)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

I would like to thank Bruno Bauwens, Ricky Demer, Nikolay Vereshchagin and Alexander Shen for useful discussions, advice and remarks.

This work is supported in parts by the RFBR grant 16-01-00362, by the Young Russian Mathematics award, MK-5379.2018.1 and the RaCAF ANR-15-CE40-0016-01 grant. The study has also been funded by the Russian Academic Excellence Project ‘5-100’.

Author information

Authors and Affiliations

National Research University Higher School of Economics, Moscow, Russia
Alexey Milovanov

Authors

Alexey Milovanov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexey Milovanov.

Additional information

This article is part of the Topical Collection on Computer Science Symposium in Russia

Appendix

Proposition 13

Letx = y ⋅ zbe a concatenation of strings y and z of length n where$y= \overbrace {000 {\ldots } 00}^{n \text { zeros}}$and C(z) > n − εfor somepositiveε.Assume that x belongs to some Hamming ball B.Then

$$2{\mathrm{C}}(B) + \log |B| - {\mathrm{C}}(x) > \frac{2}{5} n - \varepsilon - O(\log n). $$

If ε is small, then Hamming ball B can not satisfy C(B) ≈ 0 and log |B|≈C(x).

Proof

Denote by r and b the radius and the center of B. Denote byb₀andb₁the first and the secondparts of b. (So, b = b₀ ⋅ b₁and|b₀| = |b₁| = n.) Then z belongs tothe Hamming ball B₁withcenter b₁and radius r.Hence,

$${\mathrm{C}}(B_{1}) + \log |B_{1}| \ge {\mathrm{C}}(z) - O(\log n) = {\mathrm{C}}(x) - O(\log n). $$

Note thatC(B₁) ≤C(B) + O(log n). The log-sizeof B₁can beestimated as $nH(\frac {r}{n}) + O(\log n)$,where H(t) := −t log t − (1 − t) log(1 − t)is the binary Shannon entropy (see, for example, [9]). So, the inequality above can be rewrittenas

$$ {\mathrm{C}}(B) + nH\left( \frac{r}{n}\right) \ge {\mathrm{C}}(x) - O(\log n). $$

(1)

We need to show that $2{\mathrm {C}}(B) + \log |B| - {\mathrm {C}}(x) > \frac {2}{5} n - \varepsilon - O(\log n)$,where $\log |B| = 2nH(\frac {r}{2n}) + O(\log n)$.This inequality follows from (1) and the inequality below by an easycalculation.

$$H\left( \frac{r}{n}\right) - H\left( \frac{r}{2n}\right) \le \frac{3}{10}. $$

To verify the last inequality one can show that the maximum of the function$H(t) - H(\frac {t}{2})$is equal to$\frac {\ln (\frac {3}{4} +\frac {1}{\sqrt {2}})}{\ln 4} < \frac {3}{10}$.□

1.1 Symmetry of Information

Define C^m(A,B) as the minimal length of a program that on input a pair of strings (a,b) uses at most space m, and outputs the bits (a ∈ A,b ∈ B).

Lemma 14 (Symmetry of information)

AssumeA,B ⊆{0, 1}ⁿ.Then

$$\textup{(a) } \forall m \text{ } {\mathrm{C}}^{p}(A, B) \le {\mathrm{C}}^{m}(A) + {\mathrm{C}}^{m}(B | A) + O(\log ({\mathrm{C}}^{m}(A,B)+m + n)) $$

forp = m + poly(n +C^m(A,B)).

$$\textup{(b) } \forall m \text{ } {\mathrm{C}}^{p}(A) + {\mathrm{C}}^{p}(B | A) \le {\mathrm{C}}^{m}(A, B) + O(\log ({\mathrm{C}}^{m}(A,B)+m + n) ) $$

forp = 2m + poly(n +C^m(A,B)).

Proof Proof of Lemma 14 (a)

The proof is similar to the proof of Theorem 6 (a).□

Proof Proof of Lemma 14 (b)

Let k :=C^m(A,B).Denote by $\mathcal {D}$the family of sets (U,V )such that C^m(U,V ) ≤ k and U,V ⊆{0, 1}ⁿ.It is clear that $|\mathcal {D}| < 2^{k + 1}$.Denote by $\mathcal {D}_{A}$the pairs of $\mathcal {D}$for which the first element is equal to A. Let t satisfy the inequalities$2^{t} \le |\mathcal {D}_{A}| < 2^{t + 1}$.

We prove that

C^p(B|A)does not exceed t significantly;
C^p(A)does not exceed k − t significantly.

Here p = 2m + O(n).

We start with the first statement. There exists a program that enumerates all sets from$\mathcal {D}_{A}$using A as an oracleand that works in space 2m + O(n).Indeed, such enumeration can be done in the following way: enumerate all programs of length k andverify the following condition for every pair of n-bit strings. First, a program uses at most m space onthis input and does not loop. To verify it we need to check that the program does not compute longerthan 2^O(m)steps. Second, if a second n-bit string belongs to A then the program outputs 1, and 0otherwise. Append to this program the ordinal number of a program that distinguishes(A,B). This number isnot greater than t + 1.Therefore we have C^p(B|A) ≤ t + O(log(C^m(A,B) + m + n)).

Now we prove the second statement. Note that there exist at most2^k−t+ 1sets U such that $|\mathcal {D}_{U}| \ge 2^{t}$(including A). Hence, if we construct a program that enumerates all sets with such property (anddoes not use much space) then we are finished, because the set A can be described bythe ordinal number of this enumeration. Let us construct such a program. It works asfollows:

Enumerate all sets U that are the first elements from$\mathcal {D}$,i.e. we enumerate programs that distinguish the corresponding sets (say,lexicographically). We go to the next step if the following properties hold. First,$|\mathcal {D}_{U}| \ge 2^{t}$, andsecond: we did not consider set U earlier (i.e. every program whose lexicographicalnumber is smaller does not distinguish U or is not the first element from a set from$\mathcal {D}$).

This program uses 2m + poly(n +C^m(A,B))spaceand has length O(log(C^m(A) + n + m)), and hencesatisfies all requirements. □

Proof Proof of Lemma 11

It suffices to show that $\mathcal {B}$satisfies property (1)^∗with probability at most 2⁻ⁿ,because $\mathcal {B}$satisfies property (2)with probability at most $\frac {1}{4}$.

For this let us show that every part is ‘bad’ (i.e. has at least(n + k)² + 1sets from$\mathcal {B}$) with probabilityat most 2^− 2n.The probability of such event is equal to the probability that a binomial random variable withparameters (2^k, 2^−k(n + 2) ln 2)exceeds (n + k)².To bound this, we use an easy but lengthy sequence of estimations. Forw := 2^k,p := 2^−k(n + 2) ln 2andv := (n + k)²wehave

$$\sum\limits_{i=v}^{w} {{w}\choose{i}} p^{i}(1-p)^{w-i} < w \cdot {{w}\choose{v}} p^{v}(1-p)^{w-v} < w \cdot {{w}\choose{v}} p^{v} < w \frac{(wp)^{v}}{v!}. $$

The leftmostinequality follows from wp = (n + 2) ln 2 ≤ (n + k)² = v.Because wp = (n + 2) ln 2 < 10n,we obtain

$$w \frac{(wp)^{v}}{v!} < \frac{2^{k} (10n)^{(n+k)^{2}}}{((n+k)^{2})!} \ll 2^{-2n}. $$

□

Rights and permissions

Reprints and permissions

About this article

Cite this article

Milovanov, A. On Algorithmic Statistics for Space-bounded Algorithms. Theory Comput Syst 63, 833–848 (2019). https://doi.org/10.1007/s00224-018-9845-6

Download citation

Published: 06 February 2018
Issue Date: 15 May 2019
DOI: https://doi.org/10.1007/s00224-018-9845-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Algorithmic Statistics for Space-bounded Algorithms

Abstract

Access this article

Similar content being viewed by others

On Algorithmic Statistics for Space-Bounded Algorithms

Algorithmic Statistics and Prediction for Polynomial Time-Bounded Algorithms

On Resource-Bounded Versions of the van Lambalgen Theorem

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Proposition 13

Proof

1.1 Symmetry of Information

Lemma 14 (Symmetry of information)

Proof Proof of Lemma 14 (a)

Proof Proof of Lemma 14 (b)

Proof Proof of Lemma 11

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On Algorithmic Statistics for Space-bounded Algorithms

Abstract

Access this article

Similar content being viewed by others

On Algorithmic Statistics for Space-Bounded Algorithms

Algorithmic Statistics and Prediction for Polynomial Time-Bounded Algorithms

On Resource-Bounded Versions of the van Lambalgen Theorem

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Proposition 13

Proof

1.1 Symmetry of Information

Lemma 14 (Symmetry of information)

Proof Proof of Lemma 14 (a)

Proof Proof of Lemma 14 (b)

Proof Proof of Lemma 11

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation