Skip to main content

Convergence of the reach for a sequence of Gaussian-embedded manifolds

Abstract

Motivated by questions of manifold learning, we study a sequence of random manifolds, generated by embedding a fixed, compact manifold M into Euclidean spheres of increasing dimension via a sequence of Gaussian mappings. One of the fundamental smoothness parameters of manifold learning theorems is the reach, or critical radius, of M. Roughly speaking, the reach is a measure of a manifold’s departure from convexity, which incorporates both local curvature and global topology. This paper develops limit theory for the reach of a family of random, Gaussian-embedded, manifolds, establishing both almost sure convergence for the global reach, and a fluctuation theory for both it and its local version. The global reach converges to a constant well known both in the reproducing kernel Hilbert space theory of Gaussian processes, as well as in their extremal theory.

This is a preview of subscription content, access via your institution.

References

  1. Adler, R.J., Taylor, J.E.: Random Fields and Geometry. Springer Monographs in Mathematics. Springer, New York (2007)

    Google Scholar 

  2. Amelunxen, D., Bürgisser, P.: Probabilistic analysis of the Grassmann condition number. arXiv:1112.2603

  3. Amenta, N., Bern, M.: Surface reconstruction by Voronoi filtering. Discrete Comput. Geom. 22(4), 481–504 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  4. Anderson, T.W.: An Introduction to Multivariate Statistical Analysis. Wiley Series in Probability and Statistics, 3rd edn. Wiley, Hoboken (2003)

    Google Scholar 

  5. Billingsley, P.: Convergence of Probability Measures. Wiley Series in Probability and Statistics: Probability and Statistics, 2nd edn. Wiley, New York (1999)

    Book  MATH  Google Scholar 

  6. Boissonnat, J.-D., Chazal, F., Yvinec, M.: Computational geometry and topology for data analysis. Book in preparation. http://geometrica.saclay.inria.fr/team/Fred.Chazal/papers/CGLcourseNotes/main.pdf (2016)

  7. Cartwright, D.I., Field, M.J.: A refinement of the arithmetic mean-geometric mean inequality. Proc. Am. Math. Soc. 71(1), 36–38 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  8. Chazal, F., Lieutier, A.: Smooth manifold reconstruction from noisy and non-uniform approximation with guarantees. Comput. Geom. 40(2), 156–170 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  9. Federer, H.: Curvature measures. Trans. Am. Math. Soc. 93, 418–491 (1959)

    Article  MathSciNet  MATH  Google Scholar 

  10. Gray, A.: Tubes. Advanced Book Program. Addison-Wesley Publishing Company, Redwood City (1990)

    Google Scholar 

  11. Hotelling, H.: Tubes and spheres in \(n\)-spaces and a class of statistical problems. Am. J. Math. 61, 440–460 (1939)

    Article  MathSciNet  MATH  Google Scholar 

  12. Johansen, S., Johnstone, I.M.: Hotelling’s theorem on the volume of tubes: some illustrations in simultaneous inference and data analysis. Ann. Statist. 18(2), 652–684 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  13. Kendall, M., Stuart, A.: The Advanced Theory of Statistics, Design and Analysis, and Time-Series, vol. 3, 3rd edn. Hafner Press [Macmillan Publishing Co., Inc.], New York (1976)

    MATH  Google Scholar 

  14. Kendall, M.G., Stuart, A.: The Advanced Theory of Statistics. Distribution Theory, vol. 1, 3rd edn. Hafner Publishing Co., New York (1969)

    MATH  Google Scholar 

  15. Krishnan, S.R., Taylor, J.E., Adler, R.J.: The intrinsic geometry of some random manifolds. arXiv:1512.05622 (2015)

  16. Ledoux, M., Talagrand, M.: Probability in Banach spaces, Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)]. Isoperimetry and Processes. Springer, Berlin (1991)

    Google Scholar 

  17. Lee, J.M.: Riemannian Manifolds, An Introduction to Curvature, Volume 176 of Graduate Texts in Mathematics, vol. 176. Springer, New York (1997)

    Google Scholar 

  18. Niyogi, P., Smale, S., Weinberger, S.: A topological view of unsupervised learning from noisy data. SIAM J. Comput. 40(3), 646–663 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  19. Niyogi, P., Smale, S., Weinberger, S.: Finding the homology of submanifolds with high confidence from random samples. Discrete Comput. Geom. 39(1–3), 419–441 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  20. Olkin, I., Pratt, J.W.: Unbiased estimation of certain correlation coefficients. Ann. Math. Statist 29, 201–211 (1958)

    Article  MathSciNet  MATH  Google Scholar 

  21. Smale, S.: Complexity theory and numerical analysis. In: Iserles, A. (ed.) Acta Numerica, vol. 6, pp. 523–551. Cambridge University Press, Cambridge (1997)

    Google Scholar 

  22. Takemura, A., Kuriki, S.: On the equivalence of the tube and Euler characteristic methods for the distribution of the maximum of Gaussian fields over piecewise smooth domains. Ann. Appl. Probab. 12(2), 768–796 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  23. Takemura, A., Kuriki, S.: Tail probability via the tube formula when the critical radius is zero. Bernoulli 9(3), 535–558 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  24. Taylor, J., Takemura, A., Adler, R.J.: Validity of the expected Euler characteristic heuristic. Ann. Probab. 33(4), 1362–1396 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  25. Thäle, C.: 50 years sets with positive reach—a survey. Surv. Math. Appl. 3, 123–165 (2008)

    MathSciNet  MATH  Google Scholar 

  26. van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer Series in Statistics. With Applications to Statistics. Springer, New York (1996)

    MATH  Google Scholar 

  27. Weyl, H.: On the volume of tubes. Am. J. Math. 61(2), 461–472 (1939)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank Takashi Owada for useful discussions, and two referees for helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert J. Adler.

Additional information

R. J. Adler: Research supported in part by URSAT, ERC Advanced Grant 320422 and SATA II, AFOSR, FA9550-15-1-0032.

S. R. Krishnan: Research supported in part by URSAT, ERC Advanced Grant 320422.

J. E. Taylor: Research supported in part by SATA, AFOSR, FA9550-11-1-0216.

S. Weinberger: Research supported in part by SATA, AFOSR, FA9550-11-1-0216 and DMS-1510178.

Appendices

Appendix 1

We now give a proof of Lemma 5.1. As mentioned earlier, Lemma 5.1 is identical to Lemma 2.1 of [22] and, as pointed out there, the proof is essentially the same as the proof given in [12] for the one-dimensional case. Thus, we make no claim of originality, and include the proof for completeness only.

Proof

Take (a unit length vector) \(\eta _x\in T^{\perp }_xM\cap S(T_x S^{k-1})\), and consider the geodesic

$$\begin{aligned} \gamma _{(x,\eta _x)}(r)=\cos r \,x+\sin r \, \eta _x,\qquad r\ge 0. \end{aligned}$$

To determine the local reach in the direction \(\eta _x\), we need to know how far we can extend \(\gamma \) so that the metric projection of the endpoint is x. Clearly, this is until we find a \(y\ne x\) such that

$$\begin{aligned} \langle y,\gamma _{(x,\eta _x)}(r)\rangle = \langle x,\gamma _{(x,\eta _x)}(r)\rangle . \end{aligned}$$

Consider first the case of \(r<\frac{\pi }{2}\) so that \(\cos r>0\). Then the above two formulae imply that we can extend the geodesic at least until such an r as long as

$$\begin{aligned}&\sup _{y\ne x}\,(\cos r \,(\langle y,x\rangle -1)+\sin r \langle y,\eta _x\rangle )\le 0 \\&\quad \iff \sup _{y\ne x} \,(-\cos r \,(1-\langle y,x\rangle -1)+\sin r \, \langle y,\eta _x\rangle )\le 0 \\&\quad \iff \sup _{y\ne x}\left( -\cot r +\frac{\langle y,\eta _x\rangle }{1-\langle x,y\rangle }\right) \le 0\\&\quad \iff \sup _{y\ne x}\frac{\langle y,\eta _x \rangle }{1-\langle x,y\rangle }\le \cot r. \end{aligned}$$

When \(r\ge \frac{\pi }{2}\), the same argument gives that as soon as there is a y such that \(\langle y,\eta _x\rangle >0\),

$$\begin{aligned} \cos r\,(1-\langle y,x\rangle )+\sin r\,\langle y,\eta _x\rangle \ >\ 0, \end{aligned}$$

and therefore, for such r,

$$\begin{aligned} \sup _{y\ne x}\,(\cos r \,(\langle y,x\rangle -1)+\sin r \langle y,\eta _x\rangle )\ > \ 0, \end{aligned}$$

implying that such a y is closer to \(\gamma _{x,\eta _x}(r)\) than x is. Hence, the geodesic can only be extended up to a length less than or equal to \(\frac{\pi }{2}\).

Thus, by our earlier argument for \(r\le \frac{\pi }{2}\), we note that on the set

$$\begin{aligned} Z=\left\{ (x,\eta _x):\,\sup _{y\ne x}\langle y,\eta _x\rangle >0\right\} ,\end{aligned}$$

the reach satisfies

$$\begin{aligned} \cot r(x,\eta _x)= \sup _{y\ne x}\frac{\langle y,\eta _x \rangle }{1-\langle x,y\rangle }= \sup _{y\ne x}\frac{\langle y,\eta _x \rangle ^+}{1-\langle x,y\rangle }\ \ge \ 0,\end{aligned}$$

where \(x^{+}\) denotes the positive part of x. Meanwhile, on the set \(Z^{\text {c}}\), we have

$$\begin{aligned} \cot r(x,\eta _x)\le 0.\end{aligned}$$

Therefore, we have the inequality,

$$\begin{aligned} \cot r(x,\eta _x)\ \le \ \sup _{y\ne x}\frac{\langle y,\eta _x \rangle ^+}{1-\langle x,y\rangle },\end{aligned}$$

which becomes an equality when

$$\begin{aligned} \sup _{y\ne x}\frac{\langle y,\eta _x \rangle ^+}{1-\langle x,y\rangle }\ >\ 0.\end{aligned}$$

In other words, we have

$$\begin{aligned} \cot \left( \min \left( r(x,\eta _x),\frac{\pi }{2}\right) \right) = \sup _{y\ne x}\frac{\langle y,\eta _x \rangle ^+}{1-\langle x,y\rangle }. \end{aligned}$$

Finally, since M is a closed manifold embedded into a sphere, the local reach at x, which is an infimum over all \(\eta _x\) above, cannot be greater than \(\frac{\pi }{2}\). Thus we can truncate at this angle, and so, by (2.2), (2.3), and the above, obtain that

$$\begin{aligned} \cot ^2(\theta (x))= \cot ^2\left( \inf _{\eta _x}\theta _\ell (x,\eta _x)\right)= & {} \sup _{\eta _x:\Vert \eta _x\Vert =1}\sup _{y\ne x}\left( \frac{\langle y,\eta _x \rangle ^+}{1-\langle x,y\rangle }\right) ^2 \\= & {} \sup _{y\ne x}\frac{\Vert P_{T^{\perp }_x M}y\Vert ^2}{(1-\langle x,y\rangle )^2}, \end{aligned}$$

as required. \(\square \)

Appendix 2

We need to show that the remainder terms, \(O(1/k^2)\), in (9.1) and (9.2) are of the right order and, just as importantly, are uniform over \(M\times M\).

As mentioned earlier, if the correlation estimates \(\widehat{\mathbb {C}} (x,y)\) of \(\mathbb {C}(x,y)\) were centered at sample means rather than zero—which we shall refer to as the ‘standard’ case—we could simply quote known results from the Statistics literature to establish everything we need. These results are not hard to prove, but they involve pages of tedious algebra, which we do not want to try to reproduce here. Rather, we shall suffice with describing the standard proofs, and where changes need to be made to cover our situation.

The standard case is treated in [14]. Following the derivation in Chapter 16, Section 16.24 there, we start by writing out the joint probability of k sample values \(\{(f_j(x),f_j(y))\}_{j=1}^k\) drawn from a bivariate normal density with zero means and unit variances in terms of the statistics we are interested in, namely,

$$\begin{aligned} s_1^2\ \mathop {=}\limits ^{\Delta }\ \frac{1}{k}\sum f_j^2(x), \qquad s_2^2\mathop {=}\limits ^{\Delta }\frac{1}{k} \sum f_j^2(y), \end{aligned}$$

along with the \(\widehat{\mathbb {C}}_k(x,y)\) of (5.1). These replace the standard sample mean centered version of these statistics in [14].

Using the result of Example 11.6 in Chapter 11 of [14] which deals with finding the distribution of a sum of squares of i.i.d. standard normal variates, and following the discussion in Section 16.24 there, we find that the exact joint probability density of \(s_1,s_2,\) and \(\widehat{\mathbb {C}}_k(x,y)\), on \({\mathbb R}_+\times {\mathbb R}_+\times [-1,1]\), is given by

$$\begin{aligned}&\frac{k^k s_1^{k-1}s_2^{k-1}(1-\widehat{\mathbb {C}}_k^2(x,y))^{\frac{k-3}{2}}}{\pi \Gamma (k-1)(1-\mathbb {C}^2(x,y))^{k/2}}\times e^{-\frac{k}{2(1-\mathbb {C}^2(x,y))}\left( s_1^2-2\mathbb {C}(x,y)\widehat{\mathbb {C}}_k(x,y) s_1s_2+s_2^2\right) }. \end{aligned}$$

As in Section 16.32 of [14], we now integrate out \(s_1\) and \(s_2\), and use the remaining density of \(\widehat{\mathbb {C}}\) to compute that

$$\begin{aligned} \mathbb {E}\{\widehat{\mathbb {C}}_k(x,y)\} = \frac{\mathbb {C}(x,y)\,\Gamma ^2((k+1)/2)}{\Gamma (k/2)\,\Gamma ((k+2)/2)}\,F\left( \frac{1}{2},\frac{1}{2},\frac{k+2}{2},\mathbb {C}^2(x,y)\right) , \end{aligned}$$

where F is the hypergeometric function. Note that \(0\le \mathbb {C}^2(x,y)\le 1\) for all \((x,y)\in M\times M\). The fact that

$$\begin{aligned} F(\alpha ,\beta ,\gamma ,x)=1+xO(1/\gamma )\,\,\text {as}\,\,\gamma \rightarrow \infty \end{aligned}$$

uniformly for x in any bounded set (cf. [20]), and a Stirling’s approximation which gives that the ratio of Gamma functions in (12.89) converges to 1 as \(k\rightarrow \infty \), gives the uniformity of the error bound in (9.1). A similar calculation establishes (9.2) and the uniformity of the error bound there.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Adler, R.J., Krishnan, S.R., Taylor, J.E. et al. Convergence of the reach for a sequence of Gaussian-embedded manifolds. Probab. Theory Relat. Fields 171, 1045–1091 (2018). https://doi.org/10.1007/s00440-017-0801-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00440-017-0801-1

Keywords

  • Gaussian process
  • Manifold
  • Random embedding
  • Critical radius
  • Reach
  • Curvature
  • Asymptotics
  • Fluctuation theory

Mathematics Subject Classification

  • Primary 60G15
  • 57N35
  • Secondary 60D05
  • 60G60