Abstract
Motivated by questions of manifold learning, we study a sequence of random manifolds, generated by embedding a fixed, compact manifold M into Euclidean spheres of increasing dimension via a sequence of Gaussian mappings. One of the fundamental smoothness parameters of manifold learning theorems is the reach, or critical radius, of M. Roughly speaking, the reach is a measure of a manifold’s departure from convexity, which incorporates both local curvature and global topology. This paper develops limit theory for the reach of a family of random, Gaussian-embedded, manifolds, establishing both almost sure convergence for the global reach, and a fluctuation theory for both it and its local version. The global reach converges to a constant well known both in the reproducing kernel Hilbert space theory of Gaussian processes, as well as in their extremal theory.
This is a preview of subscription content, access via your institution.
References
Adler, R.J., Taylor, J.E.: Random Fields and Geometry. Springer Monographs in Mathematics. Springer, New York (2007)
Amelunxen, D., Bürgisser, P.: Probabilistic analysis of the Grassmann condition number. arXiv:1112.2603
Amenta, N., Bern, M.: Surface reconstruction by Voronoi filtering. Discrete Comput. Geom. 22(4), 481–504 (1999)
Anderson, T.W.: An Introduction to Multivariate Statistical Analysis. Wiley Series in Probability and Statistics, 3rd edn. Wiley, Hoboken (2003)
Billingsley, P.: Convergence of Probability Measures. Wiley Series in Probability and Statistics: Probability and Statistics, 2nd edn. Wiley, New York (1999)
Boissonnat, J.-D., Chazal, F., Yvinec, M.: Computational geometry and topology for data analysis. Book in preparation. http://geometrica.saclay.inria.fr/team/Fred.Chazal/papers/CGLcourseNotes/main.pdf (2016)
Cartwright, D.I., Field, M.J.: A refinement of the arithmetic mean-geometric mean inequality. Proc. Am. Math. Soc. 71(1), 36–38 (1978)
Chazal, F., Lieutier, A.: Smooth manifold reconstruction from noisy and non-uniform approximation with guarantees. Comput. Geom. 40(2), 156–170 (2008)
Federer, H.: Curvature measures. Trans. Am. Math. Soc. 93, 418–491 (1959)
Gray, A.: Tubes. Advanced Book Program. Addison-Wesley Publishing Company, Redwood City (1990)
Hotelling, H.: Tubes and spheres in \(n\)-spaces and a class of statistical problems. Am. J. Math. 61, 440–460 (1939)
Johansen, S., Johnstone, I.M.: Hotelling’s theorem on the volume of tubes: some illustrations in simultaneous inference and data analysis. Ann. Statist. 18(2), 652–684 (1990)
Kendall, M., Stuart, A.: The Advanced Theory of Statistics, Design and Analysis, and Time-Series, vol. 3, 3rd edn. Hafner Press [Macmillan Publishing Co., Inc.], New York (1976)
Kendall, M.G., Stuart, A.: The Advanced Theory of Statistics. Distribution Theory, vol. 1, 3rd edn. Hafner Publishing Co., New York (1969)
Krishnan, S.R., Taylor, J.E., Adler, R.J.: The intrinsic geometry of some random manifolds. arXiv:1512.05622 (2015)
Ledoux, M., Talagrand, M.: Probability in Banach spaces, Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)]. Isoperimetry and Processes. Springer, Berlin (1991)
Lee, J.M.: Riemannian Manifolds, An Introduction to Curvature, Volume 176 of Graduate Texts in Mathematics, vol. 176. Springer, New York (1997)
Niyogi, P., Smale, S., Weinberger, S.: A topological view of unsupervised learning from noisy data. SIAM J. Comput. 40(3), 646–663 (2011)
Niyogi, P., Smale, S., Weinberger, S.: Finding the homology of submanifolds with high confidence from random samples. Discrete Comput. Geom. 39(1–3), 419–441 (2008)
Olkin, I., Pratt, J.W.: Unbiased estimation of certain correlation coefficients. Ann. Math. Statist 29, 201–211 (1958)
Smale, S.: Complexity theory and numerical analysis. In: Iserles, A. (ed.) Acta Numerica, vol. 6, pp. 523–551. Cambridge University Press, Cambridge (1997)
Takemura, A., Kuriki, S.: On the equivalence of the tube and Euler characteristic methods for the distribution of the maximum of Gaussian fields over piecewise smooth domains. Ann. Appl. Probab. 12(2), 768–796 (2002)
Takemura, A., Kuriki, S.: Tail probability via the tube formula when the critical radius is zero. Bernoulli 9(3), 535–558 (2003)
Taylor, J., Takemura, A., Adler, R.J.: Validity of the expected Euler characteristic heuristic. Ann. Probab. 33(4), 1362–1396 (2005)
Thäle, C.: 50 years sets with positive reach—a survey. Surv. Math. Appl. 3, 123–165 (2008)
van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer Series in Statistics. With Applications to Statistics. Springer, New York (1996)
Weyl, H.: On the volume of tubes. Am. J. Math. 61(2), 461–472 (1939)
Acknowledgements
We would like to thank Takashi Owada for useful discussions, and two referees for helpful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
R. J. Adler: Research supported in part by URSAT, ERC Advanced Grant 320422 and SATA II, AFOSR, FA9550-15-1-0032.
S. R. Krishnan: Research supported in part by URSAT, ERC Advanced Grant 320422.
J. E. Taylor: Research supported in part by SATA, AFOSR, FA9550-11-1-0216.
S. Weinberger: Research supported in part by SATA, AFOSR, FA9550-11-1-0216 and DMS-1510178.
Appendices
Appendix 1
We now give a proof of Lemma 5.1. As mentioned earlier, Lemma 5.1 is identical to Lemma 2.1 of [22] and, as pointed out there, the proof is essentially the same as the proof given in [12] for the one-dimensional case. Thus, we make no claim of originality, and include the proof for completeness only.
Proof
Take (a unit length vector) \(\eta _x\in T^{\perp }_xM\cap S(T_x S^{k-1})\), and consider the geodesic
To determine the local reach in the direction \(\eta _x\), we need to know how far we can extend \(\gamma \) so that the metric projection of the endpoint is x. Clearly, this is until we find a \(y\ne x\) such that
Consider first the case of \(r<\frac{\pi }{2}\) so that \(\cos r>0\). Then the above two formulae imply that we can extend the geodesic at least until such an r as long as
When \(r\ge \frac{\pi }{2}\), the same argument gives that as soon as there is a y such that \(\langle y,\eta _x\rangle >0\),
and therefore, for such r,
implying that such a y is closer to \(\gamma _{x,\eta _x}(r)\) than x is. Hence, the geodesic can only be extended up to a length less than or equal to \(\frac{\pi }{2}\).
Thus, by our earlier argument for \(r\le \frac{\pi }{2}\), we note that on the set
the reach satisfies
where \(x^{+}\) denotes the positive part of x. Meanwhile, on the set \(Z^{\text {c}}\), we have
Therefore, we have the inequality,
which becomes an equality when
In other words, we have
Finally, since M is a closed manifold embedded into a sphere, the local reach at x, which is an infimum over all \(\eta _x\) above, cannot be greater than \(\frac{\pi }{2}\). Thus we can truncate at this angle, and so, by (2.2), (2.3), and the above, obtain that
as required. \(\square \)
Appendix 2
We need to show that the remainder terms, \(O(1/k^2)\), in (9.1) and (9.2) are of the right order and, just as importantly, are uniform over \(M\times M\).
As mentioned earlier, if the correlation estimates \(\widehat{\mathbb {C}} (x,y)\) of \(\mathbb {C}(x,y)\) were centered at sample means rather than zero—which we shall refer to as the ‘standard’ case—we could simply quote known results from the Statistics literature to establish everything we need. These results are not hard to prove, but they involve pages of tedious algebra, which we do not want to try to reproduce here. Rather, we shall suffice with describing the standard proofs, and where changes need to be made to cover our situation.
The standard case is treated in [14]. Following the derivation in Chapter 16, Section 16.24 there, we start by writing out the joint probability of k sample values \(\{(f_j(x),f_j(y))\}_{j=1}^k\) drawn from a bivariate normal density with zero means and unit variances in terms of the statistics we are interested in, namely,
along with the \(\widehat{\mathbb {C}}_k(x,y)\) of (5.1). These replace the standard sample mean centered version of these statistics in [14].
Using the result of Example 11.6 in Chapter 11 of [14] which deals with finding the distribution of a sum of squares of i.i.d. standard normal variates, and following the discussion in Section 16.24 there, we find that the exact joint probability density of \(s_1,s_2,\) and \(\widehat{\mathbb {C}}_k(x,y)\), on \({\mathbb R}_+\times {\mathbb R}_+\times [-1,1]\), is given by
As in Section 16.32 of [14], we now integrate out \(s_1\) and \(s_2\), and use the remaining density of \(\widehat{\mathbb {C}}\) to compute that
where F is the hypergeometric function. Note that \(0\le \mathbb {C}^2(x,y)\le 1\) for all \((x,y)\in M\times M\). The fact that
uniformly for x in any bounded set (cf. [20]), and a Stirling’s approximation which gives that the ratio of Gamma functions in (12.89) converges to 1 as \(k\rightarrow \infty \), gives the uniformity of the error bound in (9.1). A similar calculation establishes (9.2) and the uniformity of the error bound there.
Rights and permissions
About this article
Cite this article
Adler, R.J., Krishnan, S.R., Taylor, J.E. et al. Convergence of the reach for a sequence of Gaussian-embedded manifolds. Probab. Theory Relat. Fields 171, 1045–1091 (2018). https://doi.org/10.1007/s00440-017-0801-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-017-0801-1
Keywords
- Gaussian process
- Manifold
- Random embedding
- Critical radius
- Reach
- Curvature
- Asymptotics
- Fluctuation theory
Mathematics Subject Classification
- Primary 60G15
- 57N35
- Secondary 60D05
- 60G60