Skip to main content
Log in

Alternating Projections on Nontangential Manifolds

  • Published:
Constructive Approximation Aims and scope

Abstract

We consider sequences \((B_{k})_{k=0}^{\infty}\) of points obtained by projecting a given point B=B 0 back and forth between two manifolds \(\mathcal{M}_{1}\) and \(\mathcal{M}_{2}\), and give conditions guaranteeing that the sequence converges to a limit \(B_{\infty}\in\mathcal{M}_{1}\cap\mathcal{M}_{2}\). Our motivation is the study of algorithms based on finding the limit of such sequences, which have proved useful in a number of areas. The intersection is typically a set with desirable properties but for which there is no efficient method for finding the closest point B opt in \(\mathcal{M}_{1}\cap\mathcal{M}_{2}\). Under appropriate conditions, we prove not only that the sequence of alternating projections converges, but that the limit point is fairly close to B opt , in a manner relative to the distance ∥B 0B opt ∥, thereby significantly improving earlier results in the field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. If the manifold \(\mathcal{M}\) is not convex, then there exist points with multiple closest points on the manifold. To define the projection onto \(\mathcal{M}\), one thus needs to involve point to set maps. We will not use this formalism, but rather write π(B) to denote an arbitrarily chosen closest point to B on \(\mathcal{M}\). This is done to simplify the presentation and because our results are stated in a local environment where the π’s are shown to be well-defined functions. See Proposition 2.3.

  2. I.e., a p times continuously differentiable function such that (x) is injective for all \(x\in\mathcal {B}_{\mathbb{R} ^{m}}(0,r)\).

References

  1. Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008)

    MATH  Google Scholar 

  2. Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Lojasiewicz inequality. Math. Oper. Res. 35, 438–457 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  3. Badea, C., Grivaux, S., Müller, V.: A generalization of the Friedrichs angle and the method of alternating projections. C. R. Math. 348, 53–56 (2010)

    Article  MATH  Google Scholar 

  4. Basu, S., Pollack, R., Roy, M.-F.: Algorithms in Real Algebraic Geometry. Algorithms and Computation in Mathematics, vol. 10. Springer, Berlin (2006)

    MATH  Google Scholar 

  5. Bauschke, H.H., Borwein, J.M.: On the convergence of von Neumann’s alternating projection algorithm for two sets. Set-Valued Anal. 1, 185–212 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bauschke, H.H., Borwein, J.M.: Dykstra’s alternating projection algorithm for two sets. J. Approx. Theory 79, 418–443 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Rev. 38, 367–426 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bauschke, H.H., Noll, D., Celler, A., Borwein, J.M.: An EM algorithm for dynamic SPECT. IEEE Trans. Med. Imag 18, 252–262 (1999)

    Article  Google Scholar 

  9. Berger, M., Gostiaux, B.: Differential Geometry: Manifolds, Curves and Surfaces. Springer, Berlin (1988)

    Book  MATH  Google Scholar 

  10. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers. Now Publishers, Hanover (2011)

    Google Scholar 

  11. Boyd, S., Xiao, L.: Least squares covariance matrix adjustment. SIAM J. Matrix Anal. Appl. 27, 532–546 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  12. Brègman, L.M.: Finding the common point of convex sets by the method of successive projection. Dokl. Akad. Nauk SSSR 162, 487–490 (1965)

    MathSciNet  Google Scholar 

  13. Cadzow, J.A.: Signal enhancement—a composite property mapping algorithm. IEEE Trans. Acoust. Speech Signal Process. 36, 49–62 (1988)

    Article  MATH  Google Scholar 

  14. Combettes, P.L., Trussell, H.J.: Method of successive projections for finding a common point of sets in metric spaces. J. Optim. Theory Appl. 67, 487–507 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  15. Cox, D.A., Little, J.B., O’Shea, D.: Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra, vol. 10. Springer, Berlin (2007)

    Book  Google Scholar 

  16. Deutsch, F.: Rate of convergence of the method of alternating projections. In: Parametric Optimization and Approximation, pp. 96–107 (1985)

    Google Scholar 

  17. Deutsch, F.: Best Approximation in Inner Product Spaces. CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC, vol. 7. Springer, New York (2001)

    Book  MATH  Google Scholar 

  18. Deutsch, F., Hundal, H.: The rate of convergence for the method of alternating projections, ii. J. Math. Anal. Appl. 205, 381–405 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  19. Dykstra, R.L.: An algorithm for restricted least squares regression. J. Am. Stat. Assoc. 78, 837–842 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  20. Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1, 211–218 (1936)

    Article  MATH  Google Scholar 

  21. Franklin, J.N.: Matrix Theory. Dover Publications in Mineola, New York (2000)

    Google Scholar 

  22. Friedrichs, K.: On certain inequalities and characteristic value problems for analytic functions and for functions of two variables. Trans. Am. Math. Soc. 41, 321–364 (1937)

    Article  MathSciNet  Google Scholar 

  23. Gilbarg, D., Trudinger, N.: Elliptic Partial Differential Equations of Second Order, vol. 224. Springer, Berlin (2001)

    MATH  Google Scholar 

  24. Grigoriadis, K.M., Frazho, A.E., Skelton, R.E.: Application of alternating convex projection methods for computation of positive Toeplitz matrices. IEEE Trans. Signal Process. 42, 1873–1875 (1994)

    Article  Google Scholar 

  25. Grubišić, I., Pietersz, R.: Efficient rank reduction of correlation matrices. Linear Algebra Appl. 422, 629–653 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  26. Gubin, L.G., Polyak, B.T., Raik, E.V.: The method of projections for finding the common point of convex sets. USSR Comput. Math. Math. Phys. 7, 1–24 (1967)

    Article  Google Scholar 

  27. Higham, N.J.: Computing the nearest correlation matrix—a problem from finance. IMA J. Numer. Anal. 22, 329–343 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  28. Hörmander, L.: The Analysis of Linear Partial Differential Operators. III. Springer, Berlin (2007)

    MATH  Google Scholar 

  29. Kayalar, S., Weinert, H.L.: Error bounds for the method of alternating projections. Math. Control Signals Syst. (MCSS) 1, 43–59 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  30. Kendig, K.: Elementary Algebraic Geometry. Springer, Berlin (1977)

    Book  MATH  Google Scholar 

  31. Krantz, S.G., Parks, H.R.: Distance to C k hypersurfaces. J. Differ. Equ. 40, 116–120 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  32. Lax, P.D.: Linear Algebra and Its Applications, 2nd edn. Pure and Applied Mathematics (Hoboken). Wiley-Interscience/Wiley, Hoboken (2007)

    MATH  Google Scholar 

  33. Levi, A., Stark, H.: Signal restoration from phase by projections onto convex sets. J. Opt. Soc. Am. 73, 810–822 (1983)

    Article  MathSciNet  Google Scholar 

  34. Lewis, A.S., Luke, D., Malick, J.: Local linear convergence for alternating and averaged nonconvex projections. Found. Comput. Math. 9, 485–513 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  35. Lewis, A.S., Malick, J.: Alternating projections on manifolds. Math. Oper. Res. 33, 216–234 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  36. Li, Y., Liu, K., Razavilar, J.: A parameter estimation scheme for damped sinusoidal signals based on low-rank Hankel approximation. IEEE Trans. Signal Process. 45, 481–486 (1997)

    Article  Google Scholar 

  37. Lu, B., Wei, D., Evans, B.L., Bovik, A.C.: Improved matrix pencil methods. In: Conference Record of the Thirty-Second Asilomar Conference on Signals, Systems Computers, vol. 2, pp. 1433–1437 (1998)

    Google Scholar 

  38. Markovsky, I.: Structured low-rank approximation and its applications. Automatica 44, 891–909 (2008)

    Article  MathSciNet  Google Scholar 

  39. Marks, R.J.: II, Alternating projections onto convex sets. In: Jansson, P.A. (ed.) Deconvolution of Images and Spectra, 2nd edn., pp. 476–501. Academic Press, San Diego (1997)

    Google Scholar 

  40. Montiel, S., Ros, A., Babbitt, D.G.: Curves and Surfaces, vol. 69. Am. Math. Soc., Providence (2009)

    MATH  Google Scholar 

  41. Munkres, J.R.: Topology, 2nd edn. Prentice Hall, Upper Saddle River (2000)

    MATH  Google Scholar 

  42. Nikolski, N.K.: Operators, Functions, and Systems: An Easy Reading. Vol. 1. Mathematical Surveys and Monographs, vol. 92. Am. Math. Soc., Providence (2002). Hardy, Hankel, and Toeplitz, translated from the French by Andreas Hartmann

    Google Scholar 

  43. Prabhu, V.U., Jalihal, D.: An improved ESPRIT based time-of-arrival estimation algorithm for vehicular OFDM systems. In: IEEE 69th Vehicular Technology Conference, 2009 VTC Spring 2009, April 2009, pp. 1–4. (2009)

    Chapter  Google Scholar 

  44. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Grundlehren der Mathematischen Wissenschaften, vol. 317. Springer, Berlin (1998)

    Book  MATH  Google Scholar 

  45. Sard, A.: The measure of the critical values of differentiable maps. Bull. Am. Math. Soc. 48, 883–890 (1942)

    Article  MathSciNet  MATH  Google Scholar 

  46. Schmidt, E.: Zur theorie der linearen und nichtlinearen integralgleichungen. iii, teil. Math. Ann. 65, 370–399 (1908)

    Article  MathSciNet  MATH  Google Scholar 

  47. Shafarevich, I.R.: Basic Algebraic Geometry. Springer, New York (1974). Translated from the Russian by K.A. Hirsch, Die Grundlehren der mathematischen Wissenschaften, Band 213

    Book  MATH  Google Scholar 

  48. Toh, K.C., Todd, M.J., Tütüncü, R.H.: SDPT3—a MATLAB software package for semidefinite programming, version 1.3. Optim. Methods Softw. 11, 545–581 (1999)

    Article  MathSciNet  Google Scholar 

  49. Toh, K.C., Todd, M.J., Tütüncü, R.H.: On the implementation and usage of SDPT3—a Matlab software package for semidefinite-quadratic-linear programming, version 4.0. In: Handbook on Semidefinite, Conic and Polynomial Optimization, pp. 715–754 (2012)

    Chapter  Google Scholar 

  50. Trotman, D.: Singularity Theory. Scientific Publishing, Oxford (2007). Chap. Lectures on real stratification theory

    MATH  Google Scholar 

  51. Tütüncü, R.H., Toh, K.C., Todd, M.J.: Solving semidefinite-quadratic-linear programs using SDPT3. Math. Program. 95, 189–217 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  52. von Neumann, J.: Functional Operators, Volume II: The Geometry of Orthogonal Spaces. Princeton University Press, Princeton (1950)

    Google Scholar 

  53. Whitney, H.: Elementary structure of real algebraic varieties. Ann. Math. 66, 545–556 (1957)

    Article  MathSciNet  MATH  Google Scholar 

  54. Xu, J., Zikatanov, L.: The method of alternating projections and the method of subspace corrections in Hilbert space. J. Am. Math. Soc. 15, 573–598 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  55. Zangwill, W.I.: Nonlinear Programming. Prentice Hall, Englewood Cliffs (1969)

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the Swedish Research Council and the Swedish Foundation for International Cooperation in Research and Higher Education, as well as Dirección de Investigación Científica y Technológica del Universidad de Santiago de Chile, Chile. Part of the work was conducted while Marcus Carlsson was employed at Universidad de Santiago de Chile. We thank Arne Meurman for fruitful discussions. We also would like to thank one of the reviewers for the constructive criticism and useful suggestions that have helped to improve the quality of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fredrik Andersson.

Additional information

Communicated by Stephane Jaffard.

Appendices

Appendix A

Recall the map ϕ introduced in Sect. 2, and let A be a point in \(\mathcal{K}\). We will throughout without restriction assume that ϕ(0)=A and that the domain of definition of ϕ is \(\mathcal {B}_{\mathbb{R}^{m}}(0,r)\) for some r>0. We also let s be such that (2.1) holds.

Proposition 2.1

Let \(\alpha:\mathcal {K}\rightarrow\mathcal{M} \) be any C p-map, where \(\mathcal{M}\) is a C p-manifold and p≥1. Then the map ϕ −1α is also C p (on its natural domain of definition).

Proof

Given \(x_{0}\in\mathcal{B}_{\mathbb{R}^{m}}(0,r)\), pick \(f_{1},\ldots ,f_{n-m}\in\mathcal{K}\) with the property that

$$\bigl(T_\mathcal{M}\bigl(\phi(x_0)\bigr) \bigr)^\perp=\operatorname {Span}\{ f_1,\ldots,f_{n-m}\}, $$

and define \(\omega_{x_{0}}:\mathcal{B}_{\mathbb{R}^{m}}(0,r)\times\mathbb {R}^{n-m}\rightarrow\mathcal{K}\) via

$$\omega_{x_0}(x,y)=\phi(x)+\sum_{i=1}^{n-m}y_if_i. $$

By the inverse function theorem ([9] 0.2.22), \(\omega _{x_{0}}\) has a C p-inverse in a neighborhood of (x 0,0). The proposition now follows by noting that for values of α near ϕ(x 0), we have \(\phi^{-1}\circ\alpha=\omega_{x_{0}}^{-1}\circ\alpha\). □

Proposition 2.2

Let \(\mathcal{M}\) be a C 1-manifold. Then \(P_{T_{\mathcal{M}}(A)}\) is a continuous function of A.

Proof

Given \(A\in \operatorname {Im}\phi\), set M=(ϕ −1(A)). It is easy to see that

$$P_{T_{\mathcal{M}}(A)}=M\bigl(M^*M\bigr)^{-1}M^*. $$

The conclusion now follows as and ϕ −1 are continuous. □

Propositions 2.3 and 2.4 are a bit harder. We begin with a lemma.

Lemma 8.1

If \(B\in\mathcal{K}\) is given and A is the closest point in \(\mathcal{M}\), then \(B-A\perp T_{\mathcal{M}}(A)\). Moreover, ∥ϕ(x)−ϕ(y)∥/∥xyis uniformly bounded above and below for x, y in any \(\mathcal {B}(0,r')\), r′<r.

Proof

Note that

$$ \phi(x)=A+d\phi(0)x+\textsl{o}(x), $$

where o stands for a function with the property that o(x)/∥x∥ extends by continuity to 0 and takes the value 0 there. Thus we have

$$\begin{aligned} \bigl\|\phi(x)-B\bigr\|^2&=\bigl\|A+d\phi(0) (x)+\textsl{o}(x)-B\bigr\|^2 \\ &=\|A-B\|^2+2\bigl\langle A-B, d\phi(0)x \bigr\rangle+\textsl{o}\bigl(\|x \|\bigr), \end{aligned}$$

and hence the scalar product needs to be zero for all x’s. For the second claim, set w=(ϕ(y)−ϕ(x))/∥ϕ(y)−ϕ(x)∥, and apply the mean value theorem to γ(t)=〈ϕ(x+(yx)t)−ϕ(x),w〉 to conclude that

$$\begin{aligned} & \bigl\|\phi(y)-\phi(x)\bigr\|=\gamma(1)-\gamma(0)=\bigl\langle d\phi (z) (y-x),w\bigr \rangle \end{aligned}$$

for some z on the line between x and y. Letting σ 1((z)),…,σ m ((z)) denote the singular values of (z), we thus have

$$\inf_{z\in\mathcal{B}_{\mathbb{R}^m}(0,r')} \bigl\{\sigma_m\bigl(d\phi(z)\bigr) \bigr\} \|y-x\|\leq \bigl\|\phi(y)-\phi(x)\bigr\|\leq\sup_{z\in\mathcal{B}_{\mathbb{R}^m}(0,r')} \bigl\{ \sigma_1\bigl(d\phi(z)\bigr)\bigr\}\|y-x\|. $$

Now, (z) depends continuously on z, and its singular values depend continuously on the matrix entries [21, p. 191]. Since ϕ is an immersion, we have σ m ((x))≠0 for all \(x\in\mathcal {B}(0,r)\), so by compactness of \(\mathit{cl}(\mathcal{B}(0,r'))\), we get that both the inf and sup amount to finite positive numbers, as desired. □

Since s has been specified above, we work with s′ in the below formulation.

Proposition 2.3

Let \(\mathcal{M}\) be a C 2-manifold. Given any fixed \(A\in \mathcal{M}\), there exists s′>0 and a C 2-map

$$\pi:\mathcal{B}_{\mathcal{K}}\bigl(A,s'\bigr)\rightarrow \mathcal{M} $$

such that for all \(B\in \mathcal{B}_{\mathcal{K}}(A,s')\), there exists a unique closest point in \(\mathcal{M}\) which is given by π(B). Moreover, \(C\in\mathcal{M}\cap\mathcal {B}_{\mathcal{K}}(A,s')\) equals π(B) if and only if \(B-C\perp T_{\mathcal{M}}(C)\).

Proof

We repeat the standard construction of a tubular neighborhood of \(\mathcal{M}\) (see, e.g., [9] or [40]). By standard differential geometry, there exists an r 0<r and C 1-functions \(f_{1},\ldots,f_{n-m}:\mathcal{B}_{\mathbb{R}^{m}}(0,r_{0})\rightarrow \mathcal{K}\) with the property that

$$\bigl(T_\mathcal{M}\bigl(\phi(x)\bigr) \bigr)^\perp=\operatorname {Span}\bigl\{ f_1(x),\ldots,f_{n-m}(x)\bigr\} $$

for all \(x\in\mathcal{B}_{\mathbb{R}^{m}}(0,r_{0})\). Moreover, applying the Gram-Schmidt process, we may assume that {f 1(x),…,f nm (x)} is an orthonormal set for all \(x\in \mathcal{B} _{\mathbb{R} ^{m}}(0,r_{0})\). Define \(\tau:\mathcal{B}_{\mathbb{R}^{m}}(0,r_{0})\times\mathbb {R}^{n-m}\rightarrow\mathcal{K}\) via

$$\tau(x,y)=\phi(x)+\sum_{i=1}^{n-m}y_if_i(x). $$

τ is C 1 by construction, so the inverse function theorem implies that there exists r 1 such that τ is a diffeomorphism from \(\mathcal{B}_{\mathbb{R}^{n}}(0,r_{1})\) onto a neighborhood of A. Choose s′ such that s′<r 1/4, 2s′<s and

$$ \mathcal{B}_{\mathcal{K}}\bigl(A,2s'\bigr)\subset\tau \bigl(\mathcal{B}(0,r_1/2)\bigr). $$
(8.1)

Given any \(B\in\mathcal{B}_{\mathcal{K}}(A,s')\), there thus exists a unique (x B ,y B ) such that B=τ((x B ,y B )) and ∥(x B ,y B )∥≤r 1/2. We define

$$\pi(B)=\phi(x_B). $$

To see that π is a C 1-map, let \(\theta:\mathbb{R}^{m}\times\mathbb {R}^{n-m}\rightarrow\mathbb{R}^{m}\) be given by θ((x,y))=x, and note that on \(\mathcal{B}(A,s')\), we have

$$\pi=\phi\circ\theta\circ(\tau|_{\mathcal{B}_{\mathbb {R}^n}(0,r_1)})^{-1}. $$

For the fact that π is actually C 2 (which is not needed in this paper), we refer to [31] or Sect. 14.6 in [23]. We now show that π(B) have the desired properties. Suppose \(C\in \mathcal{M}\) is a closest point to B. Since ∥AB∥<s′, we clearly must have ∥CA∥<2s′, so by (2.1) and (8.1), there exists a \(x_{C}\in\mathcal{B}_{\mathbb{R}^{m}}(0,r_{1}/2)\) with ϕ(x C )=C. Moreover, by Lemma 8.1, \(B-C\perp T_{\mathcal{M}}(C)\), so (by the orthonormality of the f’s at x C ) there exists a \(y\in\mathbb{R}^{n-m}\) with τ(x C ,y)=B and ∥y∥=∥BC∥<s′<r 1/4. Since ∥(x C ,y)∥≤r 1/2+r 1/4 and τ is a bijection in \(\mathcal{B} (0,r_{1})\), we conclude that (x C ,y)=(x B ,y B ). Hence π(B)=ϕ(x B )=ϕ(x C )=C. This establishes the first part of the proposition.

By the construction, \(\pi(B)-B\in \operatorname {Span}\{ f_{1}(x_{B}),\ldots,f_{n-m}(x_{B})\}\) is orthogonal to \(T_{\mathcal{M}}(\phi(x_{B}))=T_{\mathcal{M}}(\pi(B))\). Conversely, let C be as in the second part of the proposition. As above, we have C=ϕ(x C ) with ∥x C ∥<r 1/2, and there exists a y with B=τ((x C ,y)) and ∥y∥=∥BC∥<2s′<r 1/2. As earlier, this implies C=π(B), as desired. □

Proposition 2.4

Let \(\mathcal{M}\) be a locally C 2-manifold at A. For each ϵ>0, there exists s ϵ >0 such that for all \(C\in\mathcal{B}_{\mathcal{K}}(A,s_{\epsilon})\cap\mathcal{M}\), we have:

  1. (i)

    \(\operatorname {dist}(D,\tilde{T}_{\mathcal{M}}(C))< \epsilon\|D-C\|\), \(D\in \mathcal{B}(A,s_{\epsilon})\cap\mathcal{M}\),

  2. (ii)

    \(\operatorname {dist}(D,\mathcal{M})< \epsilon\|D-C\|\), \(D\in \mathcal{B}(A,s_{\epsilon})\cap\tilde{T}_{\mathcal{M}}(C)\).

Proof

We first prove (i). We may clearly assume that s ϵ <s. Set w=D−(C+(x C )(x D x C )), and note that \(\operatorname {dist}(D,\tilde {T}_{\mathcal{M}} (C))\leq \|w\|\). Apply the mean value theorem to the function

$$\gamma(t)= \biggl\langle\phi \bigl((x_D-x_C)t+x_C \bigr)-\phi (x_C)-d\phi (x_C) (x_D-x_C)t, \frac{w}{\|w\|} \biggr\rangle $$

to conclude that there exists a y on the line [x C ,x D ] between x C and x D such that

$$\begin{aligned} \|w\|=\gamma(1)-\gamma(0)= \biggl\langle \bigl(d\phi(y+x_C)-d \phi(x_C) \bigr) (x_D-x_C), \frac{w}{\| w\| } \biggr\rangle. \end{aligned}$$

By the Cauchy–Schwartz’s inequality, we get

$$\begin{aligned} \operatorname {dist}\bigl(D,\tilde{T}_\mathcal{M}(C)\bigr)\leq \bigl\|d\phi(y+x_C)-d \phi(x_C)\bigr\|\|x_D-x_C\| \end{aligned}$$

for some y∈[x C ,x D ]. By Lemma 8.1 and the equicontinuity of continuous functions on compact sets, we can choose an r ϵ <r such that \(\operatorname {dist}(D,\tilde{T}_{\mathcal{M}}(C))< \epsilon\|D-C\|\) for all \(x_{D},x_{C}\in\mathcal{B} (0,r_{\epsilon})\). As in the proof of Proposition 2.3, we can now define τ and use it to pick an s ϵ such that \(x_{C},x_{D}\in\mathcal{B}(0,r_{\epsilon})\) whenever \(C,D\in\mathcal {B}(A,s_{\epsilon})\). Hence (i) holds.

The proof of (ii) is similar. We assume without restriction that \(C\in \operatorname {Im}\phi\) and let y be such that D=C+(x C )y. We may chose s ϵ small enough that ∥x C +y∥<r, and then clearly \(\operatorname {dist}(D,\mathcal{M})\leq\|\phi (x_{C}+y)-D\|\). Setting w=ϕ(x C +y)−D and applying the mean value theorem to

$$\gamma(t)= \biggl\langle\phi(x_C+yt)-C-d\phi(x_C) yt, \frac{w}{\|w\| } \biggr\rangle, $$

we easily obtain

$$\begin{aligned} &\operatorname {dist}(D,\mathcal{M})\leq\|w\| \leq\bigl\|d\phi (x_C+t_0y)-d \phi (x_C)\bigr\|\| y\| \end{aligned}$$

for some t 0∈[0,1]. We omit the remaining details. □

Appendix B

When dealing with concepts such as dimension or the ideal generated by a variety, we will use \(\mathbb{R}\) and \(\mathbb{C}\) as subscripts when there can be confusion about which field is involved. To begin, a short argument shows that

$$ \mathbb{I}_{\mathbb{C}}(\mathcal {V}_{Zar})=\mathbb {I}_{\mathbb{R}}(\mathcal{V} )+ \mathrm{i} \mathbb{I}_{\mathbb{R}}( \mathcal{V}), $$
(9.1)

from which (6.2) easily follows. In case we work over \(\mathbb {C}\), we define ∇ as \(\nabla=(\partial_{z_{1}},\ldots,\partial_{z_{n}})\), where \(\partial_{z_{j}}\) refers to the formal partial derivatives, or, which is the same, the standard derivatives for analytic functions (see, e.g., Definition 3.3, Chap. II of [30]). We need the following classical result from algebraic geometry. See, e.g., Theorem 3 and the comments following it, Chap. II, Sect. 1. [47].

Theorem 9.1

Let \(\mathcal{V}\) be an irreducible complex algebraic variety with algebraic dimension d. Then \(\operatorname {dim}_{\mathbb{C}}\operatorname {Span}_{\mathbb{C}}\{ \nabla p(A): p\in\mathbb{I}_{\mathbb{C}}(\mathcal{V})\}< n-d\) for all singular points \(A\in \mathcal{V}\) and \(\operatorname {dim}_{\mathbb{C}}\operatorname {Span}_{\mathbb{C}}\{\nabla p(A): p\in\mathbb {I}_{\mathbb{C}}(\mathcal{V})\} = n-d\) for all nonsingular points \(A\in\mathcal{V}\). Moreover, the set of singular points form a complex variety of lower dimension.

If P is a set of polynomials, we will write \(\mathbb{V}(P)\) for the variety of common zeroes (in \(\mathbb{R}^{n}\) or \(\mathbb{C}^{n}\) depending on the context). Given two sets S 1 and S 2 and a common point A, we will write that \(S_{1}\overset{\mathrm{loc}}{=}S_{2}\) near A, if there exists an open set U containing A such that S 1U=S 2U. Theorem 9.1 shows that for an irreducible variety, the nonsingular points coincide with those of maximal rank in the work of H. Whitney [53]. This observation combined with Sects. 6 and 7 of that paper yields the following key result.

Theorem 9.2

Let \(\mathcal{V}\) be an irreducible complex algebraic manifold and A a nonsingular point. Given any \(p_{1},\ldots,p_{n-d}\in{\mathcal {I}}(\mathcal{V})\) such thatp 1(A),…,∇p nd (A) are linearly independent, we have \(\mathcal{V}\overset{\mathrm{loc}}{=}\mathbb{V}(\{p_{1},\ldots,p_{n-d}\} )\) near A.

The remark after Proposition 6.4 now easily follows from the above results and the complex version of the implicit function theorem (see, e.g., Theorem 3.5, Chap. II, [30]). In order to transfer these results to the real setting, we first need a lemma.

Lemma 9.3

A real algebraic variety \(\mathcal{V}\) is irreducible in \(\mathbb {R}^{n}\) if and only if \(\mathcal{V}_{Zar}\) is irreducible in  \(\mathbb{C}^{n}\).

Proof

If \(\mathcal{V}\) has a nontrivial decomposition in \(\mathbb{R}^{n}\) into smaller varieties as \(\mathcal{U}_{1}\cup\mathcal{U}_{2}\), then clearly \((\mathcal {U}_{1})_{Zar}\cup(\mathcal{U}_{2})_{Zar}\) is a nontrivial decomposition of \(\mathcal{V}_{Zar}\). Conversely, let \(\mathcal{V}_{Zar}=\mathcal {U}_{1}\cup\mathcal{U}_{2}\) be a nontrivial decomposition of \(\mathcal{V}_{Zar}\) in \(\mathbb {C}^{n}\). Let \(\mathbb {I}_{\mathbb{C}}(\mathcal{U}_{i})\) denote the respective complex ideals, i=1,2, and set \(\mathcal{I}_{i}=(\operatorname {Re}\mathbb{I}_{\mathbb{C}}(\mathcal {U}_{i}))\cup (\operatorname {Im}\mathbb{I}_{\mathbb{C}}(\mathcal{U}_{i}))\). The real variety corresponding to each \(\mathcal{I}_{i}\) clearly coincides with \(\mathcal{U}_{i}\cap \mathbb{R}^{n}\), which thus is a variety in \(\mathbb{R}^{n}\). Since \(\mathcal{V}_{Zar}\) is the smallest complex variety including \(\mathcal{V}\), \(\mathcal{V}\) is not a subset of either \(\mathcal {U}_{1}\) or \(\mathcal{U}_{2}\), and hence \(\mathcal{V}=(\mathcal{U}_{1}\cap \mathbb{R} ^{n})\cup (\mathcal{U}_{2}\cap\mathbb{R}^{n})\) is a nontrivial real decomposition of \(\mathcal{V}\). □

Lemma 9.4

Let \(\mathcal{V}\) be an irreducible real algebraic variety such that \(\mathcal{V}_{Zar}\) has algebraic dimension d. Then \(\operatorname {dim}_{\mathbb{R} }N_{\mathcal{V}}(A)< n-d\) for all singular points \(A\in\mathcal{V}\) and \(\operatorname {dim}_{\mathbb{R} }N_{\mathcal{V} }(A)= n-d\) for all nonsingular points \(A\in\mathcal{V}\).

Proof

First note that \(\mathcal{V}_{Zar}\) is irreducible, by Lemma 9.3, and hence Theorem 9.1 applies to \(\mathcal{V}_{Zar}\). Given any real vector space V, it follows by linear algebra that \(\dim_{\mathbb{R}}V=\dim _{\mathbb{C} }(V+ i V)\). By (9.1), we thus get

$$\operatorname {dim}_{\mathbb{R}}\operatorname {Span}_{\mathbb{R}}\bigl\{\nabla p(A): p\in\mathbb {I}_{\mathbb{R}}(\mathcal{V})\bigr\}=\operatorname {dim}_{\mathbb{C} }\operatorname {Span}\bigl\{\nabla p(A): p\in\mathbb{I}_{\mathbb{C}}(\mathcal{V})\bigr\} $$

for all \(A\in\mathbb{R}^{n}\), and hence the lemma follows by Theorem 9.1. □

Lemma 9.5

Let \(\mathcal{V}\) be an irreducible real algebraic variety such that \(\mathcal{V}_{Zar}\) has algebraic dimension d. Then \(\mathcal{V}^{\mathrm{ns}}\) is a C -manifold of dimension d, and \(\mathcal{V}\setminus\mathcal{V}^{\mathrm{ns}}\) is a real algebraic manifold such that \((\mathcal{V}\setminus\mathcal{V}^{\mathrm{ns}})_{Zar}\) has algebraic dimension strictly less than d.

Proof

By Lemma 9.3, \(\mathcal{V}_{Zar}\) is irreducible. By Theorem 9.1, we have that the singular points of \(\mathcal{V}_{Zar}\) form a proper subvariety of dimension strictly less than d. Hence \(\mathcal{V}\setminus\mathcal {V}^{\mathrm{ns}}\) is included in a complex subvariety of lower dimension than d. Since the algebraic dimension decreases when taking subvarieties, we conclude that \(\dim(\mathcal{V}\setminus\mathcal{V}^{\mathrm{ns}})_{Zar}<d\). If \(\mathcal{V}^{\mathrm{ns}}\) were empty, \((\mathcal{V}\setminus\mathcal{V}^{\mathrm{ns}})_{Zar}\) would include all of \(\mathcal{V}\), contradicting the definition of \(\mathcal{V}_{Zar}\) as the smallest complex variety including \(\mathcal{V}\). Finally, given \(A\in\mathcal{V}^{\mathrm{ns}}\), Theorem 9.2 and Lemma 9.4 imply that we can find \(p_{1},\ldots,p_{n-d}\in\mathcal {I}_{\mathbb{R}}(\mathcal{V})\) with linearly independent derivatives at A such that

$$ \mathcal{V}^{\mathrm{ns}}\overset{\mathrm{loc}}{=}\mathbb {V}\bigl(\{ p_1,\ldots,p_{n-d}\}\bigr) $$
(9.2)

near A. The fact that \(\mathcal{V}^{\mathrm{ns}}\) is a d-dimensional C -manifold now follows directly from Theorem 2.1.2(ii) in [9]. □

We are now ready to prove the results in Sect. 6.

Proof of Propositions 6.2 and 6.4

We begin with Proposition 6.2. Let \(\mathcal{V}_{Zar}\) have algebraic dimension d. First assume that \(\mathcal{V}\) is irreducible. By Lemma 9.5, \(\mathcal{V}^{\mathrm{ns}}\) is a nonempty manifold of dimension d, and \((\mathcal{V}\setminus\mathcal{V}^{\mathrm{ns}})_{Zar}\) has algebraic dimension strictly less than d. Thus \((\mathcal{V}\setminus\mathcal{V}^{\mathrm{ns}})_{Zar}\) can be decomposed into finitely many irreducible components of dimension strictly less than d, (see Sects. 4.6 and 9.4 in [15]). Lemma 9.5 can then be applied to the real part of each such component. Continuing like this, the dimension drops at each step, and hence the process must terminate. This process will give us a decomposition \(\mathcal{V}=\bigcup_{j=0}^{d} \mathcal{M}_{j}\), where each \(\mathcal{M}_{j}\) has dimension j. Now, such decompositions are not unique, but basic differential geometry implies that the number d is an invariant of \(\mathcal{V}\). Indeed, let \(\mathcal{V}=\bigcup_{j=0}^{m} \tilde {\mathcal{M}}_{j}\) be another such decomposition with \(\tilde{\mathcal{M}}_{m}\neq0\), and suppose that m>d. Let ϕ be a chart covering a patch of \(\tilde{\mathcal{M}}_{m}\), as in (2.1). ϕ is then defined on an open subset U of \(\mathbb{R}^{m}\), and the subsets \(\phi^{-1}(\mathcal{M}_{j})\) are manifolds of dimension strictly less than m. Hence each one has Lebesgue measure zero, which is not compatible with that their union should equal U. Reversing the roles of m and d, Proposition 6.2 follows in the irreducible case. Incidentally, we have also shown the first part of Proposition 6.4.

If \(\mathcal{V}\) is not irreducible, we can apply the above argument to each of its irreducible components. Since the dimension of \(\mathcal {V}_{Zar}\) is the maximum of the dimension of its components (Proposition 8, Sect. 9.6, [15]), Proposition 6.2 follows as above.

Finally, with \(A\in\mathcal{V}^{\mathrm{ns}}\), the identity (6.3) follows by (9.2) and the implicit function theorem. This establishes the second part of Proposition 6.4, and we are done. □

Proof of Proposition 6.3

This is now immediate by Lemma 9.4. □

Proof of Proposition 6.5

\(\mathcal{V}_{Zar}\) is a strict submanifold of both \((\mathcal {V}_{1})_{Zar}\) and \((\mathcal{V}_{2})_{Zar}\), and all three are irreducible by assumption and Lemma 9.3. Hence \(\mathcal{V}_{Zar}\) has strictly lower dimension than the other two, by basic algebraic geometry, see, e.g., Theorem 1, Chap. I, Sect. 6, in [47]. By Lemma 9.5, the manifold \(\mathcal{V}^{\mathrm{ns}}\) has strictly lower dimension than both \(\mathcal {V}^{\mathrm{ns}}_{1}\) and \(\mathcal{V}^{\mathrm{ns}}_{2}\). This means that these have a proper intersection at any intersection point \(A\in\mathcal{V}^{\mathrm{ns}}\cap\mathcal {V}^{\mathrm{ns}}_{1}\cap\mathcal{V}^{\mathrm{ns}}_{2}\), which by Proposition 3.2 means that the angle at A is defined, as desired. □

Proof of Theorem 6.6

By Proposition 6.5, all points \(A\in\mathcal{V}^{\mathrm{ns}}\cap \mathcal{V} ^{\mathrm{ns}}_{1}\cap \mathcal{V}^{\mathrm{ns}}_{2}\) are nontrivial (i.e., the angle between \(\mathcal {V}^{\mathrm{ns}}_{1}\)and \(\mathcal{V}^{\mathrm{ns}}_{2}\) exists). We first prove the latter statement. By Proposition 6.4, (6.4) implies that

$$ \operatorname {dim}\bigl(N_{\mathcal{V}_1}(A)+ N_{\mathcal {V}_2}(A)\bigr)\geq n-m. $$
(9.3)

Since obviously

$$ N_{\mathcal{V}}(A)\supset N_{\mathcal {V}_1}(A) + N_{\mathcal{V}_2}(A), $$
(9.4)

we have \(\dim(N_{\mathcal{V}}(A))\geq n-m\), which combined with Lemma 9.4 and Proposition 6.2 shows that in fact we must have \(A\in \mathcal{V} ^{\mathrm{ns}}\) and \(\dim(N_{\mathcal{V}}(A))= n-m\). Moreover, combined with (9.3) and (9.4), this implies that \(N_{\mathcal{V}}(A)=N_{\mathcal{V}_{1}}(A) + N_{\mathcal{V}_{2}}(A)\) which, upon taking the complement and recalling (6.3), yields

$$T_{\mathcal{V}_1^{\mathrm{ns}}}(A)\cap T_{\mathcal{V}_2^{\mathrm{ns}}}(A)=T_{\mathcal {V}^{\mathrm{ns}}}(A). $$

By Proposition 3.5, we conclude that A is nontangential, so \(A\in \mathcal{V}^{\mathrm{ns}, \mathrm{nt}}\). For the first statement, we note that \(\mathcal {V}\setminus(\mathcal{V} _{1}^{\mathrm{ns}}\cap\mathcal{V}_{2}^{\mathrm{ns}}\cap\mathcal{V}^{\mathrm{ns}})=(\mathcal {V}\setminus\mathcal{V}_{1}^{\mathrm{ns}})\cup(\mathcal{V} \setminus\mathcal{V}_{2}^{\mathrm{ns}})\cup(\mathcal{V}\setminus\mathcal {V}^{\mathrm{ns}})\) and \(\mathcal{V}\setminus\mathcal{V} _{i}^{\mathrm{ns}}=\mathcal{V}\cap(\mathcal{V}_{i}\setminus\mathcal{V}_{i}^{\mathrm{ns}})\) for i=1,2. Hence \(\mathcal{V} \setminus(\mathcal{V}_{1}^{\mathrm{ns}}\cap\mathcal{V}_{2}^{\mathrm{ns}}\cap\mathcal {V}^{\mathrm{ns}})\) is a real algebraic variety by Lemma 9.5 and the trivial fact that unions and intersections of varieties yield new varieties. Now, suppose \(A\in \mathcal{V} _{1}^{\mathrm{ns}}\cap\mathcal{V}_{2}^{\mathrm{ns}}\cap\mathcal{V}^{\mathrm{ns}}\) is not in \(\mathcal{V}^{\mathrm{ns}, \mathrm{nt}}\). Then it is tangential, which by the earlier arguments happens if and only if

$$ \dim\bigl(N_{\mathcal{V}_1}(A) + N_{\mathcal {V}_2}(A)\bigr)< n-m. $$
(9.5)

By Hilbert’s basis theorem, we can pick finite sets {p j,1} and {p j,2} such that \(\mathbb{I}_{\mathbb{R}}(\mathcal{V}_{l})\) is generated by these sets for l=1,2. Let M be the matrix with the gradients of each {p j,l } j,l as columns. The condition (9.5) can then be reformulated as the vanishing of the determinants of all (nm)×(nm) submatrices of M. Since each such determinant is a polynomial, we conclude that \(\mathcal{V}\setminus\mathcal{V}^{\mathrm{ns}, \mathrm{nt}}\) is defined by the vanishing of a finite number of polynomials, so it is a real algebraic variety. Finally, if \(\mathcal{V}^{\mathrm{ns}, \mathrm{nt}}\) is not void, then \((\mathcal {V}\setminus (\mathcal{V}^{\mathrm{ns}, \mathrm{nt}}))_{Zar}\) is a proper subvariety of \(\mathcal {V}_{Zar}\). The latter is irreducible by Lemma 9.3, and hence \((\mathcal {V}\setminus (\mathcal{V}^{\mathrm{ns}, \mathrm{nt}}))_{Zar}\) has lower dimension than \(\mathcal {V}_{Zar}\) by standard algebraic geometry, (see, e.g., Theorem 1, Chap. I, Sect. 6, [47]). □

Proof of Proposition 6.8

It is well known that \(\mathcal{V}\) is irreducible if and only if \(\mathbb {I}_{\mathbb{R} }(\mathcal{V})\) is prime, i.e., if and only if \(fg\in\mathbb {I}_{\mathbb{R}}(\mathcal{V})\) implies that either \(f\in\mathbb{I}_{\mathbb{R}}(\mathcal{V})\) or \(g\in\mathbb {I}_{\mathbb{R} }(\mathcal{V})\) (see, e.g., Proposition 3, Chap. 4, Sect. 5 of [15]). Suppose now that we have such a product fg and that \(\mathcal{V}\) is covered by analytic patches. Given any fixed \(A\in\mathcal{V}\), let i 0I be as in Definition 6.7. Then \((f\circ\phi_{i_{0}}) (g\circ\phi _{i_{0}})\equiv 0\), which by analyticity implies that one of the functions, say \(f\circ \phi_{i_{0}}\), vanishes identically. Note that \(\mathcal{V}\) is path connected. To see this, use the analytic patches to show that any path-connected component is both open and closed, which gives the desired conclusion since \(\mathcal{V}\) is connected (see, e.g., Sect. 23–25 in [41]). Now, let \(B\in\mathcal{V}\) be any other point, and let γ be a continuous path connecting A with B. The image \(\operatorname {Im}\gamma\) is compact and \(\{ \mathcal{B} _{\mathbb{R} ^{n}}(C,r_{C})\}_{C\in \operatorname {Im}\gamma}\) an open covering, where r C is as in Definition 6.7. We pick a finite subcovering at the points \(\{ C_{l}\} _{l=0}^{L}\) with C 0=A and C L =B. Clearly, these can be ordered such that \(\mathcal{V}\cap\mathcal{B}(C_{l},r_{C_{l}})\cap\mathcal {B}(C_{l+1},r_{C_{l+1}})\neq \emptyset\). Also let i l be the index of the covering \(\phi_{i_{l}}\) of \(\mathcal{B} (C_{l},r_{C_{l}})\) in accordance with Definition 6.7, where i 0 already has been chosen. Let \(D\in\mathcal{V}\cap\mathcal {B}(C_{0},r_{C_{0}})\cap\mathcal{B} (C_{1},r_{C_{1}})\) be given and let R be a radius such that \(\mathcal{B} (D,R)\subset\mathcal{B}(C_{0},r_{C_{0}})\cap\mathcal {B}(C_{1},r_{C_{1}})\). By assumption, f vanishes on all points of \(\mathcal{V}\cap\mathcal{B}(C_{0},r_{C_{0}})\), and hence \(f\circ \phi_{i_{1}}\) vanishes on the open set \(\phi_{i_{1}}^{-1}(\mathcal {B}(D,R))\), which by analyticity means that it vanishes identically on \(\varOmega_{i_{1}}\) (since it is assumed to be connected). By induction it follows that f(B)=0, and the first part is proved. The second is a simple consequence of continuity. We omit the details. □

Proof of Proposition 6.9

By Proposition 6.8, \(\mathcal{V}\) is irreducible. We first show that \(\mathcal{V} ^{\mathrm{ns}}\) is dense in \(\mathcal{V}\). Consider the case when \(\mathcal {V}\) is covered by analytic patches (the proof in the second case is easier and will be omitted). Since \(\mathcal{V}\setminus\mathcal{V}^{\mathrm{ns}}\) is a nontrivial subvariety by Lemma 9.5, there exists a polynomial f which vanishes on \(\mathcal{V} \setminus\mathcal{V}^{\mathrm{ns}}\) but not on \(\mathcal{V}\). However, if \(\mathcal{V}\setminus\mathcal{V}^{\mathrm{ns}}\) contains an open set, then the argument in Proposition 6.8 shows that f≡0 on \(\mathcal{V}\), a contradiction.

Now, let θ:UV be the bijection in question, where \(U\subset\mathbb{R}^{d}\) and \(V\subset\mathcal{V}\) are open. Let m be the dimension of \(\mathcal{V}\), and pick any \(A\in\mathcal{V}^{\mathrm{ns}}\cap V\). By Proposition 6.4 and (2.1), there exists open sets \(\tilde{U}\subset\mathbb{R}^{m}\) and \(\tilde{V}\subset V\) containing A and a C -bijection \(\phi:\tilde{U}\rightarrow\tilde{V}\). Moreover, by Proposition 2.1, ϕ −1θ is bijective and differentiable between the open sets \(\theta^{-1}(\tilde{V})\subset\mathbb{R}^{d}\) and \(\tilde{U}\subset \mathbb{R}^{d}\). That m=d is now a well-known consequence of the implicit function theorem. □

Appendix C

If we were to write out all the details, this section would get rather long. Considering that it is just an illustration, we will be a bit brief.

Proof of Proposition 7.1

Given a matrix \(A\in\mathcal{K}\), we can consider all elements above and on the diagonal as variables, and the remaining to be determined by A T=A. It follows that \(\mathcal{K}\) is a linear space of dimension n(n+1)/2. The statements concerning \(\mathcal{V}_{1}\) are now immediate. To see that \(\mathcal{V}_{2}\) is a real algebraic variety, note that B has rank greater than k if and only if one can find a nonzero (k+1)×(k+1) invertible minor (that is, a matrix obtained by deleting n−(k+1) rows and columns). The determinant of each such minor is a polynomial (more precisely, the determinant composed with the map that identifies \(\mathcal{K}\) with \(\mathbb{R} ^{(n^{2}+n)/2}\)), and \(\mathcal{V}_{2}\) is clearly the variety obtained from the collection of such polynomials. Thus \(\mathcal{V}_{2}\) is a real algebraic variety. The same is true for \(\mathcal{V}=\mathcal{V}_{1}\cap\mathcal {V}_{2}\) since it is obtained by adding the algebraic equations \(\{B_{j,j}=1\}_{j=1}^{n}\) to those defining \(\mathcal{V}_{2}\). We now study \(\mathcal{V}_{2}\). We denote by \({\mathbb{M}}_{i,j}\) the set of i×j matrices with real entries. By the spectral theorem, each \(B\in\mathcal{K}\) with \(\operatorname {Rank}B\leq k\) can be written as

$$ B=U\varSigma U^T, $$
(10.1)

where \(U\in{\mathbb{M}}_{n, k}\) and \(\varSigma\in{\mathbb{M}}_{k,k}\) is a diagonal matrix, and conversely any matrix given by such a product has rank less than or equal to k. We see that \(\mathcal{V}_{2}\) can be covered with one real polynomial, which by Proposition 6.8 shows that \(\mathcal{V}_{2}\) is irreducible and Proposition 6.9 applies. In order to determine the dimension, consider the open subset of matrices with positive eigenvalues. This has the easier parametrization UU T, where again \(U\in{\mathbb{M}}_{n, k}\) is arbitrary. However, neither is this bijective. In fact, if B=UU T, then any other such parametrization of B is given by B=(UW)(UW)T, where \(W\in{\mathbb{M}}_{k,k}\) is unitary. Pick any \(B_{0}=U_{0}U_{0}^{T}\) such that the upper k×k submatrix of U 0 is invertible, and denote this by V 0. By the theory of QR-factorizations, there is a unique unitary matrix W 0 for which V 0 W 0 is lower triangular and has positive values on the diagonal [32, Theorem 1, p. 262]. The set of n×k matrices that are lower triangular contain \(nk-\frac{(k-1)k}{2}\) independent variables, so we can identify the set of such matrices with \(\mathbb{R}^{nk-\frac {(k-1)k}{2}}\). Denote the inverse of this identification by \(\iota :\mathbb{R} ^{nk-\frac{(k-1)k}{2}}\rightarrow{\mathbb{M}}_{n, k}\), and let \(\varOmega \subset\mathbb{R} ^{nk-\frac{(k-1)k}{2}}\) be the open set corresponding to the matrices with strictly positive diagonal elements. Define \(\phi:\varOmega \rightarrow\mathcal{V}_{2}\) by

$$ \phi(y)=\iota(y) \bigl(\iota(y)\bigr)^T. $$
(10.2)

It is easy to see that ϕ is in bijective correspondence with an open set including B 0, and moreover ϕ is a polynomial. Thus Proposition 6.9 implies that \(\mathcal{V}_{2}\) has dimension \(\frac {2nk-k^{2}+k}{2}\), as desired.

We turn our attention to \(\mathcal{V}\) and first prove that it can be covered with analytic patches. Let σ:{1,…,n}→{1,…,k} and τ:{1,…,n}→{−1,1} be given, and consider all \(U\in{\mathbb{M}}_{n, k}\), where \(U_{j,\sigma_{j}}= x_{j}\) is an undetermined variable whereas all other values are fixed. Denote the jth row of U by U j , and let \(\varSigma\in{\mathbb{M}}_{k, k}\) be a fixed diagonal matrix. Then \(U_{j}\varSigma U_{j}^{T}=1\) is a quadratic equation with x j as unknown, which may have 0, 1, 2, or infinitely many real solutions. Suppose the remaining values of U are such that this has two solutions for all 1≤jn, and fix x j to be the solution whose sign coincides with τ(j). Denote the corresponding matrix by \(\tilde{U}\), and note that it is a real analytic function if we now consider the remaining values of \(\tilde{U}\) as variables. These variables and the values on the diagonal of Σ are n(k−1)+k in number, and so can be identified with points y in an open subset of \(\mathbb{R}^{nk-n+k}\). Let Ω be a particular connected component of this open set. Consider \(\tilde{U}\) and Σ as functions of y on Ω, in the obvious way, and set

$$ \psi_{\sigma,\tau,\varOmega}(y)=\tilde {U}(y)\varSigma(y) \bigl(\tilde{U}(y) \bigr)^T, \quad y\in\varOmega. $$
(10.3)

Let I be the set of all possible triples σ,τ,Ω. By the spectral theorem, one easily sees that each \(B\in\mathcal{V}\) is in the image of at least one of these maps ψ i , iI. It is now not hard to see that {ψ i } iI is a covering with analytic patches of \(\mathcal{V}\). We wish to use Proposition 6.8 to conclude that \(\mathcal{V}\) is irreducible, but first we need to show that \(\mathcal{V}\) is connected. The proof gets very lengthy and technical, so we will only outline the details. The idea is that any \(B\in\mathcal{V}\) is path connected with the matrix 1 with all elements equal to 1. To see this, first note that the subset of \(\mathcal{V}\) that can be represented as (10.1) with all elements of U nonzero is dense in \(\mathcal{V}\). Hence it suffices to find a path from such an element to 1. Let B be fixed. Now, all values in Σ are not negative, for then the diagonal values of B would be as well. We can assume that the diagonal elements of Σ are ordered decreasingly and that Σ 1,1=1. Pick σ such that σ(j)=k for all j, and choose τ and Ω such that the representation (10.1) can be written in the form (10.3). Now, if the second diagonal value in Σ is negative, we may continuously change it until it is not without leaving Ω. Then the values of y corresponding to the first and second column of \(\tilde{U}\) can be continuously moved until all elements of the first column are positive. At this point, we can reduce all values of \(\tilde{U}\) except the first column to zero, increasing the first value of each row whenever necessary to stay in Ω. Then we can move y so that the values in the first column become the same. Finally, we can let these values increase simultaneously until they reach 1. We have now obtained the matrix 1, and conclude that \(\mathcal{V}\) is connected and hence also irreducible, as desired.

Finally, we shall determine the dimension of \(\mathcal{V}\). Consider again the map ι introduced earlier, with the difference that this time the last lower diagonal element in each row is not a variable, but instead determined by the other variables in that row and the constraint that it be strictly positive and that the norm of the row be 1. The number of free variables is thus \(\frac{2nk-k^{2}+k}{2}-n\), and the above construction thus naturally defines a real analytic map on an open subset Ξ of \(\mathbb{R}^{\frac{2nk-k^{2}+k}{2}-n}\). Denote this map by θ, and define \(\psi:\varXi\rightarrow\mathcal{V}\) by

$$ \psi(y)=\theta(y) \bigl(\theta(y)\bigr)^T. $$

It is not hard to see that ψ is a bijection with an open subset of \(\mathcal{V}\), so by Proposition 6.9, the dimension of \(\mathcal {V}\) is \(\frac {2nk-k^{2}+k}{2}-n\), as desired. □

Proof of Proposition 7.2

By Proposition 6.3 and 7.1, we need to show that

$$\dim N_{\mathcal{V}_2}(A)=\bigl(n^2+n\bigr)/2-\bigl(2nk-k^2+k \bigr)/2=(n-k+1) (n-k)/2 $$

if and only if \(\operatorname {Rank}(A)=k\). By the same propositions, we already have that \(\dim N_{\mathcal{V}_{2}}(A)\leq(n-k+1)(n-k)/2\), so it suffices to show that this inequality is strict when \(\operatorname {Rank}(A)<k\) and that the reverse inequality holds when \(\operatorname {Rank}(A)=k\). Based on the representation (10.1), it is easily seen that each \(A\in\mathcal{V}_{2}\) can be written

$$ A=U \varSigma U^{T}, $$
(10.4)

where now both Σ and U lie in \({\mathbb{M}}_{n,n}\), U is unitary, and Σ is diagonal with only 0’s after the kth element. In this proof, the particular identification of \(\mathcal{K}\) with \(\mathbb {R}^{(n^{2}+n)/2}\) is important. We define \(\omega:\mathbb{R}^{(n^{2}+n)/2}\rightarrow \mathcal{K}\) by letting the first n entries be placed on the diagonal and the remaining ones be distributed over the upper triangular part, but multiplied by the factor \(1/\sqrt{2}\). Finally, the lower triangular part is defined by A T=A. This identification will be implicit. For example, if p is a polynomial on \(\mathcal{K}\), then we will write ∇p instead of the correct ω(∇(pω)). Note however that ∇p depends on ω. Now, given a polynomial \(p\in\mathbb{I}(\mathcal {V}_{2})\) and \(C\in{\mathbb{M}} _{n,n}\), q C (⋅)=p(C TC) is clearly also in \(\mathbb {I}(\mathcal{V} _{2})\). Due to the particular choice of ω, we have that ∇q C (B)=C Tp(C T BC)C as can be verified by direct computation. Letting A be fixed of rank jk, it is easy to use (10.4) to produce an invertible matrix C such that C T AC=I j , where \(I_{j}\in\mathcal{K}\) is the diagonal matrix whose first j diagonal values are 1 and 0 elsewhere. In particular,

$$\nabla q_C(A)=C^T\nabla p(I_j) C, $$

which implies that \(\dim N_{\mathcal{V}_{2}}(A)=\dim N_{\mathcal {V}_{2}}(I_{j})\). Now, all \({\mathbb{M}} _{k+1,k+1}\) subdeterminants of \(\mathcal{K}\) form polynomials in \(\mathbb {I}(\mathcal{V} _{2})\), and their derivatives at I k are easily computed by hand. In this way, one easily gets

$$\dim N_{\mathcal{V}_2}(I_k)\geq(n-k+1) (n-k)/2, $$

which proves that any rank k element of \(\mathcal{V}_{2}\) is nonsingular. Conversely, if j<k, consider a fixed \(u\in\mathbb{R}^{n}\) as a row-vector, and define the map \(\theta_{u}:\mathbb{R}\rightarrow\mathcal{V}_{2}\) via θ u (x)=I j +xu T u. Letting \(\{e_{l}\}_{l=1}^{n}\) be the standard basis of \(\mathbb{R} ^{n}\) and considering \(\theta_{e_{l}}\) as well as all differences \(\theta _{e_{l}+e_{l'}}-\theta_{e_{l}}-\theta_{e_{l'}}\), one easily sees that

$$\operatorname {Span}\biggl\{\frac{d}{dx}\theta_u(0): u\in \mathbb{R}^n \biggr\} =\mathcal{K}, $$

and hence \(\dim N_{\mathcal{V}_{2}}(I_{j})=0\). This shows that I j is singular, by the remarks in the beginning of the proof.

Finally, we shall establish that \(\mathcal{V}^{\mathrm{ns}, \mathrm{nt}}\) is not void. By Proposition 6.5, Theorem 6.6, Proposition 7.1, and the first part of this proposition, it suffices to show that

$$ \dim\bigl(T_{\mathcal{V}_1}(A)\cap T_{\mathcal {V}_2}(A)\bigr)\leq \frac{2nk-k^2+k}{2}-n $$
(10.5)

for some point A which has rank k. We choose the point A=UU T, where \(U\in{\mathbb{M}}_{n\times k}\) has a 1 on the last lower diagonal element (i.e., with index (j,j) for j<k and (j,k) for jk) of each row and zeroes elsewhere. Recall the map ϕ in (10.2), and let \(y\in\mathbb{R}^{\frac{2nk-k^{2}+k}{2}}\) be such that U=ϕ(y). Given (i,j), let E (i,j) be the matrix with a 1 on positions (i,j) and (j,i) and zeroes elsewhere. The partial derivatives of ϕ at y contain multiples of all E (i,j) with ij<k, as well as nk+1 derivatives related to the last row of U, which we denote by F 1,…,F nk+1. These are not hard to compute, but we only need the fact each F l has precisely one nonzero diagonal value on one of the nk+1 last elements on the diagonal (with distinct positions for distinct l’s). Since \(T_{\mathcal{V}_{1}}(A)=\operatorname {Span}\{E_{i,j}: i\neq j\} \), it is easily seen that

$$\operatorname {Span}\bigl(T_{\mathcal{V}_1}(A)\cap T_{\mathcal{V}_2}(A)\bigr)=\operatorname {Span}\{ E_{i,j}: i<j<k\}. $$

These are \(\frac{n(n-1)}{2}-\frac{(n-k+1)(n-k)}{2}\) in number, and (10.5) follows. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Andersson, F., Carlsson, M. Alternating Projections on Nontangential Manifolds. Constr Approx 38, 489–525 (2013). https://doi.org/10.1007/s00365-013-9213-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00365-013-9213-3

Mathematics Subject Classification

Keywords

Navigation