1 Introduction

The paper by Arthur Pewsey and Eduardo García-Portugués provides a comprehensive review on directional statistics, specifically focusing on the developments in the last 20 years. Very commendably, the authors accomplished the almost impossible feat, balancing between extreme richness in content and providing an up-to-date review of the area, on the one hand, and complying with length restrictions, on the other hand. In consequence, as they write in their introduction, statistics on closely related spaces, for instance on rotational groups and shape spaces, have been touched only tangentially.

Yet, taking into account sophisticated methods of statistical analysis on some of these closely related spaces may actually shed more light onto statistics on the circle, the torus, the sphere and polyspheres. Indeed, these can be thought of as a first step in transition from Euclidean geometry to a non-Euclidean one, in our case, entering the world of compact manifolds. From an abstract point of view, one may want to ask, which statistical concepts

  1. (1)

    can be easily carried over, possibly become more benevolent?

  2. (2)

    can only hardly be carried over, possibly becoming way more difficult?

  3. (3)

    cannot be carried over at all but require a complete redesign?

As the authors elaborate, parametric families of distributions for directional data can be treated nearly as easily as their Euclidean kin. Further, as a relief, outliers and heavy tails are not troublesome on compact spaces, so that the authors only devote a minute portion of their review to this topic. Fréchet \(L^p\)-means for arbitrary powers \(p \in (0,\infty ]\) exist, as \(\mathbb {E}[d(q,X)^p]\le \pi ^p\) (\(p<\infty \)) for all random variables X on the unit sphere \(\mathbb {S}^m\) with spherical distance \(d(\cdot ,\cdot )\) and test points \(q\in \mathbb {S}^m\). The new difficulty is that such means may no longer be unique and this will be touched further below.

2 Benevolent and vicious spaces

Notably, principal components and many of their generalizations can be viewed as generalized Fréchet means, e.g. Huckemann and Hotz (2013); Huckemann and Eltzner (2018): For random data X on a sample space M, find the best approximating subspace within a suitable family \(\mathcal {N}\) of in some sense canonical submanifolds with respect to a link function \(\rho \) standing for some kind of distance,

$$\begin{aligned} {\mathrm{argmin}}_{N \in \mathcal {N}}\, \mathbb {E}\left[ \rho (X,N)\right] \,. \end{aligned}$$

In classical principal component analysis (PCA), \(M=\mathbb {R}^m\), \(\rho \) is minimal squared Euclidean distance and \(\mathcal {N}\) is the family of all k-dimensional affine subspaces where dim\((\mathcal {N}) = (m-k)(k+1)\). If \(M=\mathbb {S}^m\), then the forward approach of geodesic PCA, e.g. Hotz et al. (2010), is different from the backward approach of principal nested spheres by Jung et al. (2012), and allowing \(\mathcal {N}\) to be the family of all k-dimensional constant curvature subspheres, cf. Jung et al. (2012), we have dim\((\mathcal {N}) = (m-k)(k+2)> (m-k)(k+1)\). This allows to reduce complex structures on \(\mathbb {S}^m\) to lower dimensions than on \(\mathbb {R}^m\), making the sphere more benign than Euclidean space, cf. Schulz et al. (2015); Huckemann and Eltzner (2020).

On the other hand, on the torus \(M=\mathbb {S}^1 \times \mathbb {S}^1\) of dimension \(m=2\), taking \(\mathcal {N}\) as the family of geodesics, \(\min _{N \in \mathcal {N}}\mathbb {E}[\rho (X,N)] =0\), putting a dead end to analysis of variance, making tori and polyspheres appear rather vicious towards PCA approaches.

It seems that, despite the comprehensive exposition of this topic by the authors, developing a fully satisfactory PCA analogue on tori and polyspheres is still an open issue, requiring complete rethinking. I wonder where this challenge will take us?

3 Harnessing linearization

As we have seen above, topology and curvature cannot be ignored globally. Locally, however, it can be. In particular, this fact can be exploited for nonparametric curve estimation based on the famous rolling without slipping, originally developed for Kendall’s shape spaces, Le (2003); Kume et al. (2007), which has spurred truly intrinsic or suitable combinations of extrinsic and intrinsic, i.e. non-linear regression methods, e.g. Hinkle et al. (2012); Lin et al. (2017), all of which is easily applied to directional spaces.

Local tangent space linearization is specifically rewarding on Lie groups, since all tangent spaces can be viewed as translates of the Lie algebra, and on a compact Lie group, left and right translates are equivalent. This fact has been exploited by Telschow et al. (2019, 2020) to define canonical Gaussian perturbation models

$$\begin{aligned} \gamma (t) \mathrm{Exp}\left( A_t\right) = \mathrm{Exp}\left( B_t\right) \,\gamma (t)\,. \end{aligned}$$

Here \(\gamma (t)\) is a curve on a compact Lie group G, \(A_t\) is a zero-mean Gaussian process on the corresponding Lie algebra \(\mathfrak {g}\) and \(\mathrm{Exp}\) is the Lie exponential. The equality sign above indicates that given \(A_t\), there is a suitable zero-mean Gaussian process \(B_t\) on \(\mathfrak {g}\), thus making these models canonical. These models are valid on \(\mathbb {S}^3\) because it can be viewed as the Lie group of unit quaternions. Curiously, every \(\mathbb {S}^m\) can be viewed as the quotient \(SO(m+1)/SO(m)\) of compact rotational Lie groups, e.g. (Lee 2013, Chapter 7). I wonder how suitable (e.g. horizontal with respect to the quotient) canonical perturbation models on rotational groups can be exploited, defining canonical Gaussian perturbation models on spheres of arbitrary dimension?

4 Nonparametric estimation

As noted before, estimation within a parametric family on non-Euclidean spaces is often just as easy or difficult, as it is on Euclidean spaces. In nonparametrics, one may start with estimating certain shape parameters. In addition to measures of mean and spread, on a compact space M with distance \(d(\cdot ,\cdot )\), there are always antimeans

$$\begin{aligned} {\mathrm{argmax}}_{q\in M} F_2(q)\,, \end{aligned}$$

as introduced by Wang et al. (2020), or even more general, on a differential manifold, there are critical loci of the Fréchet function

$$\begin{aligned} F_2(q) =\mathbb {E}\left[ d(X,q)^2\right] \,. \end{aligned}$$

Under rather mild and realistic conditions, the Fréchet function is a Morse function, cf. Milnor (2016). In case of smeary means, cf. Hotz and Huckemann (2015); Eltzner and Huckemann (2018); Eltzner (2019), however, it is not. Recently, a central limit theorem for antimeans on Kendall’s planar shape spaces has been derived and used for inference by Wang et al. (2020). Further, Patrangenaru and Deng (2020) have performed regression on projective shape manifolds based on antimeans. Closely related is the search for multiple means, and the test for uniqueness of means, i.e. that the cardinality of

$$\begin{aligned} {\mathrm{argmin}}_{q\in M} F_2(q) \end{aligned}$$

is one, proposed by Eltzner (2020).

Thus, as generalized shape parameters of a probability functions, modes and antimodes of Fréchet functions

$$\begin{aligned} F_p(q) = \mathbb {E}[d(X,q)^p]\,, \quad p \in (0,\infty ]\,, \end{aligned}$$

can be defined, or more generally, its critical points.

On the side of anecdote, in case of \(p=2\), Kendall (1990) coined Karcher means for antimodes, which, however, the involuntary patron strongly repudiates, cf. Karcher (2014).

Bearing in mind that Fréchet functions can be viewed as continuous descriptors of spread, describing probability distributions by critical points of Fréchet functions and, based on it, deriving inferential tools, may prove to be a rewarding task to pursue. In particular suitable algorithms not only on circles, spheres, tori and polyspheres seem feasible, and, adding value, these critical points allow for straightforward geometric interpretation.