# Intrinsic Polynomials for Regression on Riemannian Manifolds

- 2.2k Downloads
- 26 Citations

## Abstract

We develop a framework for polynomial regression on Riemannian manifolds. Unlike recently developed spline models on Riemannian manifolds, Riemannian polynomials offer the ability to model parametric polynomials of all integer orders, odd and even. An intrinsic adjoint method is employed to compute variations of the matching functional, and polynomial regression is accomplished using a gradient-based optimization scheme. We apply our polynomial regression framework in the context of shape analysis in Kendall shape space as well as in diffeomorphic landmark space. Our algorithm is shown to be particularly convenient in Riemannian manifolds with additional symmetry, such as Lie groups and homogeneous spaces with right or left invariant metrics. As a particularly important example, we also apply polynomial regression to time-series imaging data using a right invariant Sobolev metric on the diffeomorphism group. The results show that Riemannian polynomials provide a practical model for parametric curve regression, while offering increased flexibility over geodesics.

## Keywords

Polynomial Riemannian geometry Regression Rolling maps Lie groups Shape space## 1 Introduction

Comparative studies are essential to biomedical statistical analysis. In the context of shape, such analyses are used to discriminate between healthy and disease states based on observations of anatomical shapes within individuals in the two populations [35]. Commonly, in these methods the shape data are modelled on a Riemannian manifold and intrinsic coordinate-free manifold-based methods are used [8]. This prevents bias due to arbitrary choice of coordinates and avoids the influence of unwanted effects. For instance, by modelling shapes with a representation incapable of representing scale and rotation of an object and using intrinsic manifold-based methods, scale and rotation are guaranteed not to effect the analysis [19].

Many conditions such as developmental disorders and neurodegeneration are characterized not only by shape characteristics, but by abnormal *trends* in anatomical shapes over time. Thus it is often the temporal dependence of shape that is most useful for comparative shape analysis. The field of regression analysis involves studying the connection between independent variables and observed responses [34]. In particular, this includes the study of temporal trends in a observed data.

In this work, we extend the recently developed geodesic regression model [12] to higher order polynomials using intrinsic Riemannian manifold-based methods. We show that this Riemannian polynomial model is able to provide increased flexibility over geodesics, while remaining in the parametric regression setting. The increase in flexibility is particularly important, as it enables a more accurate description of shape trends and, ultimately, more useful comparative regression analysis.

While our primary motivation is shape analysis, the Riemannian polynomial model is applicable in a variety of applications. For instance, directional data is commonly modelled as points on the sphere \(\mathbb{S}^{2}\), and video sequences representing human activity are modelled in Grassmannian manifolds [36].

In computational anatomy applications, the primary objects of interest are elements of a group of symmetries acting on the space of observable data. For instance, rigid motion is studied using the groups SO(3) and SE(3), acting on a space of landmark points or scalar images. Non-rigid motion and growth is modelled using infinite-dimensional diffeomorphism groups, such as in the currents framework [37] for unlabelled landmarks or the large deformation diffeomorphic metric mapping (LDDMM) framework of deforming images [30]. We show that in the presence of a group action, optimization of our polynomial regression model using an adjoint method is particularly convenient.

This work is an extension of the Riemannian polynomial regression framework first presented by Hinkle et al. [15]. In Sects. 5–7, we give a new derivation of polynomial regression for Lie groups and Lie group actions with Riemannian metrics. By performing the adjoint optimization directly in the Lie algebra, the computations in these spaces are greatly simplified over the general formulation. We show how this Lie group formulation can be used to perform polynomial regression on the space of images acted on by groups of diffeomorphisms.

### 1.1 Regression Analysis and Curve-Fitting

The study of the relationship between measured data and descriptive variables is known as the field of regression analysis. As with most statistical techniques, regression analyses can be broadly divided into two classes: parametric and non-parametric. The most widely used parametric regression methods are linear and polynomial regression in Euclidean space, wherein a linear or polynomial function is fit in a least-squares fashion to observed data. Such methods are the staple of modern data analysis. The most common non-parametric regression approaches are kernel-based methods and spline smoothing approaches which provide great flexibility in the class of regression functions. However, their non-parametric nature presents a challenge to inference problems; if, for example, one wishes to perform a hypothesis test to determine whether the trend for one group of data is significantly different from that of another group.

In previous work, non-parametric kernel-based and spline-based methods have been extended to observations that lie on a Riemannian manifold with some success [8, 18, 22, 26], but intrinsic parametric regression on Riemannian manifolds has received limited attention. Recently, Fletcher [12] and Niethammer et al. [31] have each independently developed a form of parametric regression, geodesic regression, which generalizes the notion of linear regression to Riemannian manifolds. Geodesic models are useful, but are limited by their lack of flexibility when modelling complex trends.

*Y*,

*p*∈

*M*is an initial point and

*v*∈

*T*

_{ p }

*M*an initial velocity. The geodesic curve \(\operatorname{Exp}(p,Xv)\) then relates the independent variable

*X*∈

*R*to the dependent random variable

*Y*, via this equation and the Gaussian random vector \(\epsilon\in T_{\operatorname{Exp}(p,Xv)}M\). In this paper, we extend this model to a polynomial regression model

*γ*(

*X*) is a Riemannian polynomial of integer order

*k*. In the case that

*M*is Euclidean space, this model is simply

*p*and vectors

*v*

_{ i }constitute the parameters of our model.

In this work we use the common term *regression* to describe methods of fitting polynomial curves using a sum of squared error penalty function. In Euclidean spaces, this is equivalent to solving a maximum likelihood estimation problem using a Gaussian noise model for the observed data. In Riemannian manifolds, the situation is more nuanced, as there is no consensus on how to define Gaussian distributions on general Riemannian manifolds, and in general the least-squares penalty may not correspond to a log likelihood. Many of the examples we will present are symmetric spaces: Kendall shape space in two dimensions, the rotation group, and the sphere, for instance. As Fletcher [12, Sect. 4] explains, least-squares regression in symmetric spaces does, in fact, correspond to maximum likelihood estimation of model parameters, using a natural definition of Gaussian distribution.

### 1.2 Previous Work: Cubic Splines

*y*

_{0},

*y*

_{1}∈

*M*of a curve, as well as the derivative of the curve at those points \(y_{0}'\in T_{y_{0}}M, y_{1}'\in T_{y_{1}}M\). A Riemannian cubic spline is then defined as any differentiable curve

*γ*:[0,1]→

*M*taking on those endpoints and derivatives and minimizing

Cubic splines are useful for interpolation problems on Riemannian manifolds. However, cubic splines provide an insufficient model for parametric curve regression. For instance, by increasing the order of derivatives in Eq. (4), cubic splines are generalizable to higher order curves. Still, only odd order splines may be defined in this way, and there is no clear way to define even order splines.

Riemannian splines are parametrized by the endpoint conditions, meaning that the space of curves is naturally explored by varying control points. This is convenient if control points such as observed data are given at the outset. However, for parametric curve regression, curve models are preferred that don’t depend on the data, such as the initial conditions of a geodesic [12]. Although Eq. (5) provides an ODE which could be used as such a parametric model in a “spline shooting” algorithm, estimating initial position and derivatives as parameters, the curvature term complicates integration and optimization.

### 1.3 Contributions in This Work

The goal of the current work is to extend the geodesic regression model in order to accommodate more flexibility while remaining in the parametric setting. The increased flexibility introduced by the methods in this manuscript allow a better description of the variability in the data. The work presented in this paper allows one to fit polynomial regression curves on a general Riemannian manifold, using intrinsic methods and avoiding the need for unwrapping and rolling. Since our model includes time-reparametrized geodesics as a special case, information about time dependence is also obtained from the regression without explicit modeling by examining the collinearity of the estimated parameters.

We derive practical algorithms for fitting polynomial curves to observations in Riemannian manifolds. The class of polynomial curves we use, described by Leite & Krakowski [24], is more suited to parametric curve regression than are spline models. These polynomials curves are defined for any integer order and are naturally parametrized via initial conditions instead of control points. We derive explicit formulas for computing derivatives with respect to the initial conditions of these polynomials in a least-squares curve-fitting setting.

In the following sections, we describe our method of fitting polynomial curves to data lying in various spaces. We develop the theory for general Riemannian manifolds, Lie groups with right invariant metrics, and finally for spaces acted on by such Lie groups. In order to keep each application somewhat self-contained, results will be shown in each case in the section in which the associated space is treated, instead of in a separate results section following all the methods.

## 2 Riemannian Geometry Preliminaries

*M*,

*g*) be a Riemannian manifold. At each point

*p*∈

*M*, the metric

*g*defines an inner product on the tangent space

*T*

_{ p }

*M*. The metric also provides a method to differentiate vector fields with respect to one another, referred to as the covariant derivative. For smooth vector fields \(v,w\in\mathfrak{X}(M)\) and a smooth curve

*γ*:[0,1]→

*M*the covariant derivative satisfies the following product rule:

*γ*:[0,1]→

*M*is characterized (for instance) by the conservation of kinetic energy along the curve:

*p*into the manifold

*M*, defined by integration of the geodesic equation, is called the exponential map and is written \(\operatorname{Exp}_{p}:T_{p}M\to M\). The exponential map is injective on a zero-centered ball

*B*in

*T*

_{ p }

*M*of some non-zero radius. Thus, for a point

*q*within a neighborhood of

*p*, there exists a unique vector

*v*∈

*T*

_{ p }

*M*corresponding to a minimal length path under the exponential map from

*p*to

*q*. The mapping of such points

*q*to their associated tangent vectors

*v*at

*p*is called the log map of

*q*at

*p*, denoted \(v = \operatorname{Log}_{p} q\).

*γ*:[0,1]→

*M*, the covariant derivative \(\nabla_{\frac{d}{dt}\gamma}\) provides a way to relate tangent vectors at different points along

*γ*. A vector field

*w*is said to be parallel transported along

*γ*if it satisfies the parallel transport equation,

## 3 Riemannian Polynomials

We now introduce Riemannian polynomials as a generalization of geodesics [15]. Geodesics are generalizations to the Riemannian manifold setting of curves in \(\mathbb{R}^{d}\) with constant first derivative. In the previous section we briefly reviewed how the covariant derivative provides a way to define vector fields which are analogous to constant vector fields along *γ*, via parallel transport.

*γ*. Curves with parallel acceleration are generalizations of curves in \(\mathbb{R}\) whose coordinates are second order polynomials, and satisfy the second order polynomial equation,

*k*th order polynomial in

*M*is defined as a curve

*γ*:[0,1]→

*M*satisfying

*t*∈[0,1]. As with polynomials in Euclidean space, polynomials are fully determined by initial conditions at

*t*=0:

*v*

_{1}(

*t*),…,

*v*

_{ k }(

*t*)∈

*T*

_{ γ(t)}

*M*, we write the following system of covariant differential equations, which is equivalent to Eq. (11):

*γ*(0),

*v*

_{ i }(0),

*i*=1,…,

*k*.

*Δt*is chosen and, at each step of the integrator,

*γ*(

*t*+

*Δt*) is computed using the exponential map:

*v*

_{ i }is incremented within the tangent space at

*γ*(

*t*) and the results are parallel transported infinitesimally along a geodesic from

*γ*(

*t*) to

*γ*(

*t*+

*Δt*). For a proof that this algorithm approximates the polynomial equations, see Appendix A. The only ingredients necessary to integrate a polynomial are the exponential map and parallel transport on the manifold.

### 3.1 Polynomial Time Reparametrization

Geodesic curves propagate at a constant speed as a result of their extremal action property. Polynomials provide flexibility not only in the class of paths that are possible, but in the time dependence of the curves traversing those paths. If the parameters of a polynomial *γ* consist of collinear vectors *v* _{ i }(0)∈*T* _{ γ(0)} *M*, then the path of *γ* (the image of the mapping *γ*) matches that of a geodesic, but the time dependence has been reparametrized by some polynomial transformation *t*↦*c* _{0}+*c* _{1} *t*+*c* _{2} *t* ^{2}+*c* _{3} *t* ^{3}. This generalizes the existence of polynomials in Euclidean space which are merely polynomial transformations of a straight line path. Regression models could even be implemented in which the operator wishes to estimate geodesic paths, but is unsure of parametrization, and so enforces the estimated parameters to be collinear.

## 4 Polynomial Regression via Adjoint Optimization

*J*

_{ j }∈

*M*,

*j*=1,…,

*N*at known times \(t_{j}\in\mathbb{R},j=1,\dots,N\), we define the following objective function

*d*represents the geodesic distance: the minimum length of a path from the curve point

*γ*(

*t*

_{ j }) to the data point

*J*

_{ j }. The function

*E*

_{0}is minimized in order to find the optimal initial conditions

*γ*(0),

*v*

_{ i }(0),

*i*=1,…,

*k*, which we will refer to as the parameters of our model.

*λ*

_{ i }for

*i*=0,…,

*k*, often called the adjoint variables, and define the augmented Lagrangian function

*E*, integrating by parts when necessary. The resulting variations with respect to the adjoint variables yield the original dynamic constraints: the polynomial equations. Variations with respect to the primal variables gives rise to the following system of equations, termed the adjoint equations (see B for derivation).

*R*is the Riemannian curvature tensor and the adjoint variable

*λ*

_{0}takes jump discontinuities at time points where data is present:

*E*with respect to

*γ*(

*t*

_{ j }). The Riemannian curvature tensor is defined by the formula [9]

*E*with respect to initial and final conditions give rise to the terminal endpoint conditions for the adjoint variables,

*γ*(0),

*v*

_{ i }(0):

*t*=0, and thus the gradients of the functional

*E*

_{0}, the adjoint variables are initialized to zero at time 1, then Eq. (22) is integrated backward in time to

*t*=0.

*γ*(0) is updated using the exponential map and the vectors

*v*

_{ i }(0) are updated via parallel translation. This algorithm is depicted in Algorithm 2.

Note that in the special case of a zero-order polynomial (*k*=0), the only gradient *λ* _{0} is simply the mean of the log map vectors at the current estimate of the Fréchet mean. So this method generalizes the common method of Fréchet averaging on manifolds via gradient descent [13]. In the case of geodesic polynomials, *k*=1, the curvature term in Eq. (22) indicates that *λ* _{1} is a sum of Jacobi fields. So this approach subsumes geodesic regression as presented by Fletcher [12]. For higher order polynomials, the adjoint equations represent a generalization of Jacobi field.

As we will see later, in some cases these adjoint equations take a simpler form not involving curvature. In the case that the manifold *M* is a Lie group, the adjoint equations can be computed by taking variations in the Lie algebra, avoiding explicit curvature computation.

### 4.1 Coefficient of Determination (*R* ^{2}) in Metric Spaces

*γ*(

*t*), denoted

*R*

^{2}[12]. As with the usual definition of

*R*

^{2}, we first compute the variance of the data. Naturally, as the data lie on a non-Euclidean metric space, instead of the standard sample variance, we substitute the Fréchet variance, defined as

*γ*is the value

*E*

_{0}(

*γ*):

*R*

^{2}as the amount of variance that has been reduced using the curve

*γ*:

*R*

^{2}value of one. The worst case (

*R*

^{2}=0) occurs when no polynomial can improve over a stationary point at the Fréchet mean, which can be considered a zero-order polynomial regression against the data.

### 4.2 Example: Kendall Shape Space

A common challenge in medical imaging is the comparison of shape features which are independent of easily explained differences such as differences in pose (relative position and rotation). Additionally, scale is often uninteresting as it is easily characterized by volume calculation and explained mostly by intersubject variability or differences in age. It was with this perspective that Kendall [19] originally developed his theory of shape space. Here we briefly describe Kendall’s shape space of *m*-landmark point sets in \(\mathbb{R}^{d}\), denoted \(\varSigma_{d}^{m}\). For a complete treatment of Kendall’s shape space, the reader is encouraged to consult Kendall and Le [20, 23].

Given a point set \(x=(x_{i})_{i=1,\ldots,m},x_{i}\in\mathbb{R}^{d}\), translation and scaling effects are removed by centering and uniform scaling. This is achieved by translating the point set so that the centroid is at zero, then scaling so that \(\sum_{i=1}^{m} \|x_{i}\|^{2}=1\). After this standardization, *x* constitutes a point in the sphere \(\mathbb{S}^{(m-1)d-1}\). This representation of shape is not yet complete as it is effected by global rotation, which we wish to ignore. Thus points on \(\mathbb{S}^{(m-1)d-1}\) are referred to as *preshapes* and the sphere \(\mathbb{S}^{(m-1)d-1}\) is referred to as *preshape space*. Kendall shape space \(\varSigma_{d}^{m}\) is obtained by taking the quotient of the preshape space by the action of the rotation group SO(*d*). In practice, points in the quotient (referred to as *shapes*) are represented by members of their equivalence class in preshape space. We describe now how to compute exponential maps, log maps, and parallel transport in shape space, using representatives in \(\mathbb{S}^{(m-1)d-1}\). The work of O’Neill [33] concerning Riemannian submersions characterizes the link between the shape and preshape spaces.

The case *d*>2 is complicated in that these spaces contain degeneracies: points at which the mapping from preshape space to \(\varSigma_{d}^{m}\) fails to be a submersion [1, 11, 17]. Despite these pathologies, outside of a singular set, the shape spaces are described by the theory of Riemannian submersions. We assume the data lie within a single “manifold part” away from any singularities, and show experiments in two dimensions, so that these technical issues can be safely ignored.

Each point *p* in preshape space projects to a point *π*(*p*) in shape space. The shape *π*(*p*) is the orbit of *p* under the action of SO(*d*). Viewed as a subset of \(\mathbb{S}^{(m-1)d-1}\), this orbit is a submanifold whose tangent space is a subspace of that of the sphere. This subspace is called the vertical subspace of \(T_{p}\mathbb {S}^{(m-1)d-1}\) and its orthogonal complement is the horizontal subspace. Projections onto the two subspaces of a vector \(v\in T_{p}\mathbb {S}^{(m-1)d-1}\) are denoted by \(\mathcal{V}(v)\) and \(\mathcal{H}(v)\), respectively. Curves moving along vertical tangent vectors result in rotations of a preshape, and so do not indicate any change in actual shape.

A vertical vector in preshape space arises as the derivative of a rotation of a preshape. The derivative of such a rotation is a skew-symmetric matrix *W*, and its action on a preshape *x* has the form \((Wx_{1},\ldots,Wx_{n})\in T\mathbb{S}^{(m-1)d-1}\). The vertical subspace is then spanned by such tangent vectors arising from any linearly independent set of skew-symmetric matrices. The projection \(\mathcal{H}\) is performed by taking such a spanning set, performing Gram-Schmidt orthonormalization, and removing each component.

*X*,

*Y*are horizontal vector fields at some point

*p*in preshape space, then

^{∗},

*X*

^{∗}, and

*Y*

^{∗}are their counterparts in shape space.

For the manifold part of a general shape space \(\varSigma_{d}^{m}\), the exponential map and parallel translation are performed using representatives preshapes in \(\mathbb {S}^{(m-1)d-1}\). For *d*>2, this must be done in a time-stepping algorithm, in which at each time step an infinitesimal spherical parallel transport is performed, followed by the horizontal projection. The resulting algorithm can be used to compute the exponential map as well. Computation of the log map is less trivial, as it requires an iterative optimization routine. A special case arises in the case when *d*=2, in which case the entire space \(\varSigma_{d}^{m}\) is a manifold. In this case the exponential map, parallel transport and log map are computed in closed form [12]. With the exponential map, log map, and parallel transport, one performs polynomial regression on Kendall shape space via the adjoint method described previously.

#### 4.2.1 Rat Calivaria Growth

*m*=8 landmarks on a midsagittal section of rat calivaria (skulls excluding the lower jaw). The positions of eight identifiable positions on the skull are available for 18 rats and at of eight ages apiece. Figure 2 shows Riemannian polynomial fits of orders

*k*=0,1,2,3. Curves of the same color indicate the synchronized motion of landmarks within a preshape, and the collection of curves for all eight landmarks represents a curve in shape space. While the geodesic curve in Kendall shape space shows little curvature, the quadratic and cubic curves are less linear which demonstrates the added flexibility provided by higher order polynomials. The

*R*

^{2}values agree with this qualitative difference: the geodesic regression has

*R*

^{2}=0.79, while the quadratic and cubic regressions have

*R*

^{2}values of 0.85 and 0.87, respectively. While this shows that there is a clear improvement in the fit due to increasing

*k*from one to two, it also shows that little is gained by increasing the order of the polynomial beyond

*k*=2. Qualitatively, Fig. 2 shows that the slight increase in

*R*

^{2}obtained by moving from a quadratic to cubic model corresponds to a marked difference in the curves, indicating that the cubic curve is likely overfitting the data. As seen in Table 1, increasing the order of polynomial to four or five has very little effect on

*R*

^{2}as well.

*R* ^{2} for regression of rat dataset

Polynomial order | |
---|---|

1 | 0.79 |

2 | 0.85 |

3 | 0.87 |

4 | 0.87 |

5 | 0.87 |

These results indicate that moving from a geodesic to quadratic model provides an important improvement in fit quality. This is consistent with the results of Kenobi et al. [21], who also found that quadratic and possibly cubic curves are necessary to fit this dataset. However, whereas Kenobi et al. use polynomials defined in the tangent space at the Fréchet mean of the data points, the polynomials we use are defined intrinsically, independent of base point.

#### 4.2.2 Corpus Callosum Aging

The corpus callosum, the major white matter bundle connecting the two hemispheres of the brain, is known to shrink during aging [10]. Fletcher showed [12] that more nuanced modes of shape change are observed using geodesic regression. In particular, the volume change observed in earlier studies corresponds to a thinning of the corpus callosum and increased curling of the anterior and posterior regions. In order to investigate even higher modes of shape change of the corpus callosum during normal aging, polynomial regression was performed on data from the OASIS brain database [27]. Magnetic resonance imaging (MRI) scans from 32 normal subjects with ages between 19 and 90 years were obtained from the database and a midsagittal slice was extracted from each volumetric image. The corpus callosum was then segmented on the 2D slices using the ITK-SNAP program [39]. Sets of 64 landmarks for each patient were obtained using the ShapeWorks program [6], which generates samplings of each shape boundary with optimal correspondences among the population.

*R*

^{2}values (0.13 and 0.12, respectively). However, moving from a quadratic to cubic polynomial model delivers a substantial increase in

*R*

^{2}(from 0.13 to 0.21). This suggests that there are interesting third-order phenomena at work. However, as seen in Table 2, increasing the order beyond three results in very little increase in

*R*

^{2}, indicating that those orders overfit the data, as was the case in the rat calivaria study as well.

*R* ^{2} for regression of corpus callosum dataset

Polynomial order | |
---|---|

1 | 0.11 |

2 | 0.14 |

3 | 0.20 |

4 | 0.21 |

5 | 0.22 |

Note that the *R* ^{2} values are quite low in this study. Similar values were observed using geodesic regression in [12]. As is noted, this is likely due to high inter-subject variability, and that age is only able to explain an effect which is small compared to differences between subjects. Fletcher [12] also notes that although the effect may be small, geodesic regression gives a result which is significant (*p*=0.009) using a non-parametric permutation test.

Model selection, which in the case of polynomial regression amounts to the choice of polynomial order, is an important issue. *R* ^{2} always increases with increasing *k*, as we have seen in these two studies. As a result, other measures are sought which balance goodness of fit with complexity of the curve model. Tools often used for model selection in Euclidean polynomial regression, such as Akaike information criterion and Bayesian information criterion [5] make assumptions about the distribution of data that are difficult to generalize to the manifold setting. Extension of permutation testing for geodesic regression to higher orders would be useful for this task, but such extension is not trivial on a Riemannian manifold. We expect that such an extension of permutation testing is possible in certain cases where it is possible to define “exchangeability” under the null hypothesis that the data follow a given order *k* trend. Currently, we select models based on qualitative analysis of the fit curves, as in the rat calivaria study, and *R* ^{2} values.

### 4.3 LDDMM Landmark Space

Analysis of landmarks is commonly done in an alternative fashion when scale and rotation invariance is not desired. In this section, we present polynomial regression using the large distance diffeomorphic metric mapping (LDDMM) framework. This framework consists of a Lie group of diffeomorphisms endowed with a right invariant Sobolev metric acting on a space of landmark configurations. For a more detailed description of the group action approach, the reader is encouraged to consult Bruveris et al. [4]. We will instead focus on the Riemannian structure of landmarks and use the formulas for general Riemannian manifolds.

*m*landmarks in

*d*dimensions, let \(M\cong\mathbb{R}^{md}\) be the space of all possible configurations. We denote by \(x_{i}\in\mathbb{R}^{d}\) the location of the

*i*th landmark point. Tangent vectors are also represented as tuples of vectors, \(v=(v_{i})_{i=1,\ldots,m}\in\mathbb{R}^{md}\), as are cotangent vectors \(\alpha=(\alpha_{i})_{i=1,\ldots,m}\in\mathbb{R}^{md}\). Contrasting ordinary differential geometric methods in which vectors and metrics are the objects of interest, it is more convenient to work with landmark covectors (which we refer to as momenta). In such case the inverse metric (also called the cometric) is generally written using a shift-invariant scalar kernel \(K:\mathbb{R}\to\mathbb{R}\). The inner product of two covectors is given by

*K*′ denotes the derivative of the kernel.

*v*=

*Kα*and

*w*=

*Kβ*, parallel transport in LDDMM landmark space are computed in coordinates using the following formula, derived by Younes et al. [38, Eq. (25)]:

Using these approaches to computing parallel transport and curvature, we implemented the general polynomial adjoint optimization method. We applied this approach to the rat calivaria data, treating the data as absolute landmark positions (after Procrustes alignment) instead of as scale and rotation invariant Kendall shapes.

*R*

^{2}, from 0.92 with the geodesic to 0.94 with the quadratic curve.

## 5 Riemannian Polynomials in Lie Groups

In this section, we consider the case when the configuration manifold is a Lie group *G*. A tangent vector *v*∈*T* _{ g } *G* at a point *g*∈*G* can be identified with a tangent vector at the identity element *e*∈*G* via either right or left translation by *g* ^{−1}. The resulting element of *T* _{ e } *G* is referred to as the right (respectively, left) trivialization of *v*. We call a vector field \(X\in\mathfrak{X}(G)\) right (respectively, left) invariant if the right trivialization of *X*(*g*) is constant for all *g*. Both left and right translation, considered as mappings *T* _{ g } *G*→*T* _{ e } *G* are linear isomorphisms, and we will use the common notation \(\mathfrak {g}\) to refer to *T* _{ e } *G*. The vector space \(\mathfrak{g}\), endowed with the vector product given by the right trivialization of the negative Jacobi-Lie bracket of right invariant vector fields is called the Lie algebra of *G*.

*g*determines a linear action \(\operatorname{Ad}_{g}\) on \(\mathfrak{g}\) called the adjoint action and its dual action \(\operatorname{Ad}_{g}^{*}\) on \(\mathfrak{g}^{*}\) which is called the coadjoint action of

*g*. In a Riemannian Lie group, the inner product on \(\mathfrak{g}\) can be used to compute the adjoint of the adjoint action, which we term the adjoint-transpose action \(\operatorname{Ad}_{g}^{\dagger}\), defined by

*X*and

*Y*to right invariant vector fields \(\widetilde{X},\widetilde{Y}\), the covariant derivative \(\nabla_{\widetilde{X}}\widetilde{Y}\) is also right invariant (c.f. [7, Proposition 3.18]) and satisfies

*ξ*

_{1}to denote the right trivialized velocity of the curve

*γ*(

*t*)∈

*G*. Using our formula for the covariant derivative, one sees that the geodesic equation in a Lie group with right invariant metric is the right “Euler-Poincaré” equation:

*ξ*

_{ i },

*i*=1,…,

*k*to represent the right trivialized higher-order velocity vectors

*v*

_{ i },

## 6 Polynomial Regression in Lie Groups

We have seen that the geodesic equation is simplified in a Lie group with right invariant metric, using the Euler-Poincaré equation. In this section, we derive the adjoint equations used to perform geodesic and polynomial regression in a Lie group. Using right-trivialized adjoint variables, we will see that the symmetries provided by the Lie group structure result in adjoint equations more amenable to computation than those in Sect. 4.

### 6.1 Geodesic Regression

*N*data points

*J*

_{ j }∈

*G*are observed at times

*t*

_{ j }∈[0,1]. Using the geodesic distance \(d:G\times G\to\mathbb{R}\), the least squares geodesic regression problem is to find the minimum of

*γ*:[0,1]→

*G*is a geodesic.

*γ*, consider a variation of the geodesic

*γ*(

*t*), which is a vector field along

*γ*that we denote

*δγ*(

*t*)∈

*T*

_{ γ(t)}

*G*. We denote by

*Z*(

*t*) the right trivialization of

*δγ*(

*t*). The variation of

*γ*induces the following variation in the trivialized velocity

*ξ*

_{1}[16]:

*δγ*to be a Jacobi field, we use the following variation of the Euler-Poincaré equation to obtain

*Z*:

*Z*(

*t*) is a right trivialized Jacobi field. In order to compute the variations of

*E*with respect to the initial position

*γ*(0) and velocity

*ξ*

_{1}(0) of the geodesic

*γ*(

*t*), the variations of

*E*with respect to

*γ*(1) and

*ξ*

_{1}(1) are transported backward to

*t*=0 by the adjoint ODE. Introducing adjoint variables \(\lambda_{0}(t),\lambda_{1}(t)\in\mathfrak {g}\), the left trivialized variation of

*E*with respect to

*γ*(

*t*) and the variation with respect to

*ξ*

_{1}(

*t*) are given by

*λ*

_{0}(1)=

*λ*

_{1}(1)=0 and integrating the adjoint ODE backward to

*t*=0. The adjoint ODE is obtained by simply computing the adjoint of the ODE governing geodesic perturbations, Eq. (48), with respect to the \(L^{2}([0,1]\to\mathfrak{g})\) inner product. The resulting adjoint ODE is

*λ*

_{0}takes jump discontinuities when passing over data points:

*γ*(

*t*

_{ j }) to the data

*J*

_{ j }. Notice that the adjoint variable

*λ*satisfies an equation resembling the Euler-Poincaré equation and can likewise be solved in closed form:

*E*is performed using the variations \(\delta_{\gamma(0)}E,\delta_{\xi_{1}(0)}E\) using, for example the following gradient descent steps:

*α*, where

*k*denotes the step of the iterative optimization process. Note that commonly the Riemannian exponential map \(\operatorname{Exp}\) in the above expression is replaced by a numerically efficient approximation such as the Cayley map [3].

### 6.2 Example: Rotation Group SO(3)

*A*. For vectors \(x,y\in\mathbb{R}^{3}\), the inner product is

*A*is the identity matrix. In that case, left invariance also implies right invariance and skew-symmetry of \(\operatorname{ad}^{\dagger}\), so that for any \(X,Y\in\mathfrak{so}(3)\):

*g*∈SO(3) on a 3-vector

*x*is given by

### 6.3 Polynomial Regression

*Z*of

*γ*, which can be considered a kind of higher-order Jacobi field. Introducing adjoint variables \(\lambda_{0},\ldots,\lambda_{k}\in \mathfrak{g}\), the adjoint system is (see Appendix C for derivation)

*i*=2,…,

*k*, these equations resemble the original polynomial equations. However, the evolution of

*λ*

_{1}is influenced by all adjoint variables and higher-order velocities in a non-trivial way. The first adjoint equation again resembles the Euler-Poincaré equation, and its solution is given by Eq. (54).

#### 6.3.1 Polynomial Regression in SO(3)

*ξ*

_{ i }, the equations for higher order polynomials in SO(3) are

## 7 Lie Group Actions

So far, we’ve seen that polynomial regression is particularly convenient in Lie groups with right invariant metrics, reducing the adjoint system from second to first order using the closed form integral of *λ* _{0}. We now consider the case when a Lie group *G* acts on another manifold *M* which is itself equipped with a Riemannian metric. For our purposes, the group action need not be transitive, in which case the target space is called a “homogeneous space” for *G*.

Although the two approaches sometimes coincide, generally one must choose between using polynomials defined by the metric in *M*, ignoring the action of *G*, or using curves defined by the action of polynomials in *G* on points in *M*. In cases when a Riemannian Lie group is known to act on the space *M*, the primary object of interest is usually not the path in the object space *M*, but the path of symmetries described by the group elements. Therefore it is most natural to use the Lie group structure to define paths in object space. We employ this approach, in which polynomial regression under a Riemannian Lie group action is studied primarily using the Lie group elements.

*M*as a curve

*p*(

*t*) defined using the group action:

*γ*is a polynomial of order

*k*in

*G*with parameters

*p*

_{0}∈

*M*is a base point in the object space. Invariance of the metric on

*G*allows us to assume, without loss of flexibility in the model, that the base deformation is the identity:

*γ*(0)=

*e*∈

*G*. Optimization is done by fixing

*γ*(0)=

*e*∈

*G*and minimizing a least squares objective function defined using the metric on

*M*, with respect to the base point

*p*

_{0}∈

*M*and the parameters of the Lie group polynomial, \(\xi_{1},\ldots,\xi_{k}\in\mathfrak{g}\). This is accomplished using a similar adjoint method to that presented in the previous sections, but where the jump discontinuities in

*λ*

_{0}are modified due to this change in objective function. In the following sections, we discuss this in more detail and also derive the gradients with respect to the base point

*p*

_{0}.

### 7.1 Action on a General Manifold

*T*

_{ p }

*M*at any point

*p*∈

*M*. Given a curve

*g*(

*t*):(−

*ϵ*,

*ϵ*)→

*G*such that

*g*(0)=

*e*and \(\frac{d}{dt}|_{t=0}g(t)=\xi\in\mathfrak{g}\), define the following mapping (c.f. [16]):

*ρ*

_{ p }is a linear mapping from \(\mathfrak{g}\) to

*T*

_{ p }

*M*, and as such it has a dual \(\rho_{p}^{*}:T_{p}^{*}M\to\mathfrak{g}^{*}\) that maps cotangent vectors in

*M*to the Lie coalgebra \(\mathfrak{g}^{*}\). This dual mapping we refer to as the cotangent lift momentum map and use the notation \(\mathbf{J}:T^{*}M\to\mathfrak{g}^{*}\).

**J**is that it is preserved under the coadjoint action:

*g*on the cotangent bundle, which appears on the right-hand side above, maps a cotangent vector

*μ*at point

*p*to the vector \(g.\mu \in T_{g.p}^{*}M\). Replacing squared norm with squared geodesic distance on the Riemannian manifold

*M*, the first adjoint variable is then given by

*G*and the metric on the manifold

*M*coincide, in the sense that for any vectors \(\xi,\mu\in \mathfrak{g}\) and points

*p*∈

*M*:

*p*

_{0}∈

*M*, this means the mapping

*g*→

*g*.

*p*

_{0}is a Riemannian submersion. If, additionally, the metric on

*G*is biinvariant, this implies that the covariant derivative satisfies [33]

*M*are generated by polynomials in

*G*along with the action on the base point

*p*

_{0}.

#### 7.1.1 Example: Rotations of the Sphere

*A*matrix in Sect. 6.2. Representing points on the sphere as unit vectors in \(\mathbb{R}^{3}\), the group action is simply left multiplication by a matrix in SO(3):

*p*. The standard metric on the sphere corresponds to the standard biinvariant metric on SO(3) so that, as discussed previously, polynomials on \(\mathbb{S}^{2}\) correspond to polynomials in SO(3) acting on points on the sphere.

*γ*(

*t*) on the base point \(p_{0}\in\mathbb{S}^{2}\). The derivative of

*γ*(

*t*) is replaced by the equation

*ξ*

_{ i }is the same as that for SO(3). Figure 6 shows example polynomial curves in the rotation group and their action on a point on the sphere. Notice that the example polynomials on the sphere are precisely those shown in Fig. 1, although they were generated here using polynomials on SO(3) instead of integrating directly on the sphere.

**J**, we have the jump discontinuities for the first adjoint variable

*λ*

_{0}:

### 7.2 Lie Group Actions on Vector Spaces

*V*and that

*G*acts linearly on the left on

*V*. Given a smooth linear group action, a vector

*ξ*in the Lie algebra \(\mathfrak{g}\) acts linearly on a vector

*v*∈

*V*in the following way

*g*(

*ϵ*) is a curve in

*G*satisfying

*g*(0)=

*e*and \(\frac{d}{d\epsilon}|_{\epsilon=0}g(\epsilon)=\xi\). Again we use the notation \(\rho_{v}:\mathfrak{g}\to V\) to denote right-multiplication under this action:

*ρ*

_{ v }), is written using the diamond notation introduced in [16]:

*J*

_{ i }in the vector space

*V*. In that case, the inner product on

*V*is used to write the regression problem as a minimization of

*γ*is a polynomial in

*G*and

*v*

_{0}∈

*V*is an evolving template vector. Without loss of generality,

*γ*(0) can also be constrained to be the identity so that

*v*

_{0}is the template vector at time zero. Optimization of Eq. (104) with respect to

*v*

_{0}requires the variation

*V*, an operation mapping

*V*to

*V*

^{∗}. If the group

*G*acts by isometries on

*V*, then the group action commutes with flatting and the optimal base vector

*v*

_{0}can be computed in closed form

*G*does not act by isometries, the optimal base vector can often be solved for in closed form.

*γ*(

*t*

_{ j }) is more interesting:

#### 7.2.1 Example: Diffeomorphically Deforming Images

*I*as a square integrable function of a domain \(\varOmega\subset\mathbb{R}^{d}\), the left action of a diffeomorphism

*γ*∈Diff(

*Ω*) is

*ξ*on an image is

*Lξ*

_{1}(0)=

*I*

_{0}⋄

*α*(0). As a result, changes in base image

*I*

_{0}influence the behavior of the deformation itself.

*ξ*

_{ i }are not constrained to be horizontal. Implementation of polynomial regression involves the expression above for the diamond map, along with the \(\operatorname{ad}\) and \(\operatorname {ad}^{*}\) operators [28]

*m*

_{ i }=

*Lξ*

_{ i }are introduced and this EPDiff equation is generalized to

*I*

_{0}is simplified, as Eq. (105) is solved in closed form using

## 8 Discussion

The Riemannian polynomial framework we have presented provides a general approach to regression for manifold-valued data. The greatest limitation to performing polynomial regression on a general Riemannian manifold is that it requires computation of the Riemannian curvature tensor, which is often tedious [29]. In a Lie group or homogeneous space, we have shown that the symmetries provided by the group allow for not only simple integration using parallel transport in the Lie algebra, but also simplified adjoint equations that do not require explicit curvature computation.

The theory of rolling maps on the sphere, introduced by Jupp & Kent [18], offer another perspective on Riemannian polynomials. On the sphere, this interesting interpretation is related to the group action described above. Given a curve \(\gamma:[0,1]\to\mathbb{S}^{2}\), consider embedding both the sphere and a plane in \(\mathbb{R}^{3}\) such that the plane is tangent to the sphere at the point *γ*(0). Now roll the sphere along so that it remains tangent at *γ*(*t*) at every time, and such that no slipping or twisting occurs. The resulting path, \(\gamma_{u}:[0,1]\to\mathbb{R}^{2}\), traced out on the plane is called the unwrapped curve. Remarkably, the property that *γ* is a *k*-order polynomial on \(\mathbb{S}^{2}\) is equivalent to the unwrapped curve *γ* _{ u } being a *k*-order polynomial in the conventional sense. For more information regarding this connection to Jupp & Kent’s rolling maps, as well as a comparison to Noakes’ cubic splines [32], the reader is referred to the literature of Leite & Krakowski [24].

## References

- 1.Bandulasiri, A., Gunathilaka, A., Patrangenaru, V., Ruymgaart, F., Thompson, H.: Nonparametric shape analysis methods in glaucoma detection. I. J. Stat. Sci.
**9**, 135–149 (2009) Google Scholar - 2.Bookstein, F.L.: Morphometric Tools for Landmark Data: Geometry and Biology. Cambridge Univ. Press, Cambridge (1991) zbMATHGoogle Scholar
- 3.Bou-Rabee, N.: Hamilton-Pontryagin integrators on Lie groups. Ph.D. thesis, California Institute of Technology (2007) Google Scholar
- 4.Bruveris, M., Gay-Balmaz, F., Holm, D., Ratiu, T.: The momentum map representation of images. J. Nonlinear Sci.
**21**(1), 115–150 (2011) CrossRefzbMATHMathSciNetGoogle Scholar - 5.Burnham, K., Anderson, D.: Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer, New York (2002) Google Scholar
- 6.Cates, J., Fletcher, P.T., Styner, M., Shenton, M., Whitaker, R.: Shape modeling and analysis with entropy-based particle systems. In: Proceedings of Information Processing in Medical Imaging (IPMI) (2007) Google Scholar
- 7.Cheeger, J., Ebin, D.G.: Comparison Theorems in Riemannian Geometry, vol. 365. AMS Bookstore, Providence (1975) zbMATHGoogle Scholar
- 8.Davis, B.C., Fletcher, P.T., Bullitt, E., Joshi, S.C.: Population shape regression from random design data. Int. J. Comput. Vis.
**90**(2), 255–266 (2010) CrossRefGoogle Scholar - 9.do Carmo, M.P.: Riemannian Geometry, 1st edn. Birkhäuser, Boston (1992) CrossRefzbMATHGoogle Scholar
- 10.Driesen, N, Raz, N: The influence of sex, age, and handedness on corpus callosum morphology: A meta-analysis. Psychobiology (1995). doi: 10.3758/BF03332028 Google Scholar
- 11.Dryden, I.L., Kume, A., Le, H., Wood, A.T.: A multi-dimensional scaling approach to shape analysis. Biometrika
**95**(4), 779–798 (2008) CrossRefzbMATHMathSciNetGoogle Scholar - 12.Fletcher, PT: Geodesic regression and the theory of least squares on Riemannian manifolds. Int. J. Comput. Vis. (2012). doi: 10.1007/s11263-012-0591-y Google Scholar
- 13.Fletcher, P.T., Liu, C., Pizer, S.M., Joshi, S.C.: Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Trans. Med. Imaging
**23**(8), 995–1005 (2004) CrossRefGoogle Scholar - 14.Giambò, R., Giannoni, F., Piccione, P.: An analytical theory for Riemannian cubic polynomials. IMA J. Math. Control Inf.
**19**(4), 445–460 (2002) CrossRefzbMATHGoogle Scholar - 15.Hinkle, J., Muralidharan, P., Fletcher, P.T., Joshi, S.C.: Polynomial regression on Riemannian manifolds. In: ECCV, Florence, Italy, vol. 3, pp. 1–14 (2012) Google Scholar
- 16.Holm, D.D., Marsden, J.E., Ratiu, T.S.: The Euler-Poincaré equations and semidirect products with applications to continuum theories. Adv. Math.
**137**, 1–81 (1998) CrossRefzbMATHMathSciNetGoogle Scholar - 17.Huckemann, S., Hotz, T., Munk, A.: Intrinsic shape analysis: geodesic principal component analysis for Riemannian manifolds modulo Lie group actions. Discussion paper with rejoinder. Stat. Sin.
**20**, 1–100 (2010) zbMATHMathSciNetGoogle Scholar - 18.Jupp, P.E., Kent, J.T.: Fitting smooth paths to spherical data. Appl. Stat.
**36**(1), 34–46 (1987) CrossRefzbMATHMathSciNetGoogle Scholar - 19.Kendall, D.G.: Shape manifolds, procrustean metrics, and complex projective spaces. Bull. Lond. Math. Soc.
**16**(2), 81–121 (1984) CrossRefzbMATHMathSciNetGoogle Scholar - 20.Kendall, D.G.: A survey of the statistical theory of shape. Stat. Sci.
**4**(2), 87–99 (1989) CrossRefzbMATHMathSciNetGoogle Scholar - 21.Kenobi, K., Dryden, I.L., Le, H.: Shape curves and geodesic modelling. Biometrika
**97**(3), 567–584 (2010) CrossRefzbMATHMathSciNetGoogle Scholar - 22.Kume, A., Dryden, I.L., Le, H.: Shape-space smoothing splines for planar landmark data. Biometrika
**94**(3), 513–528 (2007). doi: 10.1093/biomet/asm047 CrossRefzbMATHMathSciNetGoogle Scholar - 23.Le, H., Kendall, D.G.: The Riemannian structure of Euclidean shape spaces: a novel environment for statistics. Ann. Stat.
**21**(3), 1225–1271 (1993) CrossRefzbMATHMathSciNetGoogle Scholar - 24.Silva Leite, F, Krakowski, K: Covariant differentiation under rolling maps. Departamento de Matemática, Universidade of Coimbra, Portugal (2008), No. 08-22, 1–8 Google Scholar
- 25.Lewis, A., Murray, R.: Configuration controllability of simple mechanical control systems. SIAM J. Control Optim.
**35**(3), 766–790 (1997) CrossRefzbMATHMathSciNetGoogle Scholar - 26.Machado, L., Leite, F.S., Krakowski, K.: Higher-order smoothing splines versus least squares problems on Riemannian manifolds. J. Dyn. Control Syst.
**16**(1), 121–148 (2010) CrossRefzbMATHMathSciNetGoogle Scholar - 27.Marcus, D., Wang, T., Parker, J., Csernansky, J., Morris, J., Buckner, R.: Open access series of imaging studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J. Cogn. Neurosci.
**19**(9), 1498–1507 (2007) CrossRefGoogle Scholar - 28.Marsden, J., Ratiu, T.: Introduction to Mechanics and Symmetry: a Basic Exposition of Classical Mechanical Systems, vol. 17. Springer, Berlin (1999) zbMATHGoogle Scholar
- 29.Micheli, M., Michor, P., Mumford, D.: Sectional curvature in terms of the cometric, with applications to the Riemannian manifolds of landmarks. SIAM J. Imaging Sci.
**5**(1), 394–433 (2012) CrossRefzbMATHMathSciNetGoogle Scholar - 30.Miller, M.I., Trouvé, A., Younes, L.: Geodesic shooting for computational anatomy. J. Math. Imaging Vis.
**24**(2), 209–228 (2006). doi: 10.1007/s10851-005-3624-0 CrossRefGoogle Scholar - 31.Niethammer, M., Huang, Y., Vialard, F.X.: Geodesic regression for image time-series. In: Proceedings of Medical Image Computing and Computer Assisted Intervention (MICCAI) (2011) Google Scholar
- 32.Noakes, L., Heinzinger, G., Paden, B.: Cubic splines on curved surfaces. IMA J. Math. Control Inf.
**6**, 465–473 (1989) CrossRefzbMATHMathSciNetGoogle Scholar - 33.O’Neill, B.: The fundamental equations of a submersion. Mich. Math. J.
**13**(4), 459–469 (1966) CrossRefzbMATHMathSciNetGoogle Scholar - 34.Shi, X., Styner, M., Lieberman, J., Ibrahim, J.G., Lin, W., Zhu, H.: Intrinsic regression models for manifold-valued data. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2009, pp. 192–199. Springer, Berlin (2009) CrossRefGoogle Scholar
- 35.Singh, N., Wang, A., Sankaranarayanan, P., Fletcher, P., Joshi, S.: Genetic, structural and functional imaging biomarkers for early detection of conversion from MCI to AD. In: Proceedings of Medical Image Computing and Computer Assisted Intervention (MICCAI), pp. 132–140. Springer, Berlin (2012) Google Scholar
- 36.Turaga, P., Veeraraghavan, A., Srivastava, A., Chellappa, R.: Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition. IEEE Trans. Pattern Anal. Mach. Intell.
**33**(11), 2273–2286 (2011) CrossRefGoogle Scholar - 37.Vaillant, M., Glaunes, J.: Surface matching via currents. In: Information Processing in Medical Imaging, pp. 1–5. Springer, Berlin (2005) Google Scholar
- 38.Younes, L., Qiu, A., Winslow, R., Miller, M.: Transport of relational structures in groups of diffeomorphisms. J. Math. Imaging Vis.
**32**(1), 41–56 (2008) CrossRefMathSciNetGoogle Scholar - 39.Yushkevich, P.A., Piven, J., Cody Hazlett, H., Gimpel Smith, R., Ho, S., Gee, J.C., Gerig, G.: User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. NeuroImage
**31**(3), 1116–1128 (2006) CrossRefGoogle Scholar

## Copyright information

**Open Access** This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.