Geodesic analysis in Kendall's shape space with epidemiological applications

We analytically determine Jacobi fields and parallel transports and compute geodesic regression in Kendall's shape space. Using the derived expressions, we can fully leverage the geometry via Riemannian optimization and thereby reduce the computational expense by several orders of magnitude. The methodology is demonstrated by performing a longitudinal statistical analysis of epidemiological shape data. As an example application we have chosen 3D shapes of knee bones, reconstructed from image data of the Osteoarthritis Initiative (OAI). Comparing subject groups with incident and developing osteoarthritis versus normal controls, we find clear differences in the temporal development of femur shapes. This paves the way for early prediction of incident knee osteoarthritis, using geometry data alone.


Introduction
In recent years, there has been an increased interest in statistical analysis of geometric shapes. Such analyses are especially often performed in the field of morphometry, but mostly for static forms. A frequently-encountered situation, however, is that instead of a set of discrete shapes, series of shapes are given, often together with co-varying parameters. For example, longitudinal imaging studies track biological shape changes over time within and across individuals to gain insight into dynamical processes such as ageing or disease progression. Statistical modeling and analysis of shapes is of critical importance for a better understanding of such temporal shape data.
The main challenge is that shape variability is inherently nonlinear and high-dimensional, so that classical statistical approaches are not always appropriate. One way to address this is linearization. The quality of the resulting statistical model, however, then depends strongly on the validity of the linearity assumption, i.e. that the observed data points lie to a good approximation in a flat Euclidean subspace. Since the natural variability in populations often leads to a large spread in shape space and the observed data may lie in highly-curved regions (see (Huckemann and Hotz, 2014)), linearity often cannot be assumed in practical applications.
In the context of longitudinal studies, an important task is to estimate continuous trajectories from sparse and potentially noisy samples. For smooth individual biological changes, subject-specific spatiotemporal regression models are adequate. They also provide a way to describe the data at unobserved times (i.e. shape changes between observation times and -within certain limits -also at future times) and to compare trends across subjects in the presence of unbalanced data (e.g. due to drop-outs). One approach in use is to approximate the observed temporal shape data by geodesics in shape space and, based on these, to estimate overall trends within groups. Geodesic models are attractive as they feature a compact representation (similar to the slope and intercept term in linear regression) and therefore allow for computationally efficient inference.
The intrinsic theory of least squares and geodesic regression has been introduced by Fletcher in (Fletcher, 2013). Derivation of the corresponding Euler-Lagrange equations for some important manifolds can be found in the work of Machado and Silva Leite (Machado and Silva Leite, 2007). For an overview of statistical analysis on Riemannian manifolds see the work of Huckemann and Hotz (Huckemann and Hotz, 2014) and Pennec (Pennec, 2006).
An additional challenge in the analysis of shape trajectories is to distinguish between morphological differences due to (i) temporal shape evolutions of a single individual and (ii) the geometric variability in a population of an object class under study. To obtain a statistically significant localization of structural changes at the population level (group-wise statistics), the subject-specific trajectories need to be transferred in a standard reference frame. Among the different techniques proposed for normalizing longitudinal deformations (Rao et al., 2004;Bossa et al., 2010), constructions based on parallel transport provide the most natural approach and have shown superior sensitivity and stability in the context of diffeomorphic registration (Lorenzi et al., 2011). Note also that, for general trajectories, the simple transport of each shape is not suitable because the distances between the shapes are not preserved. However, if the shapes belong to the same geodesic, this problem does not arise, which is another advantage of geodesic regression.
As parallel transport in curved shape spaces is rarely given in closed form, in general it has to be approximated numerically, e.g. employing Schild's ladder (Lorenzi et al., 2011) for fanning (Louis et al., 2018). For shapes in 2D, Kendall's shape space is isomorphic to the projective space, which is a symmetric space, so that the essential geometric quantities are well known (cf. (Huckemann et al., 2010) and (Fletcher, 2013)). However, for three and more dimensions, because of less restrictive structure, many questions remain open. Utilizing closed form expressions of the pre-shape sphere, we reduce parallel transport to the solution of a homogeneous first-order differential equation that allows for highly efficient computations. Moreover, we reduce the important case of parallel transport along a geodesic path to the solution of a low-dimensional equation that only depends on the dimension of the ambient space and not on the spatial resolution of the discrete representation.
The paper is organized as follows. In Section 2, after a short overview of Kendall's shape space, we provide a computationally efficient approach (via the so-called Sylvester equation) for the canonical decomposition of tangent vectors into horizontal and vertical components, which is essential for the geometry and analysis of shapes and trajectories. Moreover, we determine parallel transport and Jacobi fields, which will be employed for geodesic regression. Parallel transport is essential for statistical normalization, alignment of trajectories and also computation of Jacobi fields. The latter describes the variability of trajectories that will be modeled as best-fitting geodesics in Section 3, where we also present our algorithm for the computation of geodesic regression. In Section 4 we apply this algorithm to yield longitudinal statistical analysis of femur data from an epidemiological study dealing with osteoarthritis and discuss the numerical results.

Geodesic Analysis in Shape Space
A pre-shape is a k-ad of landmarks (i.e. particular points) in R m after removing translations and similarity transformations. A shape is a pre-shape with rotations removed. For a comprehensive introduction to Kendall's shape space and details on the subjects of this section, we refer to (Kendall et al., 1999). For the relevant tools from Riemannian geometry, we refer to (Gallot et al., 2005).

Shape Space
In the following we present a brief overview of Kendall's shape space, provide a computationally efficient method to determine horizontal and vertical components of tangent vectors of the pre-shape space, and also prove the corresponding equivariance under rotations. k i=1 x i = 0}, identified with M (m, k − 1), will be endowed with its canonical scalar product given by x, y = trace(xy t ). Denoting the Frobenius norm by · , we call the sphere S k m := {x ∈ R k m : x = 1} pre-shape space and endow it with the spherical Procrustes metric d(x, y) := arccos( x, y ). Now, the left action of SO m on S k m given by (R, x) → Rx defines an equivalence relation given by x ∼ y if and only if y = Rx for some R ∈ SO m . Kendall's shape space is defined as Σ k m = S k m /∼. Provided that k ≥ m + 1, the dimension of Σ k m is m(k − 1) − 1 2 m(m − 1) − 1. Now, denoting the canonical projection of ∼ by π, the induced distance between any two shapes π(x) and π(y) is given by where λ 1 ≥ · · · ≥ λ m−1 ≥|λ m | denote the pseudo-singular values of yx t . Denoting D j := {x ∈ S k m : rank(x) ≤ j}, it turns out that Σ k m,m := Σ k m \ π(D m−2 ) inherits a differential structure that is compatible with its quotient topology. Following (Kendall et al., 1999), we refer to π(D m−2 ) as the singular part of Σ k m . In particular, Σ k m is a strata of manifolds with varying dimensions and Σ k m,m is open and dense in Σ k m . Away from the singular part, the quotient map π is a Riemannian submersion with respect to the metric induced by the ambient Euclidean space. Moreover, for k ≥ 3, the shape space Σ k 1 (resp. Σ k 2 ) is isometric to the sphere (resp. projective space). We call x, y ∈ S k m well positioned, and write x ω ∼ y, if and only if yx t is symmetric and d(x, y) = d Σ (x, y). For each x, y ∈ S k m , there exists an optimal rotation R ∈ SO m such that x ω ∼ Ry. Note that R does not need to be unique. Let U denote a neighborhood in S k m with radius smaller then π/4 (the diameter of Σ k m is π/2) such that For x, y ∈ U the optimal rotation R is unique and the function We recall that, for a Riemannian submersion f : M → N and y ∈ N , f −1 (y) is a submanifold of M . For any x ∈ M , denoting the kernel of d x f by Ver x , the tangent space T x M to M at x admits an orthogonal decomposition T x M = Hor x ⊕ Ver x where Hor x and the Ver x are the so-called horizontal and vertical subspaces. Due to (Kendall et al., 1999) the vertical space at x ∈ S k m is given by and the horizontal space is given by We denote the vector space of m × m skew-symmetric real matrices by Skew m . Thus Ver x = Skew m · x. Furthermore, a smooth curve is called horizontal if and only if its tangent field is horizontal. Geodesics in the shape space are equivalence classes of horizontal geodesics. Now, let exp and log denote the exponential and logarithm map of the pre-shape space. For x ω ∼ y the geodesic from x to y given by with ϕ = arccos( x, y ), 0 ≤ t ≤ 1, is horizontal. Hence Φ realizes the minimizing geodesic from π(x) to π(y). The following result concerns determination and SO m -equivariance for horizontal and vertical projection.
Let ver x resp. hor x denote the restriction of vertical resp. horizontal projection to T x S k m .
(a) ver x (w) = Ax if and only if A solves the Sylvester equation Moreover, the above equation has a unique skew-symmetric solution if rank(x) ≥ m − 1.
Proof. For (a), let ver x (w) = Ax, i.e., w = u + Ax with ux t symmetric and A ∈ Skew m . A straightforward computation eliminating ux t implies that (2) holds. To prove the converse, let j := rank(x). Suppose without loss of generality that j > 1 and write are uniquely solvable, since x 1 x t 1 is invertible. Furthermore, the solution of the first equation is skewsymmetric, since its right-hand side is skew-symmetric. It follows that with A 0 ∈ Skew m−j arbitrary, is skew-symmetric and solves the Sylvester equation (2) Henceforth the superscript v (resp. h) denotes the vertical (resp. horizontal) component, i.e., for any i.e., horizontal and vertical projections are SO m -equivariant. Note that this property holds even if π(x) belongs to the singular part of the shape space. As appropriate for our applications and for brevity, unless otherwise specified, we restrict our data to the open and dense set S := {x ∈ S k m : rank(x) ≥ m − 1} on which π is a Riemannian submersion, thus the geometry of the shape space is mainly described by its horizontal lift in the pre-shape space. In particular, for x ∈ S the Sylvester equation (2) has a unique solution determining horizontal and vertical projections and the restriction of d x π to Hor x is an isometry of Euclidean vector spaces Hor x and T π(x) Σ k m,m . Denoting the covariant derivatives in the pre-shape and shape space by ∇ resp.∇, for horizontal vector fields X and Y we have In the following [ · , · ] denotes the Lie bracket in R k m , i.e., [U, V ] = DV (U ) − DU (V ) (D Euclidean). For the Euclidean derivative of a vector field W along a curve γ in R k m we use D dt and also for simplicity of notation a dot, i.e., ∇γW =Ẇ − Ẇ , γ γ if γ = 1, and D 2 W dt 2 =Ẅ , etc. We set 1 For the computation of the Fréchet mean (cf. (Huckemann et al., 2010) and (Pennec, 2006)) π(q) of the shapes π(q 1 ), · · · , π(q N ) with q i ∈ U, i.e., we apply Newton's method to Karcher's equation N i=1 Log x q i = 0 as follows. We search for the unique zeroq of the function f defined by and set A suitable initial value is the normalized Euclidean mean The total variance of q = (q 1 , · · · , q N ) reads Logqq i 2 .

Parallel Transport
Next, we derive formulas for parallel transport in the shape space and its relation to parallel transport in the pre-shape space. 2 We call a vector field W along a horizontal curve γ horizontally parallel (for brevity h-parallel ) if and only if W is horizontal and dπW is parallel along π • γ. In the following, we derive the differential equation for the h-parallelism of W and a corresponding constructive approach using a Sylvester equation in certain cases.
Proposition 2.2. Let γ : [0, τ ] → S be a smooth horizontal curve with initial velocity v, u a horizontal vector at x := γ(0) and W a vector field along γ with W (0) = u.
(b) Suppose that γ is a unit-speed geodesic. Then equation (4) reduces tȯ (c) Let Cv denote the orthogonal projection of u on Skew m · v, i.e. Cvv t + vv t C = uv t − vu t . Suppose that Cγ is horizontal. If γ is a unit-speed geodesic, then the h-parallel transport of u is given by where U denotes the Euclidean parallel extension of u along γ, i.e., U (t) = u f.a. t. If y = γ(ϕ) with ϕ = d(x, y), then the h-parallel transport W y of u along γ to y reads Proof. (a) W is h-parallel if and only if dπ(∇γW ) = 0, i.e., infinitesimal variation of W must be vertical.
Moreover, SO m -equivariance of vertical projection implies the well-definedness, i.e., if dπW is parallel, then dπ(Rw) is parallel for all R ∈ SO m . Note that existence and uniqueness of the solution for (4) with W (0) = u is immediate from the existence and uniqueness of parallel transport and vertical projection. Now, W is horizontal if and only ifḟ = 0 where f := W γ t −γW t 2 + W, γ 2 , since f (0) = 0. If the equations (4) hold, theṅ The last equation follows from the fact that A is skew-symmetric and γγ t is symmetric, hence their product is trace-free. Now, we arrive at f = 0, i.e., W remains horizontal. To prove the converse, note that if W is horizontal, then f and thereforeḟ vanishes. HenceẆ γ t − γẆ t =γW t − Wγ t and W,γ + Ẇ , γ = 0 and the Sylvester equation for the vertical component ofẆ reads Aγγ t + γγ t A = γW t − Wγ t . Thus (4) follows.
(b) Note that W γ t andγγ t are symmetric andγ + γ = 0. Thus Wγ t = −W γ t is also symmetric. Now, (4) impliesȦ (c) Obviously W given by (6) satisfies the initial condition W (0) = u. Moreover, it satisfiesẆ = −Cγ − W,γ γ, i.e., (5) holds with A(t) = −C. To prove (7) (6). Note that, due to skew-symmetry of γγ t (∇γW )γ t , the differential equation for the h-parallel transport can also be written as Hence, a vector field along a curve in π(S) is parallel if and only if it has a horizontal lift satisfying the above equation. We mention two cases such that (6) and (7) apply. First, W coincides with the spherical parallel transport of u if and only if uv t = vu t or, equivalently, C = 0. Secondly, for planar shapes. To see this, let χ i and η i denote the rows of a shape χ and η a horizontal vector at χ. Fix µ ∈ R and is symmetric since η, χ = 0. Hence, Cγ is horizontal for arbitrary C ∈ Skew 2 .

Jacobi Fields
Next, we derive the differential equation for Jacobi fields and provide a constructive approach to its solution utilizing parallel transport.
We recall that a smooth horizontal curve γ in S k m is a geodesic if and only if π • γ is a geodesic in Σ k m . Hence, for a horizontal geodesic γ, any geodesic variation of π • γ in the latter space reads π • Γ with Γ a variation of γ through horizontal geodesic. Thus the variation field d ds (π•Γ(s, · ))| s=0 = dπ( d ds Γ(s, · )| s=0 ) is a Jacobi field of the shape space. Recall that a vector field J along γ is called normal if and only if J,γ = 0 and the tangential component of any Jacobi field is just given by (a + bt)γ(t) with a, b ∈ R, which is obviously horizontal. Thus the challenge is to find those normal vector fields that project to a Jacobi field in the shape space.
Theorem 2.3. Let J be a normal vector field along γ and denote (a) dπ(J) is a Jacobi field if and only if (b) A normal Jacobi field J S of the pre-shape sphere projects to a Jacobi field if and only if DK dt h = 0.
Proof. (a) Obviously, solutions of the equations are, due to SO m -equivariance of horizontal and vertical projection, invariant under SO m action. Let Y and Z be vector fields on S k m . Following (O'Neill, 1966), the A and T -tensor fields are defined as

Due to (O'Neill, 1967, Theorem 2), dπ(J) is a Jacobi field if and only if
where X =γ, D dt stands for ∇ X and the vector field K is given by Note that as J is normal, its covariant and Euclidean derivative coincide. In our setting, the fibers are totally geodesic, hence T ≡ 0. Therefore . Moreover, we may suppose X = 1.
In the following, we give a geometric construction for normal Jacobi fields which will be employed for geodesic regression, viz. those corresponding to horizontal variations of footpoint resp. tangent direction. In the sequel x = γ(0), ξ 1 , ξ 2 ∈ Hor x and C ∈ Skew m .
Theorem 2.4. Suppose that γ has unit speed. Let Y 1 and Y 2 be h-parallel with Y 1 (0) = ξ 1 and Y 2 (0) = ξ 2 . Then, the solution of the differential equations (9) where w i denotes the orthogonal projection of ξ i tangent to Skew m · v and u i its orthogonal complement, W i and U i are parallel extensions of w i resp. u i .
Proof. We may write γ(t) = cos(t)x + sin(t)v with v = 1, x, v = 0 and vx t symmetric. Fix C ∈ Skew m . The variation Γ(s, t) = exp(sC)γ(t) is a variation of γ through horizontal geodesics sinceΓΓ t is symmetric, and defines a vertical Jacobi field J v = dΓ ds (0) = Cγ. As any Jacobi field is a linear combination of parallel resp. h-parallel vector fields, and those vector fields conserve orthogogonality and length, we may assume that ξ 1 = ξ 2 = 1. Then, Γ 1 (s, t) = cos(t)(cos(s)x + sin(s)ξ 1 ) + sin(t)v, and Γ 2 (s, t) = cos(t)x + sin(t)(cos(s)v + sin(s)ξ 2 ). are also variations by horizontal geodesics. Now, let Y 1 and Y 2 be unit h-parallel vector fields with initial values ξ 1 and ξ 2 . The variation fields of Γ 1 and Γ 2 read cos(t)Y 1 and sin(t)Y 2 . Hence J h = cos(t)Y 1 + sin(t)Y 2 is the horizontal Jacobi field with J h (0) = ξ 1 and DJ dt (0) = ξ 2 . To prove the second expression for J h , note that W i (t) = λ i Cv with some λ i ∈ R. The vector field W i is normal and parallel, since Cv, v = Cv, x = 0. Its orthogonal complement U i is also normal and parallel. Furthermore u i v t and u i x t are symmetric. Hence u i γ t is symmetric. Implying the given representation of J h , a straightforward computation shows thatΓ i Γ t i is symmetric.
We recall that for m = 2, the shape space is isometric to the complex projective space endowed with its standard (Fubini-Study) metric. The second formula for J h in this case is well-known (cf. (Fletcher, 2013) and (Jost, 2017)).

Geodesic Regression
In the following, we employ the results of the previous section to derive an efficient and robust approach for finding the relation between an independent scalar variable, i.e. time, and a dependent shape-valued random variable.
Regression analysis is a fundamental tool for the spatiotemporal modeling of longitudinal observations. Given scalars t 1 < t 2 < · · · < t N and distinct pre-shapes q 1 , · · · , q N , the goal of geodesic regression is to find a geodesic curve in shape space that best fits the data in a least-squares sense. In particular for a horizontal geodesic γ from x to y with v =γ(0), we define the misfit between the data and the geodesic as a sum of squared distances with respect to d Σ , i.e.
We can assume that t 1 = 0 and t N = 1. While the authors of (Fletcher, 2013) and (Machado and Silva Leite, 2007) identify geodesics by their initial point and velocity -and hence they consider F (x, v) -we use for the identification their endpoints, i.e., we consider The reason is that geodesic computations in terms of the function Φ defined in equation (1), the socalled slerp (spherical linear interpolation), are more efficient. Model estimation is then formulated as the least-squares problem (x * , y * ) = arg min (x,y) F (x, y), x ω ∼ y.
In the absence of an analytic solution, the regression problem has to be solved numerically. To this end, we employ a Riemannian trust-regions solver (Boumal et al., 2014) with a Hessian approximation based on finite differences and use (q 1 , ω(q 1 , q N )) as initial guess. Having in mind that (cf. (Pennec, 2006) and (Jost, 2017)) where ρ y (x) := d 2 Σ (x, y), the gradient of the cost function F can be computed using Jacobi fields, since they express the derivatives of the exponential map and therefore those of Φ. Now, for fixed q and t, let ∇ x f denote the gradient of f with respect to x where f (x, y) := ρ q • Φ(t, x, y). Then for any u ∈ Hor x , d x Exp x tv · u = J(t) where J denotes the Jacobi field along γ with J(0) = u and J(1) = 0. Denoting the h-parallel extension of u along γ by U and its normal resp. tangential component by U ⊥ resp. U , Theorem 2.4 implies that (in terms of the parametrization given by (1)) Let P x denote the h-parallel transport to x along γ. In view of (12), − 1 2 P γ(t) ∇ x f (x, y) is the adjoint of the mapping u → J(t). As the latter is self-adjoint it follows that where W = Log γ(t) q. Using similar arguments for the gradient of f with respect to y yields where W i = Log γ(t) q i . Now, our algorithm for geodesic regression can be summarized as follows.

Application to Epidemiological Data
In this section, we analyze the morphological variability in longitudinal data of human distal femora in order to quantify shape changes that are associated with femoral osteoarthritis.

Data Description
We apply the derived scheme to the analysis of group differences in longitudinal femur shapes of subjects with incident and developing osteoarthritis (OA) versus normal controls. An overview of OA-related dysmorphisms is shown in Figure 1. The dataset is derived from the Osteoarthritis Initiative (OAI), which is a longitudinal study of knee osteoarthritis maintaining (among others) clinical evaluation data and radiological images from 4,796 men and women of age 45-79. The data are available for public access at http://www.oai.ucsf.edu/. From the OAI database, we determined three groups of shapes trajectories: HH (healthy, i.e. no OA), HD (healthy to diseased, i.e. onset and progression to severe OA), and DD (diseased, i.e. OA at baseline) according to the Kellgren-Lawrence score (Kellgren and Lawrence, 1957) of grade 0 for all visits, an increase of at least 3 grades over the course of the study, and grade 3 or 4 for all visits, respectively. We extracted surfaces of the distal femora from the respective 3D weDESS MR images (0.37×0.37 mm matrix, 0.7 mm slice thickness) using a state-of-the-art automatic segmentation approach (Ambellan et al., 2018). For each group, we collected 22 trajectories (all available data for group DD minus a record that exhibited inconsistencies, and the same number for groups HD and HH, randomly selected), each of which comprises shapes of all acquired MR images, i.e. at baseline, the 12-, 24-, 36-, 48-and 72-month visits. In a supervised post-process, the quality of segmentations as well as the correspondence of the resulting meshes (8,988 vertices) were ensured.

Geodesic Modeling of Femoral Trajectories
We apply the geodesic regression approach detailed in Section 3 to the femoral shape trajectories described above and represented in Kendall's shape space. Due to the expressions derived for the parallel  transport and Jacobi fields, we can fully leverage the geometry using Riemannian optimization procedures (cf. (Absil et al., 2007)). In particular, for the intrinsic treatment of the optimization problem underlying the geodesic regression we use the open-source Matlab toolbox manopt (Boumal et al., 2014).
In our experiments, we observed a superlinear convergence of the intrinsic trust-region solver for most of the shape trajectories. Solving the high-dimensional (54k degrees of freedom) regression problem on a laptop computer with Intel Core i7-7500U (2 × 2.70GHz) CPU took about 0.3s on average. In contrast, the generic Matlab routine for nonlinear regression (viz. fitnlm) required about 25s to determine a solution, thus being two orders of magnitude slower. The resulting estimated geodesics along with the original trajectories are visualized in Figure 2. The geodesic representation provides a less cluttered visualization of the trajectory population making it easier to identify trends within as well as across groups. For 2-dimensional visualization we perform dimension reduction for the trajectories X 1 , · · · , X k with X j = (x j 1 , · · · , x j n ), i.e. we apply tangent PCA to (x j i ) j=1,··· ,k i=1,··· ,n at the mean of all baseline shapes in HH. In the remainder, the latter is referred to as reference shape Ref.
Next we would like to answer the question of how well the observed data is replicated by the estimated geodesic trends. A common approach to test this is to compute the coefficient of determination, denoted as R 2 , that is the proportion of the total variance in the data explained by the model. Following (Fletcher, 2013), a generalization to manifolds is defined as with F (γ) and G(x) as defined in equations (11) and (3), respectively. As the unexplained variance cannot exceed the total variance (since the Fréchet mean lies in the search space of the regression problem) and both variances are nonnegative, R 2 must lie in the interval [0, 1] (with larger values indicating a higher proportion of the variance being explained by the model).
The coefficients of determination were computed for all estimated trends amounting to group-wise medians (95% confidence intervals) of 0.40 (0.33-0.46), 0.55 (0.48-0.63), and 0.51 (0.40-0.72) for group HH, DD, and HD, respectively. While for all groups the geodesic model is able to describe a relatively large portion of the shape variability, there is a clear difference between the control group HH and the groups DD and HD associated to osteoarthritis. In particular, pairwise Mann-Whitney U tests confirm that the differences are highly unlikely due to random chance (with p-values of 0.000, and 0.005 for HH v. DD, and HH v. HD, respectively). These findings indicate that the OA-related shape variability is better caputured by a single variable (time) than the physiological trends in HH.
Based on the coefficient of determination we also test for the significance of the estimated trends employing permutation tests as suggested in (Fletcher, 2013). For each of the trajectories we performed 1,000 permutations and considered the results as statistically significant for p-values less than 0.01. In almost all cases (63 out of 66) the trends were significant, such that we can expect them to be highly unlikely due to random chance.

Group-wise Analysis of Longitudinal Trends
Algorithm 2 Transport of shape trajectory Input: Pre-shapes x 1 , · · · , x n , Ref Output: Transported pre-shapes y 1 , · · · , y n y 1 ← Ref for k = 1, · · · , n − 1 do v k ← Log x k x k+1 y k+1 ← Exp(y k , P arT rans(y k , v k )) end for In order to perform group-wise analysis of longitudinal shape changes we compare the estimated geodesic trends of the femoral trajectories. This requires the consistent integration of intra-and intersubject variability in order obtain statistically significant localization of changes at the population level. In fact, the comparison of longitudinal shape changes is usually performed after normalizing (i.e. transporting) them into a common system of coordinates (see (Lorenzi and Pennec, 2014) and the references therein). Such a normalization can be realized by adapting parallel transport as presented in Algorithm 2. In particular, for geodesic trends this scheme reduces to parallel transport of the subject-specific velocity along the baseline-to-reference shape geodesic. The group-wise longitudinal progression was modeled as the mean of the normalized velocities. The areas of significant differences between longitudinal changes were investigated by two-sample Hotelling's T 2 tests on the vertex-wise displacements corresponding to the transported velocity-fields. While the displacements differ significantly (p < 0.01) between normal controls and the OA groups (for 60% and 35% of the vertices for HH v. HD and HH v. DD, respectively), there are almost no differences (1%) in the longitudinal changes in-between both OA groups. Figure 3 shows a qualitative visualization of the group tests in terms of the magnitude of the difference between the group-wise means. Visible are changes along the ridge of the cartilage plate (characteristic regions for osteophytic growth, cf. Figure 1) in comparison of both HH v. HD as well as HH v. DD, albeit the latter are less pronounced suggesting a saturation of morphological developments. Additionally, the changes are more developed on the medial compartment, which is in line with previous findings (Vincent et al., 2012).
While velocities are constant for subject-specific geodesics, their parallel transport depends on the path (an effect called holonomy). To investigate this path dependency, we repeated the above experiment using different paths for the HD group. In particular, we chose the shape at the onset time (transition time to severe OA, vis. Kellgren-Lawrence score ≥ 3) as the initial point for the transport path. In line with previous work (Lorenzi and Pennec, 2013), we found that the results are not sensitive to the path. More precisely, the results of the group tests agreed for 99.71% and 99.97% of the vertices for HH v. HD and HD v. DD, respectively.

Concluding Remarks
This work presented characterizations of and computationally efficient methods for the determination of parallel transport, Jacobi fields and geodesic regression of data represented as shapes in Kendall's space. Furthermore, an application to longitudinal statistical analysis of epidemiological data (femur data for analysis of knee osteoarthritis) has been shown. An advantage of modeling trajectories by geodesics is the following: A main task in longitudinal analysis is to translate trajectories to start at a reference shape. The intermediate distances between the shapes of a geodesic are preserved by parallel transport, which is not the case for general transports. Moreover, data inconsistencies are minimized by considering the best-fitting geodesics, and Jacobi fields can be employed to analyze the variability of the geodesics, hence providing a canonical descriptor of trends and differences for the trajectories. There are many potential avenues for future work. First, we would like to use the presented methodology within the mixed-effect framework (see e.g. (Bône et al., 2018)), which provides a joined analysis of longitudinal and cross-sectional variability. In particular, group-wise means of the geodesics can be computed with respect to a natural metric in the tangent bundle (e.g. the Sasaki metric) to determine the group parameters as described in (Muralidharan and Fletcher, 2012). Second, an extension of the method to higher-dimensional longitudinal parameters instead of just time can be examined, to achieve even more differentiated results. Third, spline regression poses a natural generalization providing more degrees of freedom.
On the application side, based on the results found, it can be said in summary that the shape trajectories of the healthy subjects expose significantly different temporal changes than those found in groups with incident and developing OA. Our analysis delivered detailed insights into the complex morphological changes that fit medical knowledge. It seems possible to make a correct assignment to one of the three groups based on just two measurements. The aim of further investigations must be to substantiate this statement, by determining with what reliability a prediction can be made about the onset of knee osteoarthritis depending on the baseline shape and trend as well as the sensitivity of the latter with respect to the number of observations made and the time intervals between them.