Analysis of (sub-)Riemannian PDE-G-CNNs

Bellaard, Gijs; Bon, Daan L. J.; Pai, Gautam; Smets, Bart M. N.; Duits, Remco

doi:10.1007/s10851-023-01147-w

Analysis of (sub-)Riemannian PDE-G-CNNs

Open access
Published: 16 April 2023

Volume 65, pages 819–843, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Analysis of (sub-)Riemannian PDE-G-CNNs

Download PDF

Gijs Bellaard¹,
Daan L. J. Bon¹,
Gautam Pai¹,
Bart M. N. Smets¹ &
…
Remco Duits¹

2782 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Group equivariant convolutional neural networks (G-CNNs) have been successfully applied in geometric deep learning. Typically, G-CNNs have the advantage over CNNs that they do not waste network capacity on training symmetries that should have been hard-coded in the network. The recently introduced framework of PDE-based G-CNNs (PDE-G-CNNs) generalizes G-CNNs. PDE-G-CNNs have the core advantages that they simultaneously (1) reduce network complexity, (2) increase classification performance, and (3) provide geometric interpretability. Their implementations primarily consist of linear and morphological convolutions with kernels. In this paper, we show that the previously suggested approximative morphological kernels do not always accurately approximate the exact kernels accurately. More specifically, depending on the spatial anisotropy of the Riemannian metric, we argue that one must resort to sub-Riemannian approximations. We solve this problem by providing a new approximative kernel that works regardless of the anisotropy. We provide new theorems with better error estimates of the approximative kernels, and prove that they all carry the same reflectional symmetries as the exact ones. We test the effectiveness of multiple approximative kernels within the PDE-G-CNN framework on two datasets, and observe an improvement with the new approximative kernels. We report that the PDE-G-CNNs again allow for a considerable reduction of network complexity while having comparable or better performance than G-CNNs and CNNs on the two datasets. Moreover, PDE-G-CNNs have the advantage of better geometric interpretability over G-CNNs, as the morphological kernels are related to association fields from neurogeometry.

PDE-Based Group Equivariant Convolutional Neural Networks

Article Open access 27 July 2022

Geometric deep learning and equivariant neural networks

Article Open access 04 June 2023

Geometric Adaptations of PDE-G-CNNs

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Many classification, segmentation, and tracking tasks in computer vision and digital image processing require some form of “symmetry.” Think, for example, of image classification. If one rotates, reflects, or translates an image, the classification stays the same. We say that an ideal image classification is invariant under these symmetries. A slightly different situation is image segmentation. In this case, if the input image is in some way changed the output should change accordingly. Therefore, an ideal image segmentation is equivariant with respect to these symmetries.

Many computer vision and image processing problems are currently being tackled with neural networks (NNs). It is desirable to design neural networks in such a way that they respect the symmetries of the problem, i.e., make them invariant or equivariant. Think for example of a neural network that detects cancer cells. It would be disastrous if, by for example slightly translating an image, the neural network would give totally different diagnosis, even though the input is essentially the same.

One way to make the networks equivariant or invariant is to simply train them on more data. One could take the training dataset and augment it with translated, rotated, and reflected versions of the original images. This approach however is undesirable: invariance or equivariance is still not guaranteed and the training takes longer. It would be better if the networks are inherently invariant or equivariant by design. This avoids a waste of network-capacity, guarantees invariance or equivariance, and increases performances, see for example [1].

More specifically, many computer vision and image processing problems are tackled with convolutional neural networks (CNNs) [2,3,4]. Convolution neural networks have the property that they inherently respect, to some degree, translation symmetries. CNNs do not however take into account rotational or reflection symmetries. Cohen and Welling introduced group equivariant convolutional neural networks (G-CNNs) in [5] and designed a classification network that is inherently invariant under 90 degree rotations, integer translations, and vertical/horizontal reflections. Much work is being done on invariant/equivariant networks that exploit inherent symmetries, a non-exhaustive list is [1, 6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]. The idea of including geometric priors, such as symmetries, into the design of neural networks is called ‘Geometric Deep Learning’ in [27].

In [28], partial differential equation (PDE)-based G-CNNs are presented, aptly called PDE-G-CNNs. In fact, G-CNNs are shown to be a special case of PDE-G-CNNs (if one restricts the PDE-G-CNNs only to convection, using many transport vectors [28, Sec. 6]). With PDE-G-CNNs, the usual nonlinearities that are present in current networks, such as the ReLU activation function and max-pooling, are replaced by solvers for specifically chosen nonlinear evolution PDEs. Figure 1 illustrates the difference between a traditional CNN layer and a PDE-G-CNN layer.

The PDEs that are used in PDE-G-CNNs are not chosen arbitrarily: they come directly from the world of geometric image analysis, and thus their effects are geometrically interpretable. This makes PDE-G-CNNs more geometrically meaningful and interpretable than traditional CNNs. Specifically, the PDEs considered are diffusion, convection, dilation, and erosion. These 4 PDEs correspond to the common notions of smoothing, shifting, max pooling, and min pooling. They are solved by linear convolutions, resamplings, and so-called morphological convolutions. Figure 2 illustrates the basic building block of a PDE-G-CNN.

One shared property of G-CNNs and PDE-G-CNNs is that the input data usually needs to be lifted to a higher dimensional space. Take, for example, the case of image segmentation with a convolution neural network where we model/idealize the images as real-valued function on $\mathbb {R}^2$. If we keep the data as functions on $\mathbb {R}^2$ and want the convolutions within the network to be equivariant, then the only possible ones that are allowed are with isotropic kernels, [29, p. 258]. This type of short-coming generalizes to other symmetry groups as well [12, Thm. 1]. One can imagine that this is a constraint too restrictive to work with, and that is why we lift the image data.

Within the PDE-G-CNN framework, the input images are considered real-valued functions on $\mathbb {R}^d$, the desired symmetries are represented by the Lie group of roto-translations SE(d), and the data is lifted to the homogeneous space of d dimensional positions and orientations $\mathbb {M}_d$. It is on this higher dimensional space on which the evolution PDEs are defined, and the effects of diffusion, dilation, and erosion are completely determined by the Riemannian metric tensor field $\mathcal {G}$ that is chosen on $\mathbb {M}_d$. If this Riemannian metric tensor field $\mathcal {G}$ is left-invariant, the overall processing is equivariant, this follows by combining techniques in [30, Thm. 21, Chpt. 4], [31, Lem. 3, Thm. 4].

The Riemannian metric tensor field $\mathcal {G}$ we will use in this article is left-invariant and determined by three nonnegative parameters: $w_1$, $w_2$, and $w_3$. The definition can be found in the preliminaries, Sect. 2 Equation (4). It is exactly these three parameters that during the training of a PDE-G-CNN are optimized. Intuitively, the parameters correspondingly regulate the cost of main spatial, lateral spatial, and angular motion. An important quantity in the analysis of this paper is the spatial anisotropy $\zeta := \frac{w_1}{w_2}$, as will become clear later.

In this article, we only consider the two-dimensional case, i.e., $d=2$. In this case, the elements of both $\mathbb {M}_2$ and SE(2) can be represented by three real numbers: $(x,y,\theta ) \in \mathbb {R}^2 \times [0,2\pi )$. In the case of $\mathbb {M}_2$, the x and y represent a position and $\theta $ represents an orientation. Throughout the article, we take $\textbf{p}_0:= (0,0,0) \in \mathbb {M}_2$ as our reference point in $\mathbb {M}_2$. In the case of SE(2), we have that x and y represent a translation and $\theta $ a rotation.

As already stated, within the PDE-G-CNN framework images are lifted to the higher dimensional space of positions and orientations $\mathbb {M}_d$. There are a multitude of ways of achieving this, but there is one very natural way to do it: the orientation score transform [30, 32,33,34]. In this transform, we pick a point $(x,y) \in \mathbb {R}^2$ in an image and determine how good a certain orientation $\theta \in [0, 2\pi )$ fits the chosen point. In Fig. 3 an example of an orientation score is given. We refer to [34, Sec. 2.1] for a summary of how an orientation score transform works.

Inspiration for using orientation scores comes from biology. The Nobel laureates Hubel and Wiesel found that many cells in the visual cortex of cats have a preferred orientation [35, 36]. Moreover, a neuron that fires for a specific orientation excites neighboring neurons that have an “aligned” orientation. Petitot and Citti-Sarti proposed a model [37, 38] for the distribution of the orientation preference and this excitation of neighbors based on sub-Riemannian geometry on $\mathbb {M}_2$. They relate the phenomenon of preference of aligned orientations to the concept of association fields [39], which model how a specific local orientation places expectations on surrounding orientations in human vision. Figure 4 provides an impression of such an association field.

As shown in [42, Fig. 17], association fields are closely approximated by (projected) sub-Riemannian geodesics in $\mathbb {M}_2$ for which optimal synthesis has been obtained by Sachkov and Moiseev [43, 44]. Furthermore, in [45] it is shown that the Riemannian geodesics in $\mathbb {M}_2$ converge to the sub-Riemannian geodesics by increasing the spatial anisotropy $\zeta $ of the metric. This shows that in practice one can approximate the sub-Riemannian model by Riemannian models. Figure 5 shows the relation between association fields and sub-Riemannian geometry in $\mathbb {M}_2$.

The relation between association fields and Riemannian geometry on $\mathbb {M}_2$ directly extends to a relation between dilation/erosion and association fields. Namely, performing dilation on an orientation score in $\mathbb {M}_2$ is similar to extending a line segment along its association field lines. Similarly, performing erosion is similar to sharpening a line segment perpendicular to its association field lines. This makes dilation/erosion the perfect candidate for a task such as line completion.

In the line completion problem, the input is an image containing multiple line segments, and the desired output is an image of the line that is “hidden” in the input image. Figure 6 shows such an input and desired output. This is also what David Field et al. studied in [39]. We anticipate that PDE-G-CNNs outperform classical CNNs in the line completion problem due to PDE-G-CNNs being able to dilate and erode. To investigate this, we made a synthetic dataset called “Lines” consisting of grayscale $64\times 64$ pixel images, together with their ground-truth line completion. In Fig. 7, a complete abstract overview of the architecture of a PDE-G-CNN performing line completion is visualized. Figure 8 illustrates how a PDE-G-CNN and CNN incrementally complete a line throughout their layers.

In Proposition 1, we show that solving the dilation and erosion PDEs can be done by performing a morphological convolution with a morphological kernel $k_t^{\alpha }: \mathbb {M}_2 \rightarrow \mathbb {R}_{\ge 0}$, which is easily expressed in the Riemannian distance $d=d_{\mathcal {G}}$ on the manifold:

$$\begin{aligned} k_t^{\alpha }(\textbf{p})=\frac{t}{\beta } \left( \frac{d_{\mathcal {G}}(\textbf{p}_0,\textbf{p})}{t}\right) ^{\beta }. \end{aligned}$$

(1)

Here $\textbf{p}_0 = (0,0,0)$ is our reference point in $\mathbb {M}_2$, and time $t>0$ controls the amount of erosion and dilation. Furthermore, $\alpha >1$ controls the “softness” of the max and min-pooling, with $\frac{1}{\alpha }+\frac{1}{\beta }=1$. Erosion is done through a direct morphological convolution (3) with this specific kernel. Dilation is solved in a slightly different way but again with the same kernel (Proposition 1 in Sect. 3 will explain the details).

And this is where a problem arises: calculating the exact distance d on $\mathbb {M}_2$ required in (1) is computationally expensive [47]. To alleviate this issue, we resort to estimating the true distance d with computationally efficient approximative distances, denoted throughout the article by $\rho $. We then use such a distance approximation within (1) to create a corresponding approximative morphological kernel, and in turn use this to efficiently calculate the effect of dilation and erosion.

In [28], one such distance approximation is used: the logarithmic distance estimate $\rho _c$ which uses the logarithmic coordinates $c^i$ (8). In short, $\rho _c(\textbf{p})$ is equal to the Riemannian length of the exponential curve that connects $\textbf{p}_0$ to $\textbf{p}$. The formal definition will follow in Sect. 4. In Fig. 9 an impression of $\rho _c$ is given.

Clearly, an error is made when the effect of erosion and dilation is calculated with an approximative morphological kernel. As a morphological kernel is completely determined by its corresponding (approximative) distance, it follows that one can analyze the error by analyzing the difference between the exact distance d and approximative distance $\rho $ that is used.

Despite showing in [28] that $d \le \rho _c$ no concrete bounds are given, apart from the asymptotic $ \rho _c^2 \le d^2 + \mathcal {O}(d^4) $. This motivates us to do a more in-depth analysis on the quality of the distance approximations.

We introduce a variation on the logarithmic estimate $\rho _c$ called the half-angle distance estimate $\rho _b$, and analyze that. The half-angle approximation uses not the logarithmic coordinates but half-angle coordinates $b^i$. The definition of these is also given later (28). In practice, $\rho _c$ and $\rho _b$ do not differ much, but analyzing $\rho _b$ is much easier!

The main theorem of the paper, Proposition 1, collects new theoretical results that describe the quality of using the half-angle distance approximation $\rho _b$ for solving dilation and erosion in practice. It relates the approximative morphological kernel $k_b$ corresponding with $\rho _b$, to the exact kernel k (1).

Both the logarithmic estimate $\rho _c$ and half-angle estimate $\rho _b$ approximate the true Riemannian distance d quite well in certain cases. One of these cases is when the Riemannian metric has a low spatial anisotropy $\zeta $. We can show this visually by comparing the isocontours of the exact and approximative distances. However, interpreting and comparing these surfaces can be difficult. This is why we have decided to additionally plot multiple $\theta $-isocontours of these surfaces. In Fig. 10 one such plot can be seen and illustrates how it must be interpreted.

Table 1 The balls of the exact distance d and approximative distance $\rho _b$ in the isotropic and low anisotropic case. The radius of the balls is set to $r = 2.5$. The domain of the plots is $[-3,3]\times [-3,3]\times [-\pi ,\pi )$. We fix $w_1=w_3=1$ throughout the plots and vary $w_2$. For $\theta = k\pi /10$ with $ k = -10,\dots ,10 $ the isocontours are drawn, similar to Fig. 10

Full size table

Table 2 The same as Table 1 but in the high spatially anisotropic case. Alongside the approximation $\rho _b$ the sub-Riemannian distance approximation $\rho _{b,sr}$ is plotted with $\nu = 1.6$. We see that the isocontours of $\rho _b$ are too “thin” compared to the isocontours of d. The isocontours of $\rho _{b,sr}$ are better in this respect

Full size table

In Table 1, a spatially isotropic $\zeta = 1$ and low-anisotropic case $\zeta = 2$ is visualized. Note that $\rho _b$ approximates d well in these cases. In fact, $\rho _b$ is exactly equal to the true distance d in the spatially isotropic case, which is not true for $\rho _c$.

Both the logarithm and half-angle approximation fail specifically in the high spatial anisotropy regime. For example when $\zeta = 8$. The first two columns of Table 2 show that, indeed, $\rho _b$ is no longer a good approximation of the exact distance d. For this reason, we introduce a novel sub-Riemannian distance approximations $\rho _{b, sr}$, which is visualized in the third column of Table 2.

Finally, we propose an approximative distance $\rho _{com}$ that carefully combines the Riemannian and sub-Riemannian approximations into one. This combined approximation automatically switches to the estimate that is more appropriate depending on the spatial anisotropy, and hence covers both the low and high anisotropy regimes. Using the corresponding morphological kernel of $\rho _{com}$ to solve erosion and dilation, we obtain more accurate (and still tangible) solutions of the nonlinear parts in the PDE-G-CNNs.

For every distance approximation (listed in Sect. 4), we perform an empirical analysis in Sect. 6 by seeing how the estimate changes the performance of the PDE-G-CNNs when applied to two datasets: the Lines dataset and publicly available DCA1 dataset.

1.1 Contributions

In Proposition 1, we summarize how the nonlinear units in PDE-G-CNNs (described by morphological PDEs) are solved using morphological kernels and convolutions, which provides sufficient and essential background for the discussions and results in this paper.

The key contributions of this article are:

Proposition 1 summarizes our mathematical analysis of the quality of the half-angle distance approximation $\rho _b$ and its corresponding morphological kernel $k_b$ in PDE-G-CNNs. We do this by comparing $k_b$ to the exact morphological kernel k. Globally, one can show that they both carry the same symmetries, and that for low spatial anisotropies $\zeta $ they are almost indistinguishable. Furthermore, we show that locally both kernels are similar through an upper bound on the relative error. This improves upon results in [28, Lem. 20].
Table 2 demonstrates qualitatively that $\rho _b$ becomes a poor approximation when the spatial anisotropy is high $\zeta \gg 1$. In Corollary 4, we underpin this theoretically and in Sect. 6.1 we validate this observation numerically. This motivates the use of a sub-Riemannian approximation when $\zeta $ is large.
In Sect. 4, we introduce and derive a novel sub-Riemannian distance approximation $\rho _{sr}$, that overcomes difficulties in previous existing sub-Riemannian kernel approximations [48]. Subsequently, we propose our approximation $\rho _{com}$ that combines the Riemannian and sub-Riemannian approximations into one that automatically switches to the approximation that is more appropriate depending on the metric parameters.
Figures 16 and 19 show that PDE-G-CNNs perform just as well as, and sometimes better than, G-CNNs and CNNs on the DCA1 and Lines dataset, while having the least amount of parameters. Figures 20 and 17 depict an evaluation of the performance of PDE-G-CNNs when using the different distance approximations, again on the DCA1 and Lines dataset. We observe that the new kernel $\rho _{b,com}$ provides best results.

Our theoretical contributions are also relevant outside the context of geometric deep learning. Namely, it also applies to general geometric image processing [48], neurogeometry [37, 38], and robotics [49, Sec. 6.8.4].

In addition, Figs. 4, 5, 9 and 8 show a connection between the PDE-G-CNN framework with the theory of association fields from neurogeometry [37, 39]. Thereby, PDE-G-CNNs reveal improved geometrical interpretability, in comparison with existing convolution neural networks. In Appendix 1, we further clarify the geometrical interpretability.

1.2 Outline

In Sect. 2, a short overview of the necessary mathematical preliminaries is given. Section 3 collects some known results on the exact solution of erosion and dilation on the homogeneous space of two-dimensional positions and orientations $\mathbb {M}_2$, and motivates the use of morphological kernels. In Sect. 4, all approximative distances are listed. The approximative distances give rise to corresponding approximative morphological kernels. The main theorem of this paper can be found in Sect. 5 and consist of three parts, of which the proofs can be found in the relevant subsections. The main theorem mostly concerns itself with the analysis of the approximative morphological kernel $k_b$. Experiments with various approximative kernels are done and the result can be found in Sect. 6. Finally, we end the paper with a conclusion in Sect. 7.

2 Preliminaries

Coordinates on SE(2) and $\mathbb {M}_2$. Let $G = SE(2) = \mathbb {R}^2 \rtimes SO(2)$ be the two-dimensional rigid body motion group. We identify elements $g \in G$ with $g \equiv (x,y,\theta ) \in \mathbb {R}^2 \times \mathbb {R}/(2\pi \mathbb {Z})$, via the isomorphism $SO(2) \cong \mathbb {R}/(2\pi \mathbb {Z})$. Furthermore, we always use the small-angle identification $ \mathbb {R}/(2\pi \mathbb {Z}) = [-\pi , \pi )$.

For $g_1=(x_1, y_1, \theta _1)$, $g_2 = (x_2, y_2, \theta _2) \in SE(2)$ we have the group product

$$\begin{aligned} \begin{aligned} g_1 g_2:= (&x_1 + x_2 \cos \theta _1 - y_2 \sin \theta _1, \\&y_1 + x_2 \sin \theta _1 + y_2 \cos \theta _1, \\&\theta _1 + \theta _2 {{\,\textrm{mod}\,}}2\pi ), \end{aligned} \end{aligned}$$

and the identity is $e = (0,0,0)$. The rigid body motion group acts on the homogeneous space of two-dimensional positions and orientations $\mathbb {M}_{2} = \mathbb {R}^2 \times S^1 \subseteq \mathbb {R}^2 \times \mathbb {R}^2$ by the left-action $\odot $:

$$\begin{aligned} (\textbf{x},\textbf{R}) \odot (\textbf{y},\textbf{n})= (\textbf{x}+ \textbf{R}\textbf{y},\textbf{R}\textbf{n}), \end{aligned}$$

with $(\textbf{x},\textbf{R}) \in SE(2)$ and $(\textbf{y},\textbf{n}) \in \mathbb {M}_2$. If context allows it, we may omit writing $\odot $ for conciseness. By choosing the reference element $\textbf{p}_0 = (0,0,(1,0)) \in \mathbb {M}_2$, we have:

$$\begin{aligned} (x,y,\theta ) \odot \textbf{p}_0 = (x,y,(\cos \theta , \sin \theta )). \end{aligned}$$

(2)

This mapping is a diffeomorphism and allows us to identify SE(2) and $\mathbb {M}_2$. Thereby we will also freely use the $(x,y,\theta )$ coordinates on $\mathbb {M}_2$.

Morphological group convolution. Given functions $f_1,f_2:\mathbb {M}_2 \rightarrow \mathbb {R}$, we define their morphological convolution (or ‘infimal convolution’) [50, 51] by

$$\begin{aligned} (f_1 \mathbin {\square } f_2)(\textbf{p})= \inf \limits _{g \in G} \left\{ f_1(g^{-1} \textbf{p}) + f_2(g \, \textbf{p}_0)\right\} \end{aligned}$$

(3)

Left-invariant (co-)vector fields on $\mathbb {M}_2$. Throughout this paper, we shall rely on the following basis of left-invariant vector fields:

$$\begin{aligned} \begin{aligned} \mathcal {A}_{1}&= \cos \theta \partial _x + \sin \theta \partial _y, \\ \mathcal {A}_{2}&= -\sin \theta \partial _x + \cos \theta \partial _y, \text { and }\\ \mathcal {A}_{3}&= \partial _{\theta }. \end{aligned} \end{aligned}$$

The dual frame $\omega ^i$ is given by $\langle \omega ^i, \mathcal {A}_{j}\rangle =\delta ^{i}_j$, i.e.,

$$\begin{aligned} \begin{aligned} \omega ^1&= \cos \theta \textrm{d}x + \sin \theta \textrm{d}y, \\ \omega ^2&= -\sin \theta \textrm{d}x +\cos \theta \textrm{d}y, \text { and } \\ \omega ^3&= \textrm{d}\theta . \end{aligned} \end{aligned}$$

Metric tensor fields on $\mathbb {M}_2$. We consider the following left-invariant metric tensor fields:

$$\begin{aligned} \begin{aligned} \mathcal {G}= \sum _{i=1}^{3} w_i^2 \ \omega ^{i} \otimes \omega ^i \end{aligned} \end{aligned}$$

(4)

and write $\Vert {\dot{\textbf{p}}}\Vert =\sqrt{\mathcal {G}_{\textbf{p}}({\dot{\textbf{p}}},{\dot{\textbf{p}}})}$. Here, $w_i > 0$ are the metric parameters. We also use the dual norm $\Vert {\hat{\textbf{p}}}\Vert _* = \sup \limits _{{{\dot{\textbf{p}}}} \in T_\textbf{p}\mathbb {M}_2} \frac{\left\langle {{\dot{\textbf{p}}}}, {\hat{\textbf{p}}} \right\rangle }{\Vert {{\dot{\textbf{p}}}}\Vert }$. We will assume, without loss of generality, that $w_2 \ge w_1$ and introduce the ratio

$$\begin{aligned} \zeta := \frac{w_2}{w_1} \ge 1 \end{aligned}$$

(5)

that is called the spatial anisotropy of the metric. Distances on $\mathbb {M}_2$. The left-invariant metric tensor field $\mathcal {G}$ on $\mathbb {M}_2$ induces a left-invariant distance (‘Riemannian metric’) $d:\mathbb {M}_{2} \times \mathbb {M}_2 \rightarrow \mathbb {R}_{\ge 0}$ by

$$\begin{aligned} d_{\mathcal {G}}(\textbf{p},\textbf{q})= \inf _{\gamma \in \Gamma _t(\textbf{p},\textbf{q})}\left( L_{\mathcal {G}}(\gamma ):= \int _0^t \Vert {\dot{\gamma }}(s)\Vert _{\mathcal {G}}\, \textrm{d}s \right) , \end{aligned}$$

(6)

where $\Gamma _t(\textbf{p}, \textbf{q})$ is the set piecewise $C^1$-curves $\gamma $ in $\mathbb {M}_2$ with $\gamma (0)=\textbf{p}$ and $\gamma (t)=\textbf{q}$. The right-hand side does not depend on $t>0$, and we may set $t=1$.

If no confusion can arise, we omit the subscript $\mathcal {G}$ and write $d, L, \Vert \cdot \Vert $ for short. The distance being left-invariant means that for all $g\in SE(2)$, $\textbf{p}_1,\textbf{p}_2 \in \mathbb {M}_2$ one has $d(\textbf{p},\textbf{q})=d(g \textbf{p},g \textbf{q})$. We will often use the shorthand notation $d(\textbf{p}):=d(\textbf{p}, \textbf{p}_0)$.

We often consider the sub-Riemannian case arising when $w_2 \rightarrow \infty $. Then we have “infinite cost” for sideways motion and the only “permissible” curves $\gamma $ are the ones for which ${{\dot{\gamma }}}(t) \in H$ where $H:= \text {span}\{\mathcal {A}_1, \mathcal {A}_3\} \subset T\mathbb {M}_{2}$. This gives rise to a new notion of distance, namely the sub-Riemannian distance $d_{sr}$:

$$\begin{aligned} d_{sr}(\textbf{p},\textbf{q})= \inf _{\begin{array}{c} \gamma \in \Gamma _t(\textbf{p},\textbf{q}), \\ {{\dot{\gamma }}} \in H \end{array}} L_{\mathcal {G}}(\gamma ). \end{aligned}$$

(7)

One can show rigorously that when $w_2 \rightarrow \infty $ the Riemannian distance d tends to the sub-Riemannian distance $d_{sr}$, see for example [45, Thm. 2].

Exponential and Logarithm on SE(2). The exponential map $\exp (c^1 \partial _x \vert _e + c^2 \partial _y \vert _e + c^3 \partial _\theta \vert _e) = (x,y,\theta ) \in SE(2)$ is given by:

$$\begin{aligned} \begin{aligned} x&= \left( c^1 \cos \tfrac{c^3}{2} - c^2 \sin \tfrac{c^3}{2} \right) {{\,\textrm{sinc}\,}}\tfrac{c^3}{2}, \\ y&= \left( c^1 \sin \tfrac{c^3}{2} + c^2 \cos \tfrac{c^3}{2} \right) {{\,\textrm{sinc}\,}}\tfrac{c^3}{2}, \\ \theta&= c^3 {{\,\textrm{mod}\,}}2\pi . \end{aligned} \end{aligned}$$

And the logarithm: $\log (x,y,\theta ) = c^1 \partial _x\vert _e + c^2 \partial _y\vert _e + c^3 \partial _\theta \vert _e \in T_eSE(2)$:

$$\begin{aligned} \begin{aligned} c^1&= \frac{x\cos \tfrac{\theta }{2} + y \sin \tfrac{\theta }{2}}{{{\,\textrm{sinc}\,}}\tfrac{\theta }{2}}, \\ c^2&= \frac{-x \sin \tfrac{\theta }{2} + y \cos \tfrac{\theta }{2}}{{{\,\textrm{sinc}\,}}\tfrac{\theta }{2}}, \\ c^3&= \theta . \end{aligned} \end{aligned}$$

(8)

By virtue of equation (2), we will freely use the logarithm coordinates on $\mathbb {M}_2$.

3 Erosion and Dilation

We will be considering the following Hamilton–Jacobi equation on $\mathbb {M}_2$:

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial W_\alpha }{\partial t} &{}= \pm \frac{1}{\alpha } \left\| \nabla W_{\alpha } \right\| ^\alpha = \pm \mathcal {H}_{\alpha }(dW_\alpha ) \\ \left. W_\alpha \right| _{t=0} &{}= U, \end{array}\right. } \end{aligned}$$

(9)

with the Hamiltonian $\mathcal {H}_\alpha : T^*\mathbb {M}_2 \rightarrow \mathbb {R}_{\ge 0}$:

$$\begin{aligned} \mathcal {H}_{\alpha }(\hat{\textbf{p}}) = \mathcal {H}_{\alpha }^{1D}(\Vert \hat{\textbf{p}}\Vert ) = \frac{1}{\alpha }\Vert \hat{\textbf{p}}\Vert _*^{\alpha }, \end{aligned}$$

and where $W_\alpha $ the viscosity solutions [52] obtained from the initial condition $U \in C( \mathbb {M}_{2},\mathbb {R})$. Here the $+$sign is a dilation scale space and the −sign is an erosion scale space [50, 51]. If confusion cannot arise, we omit the superscript 1D. Erosion and dilation correspond to min- and max-pooling, respectively. The Lagrangian $\mathcal {L}_\alpha : T\mathbb {M}_2 \rightarrow \mathbb {R}_{\ge 0}$ corresponding with this Hamiltonian is obtained by taking the Fenchel transform of the Hamiltonian:

$$\begin{aligned} \mathcal {L}_{\alpha }({\dot{\textbf{p}}}) = \mathcal {L}^{1D}_{\alpha }(\Vert {\dot{\textbf{p}}}\Vert ) =\frac{1}{\beta } \Vert {\dot{\textbf{p}}}\Vert ^\beta \end{aligned}$$

with $\beta $ such that $\frac{1}{\alpha } + \frac{1}{\beta } = 1$. Again, if confusion cannot arise, we omit the subscript $\alpha $ and/or superscript 1D. We deviate from our previous work by including the factor $\frac{1}{\alpha }$ and working with a power of $\alpha $ instead of $2\alpha $. We do this because it simplifies the relation between the Hamiltonian and Lagrangian.

The following proposition collects standard results in terms of the solutions of Hamilton–Jacobi equations on manifolds [53,54,55], thereby generalizing results on $\mathbb {R}^2$ to $\mathbb {M}_2$.

Proposition 1

(Solution erosion & dilation) Let $\alpha > 1$. The viscosity solution $W_\alpha $ of the erosion PDE (9) is given by

$$\begin{aligned} W_\alpha (\textbf{p},t)&= \inf _{\begin{array}{c} \textbf{q}\in \mathbb {M}_2, \\ \gamma \in \Gamma _t(\textbf{p}, \textbf{q}) \end{array}} U(\textbf{q}) + \int \limits _0^{t} \mathcal {L}_{\alpha }({\dot{\gamma }}(s))\, \textrm{d}s \end{aligned}$$

(10)

$$\begin{aligned}&= \inf _{\textbf{q}\in \mathbb {M}_2} U(\textbf{q}) + t \mathcal {L}^{1D}_\alpha (d(\textbf{p}, \textbf{q})/t) \end{aligned}$$

(11)

$$\begin{aligned}&=(k_t^{\alpha } \mathbin {\square } U)(\textbf{p}) \end{aligned}$$

(12)

where the morphological kernel $k_t^{\alpha }: \mathbb {M}_{2} \rightarrow \mathbb {R}_{\ge 0}$ is defined as:

$$\begin{aligned} k_{t}^{\alpha }= t \mathcal {L}^{1D}_\alpha (d/t) = \frac{t}{\beta } \left( \frac{d(\textbf{p}_0, \cdot )}{t} \right) ^\beta . \end{aligned}$$

(13)

Furthermore, the Riemannian distance $d:=d(\textbf{p}_0,\cdot )$ is the viscosity solution of the eikonal PDE

$$\begin{aligned} \left\| \nabla d \right\| ^2 = \sum _{i=1}^3 (\mathcal {A}_{i} d / w_i)^2=1 \end{aligned}$$

(14)

with boundary condition $d(\textbf{p}_0)=0$. Likewise the viscosity solution of the dilation PDE is

$$\begin{aligned} W_{\alpha }(\textbf{p},t)=-(k_t^{\alpha } \mathbin {\square } -U)(\textbf{p}) \end{aligned}$$

(15)

Proof

It is shown by Fathi in [54, Prop. 5.3] that (10) is a viscosity solution of the Hamilton–Jacobi equation (9) on a complete connected Riemannian manifold without boundary, under some (weak) conditions on the Hamiltonian and with the initial condition U being Lipschitz. In [53, Thm. 2], a similar statement is given but only for compact connected Riemannian manifolds, again under some weak conditions on the Hamiltonian but without any on the initial condition. Next, we employ these existing results and provide a self-contained proof of (11) and (12).

Because we are looking at a specific class of Lagrangians, the solutions can be equivalently written as (11). In [53, Prop. 2], this form can also be found. Namely, the Lagrangian $\mathcal {L}_\alpha ^{1D}$ is convex for $\alpha > 1$, so for any curve $\gamma \in \Gamma _t:= \Gamma _t(\textbf{p}, \textbf{q})$ we have by direct application of Jensen’s inequality (omitting the superscript 1D):

$$\begin{aligned} \mathcal {L}_\alpha \left( \frac{1}{t} \int _0^t \Vert {{\dot{\gamma }}}(s)\Vert \textrm{d}s \right) \le \frac{1}{t} \int _0^t \mathcal {L}_\alpha (\Vert {{\dot{\gamma }}}(s)\Vert )\ \textrm{d}s, \end{aligned}$$

with equality if $\Vert {{\dot{\gamma }}}\Vert $ is constant. This means that:

$$\begin{aligned} \inf _{\gamma \in \Gamma _t} t \mathcal {L}_\alpha \left( \frac{L(\gamma )}{t} \right) \le \inf _{\gamma \in \Gamma _t} \int _0^t \mathcal {L}_\alpha (\Vert {{\dot{\gamma }}}(s)\Vert )\ \textrm{d}s, \end{aligned}$$

(16)

where $L(\gamma ):=L_{\mathcal {G}}(\gamma )$, recall (6), is the length of the curve $\gamma $. Consider the subset of curves with constant speed ${\tilde{\Gamma }}_t = \{ \gamma \in \Gamma _t \mid \Vert {{\dot{\gamma }}}\Vert = L(\gamma )/t\} \subset \Gamma _t$. Optimizing over a subset can never decrease the infimum so we have:

$$\begin{aligned} \inf _{\gamma \in \Gamma _t} \int _0^t \mathcal {L}_\alpha (\Vert {{\dot{\gamma }}}(s)\Vert ) \textrm{d}s \le \inf _{\gamma \in {\tilde{\Gamma }}_t} \int _0^t \mathcal {L}_\alpha \left( \frac{L(\gamma )}{t} \right) \textrm{d}s \end{aligned}$$

The r.h.s of this equation is equal to the l.h.s of equation (16) as the length of a curve is independent of its parameterization. Thereby we have equality in (16). By monotonicity of $\mathcal {L}_\alpha $ on $\mathbb {R}_{>0}$, we may then concluded that:

$$\begin{aligned} \begin{aligned} \inf _{\gamma \in \Gamma _t} t \mathcal {L}_{\alpha } \left( L(\gamma )/t \right)&= t \mathcal {L}_{\alpha } \left( \inf _{\gamma \in \Gamma _t} L(\gamma )/t \right) \\&= t \mathcal {L}_{\alpha } (d(\textbf{p}, \textbf{q})/t). \end{aligned} \end{aligned}$$

That we can write the solution as (12) is a consequence of the left-invariant metric on the manifold. A similar derivation can be found in [28, Thm. 30]:

$$\begin{aligned} \begin{aligned} W_\alpha (\textbf{p},t)&= \inf _{\textbf{q}\in \mathbb {M}_2} U(\textbf{q}) + t \mathcal {L}_\alpha (d(\textbf{p}, \textbf{q})/t) \\&= \inf _{g \in G} U(g \textbf{p}_0) + t \mathcal {L}_\alpha (d(\textbf{p}, g \textbf{p}_0)/t) \\&= \inf _{g \in G} U(g \textbf{p}_0) + t \mathcal {L}_\alpha (d(g^{-1} \textbf{p}, \textbf{p}_0)/t) \\&= \inf _{g \in G} U(g \textbf{p}_0) + k_t^\alpha (g^{-1} \textbf{p}) \\&= (k_t^\alpha \mathbin {\square } U)(\textbf{p}) \end{aligned} \end{aligned}$$

It is shown in [55, Thm. 6.24] for complete connected Riemannian manifolds that the distance map $ d(\textbf{p}) $ is a viscosity solution of the Eikonal equation (14).

Finally, solutions of erosion and dilation PDEs correspond to each other. If $W_\alpha $ is the viscosity solution of the erosion PDE with initial condition U, then $-W_\alpha $ is the viscosity solution of the dilation PDE, with initial condition $-U$. This means that the viscosity solution of the dilation PDE is given by (15). $\square $

4 Distance Approximations

To calculate the morphological kernel $k_t^\alpha $ (13), we need the exact Riemannian distance d (6), but calculating this is computationally demanding. To alleviate this problem, we approximate the exact distance $d(\textbf{p}_0, \cdot )$ with approximative distances, denoted with $\rho : \mathbb {M}^2 \rightarrow \mathbb {R}_{\ge 0}$, which are computationally cheap. To this end, we define the logarithmic distance approximation $\rho _c$, as explained in [28, Def.19] and [56, Def.6.1.2], by

$$\begin{aligned} \rho _c:= \sqrt{ (w_1 c^1)^2 + (w_2 c^2 )^2 + (w_3 c^3 )^2}. \end{aligned}$$

(17)

Note that all approximative distances $\rho $ can be extended to something that looks like a metric on $\mathbb {M}_2$. For example, we can define:

$$\begin{aligned} \rho (g_1 \textbf{p}_0,\ g_2 \textbf{p}_0):= \rho (g_1^{-1} g_2 \textbf{p}_0). \end{aligned}$$

But this is almost always not a true metric in the sense that it does not satisfy the triangle inequality. So in this sense an approximative distance is not necessarily a true distance. However, we will keep referring to them as approximative distances as we only require them to look like the exact Riemannian distance $d(\textbf{p}_0, \cdot )$.

As already stated in the introduction, Riemannian distance approximations such as $\rho _c$ begin to fail in the high spatial anisotropy cases $\zeta \gg 1$. For these situations, we need sub-Riemannian distance approximations. In previous literature, two such sub-Riemannian approximations are suggested. The first one is standard [57, Sec. 6], the second one is a modified smooth version [29, p. 284], also seen in [48, eq. 14]:

$$\begin{aligned}&\sqrt{ \sqrt{\nu w_1^2w_3^2}\left| c^2 \right| + (w_1 c^1)^2 + (w_3 c^3)^2 } \end{aligned}$$

(18)

$$\begin{aligned}&\root 4 \of {\nu w_1^2w_3^2 \left| c^2 \right| ^2 + ((w_1 c^1)^2 + (w_3 c^3)^2)^2} \end{aligned}$$

(19)

In [48], $\nu \approx 44$ is empirically suggested. Note that the sub-Riemannian approximations rely on the assumption that $w_2 \ge w_1$.

However, they both suffer from a major shortcoming in the interaction between $w_3$ and $c^2$. When we let $w_3 \rightarrow 0$ both approximations suggest that it becomes arbitrarily cheap to move in the $c^2$ direction which is undesirable as this deviates from the exact distance d: moving spatially will always have a cost associated with it determined by at least $w_1$.

To make a proper sub-Riemannian distance estimate, we will use the Zassenhaus formula, which is related to the Baker–Campbell–Hausdorff formula:

$$\begin{aligned} e^{t(X + Y)} = e^{tX} e^{tY} e^{-\frac{t^2}{2} \left[ X,Y \right] } e^{\mathcal {O}(t^3)} \dots , \end{aligned}$$

(20)

where we have used the shorthand $e^x:= \exp (x)$. Filling in $X = A_1$ and $Y = A_3$ and neglecting the higher-order terms gives:

$$\begin{aligned} e^{t(A_1 + A_3)} \approx e^{tA_1} e^{tA_3} e^{\frac{t^2}{2} A_2}, \end{aligned}$$

(21)

or equivalently:

$$\begin{aligned} e^{\frac{t^2}{2} A_2} \approx e^{-tA_3} e^{-tA_1} e^{t(A_1 + A_3)}. \end{aligned}$$

(22)

This formula says that one can successively follow exponential curves in the “legal” directions $\mathcal {A}_1$ and $\mathcal {A}_3$ to effectively move in the “illegal” direction of $\mathcal {A}_2$. Taking the lengths of these curves and adding them up gives an approximative upper bound on the sub-Riemannian distance:

$$\begin{aligned} \begin{aligned} d_{sr}(e^{\frac{t^2}{2} A_2})&\lessapprox \left( w_1 + w_3 + \sqrt{w_1^2 + w_3^2} \right) \left| t \right| \\&\le 2\left( w_1 + w_3 \right) \left| t \right| . \end{aligned} \end{aligned}$$

(23)

Substituting $t \rightarrow \sqrt{2\left| t \right| }$ gives:

$$\begin{aligned} d_{sr}(e^{tA_2}) \lessapprox 2\sqrt{2}\left( w_1 + w_3 \right) \sqrt{\left| t \right| }. \end{aligned}$$

(24)

This inequality, together with the smoothing trick to go from (18) to (19), inspires then the following sub-Riemannian distance approximation:

$$\begin{aligned} \rho _{c, sr}:= \root 4 \of { \left( \nu (w_1 + w_3) \right) ^4 \left| c^2 \right| ^2 + ((w_1 c^1)^2 + (w_3 c^3)^2)^2},\nonumber \\ \end{aligned}$$

(25)

for some $0<\nu <2\sqrt{2}$ s.t. the approximation is tight. We empirically suggest $\nu \approx 1.6$, based on a numerical analysis that is tangential to [48, Fig. 3]. Notice that this approximation does not break down when we let $w_3 \rightarrow 0$.

Furthermore, in view of contraction of SE(2) to the Heisenberg group $H_3$ [29, Sec. 5.2], and the exact fundamental solution [32, eq. 27] of the Laplacian on $H_3$ (where the norm $\rho _{c,sr}$ appears squared in the numerator with $1=w_1=w_3=\nu $) we expect $\nu \ge 1$.

Table 3 shows that both the old sub-Riemannian approximation (19) and new approximation (25) are appropriate in cases such as $w_3=1$. Table 4 shows that the old approximation breaks down when we take $w_3 = 0.5$, and that the new approximation behaves more appropriate.

Table 3 Same situation and metric parameters as Table 2, i.e., $w_1 = w_3 = 1$ and $w_2 = 8$. We see the exact distance d alongside the old sub-Riemannian approximation $\rho _{b,sr,old}$ (19) and new approximation $\rho _{b,sr}$ (25). For the old approximation, we chose $\nu =44$, as suggested in [48], and for the new one $\nu = 1.6$. We see that in this case both approximations are appropriate

Full size table

Table 4 Same as Table 3 but then with $w_1 = 1, w_2 = 8, w_3 = 0.5$. We see that in this case that the old sub-Riemannian approximation $\rho _{b,sr,old}$ (19) underestimates the true distance and becomes less appropriate. The new approximation (25) is also not perfect but qualitatively better. Decreasing $w_3$ would exaggerate this effect even further

Full size table

The Riemannian and sub-Riemannian approximations can be combined into the following newly proposed practical approximation:

$$\begin{aligned} \rho _{c,com}:= \max (l,\ \min (\rho _{c, sr}\,\ \rho _{c})), \end{aligned}$$

(26)

where $l: \mathbb {M}_2 \rightarrow \mathbb {R}$ is given by:

$$\begin{aligned} l:= \sqrt{ (w_1 x)^2 + (w_1 y)^2 + (w_3 \theta )^2 }, \end{aligned}$$

(27)

for which will we show that it is a lower bound of the exact distance d in Lemma 4.

The most important property of the combined approximation is that is automatically switches between the Riemannian and sub-Riemannian approximations depending on the metric parameters. Namely, the Riemannian approximation is appropriate very close to the reference point $\textbf{p}_0$, but tends to overestimate the true distance at a moderate distance from it. The sub-Riemannian approximation is appropriate at moderate distances from $\textbf{p}_0$, but tends to overestimate very close to it, and underestimate far away. The combined approximation is such that we get rid of the weaknesses that the approximations have on their own.

On top of these approximative distances, we also define $\rho _b$, $\rho _{b,sr}$, and $\rho _{b,com}$ by replacing the logarithmic coordinates $c^i$ by their corresponding half-angle coordinates $b^i$ defined by:

$$\begin{aligned} b^1{} & {} = x \cos \tfrac{\theta }{2} + y \sin \tfrac{\theta }{2}, \nonumber \\ b^2{} & {} = -x \sin \tfrac{\theta }{2} + y \cos \tfrac{\theta }{2}, \nonumber \\ b^3{} & {} = \theta . \end{aligned}$$

(28)

So, for example, we define $\rho _b$ as:

$$\begin{aligned} \rho _b:= \sqrt{(w_1 b^1)^2 + (w_2 b^2)^2 + (w_3 b^3)^2}. \end{aligned}$$

(29)

Why we use these coordinates will be explained in Sect. 5.1.

We can define approximative morphological kernels by replacing the exact distance in (13) by any of the approximative distances in this section. To this end we, for example, define $k_b$ by replacing the exact distance in the morphological kernel k by $\rho _b$:

$$\begin{aligned} k_{b,t}^\alpha := \frac{t}{\beta } \left( \frac{\rho _b}{t} \right) ^\beta , \end{aligned}$$

(30)

where we recall that $\frac{1}{\alpha } + \frac{1}{\beta } = 1$ and $\alpha >1$.

5 Main Theorem and Analysis

When the effect of erosion and dilation is calculated with an approximative morphological kernel an error is made. We are therefor interested in analyzing the behavior of this error. We do this by comparing the approximative morphological kernels with the exact kernel $k_t^\alpha $ (13). The result of our analysis is summarized in the following theorem. Because there is no time t dependency in all the inequalities of our main result we use short notation $k^\alpha := k_t^\alpha $, $k_b^\alpha := k_{b,t}^\alpha $.

Theorem 1

(Quality of approximative morphological kernels) Let $\zeta := \frac{w_2}{w_1}$ denote the spatial anisotropy, and let $\beta $ be such that $\frac{1}{\alpha } + \frac{1}{\beta } = 1$, for some $\alpha >1$ fixed. We assess the quality of our approximative kernels in three ways:

The exact and all approximative kernels have the same symmetries, see Table 5.
Globally it holds that:
$$\begin{aligned} \zeta ^{-\beta } k^\alpha \le k_b^\alpha \le \zeta ^{\beta } k^\alpha , \end{aligned}$$
(31)
from which we see that in the case $\zeta = 1$ we have that $k^\alpha _b$ is exactly equal to $k^\alpha $.
Locally around^{Footnote 1}$\textbf{p}_0$ we have:
$$\begin{aligned} k_b^\alpha \le (1 + \varepsilon )^{\beta /2} k^\alpha . \end{aligned}$$
(32)
where
$$\begin{aligned} \varepsilon := \frac{\zeta ^2 - 1}{2 w_3^2} \zeta ^4 \rho _b^2 + \mathcal {O}(\left| \theta \right| ^3). \end{aligned}$$
(33)

Table 5 Overview of the fundamental symmetries $\varepsilon _i$ in half-angle coordinates $b^i$ and logarithmic coordinates $c^i$. For example $\varepsilon _3(c^1, c^2, c^3) = (-c^1, -c^2, c^3)$

Full size table

Proof

The proof of the parts of the theorem will be discussed throughout the upcoming subsections.

The symmetries are shown in Corollary 1.
The global bound (31) is shown in Corollary 3.
The local bound (32) is shown in Corollary 5.

$\square $

Clearly, as all approximative kernels are solely functions of the corresponding approximative distances, the analysis of the quality of an approximative kernel reduces to analyzing the quality of the approximative distance that is used, and this is exactly what we will do.

In previous work on PDE-G-CNN’s the bound $d=d(\textbf{p}_0,\cdot ) \le \rho _c$ is proven [28, Lem. 20]. Furthermore, it is shown that around $\textbf{p}_0$ one has:

$$\begin{aligned} \rho _c^2 \le d^2 + \mathcal {O}(d^4), \end{aligned}$$

(34)

which has the corollary that there exist a constant $C \ge 1$ such that

$$\begin{aligned} \rho _c \le C d \end{aligned}$$

(35)

for any compact neighborhood around $\textbf{p}_0$. We improve on these results by:

Showing that the approximative distances have the same symmetries as the exact Riemannian distance; Lemma 3.
Finding simple global bounds on the exact distance d which can then be used to find global estimates of $\rho _b$ by d; Lemma 4. This improves upon (35) by finding an expression for the constant C.
Estimating the leading term of the asymptotic expansion, and observing that our upper bound of the relative error between $\rho _b$ and d explodes in the cases $\zeta \rightarrow \infty $ and $w_3 \rightarrow 0$; Lemma 7. This improves upon equation (34).

Note, however, that we are not analyzing $\rho _c$: we will be analyzing $\rho _b$. This is mainly because the half-angle coordinates are easier to work with: they do not have the ${{\,\textrm{sinc}\,}}\tfrac{\theta }{2}$ factor the logarithmic coordinates have. Using that

$$\begin{aligned} b^1 = c^1 {{\,\textrm{sinc}\,}}\tfrac{\theta }{2},\ b^2 = c^2 {{\,\textrm{sinc}\,}}\tfrac{\theta }{2},\ b^3 = c^3, \end{aligned}$$

(36)

recall (28) and (8), we see that

$$\begin{aligned} {{\,\textrm{sinc}\,}}\tfrac{\theta }{2}\ \rho _c \le \rho _b \le \rho _c, \end{aligned}$$

and thus locally $\rho _c$ and $\rho _b$ do not differ much, and results on $\rho _b$ can be easily transferred to (slightly weaker) results on $\rho _c$.

5.1 Symmetry Preservation

Symmetries play a major role in the analysis of (sub-)Riemannian geodesics/distance in SE(2). They help to analyze symmetries in Hamiltonian flows [44] and corresponding symmetries in association field models [42, Fig. 11]. There are together 8 of them and their relation with logarithmic coordinates $c^i$ (Lemma 1) shows they correspond to inversion of the Lie-algebra basis $A_i \mapsto -A_i$. The symmetries for the sub-Riemannian setting are explicitly listed in [44, Prop. 4.3]. They can be algebraically generated by the (using the same labeling as [44]) following three symmetries:

$$\begin{aligned} \begin{array}{l} \varepsilon ^{2}(x,y,\theta ) = (-x \cos \theta - y \sin \theta , -x \sin \theta + y \cos \theta , \theta ),\\ \varepsilon ^{1}(x,y,\theta ) = (x \cos \theta + y \sin \theta , x \sin \theta - y \cos \theta , \theta ), \text { and } \\ \varepsilon ^{6}(x,y,\theta ) = (x \cos \theta + y \sin \theta , -x \sin \theta + y \cos \theta , -\theta ). \end{array}\nonumber \\ \end{aligned}$$

(37)

They generate the other four symmetries as follows:

$$\begin{aligned} \begin{array}{l} \varepsilon ^{3}=\varepsilon ^{2} \circ \varepsilon ^1,\ \varepsilon ^{4}=\varepsilon ^{2} \circ \varepsilon ^6,\ \varepsilon ^{7}=\varepsilon ^{1} \circ \varepsilon ^6, \\ \text { and } \varepsilon ^{5}= \varepsilon ^2 \circ \varepsilon ^{1} \circ \varepsilon ^6. \end{array} \end{aligned}$$

(38)

and with $\varepsilon ^0 = \text {id}$. All symmetries are involutions: $\varepsilon ^i \circ \varepsilon ^i = \text {id}$. Henceforth all eight symmetries will be called ‘fundamental symmetries.’ How all fundamental symmetries relate to each other becomes clearer if we write them down in either logarithm or half-angle coordinates.

Lemma 1

(8 fundamental symmetries) The 8 fundamental symmetries $\varepsilon _i$, in either half-angle coordinates $b^i$ or logarithmic coordinates $c^i$, correspond to sign flips as laid out in Table 5.

Proof

We will only show that $\varepsilon ^2$ flips $b^1$. All other calculations are done analogously. Pick a point $\textbf{p}= (x,y,\theta )$ and let $\textbf{q}= \varepsilon ^2(\textbf{p})$. We now calculate $b^1(\textbf{q})$:

$$\begin{aligned} \begin{aligned} b^1(\textbf{q}) ={}&x(\textbf{q}) \cos \tfrac{\theta (\textbf{q})}{2} + y(\textbf{q}) \sin \tfrac{\theta (\textbf{q})}{2}\\ =&- (x \cos \theta + y \sin \theta ) \cos \tfrac{\theta }{2} \\&+ (-x \sin \theta + y \cos \theta ) \sin \tfrac{\theta }{2}\\ =&-x (\cos \theta \cos \tfrac{\theta }{2} + \sin \theta \sin \tfrac{\theta }{2} ) \\&- y(\sin \cos \tfrac{\theta }{2} - \cos \theta \sin \tfrac{\theta }{2})\\ =&- x \cos \tfrac{\theta }{2} - y \sin \tfrac{\theta }{2}\\ =&-b^1(\textbf{p}), \end{aligned} \end{aligned}$$

where we have used the trigonometric difference identities of cosine and sine in the second-to-last equality. From the relation between logarithmic and half-angle coordinates (36), we have that the logarithmic coordinates $c^i$ flip in the same manner under the symmetries. $\square $

The fixed points of the symmetries $\varepsilon ^2$, $\varepsilon ^1$, and $\varepsilon ^6$ have an interesting geometric interpretation. The logarithmic and half-angle coordinates, being so closely related to the fundamental symmetries, also carry the same interpretation. Definition 1 introduces this geometric idea and Lemma 2 makes its relation to the fixed points of the symmetries precise. In Fig. 11, the fixed points are visualized, and in Fig. 12 a visualization of these geometric ideas can be seen.

Definition 1

Two points $\textbf{p}_1=(\textbf{x}_1,\textbf{n}_1)$, $\textbf{p}_2=(\textbf{x}_{2},\textbf{n}_1)$ of $\mathbb {M}_{2}$ are called cocircular if there exist a circle, of possibly infinite radius, passing through $\textbf{x}_1$ and $\textbf{x}_2$ such that the orientations $\textbf{n}_1 \in S^1$ and $\textbf{n}_{2} \in S^1$ are tangents to the circle, at, respectively, $\textbf{x}_1$ and $\textbf{x}_2$, in either both the clockwise or anti-clockwise direction. Similarly, the points are called coradial if the orientations are normal to the circle in either both the outward or inward direction. Finally, two points are called parallel if their orientations coincide.

Co-circularity has a well-known characterization that is often used for line enhancement in image processing, such as tensor voting [58].

Remark 1

Point $\textbf{p}=(r \cos \phi , r \sin \phi , \theta ) \in \mathbb {M}_2$ is cocircular to the reference point $\textbf{p}_0=(0,0,0)$ if and only if the double angle equality $\theta \equiv 2 \phi \mod 2\pi $ holds.

In fact all fixed points of the fundamental symmetries can be intuitively characterized:

Lemma 2

(Fixed Points of Symmetries) Fix reference point $\textbf{p}_0=(0,0,0) \in \mathbb {M}_2$.

The point $g \textbf{p}_0\in \mathbb {M}_2$ with $g \in SE(2)$ is, respectively,

coradial to $\textbf{p}_0$ when
$$\begin{aligned} c^1(g) = 0 \Leftrightarrow \varepsilon _2(g) = g \Leftrightarrow g \in \exp (\left\langle A_2, A_3 \right\rangle ), \end{aligned}$$
(39)
cocircular to $\textbf{p}_0$ when
$$\begin{aligned} c^2(g) = 0 \Leftrightarrow \varepsilon _1(g) = g \Leftrightarrow g \in \exp (\left\langle A_1, A_3 \right\rangle ), \end{aligned}$$
(40)
parallel to $\textbf{p}_0$ when
$$\begin{aligned} c^3(g) = 0 \Leftrightarrow \varepsilon _6(g) = g \Leftrightarrow g \in \exp (\left\langle A_1, A_2 \right\rangle ). \end{aligned}$$
(41)

Proof

We will only show (40), the others are done analogously. We start by writing $g=(r \cos \phi , r \sin \phi , \theta )$ and calculating that $g \odot \textbf{p}_0 = (r \cos \phi , r \sin \phi , (\cos \theta , \sin \theta ))$. Then by Remark 1 we known that $g \textbf{p}_0$ is cocircular to $\textbf{p}_0$ if and only if $2\phi = \theta {{\,\textrm{mod}\,}}2\pi $. We can show this is equivalent to $c^2(g)=0$:

$$\begin{aligned} c^2(g) = 0&\Leftrightarrow b^2(g) = 0 \\&\Leftrightarrow -x \sin \tfrac{\theta }{2} +y \cos \tfrac{\theta }{2}=0\\&\Leftrightarrow -\cos \phi \sin \tfrac{\theta }{2}+\sin \phi \cos \tfrac{\theta }{2}=0\\&\Leftrightarrow \sin (\phi -\tfrac{\theta }{2})=0 \Leftrightarrow 2\phi = \theta {{\,\textrm{mod}\,}}2\pi . \end{aligned}$$

In logarithmic coordinates, $\varepsilon _1$ is equivalent to:

$$\begin{aligned} \varepsilon _1(c^1, c^2, c^3) = (c^1, -c^2, c^3) \end{aligned}$$

from which we may deduce that $\varepsilon _1(g) = g$ is equivalent to $c^2(g) = 0$. If $c^2(g) = 0$ then $\log g \in \left\langle A_1, A_3 \right\rangle $ and thus $g \in \exp (\left\langle A_1, A_3 \right\rangle )$. As for the other way around, it holds by simple computation that:

$$\begin{aligned} c^2(\exp (c^1A_1 + c^3A_3)) = 0 \end{aligned}$$

which shows that $g \in \exp (\left\langle A_1, A_3 \right\rangle ) \Rightarrow c^2(g) = 0$. $\square $

In the important work [44] on sub-Riemannian geometry on SE(2) by Sachkov and Moiseev, it is shown that the exact sub-Riemannian distance $d_{sr}$ is invariant under the fundamental symmetries $\varepsilon ^i$. However, these same symmetries hold true for the Riemannian distance d. Moreover, because the approximative distances use the logarithmic coordinates $c^i$ and half-angle coordinates $b^i$ they also carry the same symmetries. The following lemma makes this precise.

Lemma 3

(Symmetries of the exact distance and all proposed approximations) All exact and approximative (sub)-Riemannian distances (w.r.t. the reference point $\textbf{p}_0$) are invariant under all the fundamental symmetries $\varepsilon _i$.

Proof

By Table 5, one sees that $\varepsilon ^3, \varepsilon ^4$, and $\varepsilon ^5$ also generate all symmetries. Therefore, if we just show that all distances are invariant under these select three symmetries we also have shown that they are invariant under all symmetries. We will first show the exact distance, in either the Riemannian or sub-Riemannian case, is invariant w.r.t. these three symmetries, i.e., $d(\textbf{p}) = d(\varepsilon ^i(\textbf{p}))$ for $i \in \{3,4,5\}$. By (38) and (37), one has $\varepsilon ^3(x,y,\theta )=(-x,-y,\theta )$ and $\varepsilon ^4(x,y,\theta ) = (-x,y,-\theta )$. Now consider the push forward $\varepsilon ^3_*$. By direct computation (in $(x,y,\theta )$ coordinates), we have $\varepsilon ^3_* \left. \mathcal {A}_i \right| _\textbf{p}= \pm \left. \mathcal {A}_i \right| _{\varepsilon ^3(\textbf{p})}$. Because the metric tensor field $\mathcal {G}$ (4) is diagonal w.r.t. to the $\mathcal {A}_i$ basis this means that $\varepsilon ^3$ is a isometry. Similarly, $\varepsilon ^4$ is an isometry. Being an isometry of the metric $\mathcal {G}$, we may directly deduce that $\varepsilon ^3$ and $\varepsilon ^4$ preserve distance. The $\varepsilon ^5$ symmetry flips all the signs of the $c^i$ coordinates which amounts to Lie algebra inversion: $ -\log g = \log (\varepsilon ^5(g)) $. Taking the exponential on both sides shows that $g^{-1} = \varepsilon ^5(g)$. By left-invariance of the metric, we have $d(g \textbf{p}_0, \textbf{p}_0) = d(\textbf{p}_0, g^{-1} \textbf{p}_0)$, which holds in both the Riemannian and sub-Riemannian case, and thus $ d(g\textbf{p}_0) = d(\varepsilon ^5(g\textbf{p}_0)) $. That all approximative distances (both in the Riemannian and sub-Riemannian case) are also invariant under all the symmetries is not hard to see: every $b^i$ and $c^i$ term is either squared or the absolute value is taken. Flipping signs of these coordinates, recall Lemma 1, has no effect on the approximative distance. $\square $

Corollary 1

(All kernels preserve symmetries) The exact kernel and all approximative kernels have the same fundamental symmetries.

Proof

The kernels are direct functions of the exact and approximative distances, recall for example (13), so from Lemma 3 we can immediately conclude that they also carry the 8 fundamental symmetries. $\square $

In Fig. 10, the previous lemma can be seen. The two fundamental symmetries $\varepsilon ^2$ and $\varepsilon ^1$ correspond, respectively, to reflecting the isocontours (depicted in colors) along their short edge and long axis. The $\varepsilon ^6$ symmetry corresponds to mapping the positive $\theta $ isocontours to their negative $\theta $ counterparts. In Fig. 13, one can see an isocontour of $\rho _b$ together with the symmetry “planes” of $\varepsilon _2$, $\varepsilon _1$ and $\varepsilon _6$.

5.2 Simple Global Bounds

Next we provide some basic global lower and upper bounds for the exact Riemannian distance d (6). Recall that the lower bound l plays an important role in the combined approximation $\rho _{c,com}$ (26) when far from the reference point $\textbf{p}_0$.

Lemma 4

(Global bounds on distance) The exact Riemannian distance $d=d(\textbf{p}_0,\cdot )$ is greater than or equal to the following lower bound $l: \mathbb {M}_2 \rightarrow \mathbb {R}$:

$$\begin{aligned} l:= \sqrt{ (w_1 x)^2 + (w_1 y)^2 + (w_3 \theta )^2 } \le d \end{aligned}$$

and less than or equal to the following upper bounds $u_1, u_2: \mathbb {M}_2 \rightarrow \mathbb {R}$:

$$\begin{aligned} d \le u_1&:= \sqrt{ (w_2 x)^2 + (w_2 y)^2 + (w_3 \theta )^2 }\\ d \le u_2&:= \sqrt{ (w_1 x)^2 + (w_1 y)^2 } + w_3 \pi \end{aligned}$$

Proof

We will first show $l \le d$. Consider the following spatially isotropic metric:

$$\begin{aligned} {\tilde{\mathcal {G}}} = w_1^2\ \omega ^1 \otimes \omega ^1 + w_1^2\ \omega ^2 \otimes \omega ^2 + w_3^2 \ \omega ^3 \otimes \omega ^3. \end{aligned}$$

We assumed w.l.o.g. that $w_1 \le w_2$ so we have for any vector $v \in T\mathbb {M}_2$ that $ \Vert v\Vert _{{\tilde{\mathcal {G}}}} \le \Vert v\Vert _{\mathcal {G}} $. From this, we can directly deduce that for any curve $\gamma $ on $\mathbb {M}_2$ we have that $L_{{\tilde{\mathcal {G}}}}(\gamma ) \le L_{\mathcal {G}}(\gamma )$. Now consider a length-minimizing curve $\gamma $ w.r.t. $\mathcal {G}$ between the reference point $\textbf{p}_0$ and some end point $\textbf{p}$. We then have the chain of (in)equalities:

$$\begin{aligned} d_{{\tilde{\mathcal {G}}}}(\textbf{p}) \le L_{{\tilde{\mathcal {G}}}}(\gamma ) \le L_{\mathcal {G}}(\gamma ) = d_{\mathcal {G}}(\textbf{p}) \end{aligned}$$

Furthermore, because the metric ${\tilde{\mathcal {G}}}$ is spatially isotropic it can be equivalently be written as:

$$\begin{aligned} {\tilde{\mathcal {G}}} = w_1^2\ dx \otimes dx + w_1^2\ dy \otimes dy + w_3^2 \ d\theta \otimes d\theta , \end{aligned}$$

which is a constant metric on the coordinate covector fields, and thus:

$$\begin{aligned} d_{{\tilde{\mathcal {G}}}}(\textbf{p}) = \sqrt{ (w_1 x)^2 + (w_1 y)^2 + (w_3 \theta )^2 } = l. \end{aligned}$$

Putting everything together gives the desired result of $l \le d$. To show that $d \le u_1$ can be done analogously.

As for showing $d \le u_2$ we will construct a curve $\gamma $ of which the length $L(\gamma )$ w.r.t. $\mathcal {G}$ can be bounded from above with $u_2$. This in turn shows that $d \le u_2$ by definition of the distance. Pick a destination position and orientation $\textbf{p}= (\textbf{x}, \textbf{n})$. The constructed curve $\gamma $ will be as follows. We start by aligning our starting orientation $\textbf{n}_0 = (1,0) \in S^1$ toward the destination position $\textbf{x}$. This desired orientation toward $\textbf{x}$ is ${\hat{\textbf{x}}}:= \frac{\textbf{x}}{r}$ where $r = \Vert \textbf{x}\Vert = \sqrt{x^2 + y^2}$. This action will cost $w_3 a$ for some $a \ge 0$. Once we are aligned with ${\hat{\textbf{x}}}$, we move toward $\textbf{x}$. Because we are aligned this action will cost $w_1 r$. Now that we are at $\textbf{x}$ we align our orientation with the destination orientation $\textbf{n}$, which will cost $w_3b$ for some $b \ge 0$. Altogether we have $L(\gamma ) = w_1 r + w_3 (a+b)$. In its current form, the constructed curve actually doesn’t have that $a+b\le \pi $ as desired. To fix this, we realize that we did not necessarily had to align with ${\hat{\textbf{x}}}$. We could have aligned with $-{\hat{\textbf{x}}}$ and move backwards toward $\textbf{x}$, which will also cost $w_1 r$. One can show that one of the two methods (either moving forwards or backwards toward $\textbf{x}$) indeed has that $a+b\le \pi $ and thus $d \le u_2$. $\square $

These bounds are simple but effective: they help us prove a multitude of insightful corollaries.

Corollary 2

(Global error distance) Simple manipulations, together with the fact that $x^2 + y^2 = (b^1)^2 + (b^2)^2$, give the following inequalities between $l, u_1$ and $\rho _b$:

$$\begin{aligned} l \le \rho _b \le u_1,\ \frac{1}{\zeta } u_1 \le \rho _b \le \zeta l. \end{aligned}$$

The second equation can be extended to inequalities between $\rho _b$ and d:

$$\begin{aligned} \frac{1}{\zeta } d \le \rho _b \le \zeta d \end{aligned}$$

(42)

Remark 2

If $w_1 = w_2 \Leftrightarrow \zeta = 1$, i.e., the spatially isotropic case, then the lower and upper bound coincide, thus becoming exact. Because $\rho _b$ is within the lower and upper bound it also becomes exact.

Corollary 3

(Global error kernel) Globally the error is independent of time $t>0$ and is estimated by the spatial anisotropy $\zeta \ge 1$ (5) as follows:

$$\begin{aligned} \zeta ^{-\beta } k^\alpha \le k_b^\alpha \le \zeta ^{\beta } k^\alpha . \end{aligned}$$

For $\zeta =1$, there is no error.

Proof

We will only prove the second inequality, the first is done analogously.

$$\begin{aligned} \begin{aligned} k_b^\alpha&:= \frac{1}{\beta } (\rho _b/t)^\beta \le \frac{1}{\beta } \left( \zeta d/t \right) ^\beta \\&= \zeta ^{\beta } \left( \frac{1}{\beta } \left( d/t \right) ^\beta \right) = \zeta ^{\beta } k^\alpha \end{aligned} \end{aligned}$$

$\square $

The previous result indicates that problems can arise if $\zeta \rightarrow \infty $, which indeed turns out to be the case:

Corollary 4

(Observing the problem) If we restrict ourselves to $x=\theta =0$, we have that $u_1 = \rho _b = \rho _c = w_2\left| y \right| $. From this, we can deduce that one can be certain that both $\rho _b$ and $\rho _c$ become bad approximations away from $\textbf{p}_0$. Namely, when $\zeta> 1 \Leftrightarrow w_2 > w_1$ both approximations go above $u_2$ if one looks far enough away from $\textbf{p}_0$. How “fast” it goes bad is determined by all metric parameters. Namely, the intersection of the approximations $\rho _b$ and $\rho _c$, and $u_2$ is at $\left| y \right| = \frac{w_3\pi }{w_2 - w_1}$, or equivalently at $\rho = \frac{w_3\pi }{1 - \zeta ^{-1}}$. This intersection is visible in Fig. 14 in the higher anisotropy cases. From this expression of the intersection, we see that in the cases $w_3 \rightarrow 0$ and $\zeta \rightarrow \infty $ the Riemannian distance approximations $\rho _b$ and $\rho _c$ quickly go bad. We will see exactly the same behavior in Lemma 7 and Remark 3.

Lemma 4 is visualized in Figs. 14 and 15. In Fig. 14, we consider the behavior of the exact distance and bounds along the y-axis, that is at $x=\theta =0$. We have chosen to inspect the y-axis because it consists of points that are hard to reach from the reference point $\textbf{p}_0$ when the spatial anisotropy is large, which makes it interesting. In contrast, along the x-axis $l,d,\rho _b,\rho _c, u_1$ and $w_1\left| x \right| $ all coincide, and is therefore uninteresting. To provide more insight we also depict the bounds along the $y=x$ axis, see Fig. 15. Observe that in both figures, the exact distance d is indeed always above the lower bound l and below the upper bounds $u_1$ and $u_2$.

5.3 Asymptotic Error Expansion

In this section, we provide an asymptotic expansion of the error between the exact distance d and the half-angle distance approximation $\rho _b$ (Lemma 7). This error is then leveraged to an error between the exact morphological kernel k and the half-angle kernel $k_b$ (Corollary 5). We also give a formula that determines a region for which the half-angle approximation $\rho _b$ is appropriate given an a priori tolerance bound (Remark 3).

Lemma 5

Let $\gamma :[0,1] \rightarrow \mathbb {M}_2$ be a minimizing geodesic from $\textbf{p}_0$ to $\textbf{p}$. We have that:

$$\begin{aligned} \rho _b(\textbf{p}) \le d(\textbf{p}) \max _{t \in [0,1]} \Vert d\rho _b\vert _{\gamma (t)} \Vert . \end{aligned}$$

Proof

The fundamental theorem of calculus tells us that:

$$\begin{aligned} \int _0^1 (\rho _b \circ \gamma )'(t)\ dt = \rho _b(\gamma (1)) - \rho _b(\gamma (0)) = \rho _b(\textbf{p}), \end{aligned}$$

but one can also bound this expression as follows:

$$\begin{aligned} \int _0^1 (\rho _b \circ \gamma )'(t)\ dt&= \int _0^1 \left\langle d\rho _b\vert _{\gamma (t)}, {{\dot{\gamma }}}(t) \right\rangle \ dt \\&\le \int _0^1 \left\| d\rho _b\vert _{\gamma (t)} \right\| \left\| {{\dot{\gamma }}}(t) \right\| \ dt\\&\le \left( \max _{t \in [0,1]} \Vert d\rho _b\vert _{\gamma (t)} \Vert \right) \int _0^1 \left\| {{\dot{\gamma }}}(t) \right\| \ dt \\&= d(\textbf{p}) \max _{t \in [0,1]} \Vert d\rho _b\vert _{\gamma (t)} \Vert . \end{aligned}$$

Putting the two together gives the desired result. $\square $

Lemma 6

One can bound $\Vert d\rho _b\Vert $ around $\textbf{p}_0$ by:

$$\begin{aligned} \Vert d \rho _b\Vert ^2 \le 1 + \frac{\zeta ^2 - 1}{2w_3^2} \rho _b^2 + \mathcal {O}(\theta ^3). \end{aligned}$$

Proof

The proof is deferred to Appendix 1$\square $

By combining the simple Lemmas 5 and 6, one can find an expression for the asymptotic error between the exact distance d and the half-angle approximation $\rho _b$.

Lemma 7

Around any compact neighborhood of $\textbf{p}_0$, we have that

$$\begin{aligned} \rho _b^2 \le ( 1 + \varepsilon ) d^2, \text { where } \varepsilon := \frac{\zeta ^2 - 1}{2w_3^2} \zeta ^4 \rho _b^2 + C \left| \theta \right| ^3. \end{aligned}$$

(43)

for some $C \ge 0$.

Proof

Let $\textbf{p}\in U$ be given, and let $\gamma : [0,1] \rightarrow \mathbb {M}_2$ be the geodesic from $\textbf{p}_0$ to $\textbf{p}$. For the distance, we know that

$$\begin{aligned} d(\gamma (s)) \le d(\gamma (t)), \text { for } s \le t. \end{aligned}$$

Making use of (42), we know that $\frac{1}{\zeta } \rho _b \le d \le \zeta \rho _b$ so we can combine this with the previous equation to find:

$$\begin{aligned} \rho _b(\gamma (s)) \le \zeta ^2 \rho _b(\gamma (t)), \text { for } s \le t. \end{aligned}$$

from which we get that

$$\begin{aligned} \max _{t \in [0,1]} \rho _b(\gamma (t)) \le \zeta ^2 \rho _b(\textbf{p}). \end{aligned}$$

Combining this fact with the above two lemmas allows us to conclude (43). $\square $

Remark 3

(Region for approximation $\rho _b \approx d$) Putting an a priori tolerance bound $\varepsilon _{tol}$ on the error $\varepsilon $ (and neglecting the $\mathcal {O}(\theta ^3)$ term) gives rise to a region $\Omega _0$ on which the local approximation $\rho _b$ is appropriate:

$$\begin{aligned} \Omega _0=\{ \textbf{p}\in \mathbb {M}_2 \mid \rho _b(\textbf{p}) < \frac{2 w_3^2}{(\zeta ^2-1)\zeta ^4} \varepsilon _{tol}\}. \end{aligned}$$

Thereby we cannot guarantee a large region of acceptable relative error when $w_3 \rightarrow 0$ or $\zeta \rightarrow \infty $. We solve this problem

by using $\rho _{b, com}$ given (26) instead of $\rho _b$.

Corollary 5

(Local error morphological kernel) Locally around $\textbf{p}_0$, we have:

$$\begin{aligned} k^\alpha _b < (1 + \varepsilon )^{\beta /2} k^\alpha . \end{aligned}$$

Proof

By Lemma 7, one has

$$\begin{aligned} k^\alpha _b:= \frac{1}{\beta } (\rho _b/t)^\beta \le \frac{1}{\beta } ((1 + \varepsilon )d^2/t^2)^{\beta /2} = (1 + \varepsilon )^{\beta /2} k^\alpha . \end{aligned}$$

$\square $

6 Experiments

6.1 Error of Half Angle Approximation

We can quantitatively analyze the error between any distance approximation $\rho $ and the exact Riemannian distance d as follows. We do this by first choosing a region $\Omega \subseteq \mathbb {M}_2$ in which we will analyze the approximation. Just as in Tables 1 and 2, we decided to inspect $\Omega := [-3,3]\times [-3,3]\times [-\pi ,\pi ) \subseteq \mathbb {M}_2$. As for our exact measure of error $\varepsilon $, we have decided on the mean relative error defined as:

$$\begin{aligned} \varepsilon := \frac{1}{\mu (\Omega )} \int _{\Omega } \frac{\left| \rho _b(\textbf{p}) - d(\textbf{p}) \right| }{d(\textbf{p})} d\mu (\textbf{p}) \end{aligned}$$

(44)

where $\mu $ is the induced Riemannian measure determined by the Riemannian metric $\mathcal {G}$. We then discretized our domain $\Omega $ into a grid of $101 \times 101 \times 101$ equally spaced points $\textbf{p}_i\ \in \Omega $ indexed by some index set $i \in I$ and numerically solved for the exact distance d on this grid. This numerical scheme is of course not exact and we will refer to these values as ${\tilde{d}}_i \approx d(\textbf{p}_i)$. We also calculate the value of the distance approximation $\rho $ on the grid points $\rho _i:= \rho (\textbf{p}_i)$. Once we have these values, we can approximate the true mean relative error $\varepsilon $ by calculating the numerical error ${\tilde{\varepsilon }}$ defined by:

$$\begin{aligned} \varepsilon \approx {\tilde{\varepsilon }}:= \frac{1}{\left| I \right| } \sum _{i \in I} \frac{\left| \rho _i - {\tilde{d}}_i \right| }{{\tilde{d}}_i} \end{aligned}$$

(45)

In Table 6, the numerical mean relative error ${\tilde{\varepsilon }}$ between the half-angle approximation $\rho _b$ and the numerical Riemannian distance ${\tilde{d}}$ can be seen for different spatial anisotropies $\zeta $. We keep $w_1=w_3=1$ constant and vary $w_2$. We see that, as shown visually in Tables 1 and 2, that $\rho _b$ gets worse and worse when we increase the spatial anisotropy $\zeta $.

There is an discrepancy in the table worth mentioning. We know from Remark 2 that when $\zeta = 1$ then $\rho _b = d$ and thus $\varepsilon = 0$. But surprisingly we do not have ${\tilde{\varepsilon }} = 0$ in the $\zeta = 1$ case in Table 6. This can be simply explained by the fact that the numerical solution ${\tilde{d}}$ is not exactly equal to the true distance d. We expect that ${\tilde{\varepsilon }}$ will go to 0 in the $\zeta = 1$ case if we discretize our region $\Omega $ more and more finely.

We can compare these numerical results to our theoretical results. Namely, we can deduce from Equation (42) that:

$$\begin{aligned} \frac{\left| \rho _b - d \right| }{d} \le \zeta - 1, \end{aligned}$$

(46)

which means

$$\begin{aligned} \varepsilon \le \zeta - 1. \end{aligned}$$

(47)

And so we expect this to also approximately hold for the numerical mean relative error ${\tilde{\varepsilon }}$. Indeed, in Table 6 we can see that $ {\tilde{\varepsilon }} \lessapprox \zeta - 1$.

Interestingly, we see that ${\tilde{\varepsilon }}$ is relatively small compared to our theoretical bound (47) even in the high anisotropy cases. However, this is only a consequence of relative smallness of $\Omega $. If we make $\Omega $ bigger and bigger we can be certain that $\varepsilon $ converges to $\zeta - 1$. This follows from an argument similar to the reasoning in Corollary 4.

Table 6 Numerical mean relative error ${\tilde{\varepsilon }}$ between $\rho _b$ and d for multiple spatial anisotropies $\zeta $

Full size table

6.2 DCA1

The DCA1 dataset is a publicly available database “consisting of 130 X-ray coronary angiograms, and their corresponding ground-truth image outlined by an expert cardiologist” [59]. One such angiogram and ground-truth can be seen in Fig. 18a and d.

We have split the DCA1 dataset [59] into a training and test set consisting of 125 and 10 images, respectively.

To establish a baseline, we ran a 3, 6, and 12 layer CNN, G-CNN and PDE-G-CNN on DCA1. The exact architectures are identical/analogous to the ones used in [28, Fig. 15]. For the baseline, the logarithmic distance approximation $\rho _c$ was used within the PDE-G-CNNs. This is the same approximation that was used in [28]. Every network was trained 10 times for 80 epochs. After every epoch, the average Dice coefficient on the test set is stored. After every full training, the maximum of the average Dice coefficients over all 80 epochs is calculated. The result is 10 maximum average Dice coefficients for every architecture. The result of this baseline can be seen in Fig. 16. The amount of parameters of the networks can be found in Table 7. We see that PDE-G-CNNs consistently perform equally well as, and sometimes outperform, G-CNNs and CNNs, all the while having the least amount of parameters of all architectures.

Table 7 The total amount of parameters in the networks that are used in Fig. 16

Full size table

To compare the effect of using different approximative distances, we decided to train the 6 layer PDE-G-CNN (with 2560 parameters) 10 times for 80 epochs using each distance approximation. The results can be found in Figs. 17 and 18. We see that on DCA1 all distance approximations have a comparable performance. We notice a small dent in effectiveness when using $\rho _{b,sr}$, and a small increase when using $\rho _{b,com}$.

6.3 Lines

For the line completion problem, we created a dataset of 512 training images and 128 test images.^{Footnote 2} Fig. 21a and d shows one sample of the Lines dataset.

To establish a baseline, we ran a 6 layer CNN, G-CNN and PDE-G-CNN. For this baseline we again used $\rho _{c}$ within the PDE-G-CNN, but changed the amount of channels to 30, and the kernel sizes to [9, 9, 9], making the total amount of parameters 6018. By increasing the kernel size, we anticipate that the difference in effectiveness of using the different distance approximations, if there is any, becomes more pronounced. Every network was trained 15 times for 60 epochs. The result of this baseline can be seen in Fig. 19. The amount of parameters of the networks can be found in Table 8. We again see that the PDE-G-CNN outperforms the G-CNN, which in turn outperforms the CNN, while having the least amount of parameters.

We again test the effect of using different approximative distances by training the 6 layer PDE-G-CNN 15 times for 60 epochs for every approximation. The results can be found in Fig. 20. We see that on the Lines dataset, all distance approximations again have a comparable performance. We again notice an increase in effectiveness when using $\rho _{b,com}$, just as on the DCA1 dataset. Interestingly, using $\rho _{b,sr}$ does not seem to hurt the performance on the Lines dataset, which is in contrast with DCA1. This is in line with what one would expect in view of the existing sub-Riemannian line-perception models in neurogeometry. Furthermore, in Fig. 21b,c,e and f some feature maps of a trained PDE-G-CNN are visualized.

7 Conclusion

In this article, we have carefully analyzed how well the nonlinear erosion and dilation parts of PDE-G-CNNs are actually solved on the homogeneous space of 2D positions and orientations $\mathbb {M}_2$. According to Proposition 1, the Hamilton–Jacobi equations are solved by morphological kernels that are functions of only the exact (sub)-Riemannian distance function. As a result, every approximation of the exact distance yields a corresponding approximative morphological kernel.

Table 8 The total amount of parameters in the networks that are used in Fig. 19

Full size table

In Theorem 1, we use this to improve upon local and global approximations of the relative errors of the erosion and dilations kernels used in the papers [28, 60] where PDE-G-CNN are first proposed (and shown to outperform G-CNNs). Our new sharper estimates for distance on $\mathbb {M}_2$ have bounds that explicitly depend on the metric tensor field coefficients. This allowed us to theoretically underpin the earlier worries expressed in [28, Fig. 10] that if spatial anisotropy becomes high the previous morphological kernel approximations [28] become less and less accurate.

Indeed, as we show qualitatively in Table 2 and quantitatively in Sect. 6.1, if the spatial anisotropy $\zeta $ is high one must resort to sub-Riemannian approximations. Furthermore, we propose a single distance approximation $\rho _{b,com}$ that works both for low and high spatial anisotropy.

Apart from how well the kernels approximate the PDEs, there is the issue of how well each of the distance approximations perform in applications within the PDE-G-CNNs. In practice, the analytic approximative kernels using $\rho _b$, $\rho _c$, $\rho _{b,com}$ perform similarly. This is not surprising as our theoretical result Lemma 3 and Corollary 1 reveals that all morphological kernel approximations carry the correct 8 fundamental symmetries of the PDE. Nevertheless, Figs. 17 and 20 do reveal advantages of using the new kernel approximations (in particular $\rho _{b,com}$) over the previous kernel $\rho _c$ in [28].

The experiments also show that the strictly sub-Riemannian distance approximation $\rho _{b,sr}$ only performs well on applications where sub-Riemannian geometry really applies. For instance, as can be seen in Figs. 17 and 20, on the DCA1 dataset $\rho _{b,sr}$ performs relatively poor, whereas on the Lines dataset, $\rho _{b,sr}$ performs well. This is what one would expect in view of sub-Riemannian models and findings in cortical line-perception [37, 38, 40, 41, 46, 61] in neurogeometry.

Besides better accuracy and better performance of the approximative kernels, there is the issue of geometric interpretability. In G-CNNs and CNNs, geometric interpretability is absent, as they include ad-hoc nonlinearities like ReLUs. PDE-G-CNNs instead employ morphological convolutions with kernels that reflect association fields, as visualized in Fig. 5b. In Fig. 8, we see that as network depth increases association fields visually merge in the feature maps of PDE-G-CNNs toward adaptive line detectors, whereas such merging/grouping of association fields is not visible in normal CNNs.

In all cases, the PDE-G-CNNs still outperform G-CNNs and CNNs on the DCA1 dataset and Lines dataset: they have a higher (or equal) performance, while having a huge reduction in network complexity, even when using 3 layers. Regardless, the choice of kernel $\rho _c$, $\rho _b$, $\rho _{b,sr}$, $\rho _{b,com}$ the advantage of PDE-G-CNNs toward G-CNNs and CNNs is significant, as can be clearly observed in Figs. 16 and 19 and Table 7 and 8. This is in line with previous observations on other datasets [28].

Altogether, PDE-G-CNNs have better geometric reduction, performance, and geometric interpretation, than basic classical feed-forward (G)-CNN networks on various segmentation problems.

Extensive investigations on training data reduction, memory reduction (via U-Net versions of PDE-G-CNNs), and a topological description of the merging of association fields are beyond the scope of this article, and are left for future work.

Availability of Data and Code

The code of the experiments, and PDE-G-CNNs in general, can be found in the publicly available LieTorch package: https://gitlab.com/bsmetsjr/lietorch. The publicly available DCA1 dataset [59] can be found at https://personal.cimat.mx:8181/ivan.cruz/DB_Angiograms.html. The lines dataset is available from the authors on request.

Notes

For a precise statement see Lemma 7 and Remark 3.
The lines dataset is available from the authors on request.

References

Bekkers, E.J., Lafarge, M.W., Veta, M., Eppenhof, K.A.J., Pluim, J.P.W., Duits, R.: Roto-translation covariant convolutional networks for medical image analysis. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 440–448. Springer (2018). arXiv:1804.03393
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25. Curran Associates, Inc., Red Hook, New York (2012). https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Litjens, G., Bejnodri, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., van der Laak, J.A.W.M., van Ginneken, B., Sánchez, C.I.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Article Google Scholar
Cohen, T.S., Welling, M.: Group equivariant convolutional networks. In: Proceedings of the 33rd International Conference on Machine Learning, vol. 48, pp. 1–12 (2016)
Dieleman, S., De Fauw, J., Kavukcuoglu, K.: Exploiting cyclic symmetry in convolutional neural networks. arXiv:1602.02660 (2016)
Dieleman, S., Willett, K.W., Dambre, J.: Rotation-invariant convolutional neural networks for galaxy morphology prediction. Mon. Not. R. Astron. Soc. 450(2), 1441–1459 (2015)
Article Google Scholar
Winkels, M., Cohen, T.S.: 3D G-CNNs for pulmonary nodule detection. MIDL, 1–11 (2018)
Worrall, D., Brostow, G.: CubeNet: equivariance to 3D rotation and translation. ECCV 2018, 585–602 (2018)
Google Scholar
Oyallon, E., Mallat, S.: Deep roto-translation scattering for object classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2865–2873 (2015)
Weiler, M., Hamprecht, F.A., Storath, M.: Learning steerable filters for rotation equivariant CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 849–858 (2018)
Bekkers, E.J.: B-spline CNNs on Lie groups. (2019) arXiv:1909.12057
Finzi, M., Stanton, S., Izmailov, P., Wilson, A.G.: Generalizing convolutional neural networks for equivariance to Lie groups on arbitrary continuous data. In: III, H.D., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 3165–3176. PMLR, Virtual (2020). http://proceedings.mlr.press/v119/finzi20a.html
Cohen, T.S., Geiger, M., Weiler, M.: A general theory of equivariant CNNs on homogeneous spaces. Adv. Neural Inf. Process. Syst. 32 (2019)
Worrall, D.E., Garbin, S.J., Turmukhambetov, D., Brostow, G.J.: Harmonic networks: Deep translation and rotation equivariance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5028–5037 (2017)
Kondor, R., Trivedi, S.: On the generalization of equivariance and convolution in neural networks to the action of compact groups. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 2747–2755. PMLR, Stockholmsmässan, Stockholm Sweden (2018). http://proceedings.mlr.press/v80/kondor18a.html
Esteves, C., Allen-Blanchette, C., Makadia, A., Daniilidis, K.: Learning SO(3) equivariant representations with spherical CNNs. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–68 (2018)
Weiler, M., Cesa, G.: General E(2)-equivariant steerable CNNs. In: Advances in Neural Information Processing Systems, pp. 14334–14345 (2019)
Paoletti, M.E., Haut, J.M., Roy, S.K., Hendrix, E.M.T.: Rotation equivariant convolutional neural networks for hyperspectral image classification. IEEE Access 8, 179575–179591 (2020). https://doi.org/10.1109/ACCESS.2020.3027776
Article Google Scholar
Weiler, M., Forré, P., Verlinde, E., Welling, M.: Coordinate Independent Convolutional Networks—Isometry and Gauge Equivariant Convolutions on Riemannian Manifolds (2021). arXiv:2106.06020
Cohen, T.S., Weiler, M., Kicanaoglu, B., Welling, M.: Gauge equivariant convolutional networks and the icosahedral CNN. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 1321–1330. PMLR, Long Beach, California (2019). https://proceedings.mlr.press/v97/cohen19d.html
Bogatskiy, A., Anderson, B., Offermann, J.T., Roussi, M., Miller, D.W., Kondor, R.: Lorentz Group Equivariant Neural Network for Particle Physics (2020). arXiv:2006.04780
Sifre, L., Mallat, S.: Rotation, scaling and deformation invariant scattering for texture discrimination. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1233–1240 (2013). https://doi.org/10.1109/CVPR.2013.163
Bekkers, E.J., Loog, M., ter Haar Romeny, B.M., Duits, R.: Template matching via densities on the roto-translation group. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 452–466 (2018). https://doi.org/10.1109/TPAMI.2017.2652452
Article Google Scholar
Worrall, D., Welling, M.: Deep scale-spaces: Equivariance over scale. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc., Red Hook, New York (2019). https://proceedings.neurips.cc/paper/2019/file/f04cd7399b2b0128970efb6d20b5c551-Paper.pdf
Satorras, V.G., Hoogeboom, E., Welling, M.: E(n) equivariant graph neural networks. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 9323–9332. PMLR, Virtual (2021). https://proceedings.mlr.press/v139/satorras21a.html
Bronstein, M.M., Bruna, J., Cohen, T.S., Veličković, P.: Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges (2021). arXiv:2104.13478
Smets, B.M.N., Portegies, J.W., Bekkers, E.J., Duits, R.: PDE-based group equivariant convolutional neural networks. J. Math. Imaging Vis. (2022). https://doi.org/10.1007/s10851-022-01114-x
Article MATH Google Scholar
Duits, R., Franken, E.M.: Left-invariant parabolic evolution equations on ${SE}(2)$ and contour enhancement via invertible orientation scores, part I: Linear left-invariant diffusion equations on ${SE}(2)$. QAM-AMS 68, 255–292 (2010)
MATH Google Scholar
Duits, R.: Perceptual organization in image analysis: a mathematical approach based on scale, orientation and curvature. PhD thesis, Eindhoven University of Technology (2005)
Duits, R., Dela Haije, T.C.J., Creusen, E., Ghosh, A.: Morphological and linear scale spaces for fiber enhancement in DW-MRI. J. Math. Imaging Vis. 46(3), 326–368 (2013)
Article MathSciNet MATH Google Scholar
Duits, R., Burgeth, B.: Scale spaces on Lie groups. In: International Conference on Scale Space and Variational Methods in Computer Vision, pp. 300–312 (2007). Springer
Franken, E.M.: Enhancement of crossing elongated structures in images. PhD thesis, Eindhoven University of Technology (2008)
Bekkers, E.J.: Retinal image analysis using sub-Riemannian geometry in SE(2). PhD thesis, Eindhoven University of Technology (2017)
Hubel, D.H., Wiesel, T.N.: Receptive fields of single neurons in the cat’s striate cortex. J. Physiol. 148, 574–591 (1959)
Article Google Scholar
Bosking, W.H., Zhang, Y., Schofield, B., Fitzpatrick, D.: Orientation selectivity and the arrangement of horizontal connections in tree shrew striate cortex. J. Neurosci. 17(6), 2112–2127 (1997)
Article Google Scholar
Petitot, J.: The neurogeometry of pinwheels as a sub-Riemannian contact structure. J. Physiol. - Paris 97, 265–309 (2003)
Article Google Scholar
Citti, G., Sarti, A.: A cortical based model of perceptional completion in the roto-translation space. J. Math. Imaging Vis. 24(3), 307–326 (2006)
Article MATH Google Scholar
Field, D.J., Hayes, A., Hess, R.F.: Contour integration by the human visual system: evidence for a local “association field’’. Vis. Res. 33(2), 173–193 (1993). https://doi.org/10.1016/0042-6989(93)90156-Q
Article Google Scholar
Baspinar, E., Calatroni, L., Franceschi, V., Prandi, D.: A cortical-inspired sub-Riemannian model for Poggendorff-type visual illusions. Journal of Imaging 7, 41 (2021). https://doi.org/10.3390/jimaging7030041
Article Google Scholar
Franceschiello, B., Mashtakov, A., Citti, G., Sarti, A.: Geometrical optical illusion via sub-Riemannian geodesics in the roto-translation group. Differ. Geom. Its Appl. 65, 55–77 (2019). https://doi.org/10.1016/j.difgeo.2019.03.007
Article MathSciNet MATH Google Scholar
Duits, R., Boscain, U., Rossi, F., Sachkov, Y.L.: Association fields via cuspless sub-Riemannian geodesics in SE(2). J. Math. Imaging Vis. 49(2), 384–417 (2014). https://doi.org/10.1007/s10851-013-0475-y
Article MathSciNet MATH Google Scholar
Sachkov, Y.L.: Cut locus and optimal synthesis in the sub-Riemannian problem on the group of motions of a plane. ESAIM Control Optim. Calcu. Var. 17, 293–321 (2011)
Article MathSciNet MATH Google Scholar
Moiseev, I., Sachkov, Y.L.: Maxwell strata in sub-Riemannian problem on the group of motions of a plane. ESAIM Control Optim. Calcu. Var. 16(2), 380–399 (2010). https://doi.org/10.1051/cocv/2009004
Article MathSciNet MATH Google Scholar
Duits, R., Meesters, S.P.L., Mirebeau, J.-M., Portegies, J.M.: Optimal paths for variants of the 2D and 3D Reeds–Shepp car with applications in image analysis. J. Math. Imaging Vis. 60, 816–848 (2018)
Article MathSciNet MATH Google Scholar
Petitot, J.: Elements of Neurogeometry. Lecture Notes in Morphogenesis. Springer, London (2017). https://doi.org/10.1007/978-3-319-65591-8
Book Google Scholar
Bekkers, E.J., Duits, R., Mashtakov, A., Sanguinetti, G.R.: A PDE approach to data-driven sub-Riemannian geodesics in SE(2). SIAM J. Imaging Sci. 8(4), 2740–2770 (2015)
Article MathSciNet MATH Google Scholar
Bekkers, E.J., Chen, D., Portegies, J.M.: Nilpotent approximations of sub-Riemannian distances for fast perceptual grouping of blood vessels in 2D and 3D. J. Math. Imaging Vis. 60(6), 882–899 (2018). https://doi.org/10.1007/s10851-018-0787-z
Article MathSciNet MATH Google Scholar
Chirikjian, G.S., Kyatkin, A.B.: Engineering Applications of Noncommutative Harmonic Analysis: With Emphasis on Rotation and Motion Groups. CRC Press, Boca Raton (2000)
Book MATH Google Scholar
Schmidt, M., Weickert, J.: Morphological counterparts of linear shift-invariant scale-spaces. J. Math. Imaging Vis. 56(2), 352–366 (2016)
Article MathSciNet MATH Google Scholar
van den Boomgaard, R., Smeulders, A.: The morphological structure of images: the differential equations of morphological scale-space. IEEE Trans. Pattern Anal. Mach. Intell. 16(11), 1101–1113 (1994). https://doi.org/10.1109/34.334389
Article Google Scholar
Evans, L.C.: Partial Differential Equations, vol. 19. American Mathematical Society, Providence (2010)
MATH Google Scholar
Diop, E.H.S., Mbengue, A., Manga, B., Seck, D.: Extension of mathematical morphology in Riemannian spaces. In: Scale Space and Variational Methods in Computer Vision, pp. 100–111. Springer, Cham (2021)
Fathi, A., Maderna, E.: Weak KAM theorem on non compact manifolds. Nonlinear Differ. Equ. Appl. NoDEA 14(1–2), 1–27 (2007). https://doi.org/10.1007/s00030-007-2047-6
Article MathSciNet MATH Google Scholar
Azagra, D., Ferrera, J., López-Mesas, F.: Nonsmooth analysis and Hamilton–Jacobi equations on Riemannian manifolds. J. Funct. Anal. 220(2), 304–361 (2005)
Article MathSciNet MATH Google Scholar
Lupi, G.: Kernel approximations in lie groups and application to group-invariant CNN. Master thesis, University of Bologna (2021)
ter Elst, A.F.M., Robinson, D.W.: Weighted subcoercive operators on Lie groups. J. Funct. Anal. 157(1), 88–163 (1998). https://doi.org/10.1006/jfan.1998.3259
Article MathSciNet MATH Google Scholar
Mordohai, P., Medioni, G.: Tensor voting: a perceptual organization approach to computer vision and machine learning. Synth. Lect. Image Video Multimed. Process. 2(1), 1–136 (2006). https://doi.org/10.2200/S00049ED1V01Y200609IVM008
Article Google Scholar
Cervantes-Sanchez, F., Cruz-Aceves, I., Hernandez-Aguirre, A., Hernandez-Gonzalez, M.A., Solorio-Meza, S.E.: Automatic segmentation of coronary arteries in X-ray angiograms using multiscale analysis and artificial neural networks. Appl. Sci. (2019). https://doi.org/10.3390/app9245507
Article Google Scholar
Duits, R., Smets, B.M.N., Bekkers, E.J., Portegies, J.W.: Equivariant deep learning via morphological and linear scale space PDEs on the space of positions and orientations. LNCS 12679, 27–39 (2021)
MATH Google Scholar
Baspinar, E., Citti, G., Sarti, A.: A geometric model of multi-scale orientation preference maps via Gabor functions. J. Math. Imaging Vis. 60(6), 900–912 (2018)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank Dr. Javier Oliván Bescós for pointing us to the publicly available DCA1 dataset [59].

Funding

We gratefully acknowledge the Dutch Foundation of Science NWO for its financial support by Talent Programme VICI 2020 Exact Sciences (Duits, Geometric learning for Image Analysis, VI.C. 202-031).

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, CASA, Eindhoven University of Technology, Eindhoven, The Netherlands
Gijs Bellaard, Daan L. J. Bon, Gautam Pai, Bart M. N. Smets & Remco Duits

Authors

Gijs Bellaard
View author publications
You can also search for this author in PubMed Google Scholar
Daan L. J. Bon
View author publications
You can also search for this author in PubMed Google Scholar
Gautam Pai
View author publications
You can also search for this author in PubMed Google Scholar
Bart M. N. Smets
View author publications
You can also search for this author in PubMed Google Scholar
Remco Duits
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G. Bellaard is the first author and writer of the manuscript, he has adapted the PDE-G-CNN code by B.M.N. Smets for inclusion of sub-Riemannian and combined morphological kernels, created most figures in the manuscript, set up (experiments on) the Lines dataset, and created the final proofs of all theorems. B.M.N. Smets has created the vital PDE-G-CNN code, i.e., the publicly available LieTorch package, used in the practical experiments of the article, contributed to the main theoretical results, and provided important practical advice on all experiments. D.L.J. Bon has contributed to part of the proof of the main theoretical result, created the pictures that visualize the feature maps, and contributed substantially to pre-testing the experiments on the Lines dataset, as well as creating the dataset. G. Pai conducted all the final experiments in collaboration with the other authors (primarily with G. Bellaard), and made substantial contributions on the final exposition and presentation of the practical part of this article. R. Duits supervised the project, has initiated the theory and theorem formulations, contributed to all proofs, polished the manuscript, and inserted the geometric interpretation of PDE-G-CNNs, linking them to neurogeometry. All authors collaborated closely and reviewed the manuscript carefully. Author main contributions per section (in order of appearance): Sect. 1: G. Bellaard & R. Duits; Sect. 2: G. Bellaard & R. Duits; Sect. 3: G. Bellaard & R. Duits; Sect. 4: G. Bellaard & R. Duits; Sect. 5: G. Bellaard & B.M.N. Smets & D.L.J. Bon & R. Duits; Sect. 6: G. Bellaard & B.M.N. Smets & G. Pai; Sect. 7: G. Bellaard & G. Pai & R. Duits; Appendix 1: G. Bellaard & D.L.J. Bon & R. Duits; Appendix 1: G. Bellaard & R. Duits.

Corresponding author

Correspondence to Gijs Bellaard.

Ethics declarations

Competing Interests

R. Duits is a member of the editorial board of JMIV.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Proof of Lemma 6

Proof

We start by writing out the explicit form of $\Vert d \rho _b\Vert ^2$ in the left-invariant frame:

$$\begin{aligned} \Vert d \rho _b\Vert ^2 = w_1^{-2} (\mathcal {A}_1 \rho _b)^2 + w_2^{-2} (\mathcal {A}_2 \rho _b)^2 + w_3^{-2} (\mathcal {A}_3 \rho _b)^2. \end{aligned}$$

By replacing the left-invariant derivatives with half-angle coordinates derivatives, we can equivalently write this as:

$$\begin{aligned} \begin{aligned}&w_2^{-2} \left( \left| \frac{\partial \rho }{\partial b^1} \right| ^2 + \left| \frac{\partial \rho }{\partial b^2} \right| ^2 \right) \\&\quad +(w_1^{-2} - w_2^{-2}) \left| \cos \left( \frac{b^3}{2} \right) \frac{\partial \rho }{\partial b^1} + \sin \left( \frac{b^{3}}{2} \right) \frac{\partial \rho }{\partial b^2} \right| ^2 \\&\quad + w_3^{-2} \left| \frac{1}{2}\frac{\partial \rho }{\partial \psi } + \frac{\partial \rho }{\partial b^3} \right| ^2, \end{aligned} \end{aligned}$$

where $\psi = {{\,\textrm{arctan2}\,}}(b^2, b^1)$, $\partial _\psi = b^2 \partial _{b^1} - b^1 \partial _{b^2} $, and we omitted the subscript b from $\rho $ for conciseness. We are going to Taylor expand the sin and cosine in the second term up to the second-order term. This becomes

$$\begin{aligned}{} & {} \left| \cos \left( \frac{b^3}{2} \right) \frac{\partial \rho }{\partial b^1} + \sin \left( \frac{b^{3}}{2} \right) \frac{\partial \rho }{\partial b^2} \right| ^2 = \left| \frac{\partial \rho }{\partial b^1} \right| ^2 \\{} & {} + \theta \left( \frac{\partial \rho }{\partial b^1} \frac{\partial \rho }{\partial b^2} \right) + \frac{\theta ^2}{4} \left( \left| \frac{\partial \rho }{\partial b^2} \right| ^2 - \left| \frac{\partial \rho }{\partial b^1} \right| ^2 \right) + O(\theta ^3). \end{aligned}$$

This allows us to write $ \left\| d \rho _b \right\| ^2$ as

$$\begin{aligned} w_1^{-2} \left| \frac{\partial \rho }{\partial b^1} \right| ^2 + w_2^{-2} \left| \frac{\partial \rho }{\partial b^2} \right| ^2 + w_3^{-2} \left| \frac{\partial \rho }{\partial b^3} \right| ^2 + \varepsilon . \end{aligned}$$

Making use of the fact that the first part in this expression equals 1, we can thus write $ \left\| d \rho _b \right\| ^2 = 1 + \varepsilon $. The exact form of $\varepsilon $ is as follows

$$\begin{aligned} \varepsilon= & {} \frac{w_1^2 - w_2^2}{4 w_1^2 \, w_2^2 \, w_3^2 \, \rho _b^2} \biggl ( w_1^4 w_3^2 (b^1 b^3)^2 - w_2^4 w_3^2 (b^2 b^3)^2 \\{} & {} + w_1^2w_2^2(w_1^2 - w_2^2) (b^1 b^2)^2 \biggr ) + O(\theta ^3). \end{aligned}$$

Using that $w_i \vert b^i \vert \le \rho _b$ we can bound the expression from above by

$$\begin{aligned} \varepsilon \le \rho _b^2 \frac{\left| w_1^2 - w_2^2 \right| }{4 w_1^2 \, w_2^2 \, w_3^2 \,} \left( w_1^2 + w_2^2 + \left| w_1^2 - w_2^2 \right| \right) + O(\theta ^3). \end{aligned}$$

Finally the lemma follows by algebraic manipulations and the fact that $w_1 \le w_2$. $\square $

Geometric Interpretation of PDE-G-CNN layers

In a PDE-G-CNN layer [28, 60], one first performs convection and then a morphological convolution (dilation/erosion). This has the interesting effect that we can interpret this equivalently as performing a morphological convolution with a shifted morphological kernel. To make this precise, we first define what convection exactly is:

Definition 2

(Convection) Let $v \in T_{\textbf{p}_0} (\mathbb {M}_2)$ be a tangent vector at the reference point $\textbf{p}_0$, and let $c: \mathbb {M}_2 \rightarrow T(\mathbb {M}_2)$ be the corresponding left-invariant vector field obtained by pushing v forward with the left-action $L_g(\textbf{p}):= g \textbf{p}$, i.e., $c(g \textbf{p}_0)=(L_g)_* v$. Convection is defined as:

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial W}{\partial t} &{}= -cW\\ \left. W \right| _{t=0} &{}= U, \end{array}\right. } \end{aligned}$$

where both W and U are scalar differentiable functions on $\mathbb {M}_2$.

The solution of this left-invariant transport (‘convection’) is quite simple and we state it in the following proposition without proof:

Proposition 2

The solution to the convection equation is

$$\begin{aligned} W(g \textbf{p}_0, t) = U(g \exp (-vt) \textbf{p}_0 ). \end{aligned}$$

where we identified v as a tangent vector in $T_eSE(2)$.

For the proof, and more details on how convection is implemented in practice within the PDE-G-CNN framework, we refer to [28, Sec. 5.1]. The general idea is that the characteristics of left-invariant flow are Lie group exponential curves acting on the reference point $\textbf{p}_0 \in \mathbb {M}_2$ in the homogeneous space.

We can now show that first performing convection and then a morphological convolution is the same as doing a morphological convolution with a shifted kernel:

Proposition 3

Let $k: \mathbb {M}_2 \rightarrow \mathbb {R}$ be any morphological kernel. We have:

$$\begin{aligned} \begin{aligned} (k \mathbin {\square } W)(\textbf{p})&= \inf \limits _{g \in G} \left\{ k(g^{-1} \textbf{p}) + W(g \textbf{p}_0, t)\right\} \\&= (\hat{k} \mathbin {\square } U)(\textbf{p}), \end{aligned} \end{aligned}$$

with shifted kernel $\hat{k}(\textbf{p}, t):= k(\exp (-t v) \textbf{p})$. In particular for time-dependent erosion PDE kernels

$$\begin{aligned} \hat{k}_t(\textbf{p})= \frac{t}{\beta }\left( \frac{d_{\mathcal {G}}(\textbf{p},e^{tv}\textbf{p}_0)}{t}\right) ^{\beta } \end{aligned}$$

(B.1)

Proof

Indeed, by direct computations one has:

$$\begin{aligned} \begin{aligned} (k \mathbin {\square } W)(\textbf{p})&= \inf \limits _{g \in G} \left\{ k(g^{-1} \textbf{p}) + W(g \textbf{p}_0, t)\right\} \\&= \inf \limits _{g \in G} \left\{ k(g^{-1} \textbf{p}) + U(g \exp (-vt) \textbf{p}_0 )\right\} \\&= \inf \limits _{g \in G} \left\{ k(\exp (-vt) g^{-1} \textbf{p}) + U(g \textbf{p}_0 )\right\} \\&= \inf \limits _{g \in G} \left\{ \hat{k}(g^{-1} \textbf{p}, t) + U(g \textbf{p}_0 )\right\} \\&= (\hat{k} \mathbin {\square } U)(\textbf{p}). \end{aligned} \end{aligned}$$

When applying this to the erosion kernels (1), the result (B.1) follows by left-invariance of the Riemannian metric: $d_\mathcal {G}(e^{-tv}\textbf{p},\textbf{p}_0)=d_\mathcal {G}(\textbf{p},e^{tv}\textbf{p}_0)$ and the identity $(e^{tv})^{-1}=e^{-tv}$. $\square $

Recall the relation between (approximative) Riemannian balls and association fields, as visualized in Figs. 4, 5 and 9.

The top left corner in Fig. 22 shows how a single PDE-G-CNN module (i.e., operator between two nodes in the network). The top-right shows the geometric rationale behind a PDE-G-CNNs that essentially performs perceptual grouping of association fields via training, and indeed the bottom two rows of Fig. 22 reveal how the grouping of association fields becomes visible in the feature maps of two input test images. In comparison with this (for PDE-G-CNNs), typical geometric behavior is absent in feature maps of CNNs applied to the same images, recall Fig. 8.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bellaard, G., Bon, D.L.J., Pai, G. et al. Analysis of (sub-)Riemannian PDE-G-CNNs. J Math Imaging Vis 65, 819–843 (2023). https://doi.org/10.1007/s10851-023-01147-w

Download citation

Received: 21 October 2022
Accepted: 18 March 2023
Published: 16 April 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10851-023-01147-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Analysis of (sub-)Riemannian PDE-G-CNNs

Abstract

Similar content being viewed by others

PDE-Based Group Equivariant Convolutional Neural Networks

Geometric deep learning and equivariant neural networks

Geometric Adaptations of PDE-G-CNNs

1 Introduction

1.1 Contributions

1.2 Outline

2 Preliminaries

3 Erosion and Dilation

Proposition 1

Proof

4 Distance Approximations

5 Main Theorem and Analysis

Theorem 1

Proof

5.1 Symmetry Preservation

Lemma 1

Proof

Definition 1

Remark 1

Lemma 2

Proof

Lemma 3

Proof

Corollary 1

Proof

5.2 Simple Global Bounds

Lemma 4

Proof

Corollary 2

Remark 2

Corollary 3

Proof

Corollary 4

5.3 Asymptotic Error Expansion

Lemma 5

Proof

Lemma 6

Proof

Lemma 7

Proof

Remark 3

Corollary 5

Proof

6 Experiments

6.1 Error of Half Angle Approximation

6.2 DCA1

6.3 Lines

7 Conclusion

Availability of Data and Code

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Appendices

Proof of Lemma 6

Proof

Geometric Interpretation of PDE-G-CNN layers

Definition 2

Proposition 2

Proposition 3

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation