Abstract
Group equivariant convolutional neural networks (GCNNs) have been successfully applied in geometric deep learning. Typically, GCNNs have the advantage over CNNs that they do not waste network capacity on training symmetries that should have been hardcoded in the network. The recently introduced framework of PDEbased GCNNs (PDEGCNNs) generalizes GCNNs. PDEGCNNs have the core advantages that they simultaneously (1) reduce network complexity, (2) increase classification performance, and (3) provide geometric interpretability. Their implementations primarily consist of linear and morphological convolutions with kernels. In this paper, we show that the previously suggested approximative morphological kernels do not always accurately approximate the exact kernels accurately. More specifically, depending on the spatial anisotropy of the Riemannian metric, we argue that one must resort to subRiemannian approximations. We solve this problem by providing a new approximative kernel that works regardless of the anisotropy. We provide new theorems with better error estimates of the approximative kernels, and prove that they all carry the same reflectional symmetries as the exact ones. We test the effectiveness of multiple approximative kernels within the PDEGCNN framework on two datasets, and observe an improvement with the new approximative kernels. We report that the PDEGCNNs again allow for a considerable reduction of network complexity while having comparable or better performance than GCNNs and CNNs on the two datasets. Moreover, PDEGCNNs have the advantage of better geometric interpretability over GCNNs, as the morphological kernels are related to association fields from neurogeometry.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Many classification, segmentation, and tracking tasks in computer vision and digital image processing require some form of “symmetry.” Think, for example, of image classification. If one rotates, reflects, or translates an image, the classification stays the same. We say that an ideal image classification is invariant under these symmetries. A slightly different situation is image segmentation. In this case, if the input image is in some way changed the output should change accordingly. Therefore, an ideal image segmentation is equivariant with respect to these symmetries.
Many computer vision and image processing problems are currently being tackled with neural networks (NNs). It is desirable to design neural networks in such a way that they respect the symmetries of the problem, i.e., make them invariant or equivariant. Think for example of a neural network that detects cancer cells. It would be disastrous if, by for example slightly translating an image, the neural network would give totally different diagnosis, even though the input is essentially the same.
One way to make the networks equivariant or invariant is to simply train them on more data. One could take the training dataset and augment it with translated, rotated, and reflected versions of the original images. This approach however is undesirable: invariance or equivariance is still not guaranteed and the training takes longer. It would be better if the networks are inherently invariant or equivariant by design. This avoids a waste of networkcapacity, guarantees invariance or equivariance, and increases performances, see for example [1].
More specifically, many computer vision and image processing problems are tackled with convolutional neural networks (CNNs) [2,3,4]. Convolution neural networks have the property that they inherently respect, to some degree, translation symmetries. CNNs do not however take into account rotational or reflection symmetries. Cohen and Welling introduced group equivariant convolutional neural networks (GCNNs) in [5] and designed a classification network that is inherently invariant under 90 degree rotations, integer translations, and vertical/horizontal reflections. Much work is being done on invariant/equivariant networks that exploit inherent symmetries, a nonexhaustive list is [1, 6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]. The idea of including geometric priors, such as symmetries, into the design of neural networks is called ‘Geometric Deep Learning’ in [27].
In [28], partial differential equation (PDE)based GCNNs are presented, aptly called PDEGCNNs. In fact, GCNNs are shown to be a special case of PDEGCNNs (if one restricts the PDEGCNNs only to convection, using many transport vectors [28, Sec. 6]). With PDEGCNNs, the usual nonlinearities that are present in current networks, such as the ReLU activation function and maxpooling, are replaced by solvers for specifically chosen nonlinear evolution PDEs. Figure 1 illustrates the difference between a traditional CNN layer and a PDEGCNN layer.
The PDEs that are used in PDEGCNNs are not chosen arbitrarily: they come directly from the world of geometric image analysis, and thus their effects are geometrically interpretable. This makes PDEGCNNs more geometrically meaningful and interpretable than traditional CNNs. Specifically, the PDEs considered are diffusion, convection, dilation, and erosion. These 4 PDEs correspond to the common notions of smoothing, shifting, max pooling, and min pooling. They are solved by linear convolutions, resamplings, and socalled morphological convolutions. Figure 2 illustrates the basic building block of a PDEGCNN.
One shared property of GCNNs and PDEGCNNs is that the input data usually needs to be lifted to a higher dimensional space. Take, for example, the case of image segmentation with a convolution neural network where we model/idealize the images as realvalued function on \(\mathbb {R}^2\). If we keep the data as functions on \(\mathbb {R}^2\) and want the convolutions within the network to be equivariant, then the only possible ones that are allowed are with isotropic kernels, [29, p. 258]. This type of shortcoming generalizes to other symmetry groups as well [12, Thm. 1]. One can imagine that this is a constraint too restrictive to work with, and that is why we lift the image data.
Within the PDEGCNN framework, the input images are considered realvalued functions on \(\mathbb {R}^d\), the desired symmetries are represented by the Lie group of rototranslations SE(d), and the data is lifted to the homogeneous space of d dimensional positions and orientations \(\mathbb {M}_d\). It is on this higher dimensional space on which the evolution PDEs are defined, and the effects of diffusion, dilation, and erosion are completely determined by the Riemannian metric tensor field \(\mathcal {G}\) that is chosen on \(\mathbb {M}_d\). If this Riemannian metric tensor field \(\mathcal {G}\) is leftinvariant, the overall processing is equivariant, this follows by combining techniques in [30, Thm. 21, Chpt. 4], [31, Lem. 3, Thm. 4].
The Riemannian metric tensor field \(\mathcal {G}\) we will use in this article is leftinvariant and determined by three nonnegative parameters: \(w_1\), \(w_2\), and \(w_3\). The definition can be found in the preliminaries, Sect. 2 Equation (4). It is exactly these three parameters that during the training of a PDEGCNN are optimized. Intuitively, the parameters correspondingly regulate the cost of main spatial, lateral spatial, and angular motion. An important quantity in the analysis of this paper is the spatial anisotropy \(\zeta := \frac{w_1}{w_2}\), as will become clear later.
In this article, we only consider the twodimensional case, i.e., \(d=2\). In this case, the elements of both \(\mathbb {M}_2\) and SE(2) can be represented by three real numbers: \((x,y,\theta ) \in \mathbb {R}^2 \times [0,2\pi )\). In the case of \(\mathbb {M}_2\), the x and y represent a position and \(\theta \) represents an orientation. Throughout the article, we take \(\textbf{p}_0:= (0,0,0) \in \mathbb {M}_2\) as our reference point in \(\mathbb {M}_2\). In the case of SE(2), we have that x and y represent a translation and \(\theta \) a rotation.
As already stated, within the PDEGCNN framework images are lifted to the higher dimensional space of positions and orientations \(\mathbb {M}_d\). There are a multitude of ways of achieving this, but there is one very natural way to do it: the orientation score transform [30, 32,33,34]. In this transform, we pick a point \((x,y) \in \mathbb {R}^2\) in an image and determine how good a certain orientation \(\theta \in [0, 2\pi )\) fits the chosen point. In Fig. 3 an example of an orientation score is given. We refer to [34, Sec. 2.1] for a summary of how an orientation score transform works.
Inspiration for using orientation scores comes from biology. The Nobel laureates Hubel and Wiesel found that many cells in the visual cortex of cats have a preferred orientation [35, 36]. Moreover, a neuron that fires for a specific orientation excites neighboring neurons that have an “aligned” orientation. Petitot and CittiSarti proposed a model [37, 38] for the distribution of the orientation preference and this excitation of neighbors based on subRiemannian geometry on \(\mathbb {M}_2\). They relate the phenomenon of preference of aligned orientations to the concept of association fields [39], which model how a specific local orientation places expectations on surrounding orientations in human vision. Figure 4 provides an impression of such an association field.
As shown in [42, Fig. 17], association fields are closely approximated by (projected) subRiemannian geodesics in \(\mathbb {M}_2\) for which optimal synthesis has been obtained by Sachkov and Moiseev [43, 44]. Furthermore, in [45] it is shown that the Riemannian geodesics in \(\mathbb {M}_2\) converge to the subRiemannian geodesics by increasing the spatial anisotropy \(\zeta \) of the metric. This shows that in practice one can approximate the subRiemannian model by Riemannian models. Figure 5 shows the relation between association fields and subRiemannian geometry in \(\mathbb {M}_2\).
The relation between association fields and Riemannian geometry on \(\mathbb {M}_2\) directly extends to a relation between dilation/erosion and association fields. Namely, performing dilation on an orientation score in \(\mathbb {M}_2\) is similar to extending a line segment along its association field lines. Similarly, performing erosion is similar to sharpening a line segment perpendicular to its association field lines. This makes dilation/erosion the perfect candidate for a task such as line completion.
In the line completion problem, the input is an image containing multiple line segments, and the desired output is an image of the line that is “hidden” in the input image. Figure 6 shows such an input and desired output. This is also what David Field et al. studied in [39]. We anticipate that PDEGCNNs outperform classical CNNs in the line completion problem due to PDEGCNNs being able to dilate and erode. To investigate this, we made a synthetic dataset called “Lines” consisting of grayscale \(64\times 64\) pixel images, together with their groundtruth line completion. In Fig. 7, a complete abstract overview of the architecture of a PDEGCNN performing line completion is visualized. Figure 8 illustrates how a PDEGCNN and CNN incrementally complete a line throughout their layers.
In Proposition 1, we show that solving the dilation and erosion PDEs can be done by performing a morphological convolution with a morphological kernel \(k_t^{\alpha }: \mathbb {M}_2 \rightarrow \mathbb {R}_{\ge 0}\), which is easily expressed in the Riemannian distance \(d=d_{\mathcal {G}}\) on the manifold:
Here \(\textbf{p}_0 = (0,0,0)\) is our reference point in \(\mathbb {M}_2\), and time \(t>0\) controls the amount of erosion and dilation. Furthermore, \(\alpha >1\) controls the “softness” of the max and minpooling, with \(\frac{1}{\alpha }+\frac{1}{\beta }=1\). Erosion is done through a direct morphological convolution (3) with this specific kernel. Dilation is solved in a slightly different way but again with the same kernel (Proposition 1 in Sect. 3 will explain the details).
And this is where a problem arises: calculating the exact distance d on \(\mathbb {M}_2\) required in (1) is computationally expensive [47]. To alleviate this issue, we resort to estimating the true distance d with computationally efficient approximative distances, denoted throughout the article by \(\rho \). We then use such a distance approximation within (1) to create a corresponding approximative morphological kernel, and in turn use this to efficiently calculate the effect of dilation and erosion.
In [28], one such distance approximation is used: the logarithmic distance estimate \(\rho _c\) which uses the logarithmic coordinates \(c^i\) (8). In short, \(\rho _c(\textbf{p})\) is equal to the Riemannian length of the exponential curve that connects \(\textbf{p}_0\) to \(\textbf{p}\). The formal definition will follow in Sect. 4. In Fig. 9 an impression of \(\rho _c\) is given.
Clearly, an error is made when the effect of erosion and dilation is calculated with an approximative morphological kernel. As a morphological kernel is completely determined by its corresponding (approximative) distance, it follows that one can analyze the error by analyzing the difference between the exact distance d and approximative distance \(\rho \) that is used.
Despite showing in [28] that \(d \le \rho _c\) no concrete bounds are given, apart from the asymptotic \( \rho _c^2 \le d^2 + \mathcal {O}(d^4) \). This motivates us to do a more indepth analysis on the quality of the distance approximations.
We introduce a variation on the logarithmic estimate \(\rho _c\) called the halfangle distance estimate \(\rho _b\), and analyze that. The halfangle approximation uses not the logarithmic coordinates but halfangle coordinates \(b^i\). The definition of these is also given later (28). In practice, \(\rho _c\) and \(\rho _b\) do not differ much, but analyzing \(\rho _b\) is much easier!
The main theorem of the paper, Proposition 1, collects new theoretical results that describe the quality of using the halfangle distance approximation \(\rho _b\) for solving dilation and erosion in practice. It relates the approximative morphological kernel \(k_b\) corresponding with \(\rho _b\), to the exact kernel k (1).
Both the logarithmic estimate \(\rho _c\) and halfangle estimate \(\rho _b\) approximate the true Riemannian distance d quite well in certain cases. One of these cases is when the Riemannian metric has a low spatial anisotropy \(\zeta \). We can show this visually by comparing the isocontours of the exact and approximative distances. However, interpreting and comparing these surfaces can be difficult. This is why we have decided to additionally plot multiple \(\theta \)isocontours of these surfaces. In Fig. 10 one such plot can be seen and illustrates how it must be interpreted.
In Table 1, a spatially isotropic \(\zeta = 1\) and lowanisotropic case \(\zeta = 2\) is visualized. Note that \(\rho _b\) approximates d well in these cases. In fact, \(\rho _b\) is exactly equal to the true distance d in the spatially isotropic case, which is not true for \(\rho _c\).
Both the logarithm and halfangle approximation fail specifically in the high spatial anisotropy regime. For example when \(\zeta = 8\). The first two columns of Table 2 show that, indeed, \(\rho _b\) is no longer a good approximation of the exact distance d. For this reason, we introduce a novel subRiemannian distance approximations \(\rho _{b, sr}\), which is visualized in the third column of Table 2.
Finally, we propose an approximative distance \(\rho _{com}\) that carefully combines the Riemannian and subRiemannian approximations into one. This combined approximation automatically switches to the estimate that is more appropriate depending on the spatial anisotropy, and hence covers both the low and high anisotropy regimes. Using the corresponding morphological kernel of \(\rho _{com}\) to solve erosion and dilation, we obtain more accurate (and still tangible) solutions of the nonlinear parts in the PDEGCNNs.
For every distance approximation (listed in Sect. 4), we perform an empirical analysis in Sect. 6 by seeing how the estimate changes the performance of the PDEGCNNs when applied to two datasets: the Lines dataset and publicly available DCA1 dataset.
1.1 Contributions
In Proposition 1, we summarize how the nonlinear units in PDEGCNNs (described by morphological PDEs) are solved using morphological kernels and convolutions, which provides sufficient and essential background for the discussions and results in this paper.
The key contributions of this article are:

Proposition 1 summarizes our mathematical analysis of the quality of the halfangle distance approximation \(\rho _b\) and its corresponding morphological kernel \(k_b\) in PDEGCNNs. We do this by comparing \(k_b\) to the exact morphological kernel k. Globally, one can show that they both carry the same symmetries, and that for low spatial anisotropies \(\zeta \) they are almost indistinguishable. Furthermore, we show that locally both kernels are similar through an upper bound on the relative error. This improves upon results in [28, Lem. 20].

Table 2 demonstrates qualitatively that \(\rho _b\) becomes a poor approximation when the spatial anisotropy is high \(\zeta \gg 1\). In Corollary 4, we underpin this theoretically and in Sect. 6.1 we validate this observation numerically. This motivates the use of a subRiemannian approximation when \(\zeta \) is large.

In Sect. 4, we introduce and derive a novel subRiemannian distance approximation \(\rho _{sr}\), that overcomes difficulties in previous existing subRiemannian kernel approximations [48]. Subsequently, we propose our approximation \(\rho _{com}\) that combines the Riemannian and subRiemannian approximations into one that automatically switches to the approximation that is more appropriate depending on the metric parameters.

Figures 16 and 19 show that PDEGCNNs perform just as well as, and sometimes better than, GCNNs and CNNs on the DCA1 and Lines dataset, while having the least amount of parameters. Figures 20 and 17 depict an evaluation of the performance of PDEGCNNs when using the different distance approximations, again on the DCA1 and Lines dataset. We observe that the new kernel \(\rho _{b,com}\) provides best results.
Our theoretical contributions are also relevant outside the context of geometric deep learning. Namely, it also applies to general geometric image processing [48], neurogeometry [37, 38], and robotics [49, Sec. 6.8.4].
In addition, Figs. 4, 5, 9 and 8 show a connection between the PDEGCNN framework with the theory of association fields from neurogeometry [37, 39]. Thereby, PDEGCNNs reveal improved geometrical interpretability, in comparison with existing convolution neural networks. In Appendix 1, we further clarify the geometrical interpretability.
1.2 Outline
In Sect. 2, a short overview of the necessary mathematical preliminaries is given. Section 3 collects some known results on the exact solution of erosion and dilation on the homogeneous space of twodimensional positions and orientations \(\mathbb {M}_2\), and motivates the use of morphological kernels. In Sect. 4, all approximative distances are listed. The approximative distances give rise to corresponding approximative morphological kernels. The main theorem of this paper can be found in Sect. 5 and consist of three parts, of which the proofs can be found in the relevant subsections. The main theorem mostly concerns itself with the analysis of the approximative morphological kernel \(k_b\). Experiments with various approximative kernels are done and the result can be found in Sect. 6. Finally, we end the paper with a conclusion in Sect. 7.
2 Preliminaries
Coordinates on SE(2) and \(\mathbb {M}_2\). Let \(G = SE(2) = \mathbb {R}^2 \rtimes SO(2)\) be the twodimensional rigid body motion group. We identify elements \(g \in G\) with \(g \equiv (x,y,\theta ) \in \mathbb {R}^2 \times \mathbb {R}/(2\pi \mathbb {Z})\), via the isomorphism \(SO(2) \cong \mathbb {R}/(2\pi \mathbb {Z})\). Furthermore, we always use the smallangle identification \( \mathbb {R}/(2\pi \mathbb {Z}) = [\pi , \pi )\).
For \(g_1=(x_1, y_1, \theta _1)\), \(g_2 = (x_2, y_2, \theta _2) \in SE(2)\) we have the group product
and the identity is \(e = (0,0,0)\). The rigid body motion group acts on the homogeneous space of twodimensional positions and orientations \(\mathbb {M}_{2} = \mathbb {R}^2 \times S^1 \subseteq \mathbb {R}^2 \times \mathbb {R}^2\) by the leftaction \(\odot \):
with \((\textbf{x},\textbf{R}) \in SE(2)\) and \((\textbf{y},\textbf{n}) \in \mathbb {M}_2\). If context allows it, we may omit writing \(\odot \) for conciseness. By choosing the reference element \(\textbf{p}_0 = (0,0,(1,0)) \in \mathbb {M}_2\), we have:
This mapping is a diffeomorphism and allows us to identify SE(2) and \(\mathbb {M}_2\). Thereby we will also freely use the \((x,y,\theta )\) coordinates on \(\mathbb {M}_2\).
Morphological group convolution. Given functions \(f_1,f_2:\mathbb {M}_2 \rightarrow \mathbb {R}\), we define their morphological convolution (or ‘infimal convolution’) [50, 51] by
Leftinvariant (co)vector fields on \(\mathbb {M}_2\). Throughout this paper, we shall rely on the following basis of leftinvariant vector fields:
The dual frame \(\omega ^i\) is given by \(\langle \omega ^i, \mathcal {A}_{j}\rangle =\delta ^{i}_j\), i.e.,
Metric tensor fields on \(\mathbb {M}_2\). We consider the following leftinvariant metric tensor fields:
and write \(\Vert {\dot{\textbf{p}}}\Vert =\sqrt{\mathcal {G}_{\textbf{p}}({\dot{\textbf{p}}},{\dot{\textbf{p}}})}\). Here, \(w_i > 0\) are the metric parameters. We also use the dual norm \(\Vert {\hat{\textbf{p}}}\Vert _* = \sup \limits _{{{\dot{\textbf{p}}}} \in T_\textbf{p}\mathbb {M}_2} \frac{\left\langle {{\dot{\textbf{p}}}}, {\hat{\textbf{p}}} \right\rangle }{\Vert {{\dot{\textbf{p}}}}\Vert }\). We will assume, without loss of generality, that \(w_2 \ge w_1\) and introduce the ratio
that is called the spatial anisotropy of the metric. Distances on \(\mathbb {M}_2\). The leftinvariant metric tensor field \(\mathcal {G}\) on \(\mathbb {M}_2\) induces a leftinvariant distance (‘Riemannian metric’) \(d:\mathbb {M}_{2} \times \mathbb {M}_2 \rightarrow \mathbb {R}_{\ge 0}\) by
where \(\Gamma _t(\textbf{p}, \textbf{q})\) is the set piecewise \(C^1\)curves \(\gamma \) in \(\mathbb {M}_2\) with \(\gamma (0)=\textbf{p}\) and \(\gamma (t)=\textbf{q}\). The righthand side does not depend on \(t>0\), and we may set \(t=1\).
If no confusion can arise, we omit the subscript \(\mathcal {G}\) and write \(d, L, \Vert \cdot \Vert \) for short. The distance being leftinvariant means that for all \(g\in SE(2)\), \(\textbf{p}_1,\textbf{p}_2 \in \mathbb {M}_2\) one has \(d(\textbf{p},\textbf{q})=d(g \textbf{p},g \textbf{q})\). We will often use the shorthand notation \(d(\textbf{p}):=d(\textbf{p}, \textbf{p}_0)\).
We often consider the subRiemannian case arising when \(w_2 \rightarrow \infty \). Then we have “infinite cost” for sideways motion and the only “permissible” curves \(\gamma \) are the ones for which \({{\dot{\gamma }}}(t) \in H\) where \(H:= \text {span}\{\mathcal {A}_1, \mathcal {A}_3\} \subset T\mathbb {M}_{2}\). This gives rise to a new notion of distance, namely the subRiemannian distance \(d_{sr}\):
One can show rigorously that when \(w_2 \rightarrow \infty \) the Riemannian distance d tends to the subRiemannian distance \(d_{sr}\), see for example [45, Thm. 2].
Exponential and Logarithm on SE(2). The exponential map \(\exp (c^1 \partial _x \vert _e + c^2 \partial _y \vert _e + c^3 \partial _\theta \vert _e) = (x,y,\theta ) \in SE(2)\) is given by:
And the logarithm: \(\log (x,y,\theta ) = c^1 \partial _x\vert _e + c^2 \partial _y\vert _e + c^3 \partial _\theta \vert _e \in T_eSE(2)\):
By virtue of equation (2), we will freely use the logarithm coordinates on \(\mathbb {M}_2\).
3 Erosion and Dilation
We will be considering the following Hamilton–Jacobi equation on \(\mathbb {M}_2\):
with the Hamiltonian \(\mathcal {H}_\alpha : T^*\mathbb {M}_2 \rightarrow \mathbb {R}_{\ge 0}\):
and where \(W_\alpha \) the viscosity solutions [52] obtained from the initial condition \(U \in C( \mathbb {M}_{2},\mathbb {R})\). Here the \(+\)sign is a dilation scale space and the −sign is an erosion scale space [50, 51]. If confusion cannot arise, we omit the superscript 1D. Erosion and dilation correspond to min and maxpooling, respectively. The Lagrangian \(\mathcal {L}_\alpha : T\mathbb {M}_2 \rightarrow \mathbb {R}_{\ge 0}\) corresponding with this Hamiltonian is obtained by taking the Fenchel transform of the Hamiltonian:
with \(\beta \) such that \(\frac{1}{\alpha } + \frac{1}{\beta } = 1\). Again, if confusion cannot arise, we omit the subscript \(\alpha \) and/or superscript 1D. We deviate from our previous work by including the factor \(\frac{1}{\alpha }\) and working with a power of \(\alpha \) instead of \(2\alpha \). We do this because it simplifies the relation between the Hamiltonian and Lagrangian.
The following proposition collects standard results in terms of the solutions of Hamilton–Jacobi equations on manifolds [53,54,55], thereby generalizing results on \(\mathbb {R}^2\) to \(\mathbb {M}_2\).
Proposition 1
(Solution erosion & dilation) Let \(\alpha > 1\). The viscosity solution \(W_\alpha \) of the erosion PDE (9) is given by
where the morphological kernel \(k_t^{\alpha }: \mathbb {M}_{2} \rightarrow \mathbb {R}_{\ge 0}\) is defined as:
Furthermore, the Riemannian distance \(d:=d(\textbf{p}_0,\cdot )\) is the viscosity solution of the eikonal PDE
with boundary condition \(d(\textbf{p}_0)=0\). Likewise the viscosity solution of the dilation PDE is
Proof
It is shown by Fathi in [54, Prop. 5.3] that (10) is a viscosity solution of the Hamilton–Jacobi equation (9) on a complete connected Riemannian manifold without boundary, under some (weak) conditions on the Hamiltonian and with the initial condition U being Lipschitz. In [53, Thm. 2], a similar statement is given but only for compact connected Riemannian manifolds, again under some weak conditions on the Hamiltonian but without any on the initial condition. Next, we employ these existing results and provide a selfcontained proof of (11) and (12).
Because we are looking at a specific class of Lagrangians, the solutions can be equivalently written as (11). In [53, Prop. 2], this form can also be found. Namely, the Lagrangian \(\mathcal {L}_\alpha ^{1D}\) is convex for \(\alpha > 1\), so for any curve \(\gamma \in \Gamma _t:= \Gamma _t(\textbf{p}, \textbf{q})\) we have by direct application of Jensen’s inequality (omitting the superscript 1D):
with equality if \(\Vert {{\dot{\gamma }}}\Vert \) is constant. This means that:
where \(L(\gamma ):=L_{\mathcal {G}}(\gamma )\), recall (6), is the length of the curve \(\gamma \). Consider the subset of curves with constant speed \({\tilde{\Gamma }}_t = \{ \gamma \in \Gamma _t \mid \Vert {{\dot{\gamma }}}\Vert = L(\gamma )/t\} \subset \Gamma _t\). Optimizing over a subset can never decrease the infimum so we have:
The r.h.s of this equation is equal to the l.h.s of equation (16) as the length of a curve is independent of its parameterization. Thereby we have equality in (16). By monotonicity of \(\mathcal {L}_\alpha \) on \(\mathbb {R}_{>0}\), we may then concluded that:
That we can write the solution as (12) is a consequence of the leftinvariant metric on the manifold. A similar derivation can be found in [28, Thm. 30]:
It is shown in [55, Thm. 6.24] for complete connected Riemannian manifolds that the distance map \( d(\textbf{p}) \) is a viscosity solution of the Eikonal equation (14).
Finally, solutions of erosion and dilation PDEs correspond to each other. If \(W_\alpha \) is the viscosity solution of the erosion PDE with initial condition U, then \(W_\alpha \) is the viscosity solution of the dilation PDE, with initial condition \(U\). This means that the viscosity solution of the dilation PDE is given by (15). \(\square \)
4 Distance Approximations
To calculate the morphological kernel \(k_t^\alpha \) (13), we need the exact Riemannian distance d (6), but calculating this is computationally demanding. To alleviate this problem, we approximate the exact distance \(d(\textbf{p}_0, \cdot )\) with approximative distances, denoted with \(\rho : \mathbb {M}^2 \rightarrow \mathbb {R}_{\ge 0}\), which are computationally cheap. To this end, we define the logarithmic distance approximation \(\rho _c\), as explained in [28, Def.19] and [56, Def.6.1.2], by
Note that all approximative distances \(\rho \) can be extended to something that looks like a metric on \(\mathbb {M}_2\). For example, we can define:
But this is almost always not a true metric in the sense that it does not satisfy the triangle inequality. So in this sense an approximative distance is not necessarily a true distance. However, we will keep referring to them as approximative distances as we only require them to look like the exact Riemannian distance \(d(\textbf{p}_0, \cdot )\).
As already stated in the introduction, Riemannian distance approximations such as \(\rho _c\) begin to fail in the high spatial anisotropy cases \(\zeta \gg 1\). For these situations, we need subRiemannian distance approximations. In previous literature, two such subRiemannian approximations are suggested. The first one is standard [57, Sec. 6], the second one is a modified smooth version [29, p. 284], also seen in [48, eq. 14]:
In [48], \(\nu \approx 44\) is empirically suggested. Note that the subRiemannian approximations rely on the assumption that \(w_2 \ge w_1\).
However, they both suffer from a major shortcoming in the interaction between \(w_3\) and \(c^2\). When we let \(w_3 \rightarrow 0\) both approximations suggest that it becomes arbitrarily cheap to move in the \(c^2\) direction which is undesirable as this deviates from the exact distance d: moving spatially will always have a cost associated with it determined by at least \(w_1\).
To make a proper subRiemannian distance estimate, we will use the Zassenhaus formula, which is related to the Baker–Campbell–Hausdorff formula:
where we have used the shorthand \(e^x:= \exp (x)\). Filling in \(X = A_1\) and \(Y = A_3\) and neglecting the higherorder terms gives:
or equivalently:
This formula says that one can successively follow exponential curves in the “legal” directions \(\mathcal {A}_1\) and \(\mathcal {A}_3\) to effectively move in the “illegal” direction of \(\mathcal {A}_2\). Taking the lengths of these curves and adding them up gives an approximative upper bound on the subRiemannian distance:
Substituting \(t \rightarrow \sqrt{2\left t \right }\) gives:
This inequality, together with the smoothing trick to go from (18) to (19), inspires then the following subRiemannian distance approximation:
for some \(0<\nu <2\sqrt{2}\) s.t. the approximation is tight. We empirically suggest \(\nu \approx 1.6\), based on a numerical analysis that is tangential to [48, Fig. 3]. Notice that this approximation does not break down when we let \(w_3 \rightarrow 0\).
Furthermore, in view of contraction of SE(2) to the Heisenberg group \(H_3\) [29, Sec. 5.2], and the exact fundamental solution [32, eq. 27] of the Laplacian on \(H_3\) (where the norm \(\rho _{c,sr}\) appears squared in the numerator with \(1=w_1=w_3=\nu \)) we expect \(\nu \ge 1\).
Table 3 shows that both the old subRiemannian approximation (19) and new approximation (25) are appropriate in cases such as \(w_3=1\). Table 4 shows that the old approximation breaks down when we take \(w_3 = 0.5\), and that the new approximation behaves more appropriate.
The Riemannian and subRiemannian approximations can be combined into the following newly proposed practical approximation:
where \(l: \mathbb {M}_2 \rightarrow \mathbb {R}\) is given by:
for which will we show that it is a lower bound of the exact distance d in Lemma 4.
The most important property of the combined approximation is that is automatically switches between the Riemannian and subRiemannian approximations depending on the metric parameters. Namely, the Riemannian approximation is appropriate very close to the reference point \(\textbf{p}_0\), but tends to overestimate the true distance at a moderate distance from it. The subRiemannian approximation is appropriate at moderate distances from \(\textbf{p}_0\), but tends to overestimate very close to it, and underestimate far away. The combined approximation is such that we get rid of the weaknesses that the approximations have on their own.
On top of these approximative distances, we also define \(\rho _b\), \(\rho _{b,sr}\), and \(\rho _{b,com}\) by replacing the logarithmic coordinates \(c^i\) by their corresponding halfangle coordinates \(b^i\) defined by:
So, for example, we define \(\rho _b\) as:
Why we use these coordinates will be explained in Sect. 5.1.
We can define approximative morphological kernels by replacing the exact distance in (13) by any of the approximative distances in this section. To this end we, for example, define \(k_b\) by replacing the exact distance in the morphological kernel k by \(\rho _b\):
where we recall that \(\frac{1}{\alpha } + \frac{1}{\beta } = 1\) and \(\alpha >1\).
5 Main Theorem and Analysis
When the effect of erosion and dilation is calculated with an approximative morphological kernel an error is made. We are therefor interested in analyzing the behavior of this error. We do this by comparing the approximative morphological kernels with the exact kernel \(k_t^\alpha \) (13). The result of our analysis is summarized in the following theorem. Because there is no time t dependency in all the inequalities of our main result we use short notation \(k^\alpha := k_t^\alpha \), \(k_b^\alpha := k_{b,t}^\alpha \).
Theorem 1
(Quality of approximative morphological kernels) Let \(\zeta := \frac{w_2}{w_1}\) denote the spatial anisotropy, and let \(\beta \) be such that \(\frac{1}{\alpha } + \frac{1}{\beta } = 1\), for some \(\alpha >1\) fixed. We assess the quality of our approximative kernels in three ways:

The exact and all approximative kernels have the same symmetries, see Table 5.

Globally it holds that:
$$\begin{aligned} \zeta ^{\beta } k^\alpha \le k_b^\alpha \le \zeta ^{\beta } k^\alpha , \end{aligned}$$(31)from which we see that in the case \(\zeta = 1\) we have that \(k^\alpha _b\) is exactly equal to \(k^\alpha \).

Locally around^{Footnote 1}\(\textbf{p}_0\) we have:
$$\begin{aligned} k_b^\alpha \le (1 + \varepsilon )^{\beta /2} k^\alpha . \end{aligned}$$(32)where
$$\begin{aligned} \varepsilon := \frac{\zeta ^2  1}{2 w_3^2} \zeta ^4 \rho _b^2 + \mathcal {O}(\left \theta \right ^3). \end{aligned}$$(33)
Proof
The proof of the parts of the theorem will be discussed throughout the upcoming subsections.

The symmetries are shown in Corollary 1.
\(\square \)
Clearly, as all approximative kernels are solely functions of the corresponding approximative distances, the analysis of the quality of an approximative kernel reduces to analyzing the quality of the approximative distance that is used, and this is exactly what we will do.
In previous work on PDEGCNN’s the bound \(d=d(\textbf{p}_0,\cdot ) \le \rho _c\) is proven [28, Lem. 20]. Furthermore, it is shown that around \(\textbf{p}_0\) one has:
which has the corollary that there exist a constant \(C \ge 1\) such that
for any compact neighborhood around \(\textbf{p}_0\). We improve on these results by:

Showing that the approximative distances have the same symmetries as the exact Riemannian distance; Lemma 3.

Finding simple global bounds on the exact distance d which can then be used to find global estimates of \(\rho _b\) by d; Lemma 4. This improves upon (35) by finding an expression for the constant C.

Estimating the leading term of the asymptotic expansion, and observing that our upper bound of the relative error between \(\rho _b\) and d explodes in the cases \(\zeta \rightarrow \infty \) and \(w_3 \rightarrow 0\); Lemma 7. This improves upon equation (34).
Note, however, that we are not analyzing \(\rho _c\): we will be analyzing \(\rho _b\). This is mainly because the halfangle coordinates are easier to work with: they do not have the \({{\,\textrm{sinc}\,}}\tfrac{\theta }{2}\) factor the logarithmic coordinates have. Using that
recall (28) and (8), we see that
and thus locally \(\rho _c\) and \(\rho _b\) do not differ much, and results on \(\rho _b\) can be easily transferred to (slightly weaker) results on \(\rho _c\).
5.1 Symmetry Preservation
Symmetries play a major role in the analysis of (sub)Riemannian geodesics/distance in SE(2). They help to analyze symmetries in Hamiltonian flows [44] and corresponding symmetries in association field models [42, Fig. 11]. There are together 8 of them and their relation with logarithmic coordinates \(c^i\) (Lemma 1) shows they correspond to inversion of the Liealgebra basis \(A_i \mapsto A_i\). The symmetries for the subRiemannian setting are explicitly listed in [44, Prop. 4.3]. They can be algebraically generated by the (using the same labeling as [44]) following three symmetries:
They generate the other four symmetries as follows:
and with \(\varepsilon ^0 = \text {id}\). All symmetries are involutions: \(\varepsilon ^i \circ \varepsilon ^i = \text {id}\). Henceforth all eight symmetries will be called ‘fundamental symmetries.’ How all fundamental symmetries relate to each other becomes clearer if we write them down in either logarithm or halfangle coordinates.
Lemma 1
(8 fundamental symmetries) The 8 fundamental symmetries \(\varepsilon _i\), in either halfangle coordinates \(b^i\) or logarithmic coordinates \(c^i\), correspond to sign flips as laid out in Table 5.
Proof
We will only show that \(\varepsilon ^2\) flips \(b^1\). All other calculations are done analogously. Pick a point \(\textbf{p}= (x,y,\theta )\) and let \(\textbf{q}= \varepsilon ^2(\textbf{p})\). We now calculate \(b^1(\textbf{q})\):
where we have used the trigonometric difference identities of cosine and sine in the secondtolast equality. From the relation between logarithmic and halfangle coordinates (36), we have that the logarithmic coordinates \(c^i\) flip in the same manner under the symmetries. \(\square \)
The fixed points of the symmetries \(\varepsilon ^2\), \(\varepsilon ^1\), and \(\varepsilon ^6\) have an interesting geometric interpretation. The logarithmic and halfangle coordinates, being so closely related to the fundamental symmetries, also carry the same interpretation. Definition 1 introduces this geometric idea and Lemma 2 makes its relation to the fixed points of the symmetries precise. In Fig. 11, the fixed points are visualized, and in Fig. 12 a visualization of these geometric ideas can be seen.
Definition 1
Two points \(\textbf{p}_1=(\textbf{x}_1,\textbf{n}_1)\), \(\textbf{p}_2=(\textbf{x}_{2},\textbf{n}_1)\) of \(\mathbb {M}_{2}\) are called cocircular if there exist a circle, of possibly infinite radius, passing through \(\textbf{x}_1\) and \(\textbf{x}_2\) such that the orientations \(\textbf{n}_1 \in S^1\) and \(\textbf{n}_{2} \in S^1\) are tangents to the circle, at, respectively, \(\textbf{x}_1\) and \(\textbf{x}_2\), in either both the clockwise or anticlockwise direction. Similarly, the points are called coradial if the orientations are normal to the circle in either both the outward or inward direction. Finally, two points are called parallel if their orientations coincide.
Cocircularity has a wellknown characterization that is often used for line enhancement in image processing, such as tensor voting [58].
Remark 1
Point \(\textbf{p}=(r \cos \phi , r \sin \phi , \theta ) \in \mathbb {M}_2\) is cocircular to the reference point \(\textbf{p}_0=(0,0,0)\) if and only if the double angle equality \(\theta \equiv 2 \phi \mod 2\pi \) holds.
In fact all fixed points of the fundamental symmetries can be intuitively characterized:
Lemma 2
(Fixed Points of Symmetries) Fix reference point \(\textbf{p}_0=(0,0,0) \in \mathbb {M}_2\).
The point \(g \textbf{p}_0\in \mathbb {M}_2\) with \(g \in SE(2)\) is, respectively,

coradial to \(\textbf{p}_0\) when
$$\begin{aligned} c^1(g) = 0 \Leftrightarrow \varepsilon _2(g) = g \Leftrightarrow g \in \exp (\left\langle A_2, A_3 \right\rangle ), \end{aligned}$$(39) 
cocircular to \(\textbf{p}_0\) when
$$\begin{aligned} c^2(g) = 0 \Leftrightarrow \varepsilon _1(g) = g \Leftrightarrow g \in \exp (\left\langle A_1, A_3 \right\rangle ), \end{aligned}$$(40) 
parallel to \(\textbf{p}_0\) when
$$\begin{aligned} c^3(g) = 0 \Leftrightarrow \varepsilon _6(g) = g \Leftrightarrow g \in \exp (\left\langle A_1, A_2 \right\rangle ). \end{aligned}$$(41)
Proof
We will only show (40), the others are done analogously. We start by writing \(g=(r \cos \phi , r \sin \phi , \theta )\) and calculating that \(g \odot \textbf{p}_0 = (r \cos \phi , r \sin \phi , (\cos \theta , \sin \theta ))\). Then by Remark 1 we known that \(g \textbf{p}_0\) is cocircular to \(\textbf{p}_0\) if and only if \(2\phi = \theta {{\,\textrm{mod}\,}}2\pi \). We can show this is equivalent to \(c^2(g)=0\):
In logarithmic coordinates, \(\varepsilon _1\) is equivalent to:
from which we may deduce that \(\varepsilon _1(g) = g\) is equivalent to \(c^2(g) = 0\). If \(c^2(g) = 0\) then \(\log g \in \left\langle A_1, A_3 \right\rangle \) and thus \(g \in \exp (\left\langle A_1, A_3 \right\rangle )\). As for the other way around, it holds by simple computation that:
which shows that \(g \in \exp (\left\langle A_1, A_3 \right\rangle ) \Rightarrow c^2(g) = 0\). \(\square \)
In the important work [44] on subRiemannian geometry on SE(2) by Sachkov and Moiseev, it is shown that the exact subRiemannian distance \(d_{sr}\) is invariant under the fundamental symmetries \(\varepsilon ^i\). However, these same symmetries hold true for the Riemannian distance d. Moreover, because the approximative distances use the logarithmic coordinates \(c^i\) and halfangle coordinates \(b^i\) they also carry the same symmetries. The following lemma makes this precise.
Lemma 3
(Symmetries of the exact distance and all proposed approximations) All exact and approximative (sub)Riemannian distances (w.r.t. the reference point \(\textbf{p}_0\)) are invariant under all the fundamental symmetries \(\varepsilon _i\).
Proof
By Table 5, one sees that \(\varepsilon ^3, \varepsilon ^4\), and \(\varepsilon ^5\) also generate all symmetries. Therefore, if we just show that all distances are invariant under these select three symmetries we also have shown that they are invariant under all symmetries. We will first show the exact distance, in either the Riemannian or subRiemannian case, is invariant w.r.t. these three symmetries, i.e., \(d(\textbf{p}) = d(\varepsilon ^i(\textbf{p}))\) for \(i \in \{3,4,5\}\). By (38) and (37), one has \(\varepsilon ^3(x,y,\theta )=(x,y,\theta )\) and \(\varepsilon ^4(x,y,\theta ) = (x,y,\theta )\). Now consider the push forward \(\varepsilon ^3_*\). By direct computation (in \((x,y,\theta )\) coordinates), we have \(\varepsilon ^3_* \left. \mathcal {A}_i \right _\textbf{p}= \pm \left. \mathcal {A}_i \right _{\varepsilon ^3(\textbf{p})}\). Because the metric tensor field \(\mathcal {G}\) (4) is diagonal w.r.t. to the \(\mathcal {A}_i\) basis this means that \(\varepsilon ^3\) is a isometry. Similarly, \(\varepsilon ^4\) is an isometry. Being an isometry of the metric \(\mathcal {G}\), we may directly deduce that \(\varepsilon ^3\) and \(\varepsilon ^4\) preserve distance. The \(\varepsilon ^5\) symmetry flips all the signs of the \(c^i\) coordinates which amounts to Lie algebra inversion: \( \log g = \log (\varepsilon ^5(g)) \). Taking the exponential on both sides shows that \(g^{1} = \varepsilon ^5(g)\). By leftinvariance of the metric, we have \(d(g \textbf{p}_0, \textbf{p}_0) = d(\textbf{p}_0, g^{1} \textbf{p}_0)\), which holds in both the Riemannian and subRiemannian case, and thus \( d(g\textbf{p}_0) = d(\varepsilon ^5(g\textbf{p}_0)) \). That all approximative distances (both in the Riemannian and subRiemannian case) are also invariant under all the symmetries is not hard to see: every \(b^i\) and \(c^i\) term is either squared or the absolute value is taken. Flipping signs of these coordinates, recall Lemma 1, has no effect on the approximative distance. \(\square \)
Corollary 1
(All kernels preserve symmetries) The exact kernel and all approximative kernels have the same fundamental symmetries.
Proof
The kernels are direct functions of the exact and approximative distances, recall for example (13), so from Lemma 3 we can immediately conclude that they also carry the 8 fundamental symmetries. \(\square \)
In Fig. 10, the previous lemma can be seen. The two fundamental symmetries \(\varepsilon ^2\) and \(\varepsilon ^1\) correspond, respectively, to reflecting the isocontours (depicted in colors) along their short edge and long axis. The \(\varepsilon ^6\) symmetry corresponds to mapping the positive \(\theta \) isocontours to their negative \(\theta \) counterparts. In Fig. 13, one can see an isocontour of \(\rho _b\) together with the symmetry “planes” of \(\varepsilon _2\), \(\varepsilon _1\) and \(\varepsilon _6\).
5.2 Simple Global Bounds
Next we provide some basic global lower and upper bounds for the exact Riemannian distance d (6). Recall that the lower bound l plays an important role in the combined approximation \(\rho _{c,com}\) (26) when far from the reference point \(\textbf{p}_0\).
Lemma 4
(Global bounds on distance) The exact Riemannian distance \(d=d(\textbf{p}_0,\cdot )\) is greater than or equal to the following lower bound \(l: \mathbb {M}_2 \rightarrow \mathbb {R}\):
and less than or equal to the following upper bounds \(u_1, u_2: \mathbb {M}_2 \rightarrow \mathbb {R}\):
Proof
We will first show \(l \le d\). Consider the following spatially isotropic metric:
We assumed w.l.o.g. that \(w_1 \le w_2\) so we have for any vector \(v \in T\mathbb {M}_2\) that \( \Vert v\Vert _{{\tilde{\mathcal {G}}}} \le \Vert v\Vert _{\mathcal {G}} \). From this, we can directly deduce that for any curve \(\gamma \) on \(\mathbb {M}_2\) we have that \(L_{{\tilde{\mathcal {G}}}}(\gamma ) \le L_{\mathcal {G}}(\gamma )\). Now consider a lengthminimizing curve \(\gamma \) w.r.t. \(\mathcal {G}\) between the reference point \(\textbf{p}_0\) and some end point \(\textbf{p}\). We then have the chain of (in)equalities:
Furthermore, because the metric \({\tilde{\mathcal {G}}}\) is spatially isotropic it can be equivalently be written as:
which is a constant metric on the coordinate covector fields, and thus:
Putting everything together gives the desired result of \(l \le d\). To show that \(d \le u_1\) can be done analogously.
As for showing \(d \le u_2\) we will construct a curve \(\gamma \) of which the length \(L(\gamma )\) w.r.t. \(\mathcal {G}\) can be bounded from above with \(u_2\). This in turn shows that \(d \le u_2\) by definition of the distance. Pick a destination position and orientation \(\textbf{p}= (\textbf{x}, \textbf{n})\). The constructed curve \(\gamma \) will be as follows. We start by aligning our starting orientation \(\textbf{n}_0 = (1,0) \in S^1\) toward the destination position \(\textbf{x}\). This desired orientation toward \(\textbf{x}\) is \({\hat{\textbf{x}}}:= \frac{\textbf{x}}{r}\) where \(r = \Vert \textbf{x}\Vert = \sqrt{x^2 + y^2}\). This action will cost \(w_3 a\) for some \(a \ge 0\). Once we are aligned with \({\hat{\textbf{x}}}\), we move toward \(\textbf{x}\). Because we are aligned this action will cost \(w_1 r\). Now that we are at \(\textbf{x}\) we align our orientation with the destination orientation \(\textbf{n}\), which will cost \(w_3b\) for some \(b \ge 0\). Altogether we have \(L(\gamma ) = w_1 r + w_3 (a+b)\). In its current form, the constructed curve actually doesn’t have that \(a+b\le \pi \) as desired. To fix this, we realize that we did not necessarily had to align with \({\hat{\textbf{x}}}\). We could have aligned with \({\hat{\textbf{x}}}\) and move backwards toward \(\textbf{x}\), which will also cost \(w_1 r\). One can show that one of the two methods (either moving forwards or backwards toward \(\textbf{x}\)) indeed has that \(a+b\le \pi \) and thus \(d \le u_2\). \(\square \)
These bounds are simple but effective: they help us prove a multitude of insightful corollaries.
Corollary 2
(Global error distance) Simple manipulations, together with the fact that \(x^2 + y^2 = (b^1)^2 + (b^2)^2\), give the following inequalities between \(l, u_1\) and \(\rho _b\):
The second equation can be extended to inequalities between \(\rho _b\) and d:
Remark 2
If \(w_1 = w_2 \Leftrightarrow \zeta = 1\), i.e., the spatially isotropic case, then the lower and upper bound coincide, thus becoming exact. Because \(\rho _b\) is within the lower and upper bound it also becomes exact.
Corollary 3
(Global error kernel) Globally the error is independent of time \(t>0\) and is estimated by the spatial anisotropy \(\zeta \ge 1\) (5) as follows:
For \(\zeta =1\), there is no error.
Proof
We will only prove the second inequality, the first is done analogously.
\(\square \)
The previous result indicates that problems can arise if \(\zeta \rightarrow \infty \), which indeed turns out to be the case:
Corollary 4
(Observing the problem) If we restrict ourselves to \(x=\theta =0\), we have that \(u_1 = \rho _b = \rho _c = w_2\left y \right \). From this, we can deduce that one can be certain that both \(\rho _b\) and \(\rho _c\) become bad approximations away from \(\textbf{p}_0\). Namely, when \(\zeta> 1 \Leftrightarrow w_2 > w_1\) both approximations go above \(u_2\) if one looks far enough away from \(\textbf{p}_0\). How “fast” it goes bad is determined by all metric parameters. Namely, the intersection of the approximations \(\rho _b\) and \(\rho _c\), and \(u_2\) is at \(\left y \right = \frac{w_3\pi }{w_2  w_1}\), or equivalently at \(\rho = \frac{w_3\pi }{1  \zeta ^{1}}\). This intersection is visible in Fig. 14 in the higher anisotropy cases. From this expression of the intersection, we see that in the cases \(w_3 \rightarrow 0\) and \(\zeta \rightarrow \infty \) the Riemannian distance approximations \(\rho _b\) and \(\rho _c\) quickly go bad. We will see exactly the same behavior in Lemma 7 and Remark 3.
Lemma 4 is visualized in Figs. 14 and 15. In Fig. 14, we consider the behavior of the exact distance and bounds along the yaxis, that is at \(x=\theta =0\). We have chosen to inspect the yaxis because it consists of points that are hard to reach from the reference point \(\textbf{p}_0\) when the spatial anisotropy is large, which makes it interesting. In contrast, along the xaxis \(l,d,\rho _b,\rho _c, u_1\) and \(w_1\left x \right \) all coincide, and is therefore uninteresting. To provide more insight we also depict the bounds along the \(y=x\) axis, see Fig. 15. Observe that in both figures, the exact distance d is indeed always above the lower bound l and below the upper bounds \(u_1\) and \(u_2\).
5.3 Asymptotic Error Expansion
In this section, we provide an asymptotic expansion of the error between the exact distance d and the halfangle distance approximation \(\rho _b\) (Lemma 7). This error is then leveraged to an error between the exact morphological kernel k and the halfangle kernel \(k_b\) (Corollary 5). We also give a formula that determines a region for which the halfangle approximation \(\rho _b\) is appropriate given an a priori tolerance bound (Remark 3).
Lemma 5
Let \(\gamma :[0,1] \rightarrow \mathbb {M}_2\) be a minimizing geodesic from \(\textbf{p}_0\) to \(\textbf{p}\). We have that:
Proof
The fundamental theorem of calculus tells us that:
but one can also bound this expression as follows:
Putting the two together gives the desired result. \(\square \)
Lemma 6
One can bound \(\Vert d\rho _b\Vert \) around \(\textbf{p}_0\) by:
Proof
The proof is deferred to Appendix 1\(\square \)
By combining the simple Lemmas 5 and 6, one can find an expression for the asymptotic error between the exact distance d and the halfangle approximation \(\rho _b\).
Lemma 7
Around any compact neighborhood of \(\textbf{p}_0\), we have that
for some \(C \ge 0\).
Proof
Let \(\textbf{p}\in U\) be given, and let \(\gamma : [0,1] \rightarrow \mathbb {M}_2\) be the geodesic from \(\textbf{p}_0\) to \(\textbf{p}\). For the distance, we know that
Making use of (42), we know that \(\frac{1}{\zeta } \rho _b \le d \le \zeta \rho _b\) so we can combine this with the previous equation to find:
from which we get that
Combining this fact with the above two lemmas allows us to conclude (43). \(\square \)
Remark 3
(Region for approximation \(\rho _b \approx d\)) Putting an a priori tolerance bound \(\varepsilon _{tol}\) on the error \(\varepsilon \) (and neglecting the \(\mathcal {O}(\theta ^3)\) term) gives rise to a region \(\Omega _0\) on which the local approximation \(\rho _b\) is appropriate:
Thereby we cannot guarantee a large region of acceptable relative error when \(w_3 \rightarrow 0\) or \(\zeta \rightarrow \infty \). We solve this problem
by using \(\rho _{b, com}\) given (26) instead of \(\rho _b\).
Corollary 5
(Local error morphological kernel) Locally around \(\textbf{p}_0\), we have:
Proof
By Lemma 7, one has
\(\square \)
6 Experiments
6.1 Error of Half Angle Approximation
We can quantitatively analyze the error between any distance approximation \(\rho \) and the exact Riemannian distance d as follows. We do this by first choosing a region \(\Omega \subseteq \mathbb {M}_2\) in which we will analyze the approximation. Just as in Tables 1 and 2, we decided to inspect \(\Omega := [3,3]\times [3,3]\times [\pi ,\pi ) \subseteq \mathbb {M}_2\). As for our exact measure of error \(\varepsilon \), we have decided on the mean relative error defined as:
where \(\mu \) is the induced Riemannian measure determined by the Riemannian metric \(\mathcal {G}\). We then discretized our domain \(\Omega \) into a grid of \(101 \times 101 \times 101\) equally spaced points \(\textbf{p}_i\ \in \Omega \) indexed by some index set \(i \in I\) and numerically solved for the exact distance d on this grid. This numerical scheme is of course not exact and we will refer to these values as \({\tilde{d}}_i \approx d(\textbf{p}_i)\). We also calculate the value of the distance approximation \(\rho \) on the grid points \(\rho _i:= \rho (\textbf{p}_i)\). Once we have these values, we can approximate the true mean relative error \(\varepsilon \) by calculating the numerical error \({\tilde{\varepsilon }}\) defined by:
In Table 6, the numerical mean relative error \({\tilde{\varepsilon }}\) between the halfangle approximation \(\rho _b\) and the numerical Riemannian distance \({\tilde{d}}\) can be seen for different spatial anisotropies \(\zeta \). We keep \(w_1=w_3=1\) constant and vary \(w_2\). We see that, as shown visually in Tables 1 and 2, that \(\rho _b\) gets worse and worse when we increase the spatial anisotropy \(\zeta \).
There is an discrepancy in the table worth mentioning. We know from Remark 2 that when \(\zeta = 1\) then \(\rho _b = d\) and thus \(\varepsilon = 0\). But surprisingly we do not have \({\tilde{\varepsilon }} = 0\) in the \(\zeta = 1\) case in Table 6. This can be simply explained by the fact that the numerical solution \({\tilde{d}}\) is not exactly equal to the true distance d. We expect that \({\tilde{\varepsilon }}\) will go to 0 in the \(\zeta = 1\) case if we discretize our region \(\Omega \) more and more finely.
We can compare these numerical results to our theoretical results. Namely, we can deduce from Equation (42) that:
which means
And so we expect this to also approximately hold for the numerical mean relative error \({\tilde{\varepsilon }}\). Indeed, in Table 6 we can see that \( {\tilde{\varepsilon }} \lessapprox \zeta  1\).
Interestingly, we see that \({\tilde{\varepsilon }}\) is relatively small compared to our theoretical bound (47) even in the high anisotropy cases. However, this is only a consequence of relative smallness of \(\Omega \). If we make \(\Omega \) bigger and bigger we can be certain that \(\varepsilon \) converges to \(\zeta  1\). This follows from an argument similar to the reasoning in Corollary 4.
6.2 DCA1
The DCA1 dataset is a publicly available database “consisting of 130 Xray coronary angiograms, and their corresponding groundtruth image outlined by an expert cardiologist” [59]. One such angiogram and groundtruth can be seen in Fig. 18a and d.
We have split the DCA1 dataset [59] into a training and test set consisting of 125 and 10 images, respectively.
To establish a baseline, we ran a 3, 6, and 12 layer CNN, GCNN and PDEGCNN on DCA1. The exact architectures are identical/analogous to the ones used in [28, Fig. 15]. For the baseline, the logarithmic distance approximation \(\rho _c\) was used within the PDEGCNNs. This is the same approximation that was used in [28]. Every network was trained 10 times for 80 epochs. After every epoch, the average Dice coefficient on the test set is stored. After every full training, the maximum of the average Dice coefficients over all 80 epochs is calculated. The result is 10 maximum average Dice coefficients for every architecture. The result of this baseline can be seen in Fig. 16. The amount of parameters of the networks can be found in Table 7. We see that PDEGCNNs consistently perform equally well as, and sometimes outperform, GCNNs and CNNs, all the while having the least amount of parameters of all architectures.
To compare the effect of using different approximative distances, we decided to train the 6 layer PDEGCNN (with 2560 parameters) 10 times for 80 epochs using each distance approximation. The results can be found in Figs. 17 and 18. We see that on DCA1 all distance approximations have a comparable performance. We notice a small dent in effectiveness when using \(\rho _{b,sr}\), and a small increase when using \(\rho _{b,com}\).
6.3 Lines
For the line completion problem, we created a dataset of 512 training images and 128 test images.^{Footnote 2} Fig. 21a and d shows one sample of the Lines dataset.
To establish a baseline, we ran a 6 layer CNN, GCNN and PDEGCNN. For this baseline we again used \(\rho _{c}\) within the PDEGCNN, but changed the amount of channels to 30, and the kernel sizes to [9, 9, 9], making the total amount of parameters 6018. By increasing the kernel size, we anticipate that the difference in effectiveness of using the different distance approximations, if there is any, becomes more pronounced. Every network was trained 15 times for 60 epochs. The result of this baseline can be seen in Fig. 19. The amount of parameters of the networks can be found in Table 8. We again see that the PDEGCNN outperforms the GCNN, which in turn outperforms the CNN, while having the least amount of parameters.
We again test the effect of using different approximative distances by training the 6 layer PDEGCNN 15 times for 60 epochs for every approximation. The results can be found in Fig. 20. We see that on the Lines dataset, all distance approximations again have a comparable performance. We again notice an increase in effectiveness when using \(\rho _{b,com}\), just as on the DCA1 dataset. Interestingly, using \(\rho _{b,sr}\) does not seem to hurt the performance on the Lines dataset, which is in contrast with DCA1. This is in line with what one would expect in view of the existing subRiemannian lineperception models in neurogeometry. Furthermore, in Fig. 21b,c,e and f some feature maps of a trained PDEGCNN are visualized.
7 Conclusion
In this article, we have carefully analyzed how well the nonlinear erosion and dilation parts of PDEGCNNs are actually solved on the homogeneous space of 2D positions and orientations \(\mathbb {M}_2\). According to Proposition 1, the Hamilton–Jacobi equations are solved by morphological kernels that are functions of only the exact (sub)Riemannian distance function. As a result, every approximation of the exact distance yields a corresponding approximative morphological kernel.
In Theorem 1, we use this to improve upon local and global approximations of the relative errors of the erosion and dilations kernels used in the papers [28, 60] where PDEGCNN are first proposed (and shown to outperform GCNNs). Our new sharper estimates for distance on \(\mathbb {M}_2\) have bounds that explicitly depend on the metric tensor field coefficients. This allowed us to theoretically underpin the earlier worries expressed in [28, Fig. 10] that if spatial anisotropy becomes high the previous morphological kernel approximations [28] become less and less accurate.
Indeed, as we show qualitatively in Table 2 and quantitatively in Sect. 6.1, if the spatial anisotropy \(\zeta \) is high one must resort to subRiemannian approximations. Furthermore, we propose a single distance approximation \(\rho _{b,com}\) that works both for low and high spatial anisotropy.
Apart from how well the kernels approximate the PDEs, there is the issue of how well each of the distance approximations perform in applications within the PDEGCNNs. In practice, the analytic approximative kernels using \(\rho _b\), \(\rho _c\), \(\rho _{b,com}\) perform similarly. This is not surprising as our theoretical result Lemma 3 and Corollary 1 reveals that all morphological kernel approximations carry the correct 8 fundamental symmetries of the PDE. Nevertheless, Figs. 17 and 20 do reveal advantages of using the new kernel approximations (in particular \(\rho _{b,com}\)) over the previous kernel \(\rho _c\) in [28].
The experiments also show that the strictly subRiemannian distance approximation \(\rho _{b,sr}\) only performs well on applications where subRiemannian geometry really applies. For instance, as can be seen in Figs. 17 and 20, on the DCA1 dataset \(\rho _{b,sr}\) performs relatively poor, whereas on the Lines dataset, \(\rho _{b,sr}\) performs well. This is what one would expect in view of subRiemannian models and findings in cortical lineperception [37, 38, 40, 41, 46, 61] in neurogeometry.
Besides better accuracy and better performance of the approximative kernels, there is the issue of geometric interpretability. In GCNNs and CNNs, geometric interpretability is absent, as they include adhoc nonlinearities like ReLUs. PDEGCNNs instead employ morphological convolutions with kernels that reflect association fields, as visualized in Fig. 5b. In Fig. 8, we see that as network depth increases association fields visually merge in the feature maps of PDEGCNNs toward adaptive line detectors, whereas such merging/grouping of association fields is not visible in normal CNNs.
In all cases, the PDEGCNNs still outperform GCNNs and CNNs on the DCA1 dataset and Lines dataset: they have a higher (or equal) performance, while having a huge reduction in network complexity, even when using 3 layers. Regardless, the choice of kernel \(\rho _c\), \(\rho _b\), \(\rho _{b,sr}\), \(\rho _{b,com}\) the advantage of PDEGCNNs toward GCNNs and CNNs is significant, as can be clearly observed in Figs. 16 and 19 and Table 7 and 8. This is in line with previous observations on other datasets [28].
Altogether, PDEGCNNs have better geometric reduction, performance, and geometric interpretation, than basic classical feedforward (G)CNN networks on various segmentation problems.
Extensive investigations on training data reduction, memory reduction (via UNet versions of PDEGCNNs), and a topological description of the merging of association fields are beyond the scope of this article, and are left for future work.
Availability of Data and Code
The code of the experiments, and PDEGCNNs in general, can be found in the publicly available LieTorch package: https://gitlab.com/bsmetsjr/lietorch. The publicly available DCA1 dataset [59] can be found at https://personal.cimat.mx:8181/ivan.cruz/DB_Angiograms.html. The lines dataset is available from the authors on request.
References
Bekkers, E.J., Lafarge, M.W., Veta, M., Eppenhof, K.A.J., Pluim, J.P.W., Duits, R.: Rototranslation covariant convolutional networks for medical image analysis. In: International Conference on Medical Image Computing and ComputerAssisted Intervention, pp. 440–448. Springer (2018). arXiv:1804.03393
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25. Curran Associates, Inc., Red Hook, New York (2012). https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45bPaper.pdf
Litjens, G., Bejnodri, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., van der Laak, J.A.W.M., van Ginneken, B., Sánchez, C.I.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Cohen, T.S., Welling, M.: Group equivariant convolutional networks. In: Proceedings of the 33rd International Conference on Machine Learning, vol. 48, pp. 1–12 (2016)
Dieleman, S., De Fauw, J., Kavukcuoglu, K.: Exploiting cyclic symmetry in convolutional neural networks. arXiv:1602.02660 (2016)
Dieleman, S., Willett, K.W., Dambre, J.: Rotationinvariant convolutional neural networks for galaxy morphology prediction. Mon. Not. R. Astron. Soc. 450(2), 1441–1459 (2015)
Winkels, M., Cohen, T.S.: 3D GCNNs for pulmonary nodule detection. MIDL, 1–11 (2018)
Worrall, D., Brostow, G.: CubeNet: equivariance to 3D rotation and translation. ECCV 2018, 585–602 (2018)
Oyallon, E., Mallat, S.: Deep rototranslation scattering for object classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2865–2873 (2015)
Weiler, M., Hamprecht, F.A., Storath, M.: Learning steerable filters for rotation equivariant CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 849–858 (2018)
Bekkers, E.J.: Bspline CNNs on Lie groups. (2019) arXiv:1909.12057
Finzi, M., Stanton, S., Izmailov, P., Wilson, A.G.: Generalizing convolutional neural networks for equivariance to Lie groups on arbitrary continuous data. In: III, H.D., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 3165–3176. PMLR, Virtual (2020). http://proceedings.mlr.press/v119/finzi20a.html
Cohen, T.S., Geiger, M., Weiler, M.: A general theory of equivariant CNNs on homogeneous spaces. Adv. Neural Inf. Process. Syst. 32 (2019)
Worrall, D.E., Garbin, S.J., Turmukhambetov, D., Brostow, G.J.: Harmonic networks: Deep translation and rotation equivariance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5028–5037 (2017)
Kondor, R., Trivedi, S.: On the generalization of equivariance and convolution in neural networks to the action of compact groups. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 2747–2755. PMLR, Stockholmsmässan, Stockholm Sweden (2018). http://proceedings.mlr.press/v80/kondor18a.html
Esteves, C., AllenBlanchette, C., Makadia, A., Daniilidis, K.: Learning SO(3) equivariant representations with spherical CNNs. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–68 (2018)
Weiler, M., Cesa, G.: General E(2)equivariant steerable CNNs. In: Advances in Neural Information Processing Systems, pp. 14334–14345 (2019)
Paoletti, M.E., Haut, J.M., Roy, S.K., Hendrix, E.M.T.: Rotation equivariant convolutional neural networks for hyperspectral image classification. IEEE Access 8, 179575–179591 (2020). https://doi.org/10.1109/ACCESS.2020.3027776
Weiler, M., Forré, P., Verlinde, E., Welling, M.: Coordinate Independent Convolutional Networks—Isometry and Gauge Equivariant Convolutions on Riemannian Manifolds (2021). arXiv:2106.06020
Cohen, T.S., Weiler, M., Kicanaoglu, B., Welling, M.: Gauge equivariant convolutional networks and the icosahedral CNN. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 1321–1330. PMLR, Long Beach, California (2019). https://proceedings.mlr.press/v97/cohen19d.html
Bogatskiy, A., Anderson, B., Offermann, J.T., Roussi, M., Miller, D.W., Kondor, R.: Lorentz Group Equivariant Neural Network for Particle Physics (2020). arXiv:2006.04780
Sifre, L., Mallat, S.: Rotation, scaling and deformation invariant scattering for texture discrimination. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1233–1240 (2013). https://doi.org/10.1109/CVPR.2013.163
Bekkers, E.J., Loog, M., ter Haar Romeny, B.M., Duits, R.: Template matching via densities on the rototranslation group. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 452–466 (2018). https://doi.org/10.1109/TPAMI.2017.2652452
Worrall, D., Welling, M.: Deep scalespaces: Equivariance over scale. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ AlchéBuc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc., Red Hook, New York (2019). https://proceedings.neurips.cc/paper/2019/file/f04cd7399b2b0128970efb6d20b5c551Paper.pdf
Satorras, V.G., Hoogeboom, E., Welling, M.: E(n) equivariant graph neural networks. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 9323–9332. PMLR, Virtual (2021). https://proceedings.mlr.press/v139/satorras21a.html
Bronstein, M.M., Bruna, J., Cohen, T.S., Veličković, P.: Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges (2021). arXiv:2104.13478
Smets, B.M.N., Portegies, J.W., Bekkers, E.J., Duits, R.: PDEbased group equivariant convolutional neural networks. J. Math. Imaging Vis. (2022). https://doi.org/10.1007/s1085102201114x
Duits, R., Franken, E.M.: Leftinvariant parabolic evolution equations on \({SE}(2)\) and contour enhancement via invertible orientation scores, part I: Linear leftinvariant diffusion equations on \({SE}(2)\). QAMAMS 68, 255–292 (2010)
Duits, R.: Perceptual organization in image analysis: a mathematical approach based on scale, orientation and curvature. PhD thesis, Eindhoven University of Technology (2005)
Duits, R., Dela Haije, T.C.J., Creusen, E., Ghosh, A.: Morphological and linear scale spaces for fiber enhancement in DWMRI. J. Math. Imaging Vis. 46(3), 326–368 (2013)
Duits, R., Burgeth, B.: Scale spaces on Lie groups. In: International Conference on Scale Space and Variational Methods in Computer Vision, pp. 300–312 (2007). Springer
Franken, E.M.: Enhancement of crossing elongated structures in images. PhD thesis, Eindhoven University of Technology (2008)
Bekkers, E.J.: Retinal image analysis using subRiemannian geometry in SE(2). PhD thesis, Eindhoven University of Technology (2017)
Hubel, D.H., Wiesel, T.N.: Receptive fields of single neurons in the cat’s striate cortex. J. Physiol. 148, 574–591 (1959)
Bosking, W.H., Zhang, Y., Schofield, B., Fitzpatrick, D.: Orientation selectivity and the arrangement of horizontal connections in tree shrew striate cortex. J. Neurosci. 17(6), 2112–2127 (1997)
Petitot, J.: The neurogeometry of pinwheels as a subRiemannian contact structure. J. Physiol.  Paris 97, 265–309 (2003)
Citti, G., Sarti, A.: A cortical based model of perceptional completion in the rototranslation space. J. Math. Imaging Vis. 24(3), 307–326 (2006)
Field, D.J., Hayes, A., Hess, R.F.: Contour integration by the human visual system: evidence for a local “association field’’. Vis. Res. 33(2), 173–193 (1993). https://doi.org/10.1016/00426989(93)90156Q
Baspinar, E., Calatroni, L., Franceschi, V., Prandi, D.: A corticalinspired subRiemannian model for Poggendorfftype visual illusions. Journal of Imaging 7, 41 (2021). https://doi.org/10.3390/jimaging7030041
Franceschiello, B., Mashtakov, A., Citti, G., Sarti, A.: Geometrical optical illusion via subRiemannian geodesics in the rototranslation group. Differ. Geom. Its Appl. 65, 55–77 (2019). https://doi.org/10.1016/j.difgeo.2019.03.007
Duits, R., Boscain, U., Rossi, F., Sachkov, Y.L.: Association fields via cuspless subRiemannian geodesics in SE(2). J. Math. Imaging Vis. 49(2), 384–417 (2014). https://doi.org/10.1007/s108510130475y
Sachkov, Y.L.: Cut locus and optimal synthesis in the subRiemannian problem on the group of motions of a plane. ESAIM Control Optim. Calcu. Var. 17, 293–321 (2011)
Moiseev, I., Sachkov, Y.L.: Maxwell strata in subRiemannian problem on the group of motions of a plane. ESAIM Control Optim. Calcu. Var. 16(2), 380–399 (2010). https://doi.org/10.1051/cocv/2009004
Duits, R., Meesters, S.P.L., Mirebeau, J.M., Portegies, J.M.: Optimal paths for variants of the 2D and 3D Reeds–Shepp car with applications in image analysis. J. Math. Imaging Vis. 60, 816–848 (2018)
Petitot, J.: Elements of Neurogeometry. Lecture Notes in Morphogenesis. Springer, London (2017). https://doi.org/10.1007/9783319655918
Bekkers, E.J., Duits, R., Mashtakov, A., Sanguinetti, G.R.: A PDE approach to datadriven subRiemannian geodesics in SE(2). SIAM J. Imaging Sci. 8(4), 2740–2770 (2015)
Bekkers, E.J., Chen, D., Portegies, J.M.: Nilpotent approximations of subRiemannian distances for fast perceptual grouping of blood vessels in 2D and 3D. J. Math. Imaging Vis. 60(6), 882–899 (2018). https://doi.org/10.1007/s108510180787z
Chirikjian, G.S., Kyatkin, A.B.: Engineering Applications of Noncommutative Harmonic Analysis: With Emphasis on Rotation and Motion Groups. CRC Press, Boca Raton (2000)
Schmidt, M., Weickert, J.: Morphological counterparts of linear shiftinvariant scalespaces. J. Math. Imaging Vis. 56(2), 352–366 (2016)
van den Boomgaard, R., Smeulders, A.: The morphological structure of images: the differential equations of morphological scalespace. IEEE Trans. Pattern Anal. Mach. Intell. 16(11), 1101–1113 (1994). https://doi.org/10.1109/34.334389
Evans, L.C.: Partial Differential Equations, vol. 19. American Mathematical Society, Providence (2010)
Diop, E.H.S., Mbengue, A., Manga, B., Seck, D.: Extension of mathematical morphology in Riemannian spaces. In: Scale Space and Variational Methods in Computer Vision, pp. 100–111. Springer, Cham (2021)
Fathi, A., Maderna, E.: Weak KAM theorem on non compact manifolds. Nonlinear Differ. Equ. Appl. NoDEA 14(1–2), 1–27 (2007). https://doi.org/10.1007/s0003000720476
Azagra, D., Ferrera, J., LópezMesas, F.: Nonsmooth analysis and Hamilton–Jacobi equations on Riemannian manifolds. J. Funct. Anal. 220(2), 304–361 (2005)
Lupi, G.: Kernel approximations in lie groups and application to groupinvariant CNN. Master thesis, University of Bologna (2021)
ter Elst, A.F.M., Robinson, D.W.: Weighted subcoercive operators on Lie groups. J. Funct. Anal. 157(1), 88–163 (1998). https://doi.org/10.1006/jfan.1998.3259
Mordohai, P., Medioni, G.: Tensor voting: a perceptual organization approach to computer vision and machine learning. Synth. Lect. Image Video Multimed. Process. 2(1), 1–136 (2006). https://doi.org/10.2200/S00049ED1V01Y200609IVM008
CervantesSanchez, F., CruzAceves, I., HernandezAguirre, A., HernandezGonzalez, M.A., SolorioMeza, S.E.: Automatic segmentation of coronary arteries in Xray angiograms using multiscale analysis and artificial neural networks. Appl. Sci. (2019). https://doi.org/10.3390/app9245507
Duits, R., Smets, B.M.N., Bekkers, E.J., Portegies, J.W.: Equivariant deep learning via morphological and linear scale space PDEs on the space of positions and orientations. LNCS 12679, 27–39 (2021)
Baspinar, E., Citti, G., Sarti, A.: A geometric model of multiscale orientation preference maps via Gabor functions. J. Math. Imaging Vis. 60(6), 900–912 (2018)
Acknowledgements
We thank Dr. Javier Oliván Bescós for pointing us to the publicly available DCA1 dataset [59].
Funding
We gratefully acknowledge the Dutch Foundation of Science NWO for its financial support by Talent Programme VICI 2020 Exact Sciences (Duits, Geometric learning for Image Analysis, VI.C. 202031).
Author information
Authors and Affiliations
Contributions
G. Bellaard is the first author and writer of the manuscript, he has adapted the PDEGCNN code by B.M.N. Smets for inclusion of subRiemannian and combined morphological kernels, created most figures in the manuscript, set up (experiments on) the Lines dataset, and created the final proofs of all theorems. B.M.N. Smets has created the vital PDEGCNN code, i.e., the publicly available LieTorch package, used in the practical experiments of the article, contributed to the main theoretical results, and provided important practical advice on all experiments. D.L.J. Bon has contributed to part of the proof of the main theoretical result, created the pictures that visualize the feature maps, and contributed substantially to pretesting the experiments on the Lines dataset, as well as creating the dataset. G. Pai conducted all the final experiments in collaboration with the other authors (primarily with G. Bellaard), and made substantial contributions on the final exposition and presentation of the practical part of this article. R. Duits supervised the project, has initiated the theory and theorem formulations, contributed to all proofs, polished the manuscript, and inserted the geometric interpretation of PDEGCNNs, linking them to neurogeometry. All authors collaborated closely and reviewed the manuscript carefully. Author main contributions per section (in order of appearance): Sect. 1: G. Bellaard & R. Duits; Sect. 2: G. Bellaard & R. Duits; Sect. 3: G. Bellaard & R. Duits; Sect. 4: G. Bellaard & R. Duits; Sect. 5: G. Bellaard & B.M.N. Smets & D.L.J. Bon & R. Duits; Sect. 6: G. Bellaard & B.M.N. Smets & G. Pai; Sect. 7: G. Bellaard & G. Pai & R. Duits; Appendix 1: G. Bellaard & D.L.J. Bon & R. Duits; Appendix 1: G. Bellaard & R. Duits.
Corresponding author
Ethics declarations
Competing Interests
R. Duits is a member of the editorial board of JMIV.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Proof of Lemma 6
Proof
We start by writing out the explicit form of \(\Vert d \rho _b\Vert ^2\) in the leftinvariant frame:
By replacing the leftinvariant derivatives with halfangle coordinates derivatives, we can equivalently write this as:
where \(\psi = {{\,\textrm{arctan2}\,}}(b^2, b^1)\), \(\partial _\psi = b^2 \partial _{b^1}  b^1 \partial _{b^2} \), and we omitted the subscript b from \(\rho \) for conciseness. We are going to Taylor expand the sin and cosine in the second term up to the secondorder term. This becomes
This allows us to write \( \left\ d \rho _b \right\ ^2\) as
Making use of the fact that the first part in this expression equals 1, we can thus write \( \left\ d \rho _b \right\ ^2 = 1 + \varepsilon \). The exact form of \(\varepsilon \) is as follows
Using that \(w_i \vert b^i \vert \le \rho _b\) we can bound the expression from above by
Finally the lemma follows by algebraic manipulations and the fact that \(w_1 \le w_2\). \(\square \)
Geometric Interpretation of PDEGCNN layers
In a PDEGCNN layer [28, 60], one first performs convection and then a morphological convolution (dilation/erosion). This has the interesting effect that we can interpret this equivalently as performing a morphological convolution with a shifted morphological kernel. To make this precise, we first define what convection exactly is:
Definition 2
(Convection) Let \(v \in T_{\textbf{p}_0} (\mathbb {M}_2)\) be a tangent vector at the reference point \(\textbf{p}_0\), and let \(c: \mathbb {M}_2 \rightarrow T(\mathbb {M}_2)\) be the corresponding leftinvariant vector field obtained by pushing v forward with the leftaction \(L_g(\textbf{p}):= g \textbf{p}\), i.e., \(c(g \textbf{p}_0)=(L_g)_* v\). Convection is defined as:
where both W and U are scalar differentiable functions on \(\mathbb {M}_2\).
The solution of this leftinvariant transport (‘convection’) is quite simple and we state it in the following proposition without proof:
Proposition 2
The solution to the convection equation is
where we identified v as a tangent vector in \(T_eSE(2)\).
For the proof, and more details on how convection is implemented in practice within the PDEGCNN framework, we refer to [28, Sec. 5.1]. The general idea is that the characteristics of leftinvariant flow are Lie group exponential curves acting on the reference point \(\textbf{p}_0 \in \mathbb {M}_2\) in the homogeneous space.
We can now show that first performing convection and then a morphological convolution is the same as doing a morphological convolution with a shifted kernel:
Proposition 3
Let \(k: \mathbb {M}_2 \rightarrow \mathbb {R}\) be any morphological kernel. We have:
with shifted kernel \(\hat{k}(\textbf{p}, t):= k(\exp (t v) \textbf{p})\). In particular for timedependent erosion PDE kernels
Proof
Indeed, by direct computations one has:
When applying this to the erosion kernels (1), the result (B.1) follows by leftinvariance of the Riemannian metric: \(d_\mathcal {G}(e^{tv}\textbf{p},\textbf{p}_0)=d_\mathcal {G}(\textbf{p},e^{tv}\textbf{p}_0)\) and the identity \((e^{tv})^{1}=e^{tv}\). \(\square \)
Recall the relation between (approximative) Riemannian balls and association fields, as visualized in Figs. 4, 5 and 9.
The top left corner in Fig. 22 shows how a single PDEGCNN module (i.e., operator between two nodes in the network). The topright shows the geometric rationale behind a PDEGCNNs that essentially performs perceptual grouping of association fields via training, and indeed the bottom two rows of Fig. 22 reveal how the grouping of association fields becomes visible in the feature maps of two input test images. In comparison with this (for PDEGCNNs), typical geometric behavior is absent in feature maps of CNNs applied to the same images, recall Fig. 8.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bellaard, G., Bon, D.L.J., Pai, G. et al. Analysis of (sub)Riemannian PDEGCNNs. J Math Imaging Vis 65, 819–843 (2023). https://doi.org/10.1007/s1085102301147w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1085102301147w