1 Introduction

This paper is devoted to the study of cross-diffusion systems in image processing. Cross-diffusion models consist of evolutionary systems of diffusion type for at least two real-valued functions, where the evolution of each function is not independent of the others. Their use is widespread in areas like population dynamics (see Galiano et al. [10, 11] and references therein). In the case of image processing, two previous related approaches must be mentioned. The first one is the so-called complex diffusion [14]. Here the image is represented by a complex function, and a filtering process is governed by a (nonlinear in general) partial differential equation (PDE) problem of diffusion type with a complex diffusion coefficient. The main properties of this approach, that will be relevant to the present paper, are briefly described in the following (see Gilboa et al. [12,13,14] for details). On the one hand, the use of complex diffusion to model the filtering process assumes a distribution of the image features between the real and the imaginary parts of a complex function which evolve in a coupled way. This mutual influence is governed by the complex diffusion matrix. (In particular, the initial condition of the corresponding PDE is always a complex function with the real part given by the actual image and zero imaginary part.) A second point to be emphasized is that this distribution may give to the imaginary part of the image a particular role as edge detector. In the linear case, when the phase angle of the (constant) complex diffusion coefficient is small, the imaginary part approximates a smoothed Laplacian of the original image, scaled by time. This property (called small theta approximation) can be used in nonlinear models as well when, in order to control the diffusion and the edge detection, the corresponding diffusion tensor is taken to be dependent only on the imaginary part of the image, instead of on the size of the image gradients as in many nonlinear real diffusion models, with the consequent computational advantages.

Complex diffusion can indeed be rewritten as a cross-diffusion model for the real and imaginary parts of the image function. This approach was considered by Lorenz et al. [24], to analyse the existence of global solution of a related cross-diffusion system. In addition to the theoretical advantage of having a global solution, the numerical examples also suggest that the filtering process with the cross-diffusion system preserves better the textures of the image when compared to Perona-Malik models [3, 27].

The previous references were the starting point of our research on cross-diffusion systems as mathematical models for image filtering. The general purpose of the present paper and the subsequent paper devoted to the nonlinear case is to extend the complex diffusion approach to more general cross-diffusion systems, analysing how to generalize known properties and deriving new ones. We point out the main contributions of the present paper.

  • Our first purpose is to study the cross-diffusion approach (regarding the complex diffusion as a particular case) as a scale-space representation, see e.g. Iijima [16], Witkin [32], Lindeberg [19, 23], Florack [9], Duits et al. [7]. When the features of the image are distributed in two components \(\mathbf{u}=(u,v)^{T}\), we assume that the linear filtering formulation is described by a matrix convolution of the form

    $$\begin{aligned} (K*\mathbf{u})(\mathbf{x})=\begin{pmatrix}(k_{11}*u)(\mathbf{x})+(k_{12}*v)(\mathbf{x})\\ (k_{21}*u)(\mathbf{x})+(k_{22}*v)(\mathbf{x})\end{pmatrix}, \end{aligned}$$
    (1.1)

    where

    $$\begin{aligned} K=\begin{pmatrix} k_{11}&{}\quad k_{12}\\ k_{21}&{}\quad k_{22} \end{pmatrix}, \end{aligned}$$
    (1.2)

    is the matrix kernel. For simplicity, and when no confusion is possible, \(*\) will denote both the matrix-vector operation on the left-hand side of (1.1) and the usual convolution of the entries on the right-hand side,

    $$\begin{aligned} (f*g)(\mathbf{x})=\int _{{\mathbb {R}}^{2}}f(\mathbf{x}-\mathbf{y})g(\mathbf{y})\hbox {d}{} \mathbf{y}. \end{aligned}$$

    In scale-space theory, two main formulations can be considered, see e.g. Weickert et al. [30, 31] and Lindeberg [21]. The first one is based on causality [17]. This principle can be interpreted as non-enhancement of local extrema. The characterization of those linear filters in the continuous, one-dimensional case satisfying this property was given by Lindeberg [18]. They consist of compositions of two types of scale-space kernels: Gaussian kernels (some already observed by Witkin [32]) and truncated exponential functions. Our approach to build the scale-space axiomatics makes use of a formulation based on the principles of scale invariance ([8, 16, 25, 31]; see Lindeberg [21] for a comparison and synthesis of both theories). This theory characterizes those linear filtering kernels satisfying recursivity, grey-level shift, rotational and scale invariance. Our first contribution in this paper generalizes that of the scalar case (that is, when the image is represented by only one real-valued function of two space variables) in the sense that the kernels \(K=K_{t}\), being t the scale parameter, have a Fourier matrix representation of the form

    $$\begin{aligned} {\widehat{K}}(\mathbf{\xi },t)=\hbox {e}^{-t|\mathbf{\xi }|^{p}d},\quad p>0, \end{aligned}$$
    (1.3)

    where \({\widehat{K}}(\mathbf{\xi },t)\) stands for the \(2\times 2\) matrix with entries \({\widehat{k}}_{ij}(\mathbf{\xi },t),i,j=1,2\), \(d=(d_{ij})_{i,j=1,2}\) is a \(2\times 2\) positive definite matrix and p is a positive constant. Additional properties which are analysed here are the existence of infinitesimal generator (and consequently the alternative formulation of (1.1) as an initial value problem of PDEs) and locality. The arguments in Pauwels et al. [25] and Duits et al. [7] will be here adapted to show that linear cross-diffusion filtering with convolution kernels of the form (1.3) admits an infinitesimal generator which is local if and only if p is an even integer.

  • Since complex diffusion models can be written as cross-diffusion systems (for the real and imaginary parts of the image) we develop a generalization of the most relevant properties of the complex diffusion approach. They are related to the way how diffusion affects different features of the image, mainly the grey values and the edges. In this sense, the paper generalizes the small theta approximation with the aim of assigning to one of the components a similar role as edge detector to that of the imaginary part of the image in some cases of linear complex diffusion. The generalization is understood in the following terms. When the matrix d approaches to a suitable spectral structure (see Sect. 2) then one of the components of the corresponding filtered image by linear cross-diffusion behaves as the operator \(A=-(-{\varDelta })^{p/2}\) applied to a smoothed version of the original image, determined by A and the trace of d (the sum of the diagonal entries of d). A second point of generalization concerns the initial distribution of the image in two components. In the complex diffusion case, linear (and nonlinear) filtering usually starts from the real noisy image written as complex function (that is, with zero imaginary part). This may be modified, in the cross-diffusion approach, by distributing the image in two components in different ways. This distribution affects the definition of relevant quantities for the problem in this context, such as the average grey-level value.

  • The previous results are complemented with a computational study dealing with the performance of linear cross-diffusion filtering with kernels of the form (1.3). Our purpose here is to make a first approach to the behaviour of the models, by numerical means, that may be used in more exhaustive computational works in the future. The numerical experiments that we present are focused on the illustration of some properties of the models, as the generalized small theta approximation, and the influence of the initial distribution of the image features as well as the parameter p and the matrix d in (1.3). The numerical experiments suggest that when the eigenvalues of d are real and different, the blurring effect in the filtering is delayed when compared with the linear complex case (where the eigenvalues are complex). The first choice leads to an improvement in the quality of filtering, measured with the classical signal-to-noise ratio (SNR) and peak signal-to-noise ratio (PSNR) indexes, and edge detection which is independent of the parameter p, that is, the local or nonlocal character of the infinitesimal generator.

The structure of the paper is as follows. Section 2 is devoted to the theoretical analysis of linear cross-diffusion models. In Sect. 2.1 the matrix convolution (1.1) is formulated as a scaling process and, in order to define the space where the convolution operators act, some assumptions on the kernels are specified. The inclusion of a scaling enables to analyse, in Sect. 2.2, the formulation of some scale-space properties (grey-level shift invariance, rotational invariance, semigroup property and scale invariance) under the cross-diffusion setting. The derivation of (1.3) is then given in Sect. 2.3, along with a discussion on the locality property. The generalization of the small theta approximation, in Sect. 2.4, completes the theoretical approach. Some of these properties are illustrated in the numerical study in Sect. 3, where the models are applied to one- and two-dimensional examples of filtering problems and their performance is analysed in terms of the parameters of the kernels (1.3). Particular attention will be paid on the comparison with Gaussian smoothing and linear complex diffusion. The main conclusions and perspectives are outlined in Sect. 4. A technical result on the reduction of \(2\times 2\) positive definite matrices, necessary for some proofs, is included as lemma in “Appendix”.

The present paper will be followed by a companion paper focused on nonlinear cross-diffusion models [2].

The following notation will be used throughout the paper. Grey-scale images will be represented in this linear case by real-valued mappings on \({\mathbb {R}}^{2}\). (The nonlinear case will be studied on bounded domains.) We denote by \(H^{k}=H^{k}({\mathbb {R}}^{2})\), where \(k\ge 0\) an integer, the Sobolev space of order k, with \(H^{0}=L^{2}({\mathbb {R}}^{2})\). The inner product in \(H^{k}\) is denoted by \(\langle \cdot ,\cdot \rangle _{k}\) and the associated norm by \(||\cdot ||_{k}\). The space of integrable functions in \({\mathbb {R}}^{2}\) will be denoted as usual by \(L^{1}=L^{1}({\mathbb {R}}^{2})\), as well as the space of real-valued infinitely continuously differentiable functions in \({\mathbb {R}}^{2}\) by \(C^{\infty }({\mathbb {R}}^{2})\) and the space of continuous functions in \({\mathbb {R}}^{2}\) vanishing at infinity by \(C_{0}({\mathbb {R}}^{2})\). The vector and scalar real-valued functions on \({\mathbb {R}}^{2}\) will be distinguished by the use of boldface for the first. By X we will denote the space where the scalar functions are, in such a way that a vector representation of the image will be defined in \(X\times X\), with associated norm

$$\begin{aligned} ||\mathbf{u}||_{X\times X}=\left( ||u||_{X}^{2}+||v||_{X}^{2}\right) ^{1/2},\quad \mathbf{u}=(u,v)^{T}. \end{aligned}$$

We will assume that \(X=H^{0}\), although in some cases (which will be specified in the paper) \(X=H^{k}\), for \(k>0\), or \(X=L^{1}\cap H^{0}\) will be considered. For \(f\in X\), \({\widehat{f}}\) will denote the Fourier transform in \({\mathbb {R}}^{2}\),

$$\begin{aligned} {\widehat{f}}(\xi )=\int _{{\mathbb {R}}^{2}}f(\mathbf{x})\hbox {e}^{-i\mathbf{x}\cdot \xi }\hbox {d}{} \mathbf{x}, \quad \xi \in {\mathbb {R}}^{2}, \end{aligned}$$

where the dot \(\cdot \) stands for the Euclidean inner product in \({\mathbb {R}}^{2}\) with \(|\cdot |\) as the Euclidean norm. Finally, on the space of matrix kernels (1.2) with \(k_{ij}\in L^{1}, i,j=1,2\) we consider the norm

$$\begin{aligned} ||K||_{*}=\max _{i,j}\{||k_{ij}||_{L^{1}}\}, \end{aligned}$$
(1.4)

where \(||\cdot ||_{L^{1}}\) denotes the \(L^{1}\)-norm.

2 Linear Cross-Diffusion Filtering

2.1 Formulation as Scaling Process

In order to formalize (1.1) as a scale-space representation we introduce a family of convolution operators \(\{{\mathcal {K}}_{t}:X\times X\rightarrow X\times X, t\ge 0\}\) in such a way that (1.1) is rewritten as

$$\begin{aligned} \mathbf{u}(\mathbf{x},t)={\mathcal {K}}_{t}{} \mathbf{u}_{0}(\mathbf{x})=(K(\cdot ,t)*\mathbf{u}_{0})(\mathbf{x}),\quad \mathbf{x}\in {\mathbb {R}}^{2}, \end{aligned}$$
(2.1)

where, from some original real-valued image \(f\in X\), an initial vector field \(\mathbf{u}_{0}(\mathbf{x})=(u_{0}(\mathbf{x}),v_{0}(\mathbf{x}))^{T}\in X\times X\) is composed. Thus, \(\mathbf{u}(\mathbf{x},t)=(u(\mathbf{x},t),v(\mathbf{x},t))^{T}\) stands for the grey-level value image at pixel \(\mathbf{x}\in {\mathbb {R}}^{2}\) at the scale t. This is obtained from a convolution with a \(2\times 2\) matrix kernel \(K(\cdot ,t)\) with entries \(k_{ij}(\cdot ,t), i,j=1,2\) such that the vector representation (2.1) is written as

$$\begin{aligned} u(\mathbf{x},t)= & {} (k_{11}(\cdot ,t)*u_{0})(\mathbf{x})+(k_{12}(\cdot ,t)*v_{0})(\mathbf{x}),\nonumber \\ v(\mathbf{x},t)= & {} (k_{21}(\cdot ,t)*u_{0})(\mathbf{x})+(k_{22}(\cdot ,t)*v_{0})(\mathbf{x}),\nonumber \\&t\ge 0, \mathbf{x}\in {\mathbb {R}}^{2}. \end{aligned}$$
(2.2)

Recall that the convolution operator is distributive, associative but not commutative. Concerning the kernels \(k_{ij}, i.j=1,2\), a first group of assumptions is made:

  1. (H1)

    \(k_{ij}(\cdot ,t)\in L^{1}, {\widehat{k}}_{ij}(\cdot ,t)\in L^{1}, i,j=1,2, t> 0\).

  2. (H2)

    For each \(\mathbf{x}\in {\mathbb {R}}^{2}, i,j=1,2\), \(k_{i,j}(\mathbf{x},\cdot ):(0,\infty )\rightarrow {\mathbb {R}}\) is continuous.

Note that since \({\widehat{k}}_{ij}(\cdot ,t)\in L^{1}\) then \(k_{ij}(\cdot ,t): {\mathbb {R}}^{2}\rightarrow {\mathbb {R}}\) is continuous and bounded.

Remark 1

Hypotheses (H1), (H2) will be required for technical reasons (definition of convolution, inversion of Fourier transform) and also when some scale-space properties are imposed in (2.1). The satisfaction of these properties will require additional assumptions that will be specified in each case in Sect. 2.2.

Remark 2

In a similar way to the scalar case [21, 25, 31], it is not hard to see that the convolution kernel formulation (2.1) can be derived from the assumptions of linear integral operators (in the matrix sense) \({\mathcal {K}}_{t}, t\ge 0\) with matrix kernels \(K_{t}\) such that

$$\begin{aligned} {\mathcal {K}}_{t}{} \mathbf{f}(\mathbf{x})=\int _{{\mathbb {R}}^{2}}K_{t}(\mathbf{x},\mathbf{y}){\mathbf{f}}(\mathbf{y})\hbox {d}{} \mathbf{y},\quad \mathbf{x}\in {\mathbb {R}}^{2},t\ge 0, \end{aligned}$$

and satisfying the translation invariance

$$\begin{aligned} K_{t}(\mathbf{x}-\mathbf{y},\cdot )=K_{t}(\mathbf{x},\mathbf{y}+\cdot ),\quad \mathbf{x},\mathbf{y}\in {\mathbb {R}}^{2}, t\ge 0. \end{aligned}$$

Remark 3

The linear complex diffusion with coefficient \(c=r\hbox {e}^{i\theta }\) can be written as a convolution [12,13,14]

$$\begin{aligned} I(\mathbf{x},t)=(h(\cdot ,t)*I_{0})(\mathbf{x}),\quad \mathbf{x}\in {\mathbb {R}}^{2}, \end{aligned}$$
(2.3)

with kernel

$$\begin{aligned}&h(\mathbf{x},t)=g_{\sigma }\hbox {e}^{i\alpha (\mathbf{x},t)},\quad g_{\sigma }(\mathbf{x})=\frac{1}{2\pi \sigma ^{2}}\hbox {e}^{-\frac{|\mathbf{x}|^{2}}{2\sigma ^{2}}},\nonumber \\&\sigma (t)=\sqrt{\frac{2tr}{\cos \theta }},\quad \alpha (\mathbf{x},t)=\frac{|\mathbf{x}|^{2}\sin {\theta }}{4tr}. \end{aligned}$$
(2.4)

If \(I_{0}=I_{0R}+iI_{0I}, I=I_{R}+iI_{I}\) then (2.3) can be formulated as (2.2) for \(u=I_{R}, u_{0}=I_{0R}, v=I_{I}, v_{0}=I_{0I}\) and \(k_{11}(\mathbf{x},t)=k_{22}(\mathbf{x},t)=h_{R}(\mathbf{x},t), k_{12}(\mathbf{x},t)=-k_{21}(\mathbf{x},t)=h_{I}(\mathbf{x},t)\), where \(h_{R}, h_{I}\) stand, respectively, for the real and imaginary parts of h in (2.4).

2.2 Scale-Space Properties

As mentioned in the Introduction, the image representation as a scale-space will be here analysed by using the principles of scale invariance. The purpose is then to characterize those matrix kernels \(K(\cdot ,t), t\ge 0\) in such a way that (2.1) satisfies shift invariance, rotational invariance, recursivity (semigroup property) and scale invariance. These four properties will be introduced in the context of cross-diffusion formulation and the requirements for the kernels to satisfy them will be imposed.

2.2.1 Grey-Level Shift Invariance

As in the scalar case, we assume that

  1. (H3)

    The matrix kernel \(K(\cdot ,t), t>0\) is ‘mass preserving’, Pauwels et al. [25], that is

    $$\begin{aligned} {\widehat{k}}_{ii}(\mathbf{0},t)= & {} \int _{{\mathbb {R}}^{2}}k_{ii}(\mathbf{x},t)\hbox {d}{} \mathbf{x}=1,\quad i=1,2,\nonumber \\ {\widehat{k}}_{ij}(\mathbf{0},t)= & {} \int _{{\mathbb {R}}^{2}}k_{ij}(\mathbf{x},t)\hbox {d}{} \mathbf{x}=0, \quad i\ne j. \end{aligned}$$
    (2.5)

Then, for any constant signal \(C\in {\mathbb {R}}^{2}\) and \(t\ge 0\) we have \(K(\cdot ,t)*C=C\) and therefore grey-level shift invariance holds:

$$\begin{aligned} K(\cdot ,t)*(\mathbf{f}+\mathbf{C})=K(\cdot ,t)*\mathbf{f}+\mathbf{C},\quad \mathbf{C}\in {\mathbb {R}}^{2}, t\ge 0, \end{aligned}$$

for any input image \(\mathbf{f}\in X\times X\). Assumption (H3) has an additional consequence:

Lemma 1

Assume that (H1)–(H3) hold. For \(\mathbf{u}=(u,v)\in L^{1}\times L^{1}\) we define \(\mathbf{M}(\mathbf{u})=(m(u),m(v))^{T}\), where

$$\begin{aligned} m(f)=\left( \int _{{\mathbb {R}}^{2}}f(\mathbf{x})\hbox {d}{} \mathbf{x}\right) ,\quad f\in L^{1}. \end{aligned}$$

If \(\mathbf{u}_{0}\in L^{1}\times L^{1}\) and \(\mathbf{u}(\cdot ,t), t\ge 0\) satisfies (2.1) then

$$\begin{aligned} \mathbf{M}(\mathbf{u}(\cdot ,t))=\mathbf{M}(\mathbf{u}_{0}),\quad t\ge 0. \end{aligned}$$
(2.6)

Proof

Note that for any \(f\in L^{1}({\mathbb {R}}^{2})\) and \(i,j=1,2\),

$$\begin{aligned}&\int _{{\mathbb {R}}^{2}}\int _{{\mathbb {R}}^{2}}k_{ij}(\mathbf{x-y},t)f(\mathbf{y})\hbox {d}{} \mathbf{y}\hbox {d}{} \mathbf{x}\nonumber \\&\quad = \int _{{\mathbb {R}}^{2}}\left( \int _{{\mathbb {R}}^{2}}k_{ij}(\mathbf{x-\mathbf y},t)\hbox {d}{} \mathbf{x}\right) f(\mathbf{y})\hbox {d}{} \mathbf{y}\nonumber \\&\quad = \left( \int _{{\mathbb {R}}^{2}}k_{ij}(\mathbf{x},t)\hbox {d}\mathbf{x}\right) \left( \int _{{\mathbb {R}}^{2}}f(\mathbf{x})\hbox {d}\mathbf{x}\right) . \end{aligned}$$
(2.7)

Now, if \(\mathbf{u}_{0}\in L^{1}\times L^{1}\) then the solution of (2.1) satisfies \(\mathbf{u}(\cdot ,t)\in L^{1}\times L^{1}, t\ge 0\); the application of (2.7) to (2.2) and assumption (H3) imply (2.6). \(\square \)

Remark 4

Property (2.6) may be considered as the cross-diffusion version of the average grey-level invariance, when the image is represented by only one real-valued function [29]. The use, from \(\mathbf{M}\), of a scalar to play a similar role of average grey level in this case may however depend on the initial data \(\mathbf{u}_{0}=(u_{0},v_{0})^{T}\) of an original real image f in two components. For example, in the case of linear complex diffusion, \(\mathbf{u}_{0}=(f,0)^{T}\) is typically taken and due to the properties of the fundamental solution (2.3) [12,13,14],

$$\begin{aligned} \int _{{\mathbb {R}}^{2}}h_{R}(\mathbf{x},t)\hbox {d}{} \mathbf{x}=1,\quad \int _{{\mathbb {R}}^{2}}h_{I}(\mathbf{x},t)\hbox {d}{} \mathbf{x}=0, \end{aligned}$$

for \(t\ge 0\), we have that \(I=I_{R}+iI_{I}\) of (2.3) satisfies

$$\begin{aligned} m(I_{R}(\cdot ,t))=m(f),\quad m(I_{I}(\cdot ,t))=0. \end{aligned}$$

Then, the role of average grey level might be played by the integral of the real part of the image, that is the first component in the corresponding formulation as a cross-diffusion system. Some other choices of the initial distribution may however motivate the change of the definition of average grey level, e.g. \(m(u)+m(v)\).

Remark 5

(Flat kernels) A second consequence of mass preserving assumption (H3) is that

$$\begin{aligned} \lim _{t\rightarrow \infty }k_{ij}(\cdot ,t)=0,\quad i,j=1,2, \end{aligned}$$

which means that the kernels are flat as \(t\rightarrow \infty \), see Weickert et al. [30, 31].

2.2.2 Rotational Invariance

The invariance of the image by rotation is obtained in a similar way to that of the scalar case, see Pauwels et al. [25].

Lemma 2

Under the assumption

  1. (H4)

    For any \(t>0\), \( i,j=1,2\), there exists \(\kappa _{ij}(\cdot ,t)\in L^{1}\) such that \(k_{ij}(\mathbf{x},t)=\kappa _{ij}(|\mathbf{x}|,t)\), \(\mathbf{x} \in {\mathbb {R}}^2\),

let \(T_{\theta }:{\mathbb {R}}^{2}\rightarrow {\mathbb {R}}^{2}\) be a rotation matrix of angle \(\theta \in {\mathbb {R}}\) and for \(\mathbf{u}\in X\times X\) let us define \({\mathcal {T}}_{\theta }:X\times X\rightarrow X\times X\) as

$$\begin{aligned} ({\mathcal {T}}_{\theta }{} \mathbf{u})(\mathbf{x})=\mathbf{u}(T_{\theta }\mathbf{x}),\quad \mathbf{x}\in {\mathbb {R}}^{2}. \end{aligned}$$

Then, for any \(\mathbf{u}_{0}\in X\times X\) and \(t>0\),

$$\begin{aligned} {\mathcal {K}}_{t}({\mathcal {T}}_{\theta }\mathbf{u}_{0})={\mathcal {T}}_{\theta }({\mathcal {K}}_{t}\mathbf{u}_{0}). \end{aligned}$$
(2.8)

Proof

The same arguments as those of the scalar case are applied here, since (H4) implies \(k_{ij}(T_{\theta }{} \mathbf{x},t)=k_{ij}(\mathbf{x},t), t\ge 0, \theta \in {\mathbb {R}}\) and this leads to (2.8). \(\square \)

Remark 6

As in the scalar case, (H4) also implies that for \(i,j=1,2, t>0\), there exists \({\widetilde{\kappa }}_{ij}={\widetilde{\kappa }}_{ij}(\cdot ,t)\in L^{1}\) such that \( {\widehat{k}}_{ij}(\mathbf{\xi })={\widetilde{\kappa }}_{ij}(|\mathbf{\xi }|), \xi \in {\mathbb {R}}^{2}. \) Moreover

$$\begin{aligned} {\widehat{k}}_{ij}(\mathbf{\xi },t) =\int _{0}^{\infty } \kappa _{ij}(\rho ,t)J_{0}(\rho |\xi |)d\rho ,\quad t>0, \end{aligned}$$

where \(J_{0}(z)\) is the zeroth order Bessel function, see Pauwels et al. [25].

2.2.3 Recursivity (Semigroup Property)

Here we assume that:

  1. (H5)

    The family of operators \(\{{\mathcal {K}}_{t}, t\ge 0\}\) satisfies the semigroup properties:

    $$\begin{aligned}&\lim _{t\rightarrow 0+}{\mathcal {K}}_{t}{} \mathbf{f}=\mathbf{f},\quad \mathbf{f}\in X\times X,\\&{\mathcal {K}}_{t+s}={\mathcal {K}}_{t}{\mathcal {K}}_{s}, \quad t,s\ge 0. \end{aligned}$$

Note that the first property in (H5) means that for \(i,j=1,2\)

$$\begin{aligned} \lim _{t\rightarrow 0+}k_{ij}(\cdot ,t)=\left\{ \begin{matrix}0&{}\quad i\ne j\\ \delta (\cdot )&{}\quad i=j,\end{matrix}\right. \end{aligned}$$

where \(\delta (\cdot )\) denotes the Dirac’s delta distribution. In terms of the convolution matrices, (H5) reads

$$\begin{aligned}&K(\cdot ,0)=\delta (\cdot )I,\nonumber \\&(K(\cdot ,t+s)*\mathbf{f})(\mathbf{x})=(K(\cdot ,t)*K(\cdot ,s)*\mathbf{f})(\mathbf{x}),\nonumber \\&\mathbf{x}\in {\mathbb {R}}^{2}, \mathbf{u}\in X\times X. \end{aligned}$$
(2.9)

On the other hand, the formulation of (2.1) as an initial value problem of PDE for \(\mathbf{u}\) is related to the existence of an infinitesimal generator of the semigroup \(\{{\mathcal {K}}_{t}, t\ge 0\}\), defined as [26, 34]

$$\begin{aligned} D\mathbf{f}=\lim _{h\rightarrow 0^{+}}\frac{{{\mathcal {K}}_{h}}\mathbf{f}-\mathbf{f}}{h}, \quad \mathbf{f}\in X\times X, \end{aligned}$$
(2.10)

and with domain \(\mathrm {dom}(D)\) consisting of those functions \(\mathbf{f}\in X\times X\) for which the limit in (2.10) exists. The existence of D is guaranteed under the additional assumption of continuity:

  1. (H6)

    If \(||\cdot ||{_{*}}\) is the norm defined in (1.4) then

    $$\begin{aligned} \lim _{t\rightarrow 0+} ||K(\cdot ,t)-K(\cdot ,0)||{_{*}}=0. \end{aligned}$$

Lemma 3

Under the assumptions (H1), (H2), (H5) and (H6) the function \(\mathbf{u}(t)=\mathbf{u}(\cdot ,t)\) given by (2.1) is a weak solution of the initial value problem

$$\begin{aligned} \mathbf{u}_{t}(t)= & {} D \mathbf{u}(t),\quad t>0,\nonumber \\ \mathbf{u}(0)= & {} \mathbf{u}_{0}(\mathbf{x}), \end{aligned}$$
(2.11)

where D is the linear operator (2.10).

Proof

Note that for \(\mathbf{f}\in X\times X\) (see e.g. Brezis [4], Pazy [26] and Yosida [34])

$$\begin{aligned}&||K(\cdot ,t)*\mathbf{f}-K(\cdot ,0)*\mathbf{f}||_{X\times X}\\&\quad \le ||K(\cdot ,t)-K(\cdot ,0)||_{*}||\mathbf{f}||_{X\times X}. \end{aligned}$$

Therefore, by (H5), (H6) we have that \(\{{\mathcal {K}}_{t}, t\ge 0\}\) is a \(C_{0}\)-semigroup of bounded linear operators on X. This implies that \(\mathbf{u}(t)={\mathcal {K}}_{t}{} \mathbf{u}_{0}\) is the unique weak solution of (2.11). (If \(\mathbf{u}_{0}\in \mathrm {dom}(D)\) then \(\mathbf{u}(t)\) is a strong solution, e.g. Pazy [26]). \(\square \)

It is worth mentioning the Fourier representation of the matrix operator (2.10). If we define

$$\begin{aligned} {\widehat{K}}(\xi ,t)=\begin{pmatrix}{\widehat{k}}_{11}(\xi ,t)&{}\quad {\widehat{k}}_{12}(\xi ,t)\\ {\widehat{k}}_{21}(\xi ,t)&{}\quad {\widehat{k}}_{22}(\xi ,t)\end{pmatrix},\quad \xi \in {\mathbb {R}}^{2}, \quad t\ge 0,\nonumber \\ \end{aligned}$$
(2.12)

then the second property in (2.9) can be written as

$$\begin{aligned} {\widehat{K}}(\xi ,t+s)={\widehat{K}}(\xi ,t){\widehat{K}}(\xi ,s), \quad t,s\ge 0. \end{aligned}$$
(2.13)

Now, if \(D=(D_{ij})_{i,j=1,2}\) denotes the entries of D, then the system (2.11) can be written in terms of the Fourier symbols \({\widehat{D}}(\xi ):=\{{\widehat{D}}_{ij}(\xi )\}_{i,j=1,2}\), as the linear evolution problem

$$\begin{aligned} \frac{d}{dt}\begin{pmatrix}{\widehat{u}}(\mathbf{\xi },t)\\ {\widehat{v}}(\mathbf{\xi },t)\end{pmatrix}= & {} \begin{pmatrix}{\widehat{D}}_{11}(\mathbf{\xi })&{}\quad {\widehat{D}}_{12}(\mathbf{\xi })\\ {\widehat{D}}_{21}(\mathbf{\xi })&{}\quad {\widehat{D}}_{22}(\mathbf{\xi })\end{pmatrix}\begin{pmatrix}{\widehat{u}}(\mathbf{\xi },t)\\ {\widehat{v}}(\mathbf{\xi },t)\end{pmatrix},\nonumber \\&\mathbf{\xi }\in {\mathbb {R}}^{2}, \quad t>0, \end{aligned}$$
(2.14)

with \({\widehat{u}}(\mathbf{\xi },0)={\widehat{u}}_{0}(\mathbf{\xi }), {\widehat{v}}(\mathbf{\xi },0)={\widehat{v}}_{0}(\mathbf{\xi })\). By taking the Fourier transform in (2.2) and comparing with (2.14) we can write (2.12) in the form of the Lie-group exponential map

$$\begin{aligned} {\widehat{K}}(\mathbf{\xi },t)=\hbox {e}^{t{\widehat{D}}(\mathbf{\xi })},\quad \mathbf{\xi }\in {\mathbb {R}}^{2}, \quad t>0. \end{aligned}$$
(2.15)

Remark 7

We also note that the assumption for rotational invariance (H4) implies that the matrix \({\widehat{D}}(\xi )\) only depends on \(|\xi |\). Furthermore, (H1) implies that \({\widehat{k}}_{ij}(\cdot ,t)\in C_{0}({\mathbb {R}}^{2}), i,j=1,2\). Then, \({\widehat{k}}_{ij}(\xi ,t)\rightarrow 0\) as \(|\xi |\rightarrow \infty \). Due to the form of \({\widehat{K}}\) in (2.15), this necessarily leads to

$$\begin{aligned} \lim _{|\mathbf{\xi }|\rightarrow \infty }{\widehat{D}}_{ij}(\mathbf{\xi })\rightarrow -\infty , \quad i,j=1,2, \end{aligned}$$

which forces \({\widehat{D}}(\xi )\) to be negative definite for all \(\xi \in {\mathbb {R}}^{2}\) (that is, the real part of its eigenvalues must be negative for all \(\xi \in {\mathbb {R}}^{2}\)).

2.2.4 Scale Invariance

The last property to formulate is the scale invariance [1]. We define

$$\begin{aligned} S_{\lambda }{} \mathbf{f}(\mathbf{x})=\mathbf{f}(\lambda \mathbf{x}),\quad \mathbf{x}\in {\mathbb {R}}^{2},\quad \mathbf{f}\in X\times X, \end{aligned}$$

and assume that:

  1. (H7)

    For any \(\lambda \in {\mathbb {R}}\) and \(t>0\) there is \(t^{\prime }= \phi (t)\) such that

    $$\begin{aligned} S_{\lambda }{\mathcal {K}}_{t^{\prime }}={\mathcal {K}}_{t}S_{\lambda }. \end{aligned}$$

In Lindeberg and ter Haar Romeny [20] it is argued that, in the context of image processing, the relation between t and the scale represented by the standard deviation \(\sigma \) in the Gaussian filtering (\(t=\sigma ^{2}/2\)) can be generalized and assumed to exist from the beginning of the process, by establishing the existence of time (t) and scale (\(\sigma \)) parameters and some connection (\(\varphi \)) between them. Following this argument, we first introduce a scale parameter \(\sigma \), related to the semigroup parameter t by a suitable transformation (to be specified later)

$$\begin{aligned} t=\varphi (\sigma ). \end{aligned}$$
(2.16)

When viewed the kernels as functions of the new parameter \(\sigma \) (and for which, for simplicity, the same notation is used) then the second property in (H5) reads

$$\begin{aligned} K(\cdot ,\sigma _{1})*K(\cdot ,\sigma _{2})=K(\cdot ,\varphi ^{-1}(\varphi (\sigma _{1})+\varphi (\sigma _{2}))), \end{aligned}$$
(2.17)

while the first one implies \(\varphi (0)=0\). In order to preserve the qualitative requirement (which is one of the bases of the scale-space theory, see Lindeberg [22]) that increasing values of the scale parameter should correspond to representations at coarser scales, we must assume that \(\varphi :{\mathbb {R}}_{+}\rightarrow {\mathbb {R}}_{+}\) is monotonically increasing (in particular, invertible). (In Pauwels et al. [25] this \(\varphi \) can be identified as \(\psi ^{-1}\) defined there.)

2.3 Linear Filtering Characterization

The characterization of those matrix kernels \(K(\cdot ,t)\), \(t\ge 0\), such that the scale-space representation (2.1) satisfies shift invariance, rotational invariance, recursivity and scale invariance is established in this section. Assume that in terms of the scale \(\sigma \), (2.1) is written in the matrix form

$$\begin{aligned} \mathbf{F}(\cdot ,\sigma )=K(\cdot ,\sigma )*\mathbf{f}, \end{aligned}$$

where \(\mathbf{f}=(f,g)^{T}, \mathbf{F}=(F,G)^{T}\); in Fourier representation this is

$$\begin{aligned} \widehat{\mathbf{F}}(\mathbf{\xi },\sigma )={\widehat{K}}(\mathbf{\xi },\sigma )\widehat{\mathbf{f}}(\mathbf{\xi }). \end{aligned}$$
(2.18)

Theorem 1

If we assume that \(\{{\mathcal {K}}_{t}, t\ge 0\}\), defined by (2.1), satisfies (H1)–(H7) then there exist \(p>0\) and a positive definite matrix d such that the kernels \(K(\cdot ,t), t>0\), must have the Fourier representation (2.15) with

$$\begin{aligned} {\widehat{D}}(\xi )=-|\xi |^{p}d. \end{aligned}$$
(2.19)

Proof

The proof follows an adaptation of the steps given in Lindeberg and ter Haar Romeny [20], for the scalar case.

  1. (A)

    Dimensional analysis Scale invariance (H7) will allow a simplification of (2.18) by using dimensional analysis, see e.g. Yarin [33]. In this case, taking e.g. the dimensionless variables \(\xi _{1}\sigma , \xi _{2}\sigma , {\widehat{f}}_{1}{\widehat{f}}_{2}^{-1}\) (where \(\mathbf{\xi }=(\xi _{1},\xi _{2})^{T}\in {\mathbb {R}}^{2}\)) and applying the Pi-Theorem, there is a matrix \({\widetilde{K}}:{\mathbb {R}}^{2}\rightarrow {\mathbb {R}}^{2}\times {\mathbb {R}}^{2}\) with \({\widetilde{K}}(\mathbf{0})=I\) (in order to have \(\widehat{\mathbf{F}}(\mathbf{\xi },0)=\widehat{\mathbf{f}}(\mathbf{\xi })\)) such that the system (2.18) can be written in the form

    $$\begin{aligned} \widehat{\mathbf{F}}(\mathbf{\xi },\sigma )={\widetilde{K}}(\mathbf{\xi }\sigma )\widehat{\mathbf{f}}(\mathbf{\xi }). \end{aligned}$$

    Furthermore, rotational invariance implies that \({\widetilde{K}}(\mathbf{\xi }\sigma )={\widetilde{K}}(|\mathbf{\xi }\sigma |)\) and therefore

    $$\begin{aligned} \widehat{\mathbf{F}}(\mathbf{\xi },\sigma )={\widetilde{K}}(|\mathbf{\xi }\sigma |)\widehat{\mathbf{f}}(\mathbf{\xi }), \end{aligned}$$
    (2.20)
  2. (B)

    Scale invariance According to (2.17) and (2.20), the semigroup condition (H5) is of the form

    $$\begin{aligned}&{\widetilde{K}}(|\mathbf{\xi }\sigma _{1}|){\widetilde{K}}(|\mathbf{\xi }\sigma _{2}|)\nonumber \\&\quad ={\widetilde{K}}(|\mathbf{\xi }\varphi ^{-1}(\varphi (\sigma _{1})+\varphi (\sigma _{2}))|), \end{aligned}$$
    (2.21)

    for \(\sigma _{1},\sigma _{2}\ge 0\). The same arguments as those in Lindeberg and ter Haar Romeny ([20], Section 1.5.6) can be used to show that scale invariance implies that \(\varphi \) in (2.16) must be of the form

    $$\begin{aligned} \varphi (\sigma )=C\sigma ^{p}, \end{aligned}$$

    for some constant \(C>0\) (which can be taken as \(C=1\)) and \(p>0\). (In Pauwels et al. [25], p is identified as \(\alpha \).) Hence, if \(H(x^{p})\equiv {\widetilde{K}}(x)\) then (2.21) reads

    $$\begin{aligned} H(|\mathbf{\xi }\sigma _{1}|^{p})H(|\mathbf{\xi }\sigma _{2}|^{p})= & {} {\widetilde{K}}(|\mathbf{\xi }\sigma _{1}|){\widetilde{K}}(|\mathbf{\xi }\sigma _{2}|)\\= & {} {\widetilde{K}}(|\mathbf{\xi }\varphi ^{-1}(\varphi (\sigma _{1})+\varphi (\sigma _{2}))|)\\= & {} {\widetilde{K}}((|\mathbf{\xi }\sigma _{1}|^{p}+|\mathbf{\xi }\sigma _{2}|^{p})^{1/p})\\= & {} H(((|\mathbf{\xi }\sigma _{1}|^{p}+|\mathbf{\xi }\sigma _{2}|^{p})^{1/p})^{p})\\= & {} H(|\mathbf{\xi }\sigma _{1}|^{p}+|\mathbf{\xi }\sigma _{2}|^{p}), \end{aligned}$$

    which is identified as the functional equation

    $$\begin{aligned} {\varPsi }(\alpha _{1}){\varPsi }(\alpha _{1})={\varPsi }(\alpha _{1}+\alpha _{2}) \end{aligned}$$

    characterizing the matrix exponential function. Therefore, \({\widetilde{K}}\) must be of the form

    $$\begin{aligned} {\widetilde{K}}(|\mathbf{\xi }\sigma |)=H(|\mathbf{\xi }\sigma |^{p})=\hbox {e}^{|\mathbf{\xi }\sigma |^{p}A}, \quad p>0, \end{aligned}$$

    for some \(2\times 2\) real matrix A. Now the arguments given in Remark 7 show that A must be negative definite or, alternatively

    $$\begin{aligned} {\widehat{K}}(\mathbf{\xi },\sigma )={\widetilde{K}}(|\mathbf{\xi }\sigma |)=\hbox {e}^{-|\mathbf{\xi }\sigma |^{p}d},\quad \xi \in {\mathbb {R}}^{2}, \end{aligned}$$
    (2.22)

    where d is a \(2\times 2\) positive definite matrix. Writing (2.22) in terms of the original scale t leads to the representation (2.15) with \({\widehat{D}}(\xi )\) given by (2.19).\(\square \)

Remark 8

The form (2.19) corresponds to specific forms of the infinitesimal generator D. Note that if \(\mathbf{f}\in X\times X\) then

$$\begin{aligned} \widehat{\left( \frac{{\mathcal {K}}_{h}{} \mathbf{f}-\mathbf{f}}{h}\right) }(\xi )=\left( \frac{\hbox {e}^{-h|\mathbf{\xi }|^{p}d}-I}{h}\right) \widehat{\mathbf{f}}(\mathbf{\xi }), \end{aligned}$$

and formally

$$\begin{aligned} \frac{\hbox {e}^{-h|\mathbf{\xi }|^{p}d}-I}{h}= & {} \sum _{j=1}^{\infty } \frac{(-1)^{j}h^{j-1}|\mathbf{\xi }|^{jp}}{j!}d^{j}. \end{aligned}$$

Thus,

$$\begin{aligned} \lim _{h\rightarrow 0^{+}}\frac{\hbox {e}^{-h|\mathbf{\xi }|^{p}d}-I}{h}=-|\mathbf{\xi }|^{p}d, \end{aligned}$$

and the limit is the Fourier symbol of the operator, [26]

$$\begin{aligned} D\mathbf{f}=-(-{\varDelta })^{p/2}d\mathbf{f}, \end{aligned}$$
(2.23)

with \({\varDelta }\) standing for the Laplace operator and where \((-{\varDelta })^{p/2}\) is multiplying each entry of d, cf. Pauwels et al. [25].

The explicit Formula (2.23) can be used to discuss the additional scale-space property of locality [30, 31]. A semigroup of operators \(T_{t}, t\ge 0\), satisfies the locality condition if for all smooth \(\mathbf{f}, \mathbf{g}\) in its domain and all \(\mathbf{x}\in {\mathbb {R}}^{2}\)

$$\begin{aligned} (T_{t}{} \mathbf{f}-T_{t}{} \mathbf{g})(\mathbf{x})=o(t),\quad t\rightarrow 0^{+}, \end{aligned}$$

whenever the derivatives of \(\mathbf{f}\) and \(\mathbf{g}\) of any nonnegative order are identical. Mathematically [1, 25], the locality condition implies that the corresponding infinitesimal generator is a local differential operator, which in the case of (2.23) means that p / 2 must be integer, extending the result obtained in Pauwels et al. [25] for the scalar case to the cross-diffusion framework.

Remark 9

Note that when d has complex conjugate eigenvalues, Lemma 4 in “Appendix” shows that there is a basis in \({\mathbb {R}}^{2}\) such that d is similar to a matrix of the form

$$\begin{aligned} \begin{pmatrix} \nu &{}\quad -\mu \\ \mu &{}\quad \nu \end{pmatrix}, \end{aligned}$$
(2.24)

with \(\nu >0, \mu \ne 0\) and where the eigenvalues of d are \(\nu \pm i\mu \). Therefore, linear complex diffusion corresponds to the case of (2.23) with \(p=2\) and a matrix d of the form (3.1). The complex diffusion coefficient is given by \(c=\nu +i\nu \) or \(c=\nu -i\mu \). Formula (2.23) shows that this linear complex diffusion can be generalized by using other values of p.

Remark 10

The nature of the semigroup \(\{{\mathcal {K}}_{t}, t\ge 0\}\) can be analysed from the spectrum and regularity of the infinitesimal generator (2.23) [26]. In this sense, the following result holds.

Theorem 2

Let \(k\ge 0\). The operator D in (2.23) with domain \(\mathrm {dom}(D)=H^{k+1}\times H^{k+1}\) is the infinitesimal generator of a \(C_{0}\)-semigroup \(\{{\mathcal {K}}_{t}, t\ge 0\}\) and there exists \(M>0\) such that the induced norm satisfies \(||{\mathcal {K}}_{t}||\le M\). Furthermore, if d is of one of the three reduced forms

$$\begin{aligned} d_{1}= & {} \begin{pmatrix}\lambda _{+}&{}\quad 0\\ 0&{}\quad \lambda _{-}\end{pmatrix},\quad \lambda _{+}\ge \lambda _{-}>0,\nonumber \\ d_{2}= & {} \begin{pmatrix}\alpha &{}\quad \beta \\ 0&{}\quad \alpha \end{pmatrix},\quad \alpha \ge \beta>0,\nonumber \\ d_{3}= & {} \begin{pmatrix}\nu &{}\quad -\nu \\ \mu &{}\quad \nu \end{pmatrix},\quad \nu >0, \mu \ne 0, \end{aligned}$$
(2.25)

then \(M\le 1\).

Proof

We first prove that for each reduced form \(d_{j}, j=1,2,3\) in (2.24), \(\{{\mathcal {K}}_{t}, t\ge 0\}\) is a \(C_{0}\)-semigroup of contractions. Consider the eigenvalue problem for D:

$$\begin{aligned} (\lambda I-D)\mathbf{u}=\mathbf{f}, \end{aligned}$$
(2.26)

where I is the \(2\times 2\) identity matrix, \(\mathbf{u}=(u,v)^{T}, \mathbf{f}=(f,g)^{T}\). Assume that in (2.19) \(d=d_{1}\). In terms of the Fourier transform, (2.26) reads

$$\begin{aligned} (\lambda +|\xi |^{p}\lambda _{+}){\widehat{u}}(\mathbf{\xi })= & {} {\widehat{f}}(\mathbf{\xi }),\nonumber \\ (\lambda +|\xi |^{p}\lambda _{-}){\widehat{v}}(\mathbf{\xi })= & {} {\widehat{g}}(\mathbf{\xi }),\quad \xi \in {\mathbb {R}}^{2}. \end{aligned}$$

Then, since \(\lambda _{+}>\lambda _{-}>0\), for any \(\lambda >0\) we have

$$\begin{aligned} \frac{1}{|\lambda +|\xi |^{p}\lambda _{\pm }|}\le \frac{1}{\lambda },\quad \xi \in {\mathbb {R}}^{2}. \end{aligned}$$
(2.27)

When \(d=d_{2}\), the Fourier system associated to (2.26) is now of the form

$$\begin{aligned} (\lambda +|\xi |^{p}\alpha ){\widehat{u}}(\mathbf{\xi })+|\xi |^{p}\beta {\widehat{v}}(\mathbf{\xi })= & {} {\widehat{f}}(\mathbf{\xi }),\nonumber \\ (\lambda +|\xi |^{p}\alpha ){\widehat{v}}(\mathbf{\xi })= & {} {\widehat{g}}(\mathbf{\xi }),\quad \xi \in {\mathbb {R}}^{2}. \end{aligned}$$

Now, since \(0<\beta <\alpha \) then for any \(\lambda >0\),

$$\begin{aligned} \frac{|\xi |^{p}\beta }{|\lambda +|\xi |^{p}\alpha |}\le 1,\quad \frac{1}{|\lambda +|\xi |^{p}\alpha |}\le \frac{1}{\lambda },\quad \xi \in {\mathbb {R}}^{2}. \end{aligned}$$
(2.28)

Finally, when \(d=d_{3}\), the Fourier representation of (2.26) has the form

$$\begin{aligned} (\lambda +\nu |\mathbf{\xi }|^{p}){\widehat{u}}(\mathbf{\xi })-\mu |\mathbf{\xi }|^{p}{\widehat{v}}(\mathbf{\xi })= & {} {\widehat{f}}(\mathbf{\xi }),\nonumber \\ \mu |\mathbf{\xi }|^{p}{\widehat{u}}(\mathbf{\xi })+(\lambda +\nu |\mathbf{\xi }|^{p}){\widehat{v}}(\mathbf{\xi })= & {} {\widehat{g}}(\mathbf{\xi }). \end{aligned}$$
(2.29)

Inverting (2.29) leads to

$$\begin{aligned} {\widehat{u}}(\mathbf{\xi })= & {} \frac{(\lambda +\nu |\mathbf{\xi }|^{p})}{m(\mathbf{\xi })}{\widehat{f}}(\mathbf{\xi })+\frac{\mu |\mathbf{\xi }|^{p}}{m(\mathbf{\xi })}{\widehat{g}}(\mathbf{\xi }),\\ {\widehat{v}}(\mathbf{\xi })= & {} -\frac{\mu |\mathbf{\xi }|^{p}}{m(\mathbf{\xi })}{\widehat{f}}(\mathbf{\xi })+\frac{(\lambda +\nu |\mathbf{\xi }|^{p})}{m(\mathbf{\xi })}{\widehat{g}}(\mathbf{\xi }),\\ m(\mathbf{\xi })= & {} (\lambda +\nu |\mathbf{\xi }|^{p})^{2}+(\mu |\mathbf{\xi }|^{p})^{2}. \end{aligned}$$

Note now that since \(\nu >0\), for any \(\lambda >0\) we have

$$\begin{aligned} \frac{(\lambda +\nu |\mathbf{\xi }|^{p})}{m(\mathbf{\xi })}\le \frac{1}{\lambda +\nu |\mathbf{\xi }|^{p}}\le \frac{1}{\lambda }, \end{aligned}$$
(2.30)

and also, since

$$\begin{aligned} |\lambda +\nu |\mathbf{\xi }|^{p}||\mu |\mathbf{\xi }|^{p}|\le \frac{m(\mathbf{\xi })}{2}, \end{aligned}$$

then

$$\begin{aligned} \frac{|\mu |\mathbf{\xi }|^{p}|}{m(\mathbf{\xi })}\le \frac{1}{2(\lambda +\nu |\mathbf{\xi }|^{p})}\le \frac{1}{2\lambda }. \end{aligned}$$
(2.31)

Finally, the application of Hille-Yosida theorem to each case, using the corresponding estimates (2.27), (2.28) and (2.29)–(2.31), proves the second part of the theorem. For the general case, we note that, using the eigenvalues of d, there is a nonsingular matrix P such that \({\varLambda }=PdP^{-1}\) is of one of the three forms in (2.24) (see Lemma 4). Then, the theorem follows by using \(M=||P||||P^{-1}||\). \(\square \)

2.4 Generalized Small Theta Approximation

One of the arguments to consider complex diffusion as an alternative for image processing is the so-called small theta approximation [14]. This means that for small values of the imaginary part of the complex diffusion coefficient, the corresponding imaginary part of the solution of the evolutionary diffusion problem behaves, in the limit, as a scaled smoothed Gaussian derivative of the initial signal. This idea can also be discussed in the context of cross-diffusion systems (2.1), where D is the infinitesimal generator (2.16), that is

$$\begin{aligned} \mathbf{u}(\mathbf{x},t)=\hbox {e}^{tD}{} \mathbf{u}_{0}(\mathbf{x})=\hbox {e}^{-t(-{\varDelta })^{p/2}d}{} \mathbf{u}_{0}(\mathbf{x}). \end{aligned}$$
(2.32)

By using the notation introduced in Lemma 4, we have the following result.

Theorem 3

Define the operator \(A=-(-{\varDelta })^{p/2}, p>0\) and let d be a positive definite matrix with eigenvalues and parameters given by (3.2). Assume that d satisfies one of the cases (i), (iii) or (iv) in Lemma 4. Let \(f\in X\) be a real-valued function. If \(\mathbf{u}(\mathbf{x},t)=(u(\mathbf{x},t),v(\mathbf{x},t))^{T}\) then (2.32) satisfies:

  1. (C1)

    If \(|s|\rightarrow 0\) and \(\mathbf{u}_{0}=(f,0)^{T}\) then

    $$\begin{aligned}&\lim _{d_{21}\rightarrow 0}{u(\mathbf{x},t)}=\hbox {e}^{\frac{q}{2} tA}f(\mathbf{x}),\\&\lim _{d_{21}\rightarrow 0}\frac{v(\mathbf{x},t)}{d_{21}}=tA\left( \hbox {e}^{\frac{q}{2} tA}f(\mathbf{x})\right) . \end{aligned}$$
  2. (C2)

    If \(|s|\rightarrow 0\) and \(\mathbf{u}_{0}=(0,f)^{T}\) then

    $$\begin{aligned}&\lim _{d_{12}\rightarrow 0}\frac{u(\mathbf{x},t)}{d_{12}}=tA\left( \hbox {e}^{\frac{q}{2} tA}f(\mathbf{x})\right) ,\\&\lim _{d_{12}\rightarrow 0}{v(\mathbf{x},t)}=\hbox {e}^{\frac{q}{2} tA}f(\mathbf{x}), \end{aligned}$$

where \(q=\mathrm {tr}(d)\) is the trace of d.

Proof

We can write (2.32) in the form

$$\begin{aligned} \mathbf{u}(\mathbf{x},t)=P\hbox {e}^{tA{\varLambda }}P^{-1}{} \mathbf{u}_{0}(\mathbf{x}), \end{aligned}$$
(2.33)

where P and \({\varLambda }\) are the corresponding matrices specified in Lemma 4 in each case. Specifically and after some tedious but straightforward calculations, we have:

  • In the case (i)

    $$\begin{aligned} u(\mathbf{x},t)= & {} \hbox {e}^{\frac{q}{2} tA}\left( \cosh (t\mu A)-\frac{r}{2\mu }\sinh (t\mu A)\right) u_{0}(\mathbf{x})\nonumber \\&-\,\hbox {e}^{\frac{q}{2} tA}\left( \frac{d_{12}}{\mu }\sin (t\mu A)\right) v_{0}(\mathbf{x}),\nonumber \\ v(\mathbf{x},t)= & {} \hbox {e}^{\frac{q}{2} tA}\left( \cosh (t\mu A)+\frac{r}{2\mu }\sinh (t\mu A)\right) v_{0}(\mathbf{x})\nonumber \\&+\,\hbox {e}^{\frac{q}{2} tA}\left( \frac{d_{21}}{\mu }\sinh (t\mu A)\right) u_{0}(\mathbf{x}), \end{aligned}$$
    (2.34)
  • In the case (iii)

    $$\begin{aligned} u(\mathbf{x},t)= & {} \hbox {e}^{\frac{q}{2} tA}\left( (1-\frac{rt}{2}A)u_{0}(\mathbf{x})+d_{12}tAv_{0}(\mathbf{x})\right) ,\nonumber \\ v(\mathbf{x},t)= & {} \hbox {e}^{\frac{q}{2} tA}\left( (1+\frac{rt}{2}A)v_{0}(\mathbf{x})+d_{21}tAu_{0}(\mathbf{x})\right) ,\nonumber \\ \end{aligned}$$
    (2.35)
  • In the case (iv)

    $$\begin{aligned} u(\mathbf{x},t)= & {} \hbox {e}^{\frac{q}{2} tA}\left( \cos (t\mu A)-\frac{r}{2\mu }\sin (t\mu A)\right) u_{0}(\mathbf{x})\nonumber \\&-\,\hbox {e}^{\frac{q}{2} tA}\left( \frac{d_{12}}{\mu }\sin (t\mu A)\right) v_{0}(\mathbf{x}),\nonumber \\ v(\mathbf{x},t)= & {} \hbox {e}^{\frac{q}{2} tA}\left( \cos (t\mu A)+\frac{r}{2\mu }\sin (t\mu A)\right) v_{0}(\mathbf{x})\nonumber \\&+\,\hbox {e}^{\frac{q}{2} tA}\left( \frac{d_{21}}{\mu }\sin (t\mu A)\right) u_{0}(\mathbf{x}) , \end{aligned}$$
    (2.36)

where \(\mathbf{u}_{0}(\mathbf{x})=(u_{0}(\mathbf{x}),v_{0}(\mathbf{x}))^{T}\) and the cosine, sine, hyperbolic cosine and hyperbolic sine of the operator A are defined in the standard way from the exponential, see e.g. Yosida [34]. By using the approximations as \(z\rightarrow 0\),

$$\begin{aligned} \cos (z)\approx 1,\quad \sin (z)\approx z,\quad \cosh (z)\approx 1,\quad \sinh (z)\approx z, \end{aligned}$$

and the corresponding limits in (2.34)–(2.36) then (C1) and (C2) hold. \(\square \)

Theorem 3 can be considered as a generalization of the small theta approximation property of linear complex diffusion. Under the conditions specified in the theorem, one of the components behaves as the operator A applied to a smoothed version of the original image f, scaled by t and with the smoothing effect determined by A and the trace of d. Actually, Formulas (2.34)–(2.36) can be used to extend the result to other initial distributions \(\mathbf{u}_{0}\). Finally, note that if d is similar to a matrix of the case (ii) in Lemma 4 then

$$\begin{aligned} u(\mathbf{x},t)=\hbox {e}^{\frac{q}{2} tA}u_{0}(\mathbf{x}),\quad v(\mathbf{x},t)=\hbox {e}^{\frac{q}{2} tA}v_{0}(\mathbf{x}), \end{aligned}$$

and this property does not hold. This case could be considered as a Gaussian smoothing for both components.

3 Numerical Experiments

This section is devoted to illustrate numerically the behaviour of linear filters of cross-diffusion. More specifically, the numerical experiments presented here, involving one- and two-dimensional signals, will concern the influence, in a filtering problem with processes (2.1) and kernels satisfying (2.19), of the following elements:

  • The matrix d: According to the analysis of Sect. 2, the choice of d plays a role in the generalized small theta approximation and the experiments will also attempt to discuss if the influence is somehow extended to the quality of filtering. Note that the matrix d in (2.23) must be positive definite, but not necessarily symmetric. (This means that \(\mathbf{x}^{T}\hbox {d}{} \mathbf{x}>0\) for all \(\mathbf{x}\ne 0\) or, equivalently, if the symmetric part \((d+d^{T})/2\) is positive definite.) This implies that the real part of each eigenvalue is positive. In terms of the entries of d, the positive definite character requires two conditions

    $$\begin{aligned} d_{11}>0,\quad 4d_{11}d_{22}-(d_{12}+d_{21})^{2}>0, \end{aligned}$$
    (3.1)

    or equivalently, being

    $$\begin{aligned} \lambda _{\pm }= & {} \frac{1}{2}\left( d_{11}+d_{22}\pm \sqrt{s}\right) , \nonumber \\ s= & {} r^{2}+4d_{12}d_{21},\quad r=d_{22}-d_{11}, \end{aligned}$$
    (3.2)

    the eigenvalues of d, then \(Re(\lambda _{\pm })>0\). According to this three types of matrices will be taken in the experiments, covering the different form of the eigenvalues (see Lemma 4).

  • The initial distribution \(\mathbf{u}_{0}\): Besides the generalized small theta approximation, the choice of \(\mathbf{u}_{0}\) is also relevant to define the average grey value.

  • The parameter p: Here the purpose is to explore numerically if locality affects the filtering process, either in its quality or computational performance.

As mentioned in the Introduction, these experiments are a first approach to the numerical study of the behaviour of the filters. All the computations are made with Fourier techniques [6]. More presicsely, for the case of one-dimensional signals, an interval \((-L,L)\) with large L is defined and discretized by Fourier collocation points \(x_{j}=-L+jh, j=0,\ldots ,N,\) with stepsize \(h>0\) and the signal is represented by the corresponding trigonometric interpolating polynomial with the coefficients computed by the discrete Fourier transform (DFT) of the signal values at the collocation points. For experiments with images, the implementation follows the corresponding Fourier techniques in two dimensions, with discretized intervals \((-L_{x},L_{x})\) \(\times \) \((-L_{y},L_{y})\), being \(L_{x}, L_{y}\) large, by Fourier collocation points \((x_{j},y_{k})\), with \(x_{j}=-L_{x}+jh_{x}, j=0,\ldots ,N_{x},\) \(y_{k}=-L_{y}+kh_{y}, k=0,\ldots ,N_{y}\), and the image is represented by the trigonometric interpolating polynomial at the collocation points, computed with the corresponding two-dimensional version of the DFT. In both cases, from the Fourier representation, the convolution (2.1) is implemented in the Fourier space by using (2.11).

The experiments can be divided in two groups. The first one concerns the evolution of a clean signal. It illustrates properties like the generalized small theta approximation and the effect of locality. The second group deals with image filtering problems, and the experiments are performed with the following strategy: from an original image s we add some noise of different types to generate an initial noisy signal. From this noisy signal an initial distribution is defined and then the restored image, given by (2.1), is monitored at several times \(t_{n}, n=0,1,\ldots \), in order to estimate the quality of restoration. This quality has been measured by using different standard metrics, namely:

  • signal-to-noise ratio (SNR):

    $$\begin{aligned} SNR(s,u^{n})=10\log _{10}\left( \frac{\mathrm{var}(u^{n})}{\mathrm{var}(s-u^{n})}\right) . \end{aligned}$$
    (3.3)
  • peak signal-to-noise ratio (PSNR):

    $$\begin{aligned} PSNR(s,u^{n})=10\log _{10}\left( \frac{l^{2}}{||s-u^{n}||^{2}}\right) . \end{aligned}$$
    (3.4)

In all the cases, \(u^{n}\) stands for the first component of the restored image at time \(t_{n}\), l is the length of the vectors in one-dimensional signals and \(l=255\) for two-dimensional signals, \(||\cdot ||\) stands for the Euclidean norm in 1D and the Frobenius norm in 2D and \(\mathrm{var}(x)\) is the variance of the vector x (or the matrix disposed as a vector). According to the formulas, the larger the parameter values the better the filtering is. Other metrics, like the root-mean-square error or the correlation coefficient, have been used in the experiments, although only the results corresponding to (3.3), (3.4) will be shown here.

3.1 Choice of Matrix d

3.1.1 Experiments in 1D

A unit function f is first considered. Taking \(\mathbf{u}_{0}=(f,0)^{T}\), the evolution (2.1) with kernels satisfying Theorem 1 and \(p=2\) is monitored at several times. Experiments with three types of matrices d [covering the cases \(s>0, s=0, s<0\), where s is given by (3.2] and for each type, with different values (according to the size of |s|) were performed. They are represented in Figs. 12 and 3 and suggest two first conclusions:

Fig. 1
figure 1

Cross-diffusion with \(p=2\) and for \(d_{11}=1,d_{22}=1.1, d_{12}=0.1,d_{21}=1\). Profiles of a u and b v at times \(t=0,0.25,2.5,25\)

Fig. 2
figure 2

Cross-diffusion with \(p=2\) and for \(d_{11}=1,d_{22}=1.1, d_{12}=0.1,d_{21}=-0.25\). Profiles of a u and b v at times \(t=0,0.25,2.5,25\)

Fig. 3
figure 3

Cross-diffusion with \(p=2\) and for \(d_{11}=d_{22}=1, d_{12}=-0.1,d_{21}=0.1\). Profiles of a u and b v at times \(t=0,0.25,2.5,25\)

  1. 1.

    The first component is affected by a smoothing effect. We observed that the regularization is stronger as \(d_{11}, d_{22}\) (which, by positive definiteness of d, must be positive) and \(|d_{12}|, |d_{21}|\) grow.

  2. 2.

    Except in the case \(d_{12}=d_{21}=0\) (which may be associated to real Gaussian smoothing) the second component develops a sort of small-amplitude Gaussian derivative-type monopulse. Again, the height of the amplitude depends on how large (in absolute value) the elements of d are, with the larger the parameters the taller the wave is. In particular, this property illustrates the effect of the small theta approximation in complex diffusion (Fig. 3) and in more general cross-diffusion models (Figs. 12).

3.1.2 Experiments in 2D

The illustration of the influence of the matrix d in 2D signals is focused on mainly two points: the small theta approximation and the behaviour of the filtering evolution of (2.1) with respect to the blurring effect, the detection of the edges and the treatment of textures of the image. From the experiments performed, we observe the following:

  • The property of generalized small theta approximation is illustrated in Fig. 4, which corresponds to apply (2.1) at \(t=0.1\) with \(p=2\) and matrices

    $$\begin{aligned} d_{1}=\begin{pmatrix}1&{}\quad 10^{-5}\\ 1.99&{}\quad 1\end{pmatrix},\quad d_{2}=\begin{pmatrix}1&{}\quad -10^{-5}\\ 1.99&{}\quad 1\end{pmatrix}, \end{aligned}$$
    (3.5)

    corresponding to the cases \(s>0\) and \(s<0\), respectively, with |s| small. The initial condition is \(\mathbf{u}_{0}=(f,0)^{T}\) where f is the original image displayed in Fig. 4a. According to Theorem 3, the entry \(d_{21}\) of d may be used as a natural scale for the second component of the solution of (2.1), which is shown in Fig. 4b, c for \(d=d_{1}\) and \(d=d_{2}\), respectively. In this small theta approximation property the processes displayed (and those that we performed with matrices for which \(s=0\), not shown here) do not have relevant differences: the second component is affected by a slight blurring. A comparison with some standard methods for edge detection [5, 28] is given in Fig. 5.

Fig. 4
figure 4

Generalized small theta approximation. a Original image f. b, c Second component of the solution of (2.1) at \(t=0.1\) with \(p=2, \mathbf{u}_{0}=(f,0)^{T}\) and b \(d=d_{1}\), (c) \(d=d_{2}\), see (3.5)

Fig. 5
figure 5

Edge detection from f in Fig. 4a provided by: a the Prewitt method [28] and b the Canny method [5]

  • When the image is affected by some noise, the behaviour of the filtering process with (2.1) can show some differences depending on the choice of the matrix d. The first one concerns the blurring effect. Our numerical experiments suggest a better evolution of the filtering for matrices in the case (i) of Lemma 4 than that of matrices in cases (ii)–(iv). (The last one includes the linear complex diffusion.) This is illustrated by Fig. 6 that corresponds to the evolution of (2.1) from a noisy image f (Fig. 6a) with \(p=2, \mathbf{u}_{0}=(f,0)^{T}\) and matrices \(d=d_{1}\) (Fig. 6b) and \(d=d_{2}\) (Fig. 6c), now of the form

    $$\begin{aligned} d_{1}=\begin{pmatrix}1&{}\quad 0.9\\ 1&{}\quad 1\end{pmatrix},\quad d_{2}=\begin{pmatrix}1&{}\quad -0.9\\ 1&{}\quad 1\end{pmatrix}, \end{aligned}$$
    (3.6)

    at time \(t=15\). The experiment shows that filtering with \(d_{1}\) delays the blurring effect with respect to the behaviour observed in the case of \(d_{2}\). Similar experiments suggest that using matrices d for which the parameter \(s=(d_{22}-d_{11})^{2}+4d_{12}d_{21}\) is positive and moderately large (always conditioned to the satisfaction of positive definite character) improves the filtering in this way. This is also confirmed by Fig. 7, where the evolution of the corresponding SNR and PSNR indexes (3.3) and (3.4) for (2.1) with \(d_{1}\) and \(d_{2}\) are shown, respectively.

Fig. 6
figure 6

Image filtering problem. a Original noisy image f. b, c First component of the solution of (2.1) at \(t=15\) with \(p=2, \mathbf{u}_{0}=(f,0)^{T}\) and b \(d=d_{1}\), c \(d=d_{2}\), see (3.6)

Fig. 7
figure 7

a SNR versus t and b PSNR versus t for a filter (2.1) with \(p=2, \mathbf{u}_{0}=(f,0)^{T}\), f in Fig. 6a and for both \(d=d_{1}\) and \(d=d_{2}\), see (3.6)

  • The delay of the blurring effect has also influence in other features of the image. The first one is the edge detection by using the second component of (2.1), as observed in Fig. 8. The better performance of the process with \(d_{1}\) gives a less blurred detection of the edges than the one given by \(d_{2}\). On the other hand, the delay of the blurring effect may improve the identification of the textures of the image. Using nonlinear models, Lorenz et al. [24] suggested a good behaviour of cross-diffusion with respect to the textures. Numerical experiments in this sense were performed here and they are illustrated in Table 1. This shows the entropy as a measure of texture. The entropy was computed as

    $$\begin{aligned} En=\sum _{i,j}c_{ij}\log _{2}{c_{ij}}, \end{aligned}$$
    (3.7)

    where \(c_{ij}\) stands for the entries of the corresponding grey-level occurrence matrix [15].

Fig. 8
figure 8

Image filtering problem. Second component of the solution of (2.1) at \(t=15\) with \(p=2, \mathbf{u}_{0}=(f,0)^{T}\), f in Fig. 6a and: a \(d=d_{1}\), b \(d=d_{2}\), see (3.6)

3.2 Choice of the Parameter p

3.2.1 Experiments in 1D

The influence of the values of p is first illustrated in Figs. 910 and 11 for 1D signals and similar matrices to those of cases (i), (iii) and (iv) in Lemma 4. Note that as p grows the first component develops small oscillations at the points with less regularity. On the other hand, the second component increases somehow the number of pulses.

3.2.2 Experiments in 2D

As in Sect. 3.1.2, a first point to study here is the effect of p on the generalized small theta approximation property. A similar experiment with \(d_{1}\) and \(d_{2}\) given by (3.5) illustrates the assignment of the second component as edge detector, under the conditions of Theorem 3 but when the operator D in (2.23) is nonlocal. The results are shown in Fig. 12, corresponding to \(p=3\). Compared to Fig. 4 (for which \(p=2\)) no relevant differences are observed.

Table 1 Entropy (3.7) for the image in Fig. 4a and the corresponding images obtained with (2.1) at \(t=10\) with \(p=2, \mathbf{u}_{0}=(f,0)^{T}\), being f the corresponding original image, for both \(d=d_{1}\) and \(d=d_{2}\), see (3.6)
Fig. 9
figure 9

Cross-diffusion with \(p=3\) and for \(d_{11}=1,d_{22}=1.1, d_{12}=0.1,d_{21}=1\). Profiles of a u and b v at times \(t=0,0.25,2.5,25\)

Fig. 10
figure 10

Cross-diffusion with \(p=4\) and for \(d_{11}=1,d_{22}=1.1, d_{12}=0.1,d_{21}=-0.25\). Profiles of a u and b v at times \(t=0,0.25,2.5,25\)

Fig. 11
figure 11

Cross-diffusion with \(p=5\) and for \(d_{11}=d_{22}=1, d_{12}=-0.1,d_{21}=0.1\). Profiles of a u and b v at times \(t=0,0.25,2.5,25\)

Fig. 12
figure 12

Generalized small theta approximation. Second component of the solution of (2.1) at \(t=0.1\) with \(p=3, \mathbf{u}_{0}=(f,0)^{T}\) and b \(d=d_{1}\), c \(d=d_{2}\), see (3.5)

The influence of the local or nonlocal character of (2.23) on the quality of filtering was studied by numerical means and some of the experiments performed are shown here. The first one, displayed in Fig. 13, compares the SNR and PSNR values as functions of p obtained by computing (2.1) from a noisy image f with \(\mathbf{u}_{0}=(f,0)^{T}\), \(d=d_{1}\) given by (3.6) and at time \(t=10\). Note that both parameters attain a maximum value by \(p=4\) (for which the operator D in (2.16) is local) while among the nonlocal generators, those around the values \(p=3\) (in the case of PSNR) and \(p=5\) (in the case of SNR) show the best results. In all the related experiments performed, the same behaviour was observed.

Fig. 13
figure 13

SNR (a) and PSNR (b) versus p for a filter (2.1) at \(t=10\) with \(\mathbf{u}_{0}^{(1)}=(f,0)^{T}\) where f is the initial noisy image affected by additive Gaussian white noise with \(\sigma =40\) and \(d=d_{1}\), see (3.6)

As for the evolution of the filtering, a similar experiment to that of Fig. 7 but for different values of p is illustrated in Fig. 14. The behaviour of the SNR and PSNR values suggest that the advantages of using (2.1) with matrices of the type of \(d_{1}\) are independent of p.

Fig. 14
figure 14

a SNR versus t and b PSNR versus t for a filter (2.1) with \(p=3, \mathbf{u}_{0}=(f,0)^{T}\), f in Fig. 6a, for both \(d=d_{1}\) and \(d=d_{2}\), see (3.6)

3.3 Choice of the Initial Distribution

A final question in this numerical study is the influence of the initial distribution \(\mathbf{u}_{0}\) in the small theta approximation and in filtering problems in 2D with (2.1). As far as the first one is concerned, note that Theorem 3 is applied for \(\mathbf{u}_{0}=(f,0)^{T}\) or \(\mathbf{u}_{0}=(0,f)^{T}\) where f is the original image. As \(s\rightarrow 0\), in all the cases (2.34)–(2.36) , \(u(\mathbf{x},t), v(\mathbf{x},t)\) behave as

$$\begin{aligned} u(\mathbf{x},t)\approx & {} \hbox {e}^{\frac{q}{2} tA}\left( (1-\frac{rt}{2}A)u_{0}(\mathbf{x})+d_{12}tAv_{0}(\mathbf{x})\right) ,\nonumber \\ v(\mathbf{x},t)\approx & {} \hbox {e}^{\frac{q}{2} tA}\left( (1+\frac{rt}{2}A)v_{0}(\mathbf{x})+d_{21}tAu_{0}(\mathbf{x})\right) . \end{aligned}$$
(3.8)

Then, the approximation (3.8) [which is actually exact in the case of (2.35)] suggests to explore, at least numerically, some other choices for \(\mathbf{u}_{0}\). Among the ones used in our numerical experiments, by way of illustration two are considered here, namely

$$\begin{aligned} \mathbf{u}_{0}^{(1)}=(f,|\nabla f|)^{T}, \mathbf{u}_{0}^{(2)}=(f,-|\nabla f|{\varDelta }f)^{T}, \end{aligned}$$
(3.9)

and we denote \(\mathbf{u}_{0}^{(0)}=(f,0)^{T}\).

Fig. 15
figure 15

Generalized small theta approximation. Second component of the solution of (2.1) at \(t=0.1\) with \(p=2,d=d_{1}\) in (3.5) and: a \(\mathbf{u}_{0}=\mathbf{u}_{0}^{(1)}\); b \(\mathbf{u}_{0}=\mathbf{u}_{0}^{(2)}\), see (3.9)

By making similar experiments to that of Fig. 4 in Sect. 3.1.1, the results are illustrated in Fig. 15, which corresponds to initial data given by (3.9). As for the quality of filtering, the experiments presented here can be compared to those of Sect. 3.1.2 for \(\mathbf{u}_{0}=\mathbf{u}_{0}^{(0)}\). Specifically, Figs. 16 and 17 correspond to the same experiment as in Fig. 6 but, respectively, with \(\mathbf{u}_{0}=\mathbf{u}_{0}^{(1)}\) and \(\mathbf{u}_{0}=\mathbf{u}_{0}^{(2)}\). In the three experiments we observe that the results do not improve those obtained by \(\mathbf{u}_{0}^{(0)}\) in Sect. 3.1.2. This is confirmed by a last experiment, where the original image from Fig. 4a is blurred by adding Gaussian white noise with several values of standard deviation \(\sigma \). Then, (2.1) with \(p=2, d=d_{1}\) in (3.6) and the three different initial distributions \(\mathbf{u}_{0}^{(j)}, j=0,1,2\) is applied. Then, the SNR and PSNR values (3.3) and (3.4) at time \(t=5\) are computed. The results are displayed in Table 2 and show that the best values of the metrics are given by \(\mathbf{u}_{0}^{(0)}\).

Fig. 16
figure 16

Image filtering problem. First component of the solution of (2.1) at \(t=15\) with \(p=2, \mathbf{u}_{0}=\mathbf{u}_{0}^{(1)}\) and: a \(d=d_{1}\), b \(d=d_{2}\), see (3.6)

Fig. 17
figure 17

Image filtering problem. First component of the solution of (2.1) at \(t=15\) with \(p=2, \mathbf{u}_{0}=\mathbf{u}_{0}^{(2)}\) and: a \(d=d_{1}\), b \(d=d_{2}\), see (3.6)

4 Conclusions and Perspectives

In the present paper linear cross-diffusion systems for image processing are analysed. Viewed as convolution processes, those kernels satisfying fundamental scale-space properties are characterized in terms of a positive definite matrix to control the cross-diffusion and a positive parameter that determines the local character of the infinitesimal generator. The axiomatic is based on scale invariance and generalizes that of the scalar case. The cross-diffusion approach, viewed as a generalization of linear complex diffusion, is shown to satisfy more general versions of the small theta approximation property to assign a role of edge detector to one of the components.

In a second part, a numerical study of comparison with kernels is made. This can be considered as a first approach, by computational means, to the performance of linear cross-diffusion models, to be analysed in a more exhaustive way in future works. The numerical experiments, performed for one- and two-dimensional signals, show the influence of the choice of the initial distribution of the image in a vector of two components, as well as of the matrix of the kernel on the behaviour of the filtering process by cross-diffusion. The numerical results suggest that suitable choices of the positive definite matrix give a delay of blurring which can also be useful to a better identification of the edges and that is independent of the local or nonlocal character of the infinitesimal generator. Additionally, other values of the initial distribution, different from the ones for which the generalized small theta approximation holds, do not improve the results in our experiments.

Table 2 SNR and PSNR values for (2.1) at \(t=5\) with \(p=2, d=d_{1}\) in (3.6) and \(\mathbf{u}_{0}=\mathbf{u}_{0}^{(j)}, j=0,1,2\)

The present paper will be continued in a natural way by the introduction of nonlinear cross-diffusion models and the study of their behaviour in image restoration (see Araújo et al. [2]).