1 Introduction

A classical problem in image processing and, particularly, in pattern recognition, is to identify if a large image contains copies -and how many, and their locations and orientations- of a small image, named “template”. The resulting algorithms are generically known as template matching algorithms [3, 13, 14]. The most classical solution is based on using cross-correlations, although there are other approaches based, for example, in metaheuristic algorithms [7] or on deep learning [9, 18, 21]. In this paper, we show the mathematical foundations of the cross-correlation-based template matching algorithm (TM in all that follows), and we introduce a new fast algorithm that solves the problem using tensors.

The main advantages of TM, when compared to the algorithms based on machine learning, are that TM is a white box model, it is directly applicable when you have just one template and one larger image (not requiring any kind of training, which may be a very difficult task in some applications), and locates rotations with arbitrary precision. Current deep learning based algorithms for template matching in three-dimensional images are not able to estimate rotations accurately [18].

On the other hand, a major drawback of TM is its computational cost. TM basic idea is to compute the inner product between the (rotated) template and the (translated) image, and normalize the result. These computations are made, for each rotation, in the Fourier domain to efficiently address translations [2, 19, 24]. However, this process has to be repeated for every rotation to be investigated, thus the resulting complexity has a dependency with the rotations processed. The computational cost of this process may become restrictive for 3D images since SO(3), the space of rotations of \(\mathbb {R}^3\), is a (compact) manifold of dimension 3. In application domains such as cryo-electron microscopy, there are required more than ten thousand rotations for achieving an angular precision of a few degrees. An alternative approach is to apply steerable filters to compute efficiently the correlation at different rotations. However, although a new method has been developed to generate steerable filters for any arbitrary kernel [12] in 2D, there is no solution yet for 3D images.

We propose an algorithm called tensorial template matching, TTM, which integrates into a unique symmetric tensor the information relative to the template in all rotations. In other words, the tensor template incorporates in a unique object the information about all rotations of the template, thus allowing us to find the position and rotation of instances of the template in any tomogram with just a few correlations with the linearly independent components of the tensor. The tensor template is computed only once per template, and, as soon as it is generated, it enables to process any image.

2 Classical template matching

Let us introduce some notation. d-dimensional images are just elements of \(L^2(\mathbb {R}^d)\), which is a Hilbert space with the inner product \(\langle f,g\rangle =\int _{\mathbb {R}^d}f(x)g(x)dx\). It is natural to use the inner product to compare two images fg of the same size. Concretely, we can use that \(\langle f,g\rangle =\Vert f\Vert _2\Vert g\Vert _2\cos \theta \), where \(\theta \) is the angle formed by f and g. In particular, \(f=\alpha g\) for some positive constant \(\alpha \) if \(\frac{\langle f,g\rangle }{\Vert f\Vert _2\Vert g\Vert _2}=1\).

Template matching is typically used to study if instances of a “small” image t (the template) is present in a larger image f, e.g. look for instances of an specific macromolecule in a cryo-electron tomogram (3D volumetric image). The size of the image is connected to the set of points where the image does not vanish, the support of the image. That is, t is meant “small” when the set \(K=supp(t)=\overline{\{x:t(x)\ne 0\}}^{\mathbb {R}^d}\) is small (e.g., is a subset of a small ball \(\mathbb {D}\)). Let’s assume that f and t have quite different sizes, so our interest is to compare t (the template, the small image) with just a part of f. In such case we need to introduce some special operators \(S:L^2(\mathbb {R}^d)\rightarrow L^2(\mathbb {R}^d)\) that fix our attention in just a part of the domain of f. An interesting example of such operators is

$$\begin{aligned} S_r(f)(x)=U(1-\frac{1}{r}\Vert x\Vert _2)f(x)=\left\{ \begin{array}{llll} f(x),&{} \ \Vert x\Vert _2\le r \\ 0,&{} \text { otherwise} \end{array} \right. , \end{aligned}$$
(1)

where \(U:\mathbb {R}\rightarrow \mathbb {R}\) denotes Heaviside’s unit step function and \(r>0\). If the support of the template t is \(\mathbb {D}_{\textbf{0}}(r)=\{x:\Vert x\Vert _2\le r\}\), the ball of radius r centered at \(\textbf{0}\in \mathbb {R}^d\), the normalized inner product

$$\begin{aligned} \frac{\langle S_r(f),t\rangle }{\Vert S_r(f)\Vert _2\Vert t\Vert _2} \end{aligned}$$

informs about the similarity between t and the restriction of f to \(\mathbb {D}_{\textbf{0}}(r)\). Moreover, if we introduce the translation operator \(\tau _{x}: L^2(\mathbb {R}^d)\rightarrow L^2(\mathbb {R}^d)\), \(\tau _{x}(f)(z)=f(z+x)\) and compute

$$\begin{aligned} \frac{\langle S_r(\tau _{x}(f)),t\rangle }{\Vert S_r(\tau _{x}(f))\Vert _2\Vert t\Vert _2} \end{aligned}$$

the result informs about the similarity between t and the restriction of f to \(\mathbb {D}_{x}(r)=\{z:\Vert x-z\Vert _2\le r\}\). Of course, it may happen that f contains a copy of a rotated version of t, so rotations are also necessary for a complete discussion of the problem. Thus, given \(R\in SO(d)\), we define the operator \(O_R: L^2(\mathbb {R}^d)\rightarrow L^2(\mathbb {R}^d)\), \(O_R(t)(z)=t(Rz)\), and for \(t\in L^2(\mathbb {R}^d)\), we define a rotated version of t,

$$\begin{aligned} t_R=O_{R^{-1}}(t). \end{aligned}$$
(2)

The normalized inner product

$$\begin{aligned} \frac{\langle S_r(\tau _{x}(f)),t_R\rangle }{\Vert S_r(\tau _{x}(f))\Vert _2\Vert t\Vert _2} \end{aligned}$$

informs about the similarity of \(t_R\) and the restriction of f to \(\mathbb {D}_{x}(r)\). It is important to notice that \(\Vert t\Vert _2=\Vert t_R\Vert _2\).

The operator \(S_r\) defined by (1) has some special properties. Concretely, it is symmetric, semidefinite positive and commutes with rotations Recall that, given \((X,\langle \cdot ,\cdot \rangle _X)\) a (real) inner product space,Footnote 1 an operator \(S:X\rightarrow X\) is named:

  • Symmetric (also named self-adjoint) if

    $$\begin{aligned} \langle f,S(g)\rangle _X= \langle S(f),g\rangle _X \quad \text {for all } f,g\in X \end{aligned}$$
  • Semidefinite positive, if

    $$\begin{aligned} \langle f,S(f)\rangle _X\ge 0\quad \text {for all } f\in X \end{aligned}$$
  • Definite positive, if it is semidefinite positive and \(\langle f,S(f)\rangle =0\) implies \(f=0\).

If \(S:X\rightarrow X\) is a symmetric semidefinite positive operator (SSP, in all what follows), then X becomes a semi-normed space with the inner product

$$\begin{aligned} \langle f,g\rangle _S:= \langle f,S(g)\rangle _X \end{aligned}$$
(3)

and the seminorm

$$\begin{aligned} \Vert f\Vert _S=\sqrt{\langle f,f\rangle _S} \end{aligned}$$
(4)

Observe, for example, that if S is given by (1), then \(\Vert f\Vert _S=0\) means that \(f_{|\mathbb {D}_{\textbf{0}}(r)}=0\) almost everywhere.

Theorem 2.1

Let X be an inner product space, \(S:X\rightarrow X\) be an SSP operator, and consider the inner product given by (3). Then

\(\mathrm{(a)}\):

\( \langle f,g\rangle _S\le \left| \langle f,g\rangle _S\right| \le \Vert f\Vert _S\Vert g\Vert _S\) for all \(f,g\in L^2(\mathbb {R}^d)\).

Moreover, if \(f,g\in L^2(\mathbb {R}^d)\), \(\Vert g\Vert _S\ne 0\), the following are equivalent statements:

\(\mathrm{(b)}\):

\(\langle f,g\rangle _S=\Vert f\Vert _S\Vert g\Vert _S\).

\(\mathrm{(c)}\):

\(\Vert f-\frac{\Vert f\Vert _S}{\Vert g\Vert _S}g\Vert _S=0\).

Proof

As S is SSP, we have that, for all \(\alpha \in \mathbb {R}\),

$$\begin{aligned} 0\le \langle f+\alpha g,f+\alpha g\rangle _S=\Vert f\Vert _S^2+2\alpha \langle f,g\rangle _S+\alpha ^2\Vert g\Vert _S^2. \end{aligned}$$
(5)

Hence, if \(\Vert g\Vert _S\ne 0\), the only way that the quadratic polynomial (in \(\alpha \)) above is nonnegative everywhere is that

$$\begin{aligned} 4\langle f,g\rangle _S^2-4\Vert f\Vert _S^2\Vert g\Vert _S^2\le 0, \end{aligned}$$

which is equivalent to

$$\begin{aligned} \left| \langle f,g\rangle _S\right| \le \Vert f\Vert _S\Vert g\Vert _S. \end{aligned}$$
(6)

On the other hand, if \(\Vert g\Vert _S=0\), the only way to satisfy (5) is \(\langle f,g\rangle _S=0\), in whose case (6) also holds. This proves (a).

Let us now demonstrate \((b)\Leftrightarrow (c)\) whenever \(\Vert g\Vert _S\ne 0\). Indeed, (c) is equivalent to

$$\begin{aligned} 0= & {} \left\| f-\frac{\Vert f\Vert _S}{\Vert g\Vert _S}g\right\| _S^2\\= & {} \left\langle f-\frac{\Vert f\Vert _S}{\Vert g\Vert _S}g, f-\frac{\Vert f\Vert _S}{\Vert g\Vert _S}g\right\rangle \\= & {} \Vert f\Vert _S^2+ \frac{\Vert f\Vert ^2_S}{\Vert g\Vert ^2_S}\Vert g\Vert _S^2-2 \frac{\Vert f\Vert _S}{\Vert g\Vert _S}\langle f,g\rangle _S\\= & {} 2\Vert f\Vert _S^2-2 \frac{\Vert f\Vert _S}{\Vert g\Vert _S}\langle f,g\rangle _S, \end{aligned}$$

which holds if and only if

$$\begin{aligned} \langle f,g\rangle _S=\Vert f\Vert _S\Vert g\Vert _S. \end{aligned}$$

Thus \((b)\Leftrightarrow (c)\). \(\square \)

Note that, if \(f,t\in L^2(\mathbb {R}^d)\) are two images, \(\alpha >0\), and we take \(S=S_r\) given by (1), then \(\Vert \tau _xf-\alpha O_{R^{-1}}(t)\Vert _S=0\) means that f has a match with \(t_R\) in the unit ball centered at x. Indeed, there are many ways to define operators S with the property that \(\Vert f-g\Vert _S=0\) means that \(f=g\) in a neighbourhood of \(\textbf{0}\), so that \(\Vert \tau _xf-\alpha O_{R^{-1}}(t)\Vert _S=0\) means that f has a match with a rotated version of t in a neighbourhood of \(\textbf{0}\). Although arbitrary SSP operators may not enjoy this property, they allow the creation of a general way to deal with this kind of operators.

Thus, in all what follows, we assume that X is a vector subspace of \(L^2(\mathbb {R}^d)\) doted with the inner product that inherits from \(L^2(\mathbb {R}^d)\), \(S:X\rightarrow X\) is an SSP operator, and \(\Vert \textbf{1}\Vert _S^2=\langle \textbf{1},\textbf{1}\rangle _S>0\), where \(\textbf{1}(x)=1\) is the constant image.Footnote 2

Rotations and composition of operators will play an important role in this paper. Thus, it is natural to ask how the composition of rotations acts on the images. This is, indeed, a simple computation:

$$\begin{aligned} \begin{aligned} O_{R_1R_2}(t)(z)&=t(R_1R_2z) =t(R_1(R_2z))\\&=O_{R_1}(t)(R_2z)=O_{R_2}(O_{R_1}(t))(z) \end{aligned} \end{aligned}$$

Hence

$$\begin{aligned} O_{R_1R_2}=O_{R_2}\circ O_{R_1} \end{aligned}$$
(7)

and

$$\begin{aligned} t_{R_1R_2}= & {} O_{(R_1R_2)^{-1}}(t)= O_{R_2^{-1}R_1^{-1}}(t) \nonumber \\= & {} O_{R_1^{-1}}\circ O_{R_2^{-1}}(t) =(t_{R_2})_{R_1}. \end{aligned}$$
(8)

Given an image f, we consider its projection onto the space of images which are \(S-\) orthogonal to the constant image \(\textbf{1}\),

$$\begin{aligned} P_S(f)=f-\frac{\langle f,\textbf{1}\rangle _S}{\langle \textbf{1},\textbf{1}\rangle _S} \textbf{1}. \end{aligned}$$
(9)

Remark 1

These projections are important to study invariant properties with respect to constant brightness changes in the images of translations and rotations. Note that there is no “real” difference between an image f and the images of the form \(f+\alpha \textbf{1}\), \(\alpha \in \mathbb {R}\). When we modify the constant \(\alpha \), what we observe is a uniform change in the density or the brightness, but not the apparition of new structures or forms, in the image f. Thus, f and its projection \(P_S(f)\) essentially represent the very same image, since \(f=P_S(f)+\alpha \textbf{1}\) for certain \(\alpha \in \mathbb {R}\).

Given two images ft, we have that

$$\begin{aligned} f=P_S(f)+\alpha \textbf{1} \quad \text {and}\quad t=P_S(t)+\beta \textbf{1} \text { for certain constants } \alpha ,\beta . \end{aligned}$$

Hence

$$\begin{aligned} \langle f,t\rangle _S= & {} \langle P_S(f)+\alpha \textbf{1},P_S(t)+\beta \textbf{1}\rangle _S\\= & {} \langle P_S(f),P_S(t)\rangle _S+ \alpha \beta \langle \textbf{1},\textbf{1}\rangle _S \end{aligned}$$

since \(P_S(f),P_S(t) \perp _S \textbf{1}\). Consequently, if \(x\in \mathbb {R}^d\) and \(R\in SO(d)\), there are two constants \(\rho =\rho (x)\) and \(\delta =\delta (R)\) such that

$$\begin{aligned} \langle \tau _x(f),t_R\rangle _S= & {} \langle P_S(\tau _x(f)),P_S(t_R)\rangle _S+ \rho \delta \langle \textbf{1},\textbf{1}\rangle _S. \end{aligned}$$

Assume that S commutes with rotations, and take \(x\in \mathbb {R}^d\) fixed. Then, for each \(R\in SO(d)\) we have that

$$\begin{aligned} t_R= & {} O_{R^{-1}}(t)=O_{R^{-1}}(P_S(t)+\beta \textbf{1})\nonumber \\= & {} O_{R^{-1}}(P_S(t))+\beta \textbf{1} \end{aligned}$$
(10)

since \(O_{R^{-1}}(\textbf{1})=\textbf{1}\). Moreover

$$\begin{aligned}{} & {} \langle O_{R^{-1}}(P_S(t)),\textbf{1}\rangle _S \\{} & {} \quad = \int _{\mathbb {R}^d}P_S(R^{-1}u)S(\textbf{1})(u)du \\{} & {} \quad = \int _{\mathbb {R}^d}P_S(v)S(\textbf{1})(Rv)dv \end{aligned}$$

(just take \(v=R^{-1}u\) and use that \(\det R=1\) )

$$\begin{aligned}= & {} \int _{\mathbb {R}^d}P_S(t)(v)(O_R\circ S)(\textbf{1})(v)dv\\= & {} \int _{\mathbb {R}^d}P_S(t)(v)(S\circ O_R)(\textbf{1})(v)dv \end{aligned}$$

(since \(O_R\circ S= S\circ O_R\))

$$\begin{aligned}= & {} \int _{\mathbb {R}^d}P_S(t)(v)(S)(\textbf{1})(v)dv \end{aligned}$$

(since \(O_R(\textbf{1})=\textbf{1}\) )

$$\begin{aligned}= & {} \langle P_S(t),\textbf{1}\rangle _S = 0. \end{aligned}$$

Hence \(O_{R^{-1}}(P_S(t))\perp _S \textbf{1}\) and this, in conjunction with (10), implies that

$$\begin{aligned} P_S(t_R)=P_S(O_{R^{-1}}(t))= O_{R^{-1}}(P_S(t))=P_S(t)_R. \end{aligned}$$

Hence

$$\begin{aligned} t_R=P_S(t)_R+\beta \textbf{1} \end{aligned}$$

is an S-orthogonal decomposition of \(t_R\), which means that the constant \(\beta \) that multiplies \(\textbf{1}\) in the S-orthogonal decomposition of \(t_R\) does not depend on R, and

$$\begin{aligned} \langle \tau _x(f),t_R\rangle _S= & {} \langle P_S(\tau _x(f)),P_S(t)_R\rangle _S+ \rho \beta \langle \textbf{1},\textbf{1}\rangle _S\\= & {} \langle P_S(\tau _x(f)),P_S(t_R)\rangle _S+ \rho \beta \langle \textbf{1},\textbf{1}\rangle _S. \end{aligned}$$

In particular, for each \(x\in \mathbb {R}^d\), the problems:

  • Maximize \(\langle \tau _x(f),t_R\rangle _S \) over rotations R.

  • Maximize \(\langle P_S(\tau _x(f)),P_S(t)_R\rangle _S \) over rotations R.

  • Maximize \(\langle P_S(\tau _x(f)),P_S(t_R)\rangle _S \) over rotations R.

are equivalent.

Let us define:

$$\begin{aligned} f_{-x,R^{-1}}:=(O_R\circ \tau _x)(f). \end{aligned}$$
(11)

Lemma 2.2

If S is an SSP operator that commutes with rotations, the parameter \(\delta \) that appears in the S-orthogonal decomposition

$$\begin{aligned} f_{-x,R^{-1}}=P_S(f_{-x,R^{-1}})+\delta \textbf{1} \end{aligned}$$

does not depend on R. Consequently, given \(x\in \mathbb {R}^d\), the problems

  • Maximize \(\langle f_{-x,R^{-1}},t \rangle _S \) over rotations R.

  • Maximize \(\langle P_S(f_{-x,R^{-1}}),P_S(t)\rangle _S \) over rotations R.

are equivalent.

Proof

We know that \(\delta =\frac{\langle f_{-x,R^{-1}},\textbf{1} \rangle _S}{\langle \textbf{1},\textbf{1} \rangle _S}\), so that we only need to prove that \(\langle f_{-x,R^{-1}},\textbf{1} \rangle _S\) does not depend on R. Indeed,

$$\begin{aligned} \langle f_{-x,R^{-1}},\textbf{1} \rangle _S= & {} \int _{\mathbb {R}^d}f(Ry+x)S(\textbf{1})(y)dy \end{aligned}$$

(Make the change of variable \(z=Ry\) )

$$\begin{aligned}= & {} \int _{\mathbb {R}^d}f(z+x)S(\textbf{1})(R^{-1}z)dz \\= & {} \int _{\mathbb {R}^d}f(z+x)(O_{R^{-1}}\circ S)(\textbf{1})(z)dz \\= & {} \int _{\mathbb {R}^d}f(z+x)(S\circ O_{R^{-1}})(\textbf{1})(z)dz \end{aligned}$$

(since S commutes with \(O_{R^{-1}}\) )

$$\begin{aligned}= & {} \int _{\mathbb {R}^d}f(z+x)S(\textbf{1})(z)dz \end{aligned}$$

(since \(O_{R^{-1}}(\textbf{1})=\textbf{1}\) )

$$\begin{aligned}= & {} \langle \tau _x(f),\textbf{1}\rangle _S \end{aligned}$$

\(\square \)

We can now state and demonstrate the following:

Theorem 2.3

(Classical template matching) Let S be a SSP operator which commutes with rotations and let \(x\in \mathbb {R}^d\) be fixed. Then the following are equivalent problems:

\(\mathrm{(a)}\):

Maximize \(\langle f_{-x,R^{-1}},t \rangle _S\) over rotations R.

\(\mathrm{(b)}\):

Maximize \(\langle \tau _x(f), t_R \rangle _S\) over rotations R.

\(\mathrm{(c)}\):

Maximize \(\langle P_S(\tau _x(f)), P_S(t_R) \rangle _S\) over rotations R.

\(\mathrm{(d)}\):

Maximize \(\langle P_S(f_{-x,R^{-1}}),P_S(t) \rangle _S \) over rotations R.

Moreover, if \(\Vert t\Vert _S>0\) and S also has the property that \(\Vert f\Vert _S=0\) implies \(f_{|\textbf{D}}=0\) for a certain neighborhood \(\textbf{D}\) of \(\textbf{0}\in \mathbb {R}^d\) which contains the supports of all the rotated templates \(t_Q\) with \(Q\in SO(d)\), then a match between f and \(t_R\) in x is got whenever any one of the following claims hold:

\((\textrm{a}^*)\):

\(\frac{\langle f_{-x,R^{-1}},t \rangle _S}{\Vert f_{-x,R^{-1}}\Vert _S\Vert t\Vert _S}=1\)

\((\textrm{b}^*)\):

\(\frac{\langle \tau _x(f), t_R \rangle _S}{\Vert \tau _x(f)\Vert _S\Vert t_R\Vert _S}=1\)

\((\textrm{c}^*)\):

\(\frac{\langle P_S(\tau _x(f)), P_S(t_R) \rangle _S}{\Vert P_S(\tau _x(f))\Vert _S \Vert P_S(t_R)\Vert _S} =1\)

\((\textrm{d}^*)\):

\(\frac{\langle P_S(f_{-x,R^{-1}}),P_S(t) \rangle _S}{\Vert P_S( f_{-x,R^{-1}})\Vert _S\Vert P_S(t)\Vert _S}=1\)

Finally, the normalized correlations described in \((a^*)\), \((b^*)\), \((c^*)\), and \((d^*)\) do not change when we substitute f by \(\alpha f+\beta \), and t by \(\delta t+\gamma \), with \(\alpha ,\beta ,\delta ,\gamma \in \mathbb {R}\), \(\alpha ,\delta \ne 0\).

Proof

The equivalences \((a)\Leftrightarrow (d)\) and \((b)\Leftrightarrow (c)\) have been already shown. The following identities demonstrate \((a)\Leftrightarrow (b)\):

$$\begin{aligned} \langle f_{-x,R^{-1}},t\rangle _S= & {} \int _{\mathbb {R}^d}f(Rz+x)S(t)(z)dz\\= & {} \int _{\mathbb {R}^d}f(y+x)S(t)(R^{-1}y)dy \end{aligned}$$

( just take \(Rz=y\) and use that \(\det R=1\) )

$$\begin{aligned}= & {} \int _{\mathbb {R}^d}f(y+x)(O_{R}^{-1}\circ S)(t)(y)dy\\= & {} \int _{\mathbb {R}^d}f(y+x)(S\circ O_{R}^{-1})(t)(y)dy \end{aligned}$$

(since S commutes with \(O_{R^{-1}}\) )

$$\begin{aligned}= & {} \langle \tau _x(f),t_R\rangle _S \\ \end{aligned}$$

The other claims are a direct consequence of Theorem 2.1. \(\square \)

In all that follows, we assume that S is an SSP operator that commutes with rotations and t is normalized in the sense that \(t\perp _S \textbf{1}\) and \(\Vert t\Vert _S=1\). Then \(P_S(t_R)=t_R\) and \(\Vert t_R\Vert _S=1\) for all rotation R. Consequently,

$$\begin{aligned} \langle \tau _x(f),t_R\rangle _S= & {} \langle P_S(\tau _x(f))+\alpha \textbf{1},t_R\rangle _S \\= & {} \langle P_S(\tau _x(f)),t_R\rangle _S = \langle P_S(\tau _x(f)),P_S(t_R)\rangle _S \end{aligned}$$

and

$$\begin{aligned} c(x,R) =\frac{\langle P_S(\tau _x(f)),P_S(t_R)\rangle _S}{\Vert P_S(\tau _x(f))\Vert _S \Vert P_S(t_R)\Vert _S} = \frac{\langle \tau _x(f),t_R\rangle _S}{\Vert P_S(\tau _x(f))\Vert _S } \end{aligned}$$

attains its maximum (\(=1\)) if and only if there is a perfect match between f and \(t_R\) in x. Moreover, if we define \(w(x)=\frac{1}{\Vert P_S(\tau _x(f))\Vert _S }\) and consider the cross-correlation of functions \(f,g\in L^2(\mathbb {R}^d)\), which is defined by

$$\begin{aligned} (f\star g )(x)=\int _{\mathbb {R}^d}f(z+x)g(z)dz =\langle \tau _x(f),g\rangle , \end{aligned}$$
(12)

then

$$\begin{aligned} c(x,R) = w(x) (f\star S(t)_R)(x). \end{aligned}$$
(13)

A perfect match is, in general terms, never attained. This is so because the desired image, represented by the template t, is usually supported on a strict subset \(\Omega \) of the domain D were the operator S is able to distinguish functions. Thus, the image f may well contain a copy of the image represented by \(t_R\) but in the neighbourhoods of the support of \(t_R\), f will contain some information which is not present in \(t_R\). In addition, f is usually corrupted by noise and distortions. This means that the normalized correlations described in items \((a^*)-(d^*)\) of Theorem 2.3, will never equal 1. Consequently, a threshold should be introduced in order to decide if a match has (or has not) been produced.

In order to find the rotation which maximizes c(xR), the cross-correlation \((f\star S(t)_R)(x)\) should be computed for a huge amount of rotations R, which makes classical matching an inefficient approach for template matching. Indeed, for \(d=3\), the size of the set of rotations R used to sample SO(3) well enough to guarantee a reliable result varies between \(10^4\) and \(5\cdot 10^5\) rotations [5].

Due to numerical reasons, high frequencies may be altered during rotation transformation. Thus, in practice, we do not apply the operator S to the original images ft but to a filtered version of them that eliminates these high frequencies. Concretely, we apply an isotropic (i.e. rotation invariant) low-pass filter h to both images and, after that, we apply the template matching algorithm to the resulting images. The idea behind this is that, if there is a match between f and t, there will be a match between \({\mathfrak {f}}=f*h\) and \({\mathfrak {t}}=t*h\) too. The operator S results from applying a rotationally symmetric mask \(m(x)=\rho (\Vert x\Vert )\) to the given image. Thus, we substitute f by \({\mathfrak {f}}=f*h\) and t by \({\mathfrak {t}}=t*h\). Then we apply the classical (or tensor) matching algorithm to the pair of images \({\mathfrak {f}},{\mathfrak {t}}\) using the SSP operator \(S({\mathfrak {f}})(x)=m(x) {\mathfrak {f}}(x)\). Usually, the mask m equals 1 within a certain radius around \(\textbf{0}\) and equals 0 outside a sightly larger radius. In between these radii the mask takes values between 0 and 1. Under these restrictions, it is clear that the operator S is SSP and commutes with rotations. Moreover, if \(0=\Vert f\Vert _S=\langle f,S(f)\rangle \ge \int _{\textbf{D}}f^2(x)dx\ge 0\), we have that \(f_{|\textbf{D}}=0\) where \(\textbf{D}\) is a ball of positive radius centered at \(\textbf{0}\). Let us compute the inner product

$$\begin{aligned} \langle \tau _x({\mathfrak {f}}),{\mathfrak {t}}_R\rangle _S= & {} \langle \tau _x({\mathfrak {f}}),m{\mathfrak {t}}_R\rangle \\= & {} \langle \tau _x(f*h),m(t*h)_R\rangle \\= & {} \langle \tau _x(f)*h,m(t_R*h)\rangle \end{aligned}$$

(since every filter is translation invariant, and h is isotropic)

$$\begin{aligned}= & {} \langle \tau _x(f),h *(m(t_R*h))\rangle \end{aligned}$$

(use \(\widetilde{h}(x):=h(-x)=h(x)\), which follows from isotropy of h )

$$\begin{aligned}= & {} \langle \tau _x(f),t_R\rangle _{\overline{S}} \end{aligned}$$

where

$$\begin{aligned} \overline{S}(f)=h *(m\cdot (f*h)) \end{aligned}$$

and we use \(\cdot \) to denote the standard product of real functions. This means that we would have the same effect just considering the template matching algorithm associated with the operator \(\overline{S}\) applied to the images ft. Moreover, the following holds:

Lemma 2.4

Let \(S:L^{2}(\mathbb {R}^d)\rightarrow L^{2}(\mathbb {R}^d)\) be given by

$$\begin{aligned} S(f)=h *(m\cdot (f*h)) \end{aligned}$$
(14)

with h defining an isotropic filter and m a rotationally symmetric mask as described above. Then S is SSP.

Proof

For the proof, we use the following (well-known) formulae: For functions \(a,b,c\in L^{2}(\mathbb {R}^d)\), we have that \((a\star b)(x)=\langle \tau _x(a),b\rangle \), so that \((a\star b)(0)=\langle a,b\rangle = (a*\widetilde{b})(0)\), \(a\star b=a*\widetilde{b}\), and \(a\star (b*c) = (a\star b)\star c\).

Let us now consider the product \(\langle f,S(f)\rangle \):

$$\begin{aligned} \langle f,S(f)\rangle= & {} (f\star S(f))(0) \\= & {} (f\star (h *(m\cdot (f*h))))(0) \\= & {} ((f\star h) \star (m\cdot (f*h)))(0) \\= & {} \langle f\star h, m\cdot (f*h)\rangle \\= & {} \langle f*h, m\cdot (f*h)\rangle \ge 0 \end{aligned}$$

(since \(h=\widetilde{h}\) and \(m\ge 0\)). This proves that S is semidefinite positive. Let us show the symmetry:

$$\begin{aligned} \langle f,S(g)\rangle= & {} (f\star S(g))(0) = (f\star (h *(m\cdot (g*h))))(0) \\= & {} ((f\star h) \star (m\cdot (g*h)))(0) \\= & {} \langle f\star h, m\cdot (g*h)\rangle \\= & {} \langle m\cdot (f\star h), (h*g)\rangle \end{aligned}$$

(since \(g*h= h*g \) and \(\cdot \) is the standard product of functions)

$$\begin{aligned}= & {} ((m\cdot (f\star h))\star (h*g))(0) \\= & {} (((m\cdot (f\star h))\star h)\star g)(0) \\= & {} \langle ((m\cdot (f\star h))\star h), g\rangle \\= & {} \langle ((m\cdot (f*h))*h), g\rangle \text { (since } h=\widetilde{h} \text { )}\\= & {} \langle S(f),g\rangle . \end{aligned}$$

\(\square \)

Remark 2

Lemma 2.4 also applies when we consider S as an operator on the space \(C_0(\mathbb {R}^d)\) of continuous functions with compact support defined on \(\mathbb {R}^d\), doted with the scalar product of \(L^2(\mathbb {R}^d)\), so that \(S:C_0(\mathbb {R}^d)\rightarrow C_0(\mathbb {R}^d)\). This is so because \(C_0(\mathbb {R}^d)\) is a vector subspace of \(L^2(\mathbb {R}^d)\), and the convolution of continuous functions with compact support is also continuous with compact support. The space \(C_0(\mathbb {R}^d)\) is, in fact, a good model for images that can be used in many application domains.

In all that follows, we assume that the SSP operator S is of the form (14) with h, m verifying the hypotheses of Lemma 2.4. Thus, the template matching algorithm is applied with this operator and a fast computation of c(xR) is needed.

A direct computation leads to:

$$\begin{aligned} c(x,R)= & {} \frac{\langle \tau _x(f),t_R\rangle _S}{\Vert P_S(\tau _x(f))\Vert _S }\\= & {} \frac{1}{\Vert P_S(\tau _x(f))\Vert _S } \langle \tau _x(f),S(t_R)\rangle \\= & {} \frac{1}{\Vert P_S(\tau _x(f))\Vert _S } ( \tau _x(f)\star S(t_R))(0)\\= & {} \frac{1}{\Vert P_S(\tau _x(f))\Vert _S } ( \tau _x(f)\star (h*(m\cdot (h*t_R))))(0)\\= & {} \frac{1}{\Vert P_S(\tau _x(f))\Vert _S } ( (\tau _x(f)\star h)\star (m\cdot (h*t_R)))(0)\\= & {} \frac{1}{\Vert P_S(\tau _x(f))\Vert _S } ( (\tau _x(f)*h)\star (m\cdot (h*t_R)))(0) \end{aligned}$$

(since \(h=\widetilde{h}\))

$$\begin{aligned}= & {} \frac{1}{\Vert P_S(\tau _x(f))\Vert _S } \langle (\tau _x(f)*h), m\cdot (h*t_R)\rangle \\= & {} \frac{1}{\Vert P_S(\tau _x(f))\Vert _S } \langle (\tau _x(f)*h), (m\cdot (h*t))_R\rangle \end{aligned}$$

(since h is isotropic, and m is rotationally symmetric). Moreover,

$$\begin{aligned} \Vert P_S(\tau _x(f))\Vert _S^2= & {} \left\| \tau _x(f)-\frac{\langle \tau _x(f),\textbf{1}\rangle _S}{\Vert \textbf{1}\Vert _S^2}\textbf{1}\right\| _S^2\\= & {} \Vert \tau _x(f)\Vert _S^2 -2\left\langle \tau _x(f),\frac{\langle \tau _x(f),\textbf{1}\rangle _S}{\Vert \textbf{1}\Vert _S^2}\textbf{1}\right\rangle _S + \left\| \frac{\langle \tau _x(f),\textbf{1}\rangle _S}{\Vert \textbf{1}\Vert _S^2}1 \right\| _S^2 \\= & {} \Vert \tau _x(f)\Vert _S^2-2\left\langle \tau _x(f),\frac{\langle \tau _x(f),S(\textbf{1})\rangle }{\Vert \textbf{1}\Vert _S^2}S(\textbf{1})\right\rangle + \left\| \frac{\langle \tau _x(f),S(\textbf{1})\rangle }{\Vert \textbf{1}\Vert _S^2}\textbf{1}\right\| _S^2 \\= & {} \Vert \tau _x(f)\Vert _S^2 -2\frac{(\langle \tau _x(f),S(\textbf{1})\rangle )^2}{\Vert \textbf{1}\Vert _S^2} + \frac{(\langle \tau _x(f),S(\textbf{1})\rangle )^2}{\Vert \textbf{1}\Vert _S^4}\Vert \textbf{1} \Vert _S^2 \\= & {} \Vert \tau _x(f)\Vert _S^2-\frac{(\langle \tau _x(f),S(\textbf{1})\rangle )^2}{\Vert \textbf{1}\Vert _S^2}\\= & {} \langle \tau _x(f), S(\tau _x(f))\rangle -\frac{(\langle \tau _x(f),S(\textbf{1})\rangle )^2}{\Vert \textbf{1}\Vert _S^2} \end{aligned}$$

Now, using the definition of S (and imposing \(h*\textbf{1}=\textbf{1}\)), we can simplify the computation as follows:

$$\begin{aligned} \Vert P_S(\tau _x(f))\Vert _S^2= & {} \langle \tau _x(f), h*(m\cdot (h*\tau _x(f))\rangle -\frac{(\langle \tau _x(f),h*(m\cdot (h*\textbf{1}))\rangle )^2}{\Vert \textbf{1}\Vert _S^2} \\= & {} \langle \tau _x(f)*h, m\cdot (\tau _x(f)*h)\rangle -\frac{(\langle \tau _x(f)*h, m\cdot (h*\textbf{1})\rangle )^2}{\Vert \textbf{1}\Vert _S^2} \\= & {} \langle (\tau _x(f)*h)^2, m\rangle -\frac{(\langle \tau _x(f)*h, m\cdot (h*\textbf{1})\rangle )^2}{\Vert \textbf{1}\Vert _S^2} \\= & {} \langle (\tau _x(f)*h)^2, m\rangle -\frac{(\langle \tau _x(f)*h, m\rangle )^2}{\Vert \textbf{1}\Vert _S^2} \end{aligned}$$

Note that the FFT algorithm can be used to compute the inner products appearing at the end of the formula above, which helps to fasten the algorithm. Indeed, if fg are two images, \(\langle f,g\rangle = (f*\widetilde{g})(0)\), so that

$$\begin{aligned} \langle f,g\rangle = {\mathcal {F}}^{-1}( {\mathcal {F}}(f)\cdot {\mathcal {F}}(\widetilde{g}))(0)= {\mathcal {F}}^{-1}( {\mathcal {F}}(f)\cdot \overline{ {\mathcal {F}}(g)})(0). \end{aligned}$$
(15)

Moreover, the following identities also hold:

$$\begin{aligned} \Vert \textbf{1}\Vert _S^2= & {} \langle \textbf{1},S(\textbf{1})\rangle = \langle \textbf{1},h*(m\cdot (h*\textbf{1}))\rangle \\= & {} \langle \textbf{1}*h, (m\cdot (h*\textbf{1}))\rangle = \langle \textbf{1}, m\rangle ,\\ \Vert P_S(t)\Vert _S= & {} \sqrt{\langle (h*t)^2,m\rangle -\frac{(\langle t*h, m\rangle )^2}{\langle \textbf{1}, m\rangle }}, \end{aligned}$$

and

$$\begin{aligned} \langle t,\textbf{1}\rangle _S= \langle h*t,m\rangle . \end{aligned}$$

Thus,

$$\begin{aligned} m\left( h*\frac{P_S(t)}{\Vert P_S(t)\Vert _S}\right)= & {} m\left( h*\frac{t-\frac{\langle t,\textbf{1}\rangle _S}{\Vert \textbf{1}\Vert _S^2}\textbf{1}}{\sqrt{\langle (h*t)^2,m\rangle -\frac{(\langle t*h, m\rangle )^2}{\langle \textbf{1}, m\rangle }}}\right) \\= & {} m \frac{h*t-\frac{\langle h*t,m\rangle }{\langle \textbf{1}, m\rangle }\textbf{1}}{\sqrt{\langle (h*t)^2,m\rangle -\frac{(\langle t*h, m\rangle )^2}{\langle \textbf{1}, m\rangle }}} \end{aligned}$$

The formulae above can be used to code an algorithm for classical template matching.

An important tool we will use in this paper is the set \(\mathbb {H}\) of quaternions. In particular, we will use that rotations can be parametrized by unit quaternions (which can be identified with the unit 3-sphere \(\mathbb {S}^3\)), as well as the following formulae (see, e.g. [11, 22]):

  • If \(x\in \mathbb {H}\) has norm 1, then \(x^{-1}=x^*\).

  • Given \(x\in \mathbb {H}\), \(x=a+b\textbf{i}+c\textbf{j}+d\textbf{k}\), we identify x with a pair (av) where \(a\in \mathbb {R}\) and \(v=(b,c,d)\in \mathbb {R}^3\), and call a the real part of x, \(a={\textbf{R}}{\textbf{e}}(x)\). Then, if \(x=(a,v), y=(b,w)\in \mathbb {H}\), we have that

    $$\begin{aligned} {\textbf{R}}{\textbf{e}} (xy)=ab-\langle v, w \rangle \end{aligned}$$

    Consequently, if \(x,y\in \mathbb {H}\) have norm 1, then

    $$\begin{aligned} \langle x,y\rangle= & {} {\textbf{R}}{\textbf{e}}(y^{-1}x) = {\textbf{R}}{\textbf{e}}(xy^{-1})\nonumber \\= & {} {\textbf{R}}{\textbf{e}}(yx^{-1})= {\textbf{R}}{\textbf{e}}(x^{-1}y) \end{aligned}$$
    (16)

We end this section with a result about composition of SSP operators that will be used in the proof of the main theorem of the paper. We state the result for arbitrary inner product spaces, and include its proof for the sake of completeness:

Lemma 2.5

Let X be an inner product vector space. If the operators \(T,S:X\rightarrow X\) are semidefinite positive, symmetric, and commute, then TS is also semidefinite positive and symmetric.

Proof

Let ST satisfy the hypothesis of the lemma. Let us define \(S_1=S/\Vert S\Vert \) and \(S_{n+1}=S_n-S_n^2\), \(n=1,2,3,\ldots \). We prove by induction on n that \(0\le S_n\le I\) for all \(n\ge 1\), where I denotes the identity operator on X.

It is clear that \(S_1\ge 0\) since \(S\ge 0\). Moreover, given \(x\in X\),

$$\begin{aligned} \langle S_1(x),x\rangle = \frac{1}{\Vert S\Vert }\langle S(x),x\rangle \le \frac{1}{\Vert S\Vert }\Vert S(x)\Vert \Vert x\Vert \le \frac{1}{\Vert S\Vert }\Vert S\Vert \Vert x\Vert \Vert x\Vert =\Vert x\Vert ^2 \end{aligned}$$

so that \(S_1\le I\). Assume that \(0\le S_k\le I\) and consider the case \(k+1\): From \(S_k\le I\) we get \(0\le I-S_k\). From \(0\le S_k\), we get \(-S_k\le 0\) and, henceforth, \(I-S_k\le I\). Now, given \(x\in X\), we have that

$$\begin{aligned} \langle S_k^2(I-S_k)(x),x\rangle= & {} \langle S_k(I-S_k)(x),S_k(x)\rangle = \langle (I-S_k)S_k(x),S_k(x)\rangle \\= & {} \langle (I-S_k)(y_k),y_k\rangle \ge 0 \quad \text {(where }y_k=S_k(x) \text {)} \end{aligned}$$

since \(0\le I-S_k\). This proves \(S_k^2(I-S_k)\ge 0\). An analogous computation also shows that \(S_k(I-S_k)^2\ge 0\):

$$\begin{aligned} \langle S_k(I-S_k)^2(x),x\rangle= & {} \langle (I-S_k)S_k(I-S_k)(x),x\rangle \\= & {} \langle (I-S_k)S_k(x),(I-S_k)(x)\rangle \\= & {} \langle S_k((I-S_k)(x)),(I-S_k)(x)\rangle \ge 0 \\= & {} \langle S_k(y_k),y_k\rangle \ge 0 \quad \text {(where }y_k=(I-S_k)(x) \text {).} \end{aligned}$$

Hence

$$\begin{aligned} 0\le S_k^2(I-S_k)+S_k(I-S_k)^2=S_k-S_k^2 =S_{k+1}. \end{aligned}$$

On the other hand, \(S_k^2\ge 0\) and \(I-S_k\ge 0\) imply that

$$\begin{aligned} 0\le I-S_k+S_k^2=I-S_{k+1}. \end{aligned}$$

Thus \(0\le S_n\le I\) for all n.

Now, \(S_{n+1}=S_n-S_n^2\) can be written as \(S_n=S_n^2+S_{n+1}\), so that:

$$\begin{aligned} S_1=S_1^2+S_2=S_1^2+S_2^2+S_3=\cdots =S_1^2+\cdots S_n^2 +S_{n+1}\quad \text {for all } n\ge 1. \end{aligned}$$

Hence

$$\begin{aligned} S_1^2+\cdots +S_n^2=S_1-S_{n+1}\le S_1, \quad n=1,2,\ldots . \end{aligned}$$

It follows that, for \(x\in X\),

$$\begin{aligned} \sum _{k=1}^n\Vert S_k(x)\Vert ^2= & {} \sum _{k=1}^n\langle S_k(x),S_k(x)\rangle \\= & {} \sum _{k=1}^n\langle S_k^*S_k(x),x\rangle \\= & {} \sum _{k=1}^n\langle S_k^2(x),x\rangle \\= & {} \left\langle \left( \sum _{k=1}^n S_k^2\right) (x),x\right\rangle \\\le & {} \langle S_1 (x),x\rangle , \quad n=1,2,\ldots . \end{aligned}$$

Thus, \(\sum _{k=1}^\infty \Vert S_k(x)\Vert ^2<\infty \) and \(\Vert S_n(x)\Vert \) goes to 0 for \(n\rightarrow \infty \), for all \(x\in X\). Consequently,

$$\begin{aligned} \sum _{k=1}^{\infty } S_k^2(x) = \lim _{n\rightarrow \infty }\sum _{k=1}^n S_k^2(x)=S_1(x)-\lim _{n\rightarrow \infty }S_{n+1}(x)=S_1(x), \quad x\in X. \end{aligned}$$

Let us now consider the product \(ST=TS\), and let \(x\in X\) be arbitrarily chosen. It follows from the definition of the operators \(S_n\) that they are symmetric and commute with T. Hence

$$\begin{aligned} \langle (TS)(x),x\rangle= & {} \Vert S\Vert \langle (TS_1)(x),x\rangle = \Vert S\Vert \left\langle T\left( \lim _{n\rightarrow \infty }\sum _{k=1}^n S_k^2(x)\right) ,x\right\rangle \\= & {} \Vert S\Vert \left\langle \lim _{n\rightarrow \infty }\sum _{k=1}^n (TS_k^2)(x),x\right\rangle \\= & {} \Vert S\Vert \left\langle \lim _{n\rightarrow \infty }\sum _{k=1}^n (S_kTS_k)(x),x\right\rangle \\= & {} \Vert S\Vert \left\langle \lim _{n\rightarrow \infty }\sum _{k=1}^n T(S_k(x)),S_k(x)\right\rangle \\= & {} \Vert S\Vert \left\langle \lim _{n\rightarrow \infty }\sum _{k=1}^n T(y_k),y_k\right\rangle \ge 0, \end{aligned}$$

which proves that \(ST=TS\) is SSP. \(\square \)

3 Tensor template matching

This section introduces a tensorial template matching (TTM) algorithm. The purpose is to handle translations and rotations efficiently at the same time. First, we introduce some background necessary to understand further mathematical developments. Second, we present the main theorem for TTM, which allows us to determine the optimal rotation of the template, t, on every match in the image f without sampling the SO(3) by computing some tensors. Finally, we explain how to determine match positions (template translations) directly from the computed tensors.

3.1 Tensor background

A tensor \(A\in T^n(\mathbb {R}^d)\) of order n and dimension d is just an array of the form \(A=(A_{i_1,\ldots ,i_n})_{1\le i_1,\ldots ,i_n\le d}\) where all the entries \(A_{i_1,\ldots ,i_n}\) are real numbers. The tensor A is named symmetric if \(A_{i_1,\ldots ,i_n}=A_{i_{\sigma (1)},\ldots ,i_{\sigma (n)}}\) for every permutation \(\sigma \in \Sigma _n\) (the set of permutations of \(\{1,\ldots ,n\}\)). We denote by \(S^n(\mathbb {R}^d)\) the set of symmetric tensors of order n and dimension d. An important example of symmetric tensor of order n is the so called n-th tensor power of a vector \(v=(v_1,\ldots ,v_d)\in \mathbb {R}^d\), which is defined as

$$\begin{aligned} v^{\odot n} =(v_{i_1}v_{i_2}\cdots v_{i_n})_{1\le i_1,\ldots , i_n\le d}. \end{aligned}$$
(17)

It is well known that \(T^n(\mathbb {R}^d)\) and \(S^n(\mathbb {R}^d)\) are real vector spaces with the natural operations (pointwise sum and multiplication by a scalar), and that \(\dim T^n(\mathbb {R}^d)=d^n\), \(\dim S^n(\mathbb {R}^d)=\left( {\begin{array}{c}n+d-1\\ n\end{array}}\right) \) for all \(n,d\ge 1\). For example, \(\dim T^4(\mathbb {R}^4)=4^4=256\), \(\dim S^4(\mathbb {R}^4)=\left( {\begin{array}{c}7\\ 4\end{array}}\right) =35\). Moreover, every symmetric tensor is a finite sum of tensor powers, which allows us to introduce the concept of the (symmetric) rank of a symmetric tensor as the minimal number of tensor powers used to represent the tensor with their sum [6].

The map \(\langle \cdot ,\cdot \rangle : T^n(\mathbb {R}^d)\times T^n(\mathbb {R}^d) \rightarrow \mathbb {R}\) given by

$$\begin{aligned} \langle A, B\rangle = \sum _{i_1=1}^d \sum _{i_2=1}^d \cdots \sum _{i_n=1}^d A_{i_1,\ldots ,i_n}B_{i_1,\ldots ,i_n} \end{aligned}$$
(18)

defines an inner product. It is also usual to denote \(A\cdot B=\langle A,B\rangle \). Moreover, with this notation, if \(x,y\in \mathbb {R}^d\) are d-dimensional vectors, a direct application of the multinomial theorem shows that

$$\begin{aligned} x^{\odot n}\cdot y^{\odot n}=(x \cdot y)^n=(\langle x,y\rangle )^n. \end{aligned}$$
(19)

Moreover, if \(A\in S^n(\mathbb {R}^d)\), and \(x=(x_1,\ldots ,x_d)\in \mathbb {R}^d\), we can also consider the inner product

$$\begin{aligned} A\cdot x^{\odot n} =\langle A, x^{\odot n}\rangle = \sum _{i_1=1}^d \sum _{i_2=1}^d \cdots \sum _{i_n=1}^d A_{i_1,\ldots ,i_n}x_{i_1} \cdots x_{i_n}, \end{aligned}$$
(20)

which can be seen as an homogeneous polynomial in d variables, of degree n, which justifies using the notation \(Ax^n=A\cdot x^{\odot n}\). Moreover, if \(k<n\), \(Ax^{k}\in S^{n-k}(\mathbb {R}^d)\) denotes the symmetric tensor whose components are

$$\begin{aligned}{} & {} (Ax^k)_{i_1,\ldots ,i_{n-k}}\nonumber \\{} & {} \quad =\sum _{j_1=1}^d \sum _{j_2=1}^d \cdots \sum _{j_k=1}^d A_{i_1,\ldots ,i_{n-k},j_1,\ldots ,j_k}x_{j_1}\cdots x_{j_k}. \end{aligned}$$
(21)

In particular, \(Ax^{n-1}\in S^1(\mathbb {R}^d)=\mathbb {R}^d\) is a vector whose i-th component is

$$\begin{aligned} (Ax^{n-1})_{i}= \sum _{j_1=1}^d \sum _{j_2=1}^d \cdots \sum _{j_{n-1}=1}^d A_{i,j_1,\ldots ,j_{n-1}}x_{j_1}\cdots x_{j_{n-1}}. \end{aligned}$$
(22)

Indeed, if \(\varphi (x)=Ax^n\) then

$$\begin{aligned} \nabla \varphi (x)=nA x^{n-1}, \end{aligned}$$

where \(\nabla \varphi \) denotes the gradient of the function \(\varphi :\mathbb {R}^d\rightarrow \mathbb {R}\).

Note that the vector x can be chosen from \(\mathbb {C}^d\) in the definitions above. This justifies the following definition (see [8]): Given \(A\in S^n(\mathbb {R}^d)\), \(B\in S^m(\mathbb {R}^d)\). We say that \(\lambda \in \mathbb {C}\) is a B-eigenvalue of A and \(u\in \mathbb {C}^d\) is its associated B-eigenvector (equivalently, that \((\lambda ,u)\) is a B-eigenpair of A) if \(Au^{n-1}=\lambda Bu^{m-1}\) and \(Bu^m=1\).

Using gradients, we can rewrite the equation \(Au^{n-1}=\lambda Bu^{m-1}\) as

$$\begin{aligned} \frac{1}{n}\nabla Au^n=\lambda \frac{1}{m}\nabla Bu^m. \end{aligned}$$

Hence u is a B-eigenvector of A if and only if it is a critical point of the following optimization problem:

$$\begin{aligned} \left\{ \begin{array}{llll} \text {Maximize: } &{} Ax^n\\ \text {under the restriction:} &{} Bx^m=1.\end{array}\right. \end{aligned}$$
(23)

Two particularly important cases are the H-eigenvectors

$$\begin{aligned} \left\{ \begin{array}{llll} \text {Maximize: } &{} Ax^n\\ \text {under the restriction:} &{} \sum _{i=1}^dx_i^m=1\end{array}\right. \end{aligned}$$
(24)

and the Z-eigenvectors

$$\begin{aligned} \left\{ \begin{array}{llll} \text {Maximize: } &{} Ax^n\\ \text {under the restriction:} &{} \sum _{i=1}^dx_i^2=1.\end{array}\right. \end{aligned}$$
(25)

The optimization problem associated with finding Z-eigenvectors of a given symmetric tensor is particularly important for us since the tensor matching algorithm we propose is reduced to one of these problems in each position, and, fortunately, there are good iterative algorithms to approximate the solutions of (25) (see e.g. [15, 16]). These algorithms have a linear rate of convergence. In section 3.4 we show an heuristics that can be used to select the positions where a match is probable, so that solving (25) is necessary.

3.2 Defining of the Tensor template

In all that follows in this paper, our inner product space is \(X=C_0(\mathbb {R}^d)\), and \(S:X\rightarrow X\) denotes an SSP operator which commutes with rotations. Moreover, we also assume that the template \(t\in X\) is normalized by \(t\perp _S \textbf{1}\) and \(\Vert t\Vert _S=1\). In section 2 we proved that, for each \(x\in \mathbb {R}^d\), \(c(x,R) = w(x) (f\star S(t)_R)(x)\) attains its maximum value on rotation R (and this value equals 1) if and only if there is a match between f and t at (xR) (i.e., a match between \(\tau _x(f)\) and \(t_R\)). Let us define the symmetric tensor \(C_n(x)\in S^{n}(\mathbb {R}^{d'})\), where \(d'\) is the number of parameters used to describe the rotations SO(d) (in particular, for \(d=3\), we get \(d'=4\)) by the formula:

$$\begin{aligned} C_n(x)=\int _{SO(d)}R^{\odot n}c(x,R)dR \end{aligned}$$
(26)

This means that

$$\begin{aligned} (C_n(x))_{i_1,\ldots ,i_n}=\int _{SO(d)}R_{i_1}R_{i_2}\cdots R_{i_n}c(x,R)dR \end{aligned}$$
(27)

for all \(1\le i_1,\ldots ,i_n\le d'\). Hence

$$\begin{aligned} (C_n(x))_{i_1,\ldots ,i_n}= & {} \int _{SO(d)}R_{i_1}\cdots R_{i_n}c(x,R)dR \\= & {} w(x)\int _{SO(d)}R_{i_1}\cdots R_{i_n} (f\star S(t)_R)(x)dR \\= & {} w(x)\int _{SO(d)}R_{i_1}\cdots R_{i_n} \langle \tau _x(f),S(t)_R\rangle dR \\= & {} w(x)\int _{SO(d)}R_{i_1}\cdots R_{i_n} \int _{\mathbb {R}^d} \tau _x(f)(z)S(t)_R(z)dz dR \\= & {} w(x)\int _{\mathbb {R}^d} \tau _x(f)(z) \left( \int _{SO(d)}R_{i_1}\cdots R_{i_n} S(t)_R(z) dR \right) dz\\= & {} w(x) \langle \tau _x(f),(T(z))_{i_1,\ldots ,i_n}\rangle , \end{aligned}$$

where

$$\begin{aligned} T(z)=\int _{SO(d)}R^{\odot n}S(t)_R(z) dR \in S^{n}(\mathbb {R}^{d'}) \end{aligned}$$
(28)

is a tensor template (or tensorial needle).

It is of fundamental importance to observe that T(z) is computed only once and contains a reduced number of components, since \(\dim S^n(\mathbb {R}^{d'})=\left( {\begin{array}{c}n+d'-1\\ n\end{array}}\right) \) (in particular, \(\dim S^4(\mathbb {R}^{4})=\left( {\begin{array}{c}7\\ 4\end{array}}\right) =35\)). Indeed, this is the main reason why the tensor template matching algorithm we introduce in this paper is fast. Another reason is that rotations R defining a match between f and \(t_R\) at x are Z-eigenvectors of the symmetric tensor \(C_n(x)\), which is really remarkable because the power method used in [15, 16] for the solution of the corresponding optimization problem is fast. Moreover, the corresponding algorithm does not require using myriads or even millions of rotations -as is the case with classical matching algorithms- but just a reduced set of them: one by iteration.

3.3 Finding the correct rotation

Let us state the main result of this paper:

Theorem 3.1

Let \(f,t\in C_0(\mathbb {R}^3)\), \(x\in \mathbb {R}^3\), and \(n\in 2\mathbb {N}\) be given. If there is a match between f and \(t_R\) at x, the function \(\varphi (Q)=C_n(x)\cdot Q^{\odot n}\), defined on rotations of \(\mathbb {R}^3\), when parametrized by unit quaternions Q, attains its global maximum at \(Q=R\).

Proof

We prove the result as a consequence of Theorem 2.3. Thus, our main goal is to represent \(\varphi (Q)\) in terms of a scalar product, \(\varphi (Q)=w(x)\langle \tau _x(f), t_Q \rangle _{S'}\), for some SSP operator \(S'\), which would guarantee that if there is a match between f and \(t_R\) at x, then \(\varphi (Q)\) attains its global maximum at \(Q=R\).

Let us compute \(\varphi (Q)\):

$$\begin{aligned} \varphi (Q)= & {} C_n(x)\cdot Q^{\odot n} = Q^{\odot n}\cdot \int _{SO(3)}R^{\odot n}c(x,R)dR \\= & {} \int _{SO(3)}Q^{\odot n}\cdot R^{\odot n}c(x,R)dR \\= & {} \int _{SO(3)}(Q\cdot R)^{n}c(x,R)dR \\= & {} \int _{SO(3)}({\textbf{R}}{\textbf{e}} (R^{-1}Q))^{n}c(x,R)dR \quad [\text {using} (16)]\\= & {} \int _{SO(3)}({\textbf{R}}{\textbf{e}} (R^{-1}Q))^{n}w(x) \langle \tau _x(f),t_R\rangle _S dR \end{aligned}$$

(by definition of c(xR)). Hence, dividing by w(x), we get:

$$\begin{aligned}{} & {} \frac{1}{w(x)}\varphi (Q)\\{} & {} \quad = \int _{SO(3)}({\textbf{R}}{\textbf{e}} (R^{-1}Q))^{n}\int _{\mathbb {R}^3} f(z+x)(O_{R^{-1}}\circ S)(t)(z) dz dR\\{} & {} \quad = \int _{SO(3)}\int _{\mathbb {R}^3} f(z+x)(O_{R^{-1}}\circ S)(t)(z) ({\textbf{R}}{\textbf{e}} (R^{-1}Q))^{n}dz dR\\{} & {} \quad = \int _{\mathbb {R}^3} f(z+x) \int _{SO(3)}(O_{R^{-1}}\circ S)(t)(z) ({\textbf{R}}{\textbf{e}} (R^{-1}Q))^{n}dR dz\\{} & {} \quad = \int _{\mathbb {R}^3} f(z+x) \left( \int _{SO(3)}S(t)(z)(R) ({\textbf{R}}{\textbf{e}} (R^{-1}Q))^{n}dR \right) dz \end{aligned}$$

(where \(S(t)(z)(R)=(O_{R^{-1}}\circ S)(t)(z)\))

$$\begin{aligned}= & {} \int _{\mathbb {R}^3} f(z+x) (S(t)(z)\circledast _{SO(3)} K)(Q) dz \end{aligned}$$

where \(K(R)=({\textbf{R}}{\textbf{e}} (R))^n\) and

$$\begin{aligned} (a\circledast _{SO(3)} b)(Q)=\int _{ SO(3)} a(R)b(R^{-1}Q)dR \end{aligned}$$

denotes the convolution of functions \(a,b\in L^2(SO(3))\).

In other words,

$$\begin{aligned} \varphi (Q)=Q^{\odot n} \cdot C_n(x) = w(x)\langle \tau _x(f), (S(t)\circledast _{SO(3)} K)(Q)\rangle . \end{aligned}$$
(29)

Here,

$$\begin{aligned} (S(t)\circledast _{SO(3)} K)(Q)=\int _{SO(3)}S(t)_R K(R^{-1}Q)dR \end{aligned}$$

must be interpreted as a function defined on \(\mathbb {R}^3\) with values on \(\mathbb {R}\) (indeed, it is an element of \(C_0(\mathbb {R}^3)\)):

$$\begin{aligned} (S(t)\circledast _{SO(3)} K)(Q)(z)= & {} \left( \int _{SO(3)}S(t)_R K(R^{-1}Q)dR\right) (z)\\:= & {} \int _{SO(3)}S(t)_R(z) K(R^{-1}Q)dR \\= & {} (S(t)(z)\circledast _{SO(3)} K)(Q), \end{aligned}$$

where \(S(t)(z):SO(3)\rightarrow \mathbb {R}\) is given by

$$\begin{aligned} S(t)(z)(R)=S(t)_R(z)=(O_{R^{-1}}\circ S)(t)(z)=S(t)(R^{-1}z). \end{aligned}$$

Indeed, in general, every element \(t\in C_0(\mathbb {R}^3)\) can be interpreted, for each \(z\in \mathbb {R}^3\), as an element of \(L^2(SO(3))\) just making \(t(z)(R)=t_R(z)\). Consequently, \((t\circledast _{SO(3)} K)(I_d)\) is an element of \(C_0(\mathbb {R}^3)\),

$$\begin{aligned} (t\circledast _{SO(3)} K)(I_d)(z)=(t(z)\circledast _{SO(3)} K)(I_d). \end{aligned}$$
(30)

It follows that

$$\begin{aligned} (S(t)\circledast _{SO(3)} K)(Q)(z)= & {} \int _{SO(3)}S(t)_R(z) K(R^{-1}Q)dR\\= & {} \int _{SO(3)}S(t)_{QP}(z) K(P^{-1})dP \end{aligned}$$

(in the last equality, set \(R=QP\) and use that \(|Q|=1\))

$$\begin{aligned}= & {} \int _{SO(3)}(O_{{(QP})^{-1}}\circ S)(t)(z) K(P^{-1})dP \\= & {} \int _{SO(3)}(O_{P^{-1}Q^{-1}}\circ S)(t)(z) K(P^{-1})dP \\= & {} \int _{SO(3)}(O_{Q^{-1}}\circ O_{P^{-1}}\circ S)(t)(z) K(P^{-1})dP \\= & {} \int _{SO(3)}O_{Q^{-1}}(S(t)_P)(z) K(P^{-1})dP\\= & {} O_{Q^{-1}}\left( \int _{SO(3)}S(t)_PK(P^{-1})dP\right) (z)\\= & {} \left( \int _{SO(3)}S(t)_P K(P^{-1})dP\right) _Q(z) \\= & {} (S(t)\circledast _{SO(3)} K)(I_d)_Q(z), \end{aligned}$$

where \(I_d\) denotes the identity rotation.

Let us now denote by \(S_2:C_0(\mathbb {R}^3)\rightarrow C_0(\mathbb {R}^3)\) the operator given by

$$\begin{aligned} S_2(t)=(t\circledast _{SO(3)} K)(I_d) \end{aligned}$$

and let \(S'=S_2\circ S\). Then

$$\begin{aligned} Q^{\odot n} \cdot C_n(x)= & {} w(x)\langle \tau _x(f), (S(t)\circledast _{SO(3)} K)(Q)\rangle \\= & {} w(x)\langle \tau _x(f), (S(t)\circledast _{SO(3)} K)(I_d)_Q\rangle \\= & {} w(x)\langle \tau _x(f), S_2(S(t))_Q \rangle \\= & {} w(x)\langle \tau _x(f), S_2(S(t_Q)) \rangle \end{aligned}$$

(since \(S, S_2\) commute with rotations)

$$\begin{aligned}= & {} w(x)\langle \tau _x(f), S'(t_Q) \rangle \\= & {} w(x)\langle \tau _x(f), t_Q \rangle _{S'} \end{aligned}$$

Thus, the proof ends as soon as we demonstrate that \(S'\) is an SSP operator, and it is for this that we need to use Lemma 2.5. Indeed, \(S'=S_2\circ S\) is a composition of operators, S is, by hypothesis, symmetric semidefinite positive, and S, \(S_2\) commute because S commutes with rotations and \(S_2\) is defined in terms of convolution in SO(3). Thus, Lemma 2.5 implies that \(S'\) is SSP whenever \(S_2\) is SSP.

To prove that \(S_2\) is symmetric semidefinite positive, we use the properties of the convolution on SO(3) when interpreted as a hyperspherical convolution on \(S^3\), the unit sphere of \(\mathbb {R}^4\). Recall that if \(S^{d-1}=\{x\in \mathbb {R}^d: x\cdot x^t=1\}\) denotes the (unit) sphere of \(\mathbb {R}^d\), then SO(d) acts transitively on \(S^{d-1}\) (which means that, given \(z_1,z_2\in S^{d-1}\) there is a rotation \(R\in SO(d)\) such that \(R(z_1)=z_2\)), which makes of \(S^{d-1}\) a homogeneous space and allows to introduce the convolution of functions defined on \(S^{d-1}\) as follows:

$$\begin{aligned} (f*_{S^{d-1}} g)(z)=\int _{SO(d)}f(R\eta )g(R^{-1}z)dR, \end{aligned}$$

where \(\eta \in S^{d-1}\) is the north pole of the sphere and \(f,g\in L^2(S^{d-1})\). Thus, if we use that the elements of SO(3) are parametrized by quaternions of norm 1, which that can be identified with the elements of the sphere \(S^3=\{x\in \mathbb {H}: x\overline{x}=|x|=1\}\) in fourth-dimensional space, then assuming that the north pole of \(S^3\) is given precisely by the identity rotation \(I_d\), the convolution of \(f,g\in L^2(SO(3))\) can be interpreted as a hyperspherical convolution on \(S^3\):

$$\begin{aligned} (f\circledast _{SO(3)} g)(Q)= & {} \int _{SO(3)}f(RI_d)g(R^{-1}Q)dR \nonumber \\= & {} (f*_{S^3} g)(Q). \end{aligned}$$
(31)

Now, as it is well known, \(L^2(S^3)\) is a Hilbert space and the so-called hyperspherical harmonics, \(\{\Xi _{M}^\ell \}\), form an orthonormal basis of this space. Thus, every function \(f\in L^2(S^3)\) admits a Fourier expansion

$$\begin{aligned}{} & {} f(z)=\sum _{\ell , M}{\hat{f}}(\ell ,M) \Xi _{M}^\ell (z) \end{aligned}$$
(32)
$$\begin{aligned}{} & {} {\hat{f}}(\ell ,M)= \langle f,\Xi _{M}^\ell (z)\rangle = \int _{S^{3}}f(\xi ) \overline{\Xi _{M}^\ell (\xi )}d\xi . \end{aligned}$$
(33)

Moreover, in [10], it was proven that, if \(f,g\in L^2(S^3)\) and \(\mathfrak {f}=f*_{S^3} g\), then

$$\begin{aligned} \hat{\mathfrak {f}}(\ell ,M)= (\ell +1){\hat{f}}(\ell ,M){\hat{g}}(\ell ,0) \end{aligned}$$
(34)

It follows that, given a template \(t\in C_0(\mathbb {R}^3)\), for each \(z\in \mathbb {R}^3\), the map \(t(z)(R)=t_R(z)\) belongs to \(L^2(S^3)\) (here the rotations R are parametrized as unit quaternions, so that \(R\in S^3\)) and

$$\begin{aligned} t(z)(R)= & {} \sum _{\ell , M}\widehat{t(z)}(\ell ,M) \Xi _{M}^\ell (R) \end{aligned}$$
(35)
$$\begin{aligned} K(R)= & {} \sum _{\ell , M}{\widehat{K}}(\ell ,M) \Xi _{M}^\ell (R), \end{aligned}$$
(36)

and

$$\begin{aligned} (t(z)\circledast _{SO(3)} K)(R) = \sum _{\ell , M} \widehat{t(z)}(\ell ,M) {\widehat{K}}(\ell ,O) (\ell +1)\Xi _{M}^\ell (R) \end{aligned}$$
(37)

We need the following Lemma, whose proof is included in Sect. 4:

Lemma 3.2

\({\widehat{K}}(\ell ,O)\ge 0\) for all \(\ell \).

Proof

Then

$$\begin{aligned} \langle t(z), (t(z)\circledast _{SO(3)} K)(R)\rangle _{L^2(SO(3))} = \sum _{\ell , M} (\widehat{t(z)}(\ell ,M))^2 {\widehat{K}}(\ell ,O) (\ell +1) \ge 0\qquad \end{aligned}$$
(38)

This means that convolution with K, which is an operator \(C_K:L^2(SO(3))\rightarrow L^2(SO(3))\), \(C_K(f)=f\circledast _{SO(3)}K\), is semidefinite positive. Moreover, it is well known that this operator is symmetric (and we will use both things in our computations bellow).

In order to prove that \(S_2\) is SSP, we introduce the operator \(L:C_0(\mathbb {R}^3)\rightarrow C(SO(3), C_0(\mathbb {R}^3))\) defined by \(L(t)(R)=t_R\), as well as the operator \(L^*: C(SO(3), C_0(\mathbb {R}^3))\rightarrow C_0(\mathbb {R}^3)\) defined by \(L^*(a)(z)=\int _{SO(3)}a(R)_{R^{-1}}(z)dR\).

Then

$$\begin{aligned} \langle f,L^*(a)\rangle= & {} \int _{\mathbb {R}^3}f(z)\left( \int _{SO(3)}a(R)_{R^{-1}}(z)dR\right) dz\\= & {} \int _{\mathbb {R}^3}\int _{SO(3)}f(z)a(R)(Rz)dRdz \\= & {} \int _{SO(3)}\int _{\mathbb {R}^3}f(z)a(R)(Rz)dzdR \\= & {} \int _{SO(3)}\int _{\mathbb {R}^3}f(R^{-1}w)a(R)(w)dwdR \quad \text {(just take } w=Rz \text { )} \\= & {} \int _{\mathbb {R}^3}\int _{SO(3)}f(R^{-1}w)a(R)(w)dRdw \\= & {} \int _{\mathbb {R}^3}\left( \int _{SO(3)}L(f)(R)(w)a(R)(w)dR\right) dw \\= & {} \int _{\mathbb {R}^3}\langle L(f)(w),a(w)\rangle _{SO(3)}dw,\\ \end{aligned}$$

where \(L(f)(w)(R):=L(f)(R)(w)=f_R(w)=f(R^{-1}w)\) and \(a(w)(R):=a(R)(w)\). On the other hand,

$$\begin{aligned} S_2(t)= & {} (t\circledast _{SO(3)} K)(I_d) \\= & {} \int _{SO(3)}t_R K(R^{-1})dR \\= & {} \int _{SO(3)}L(t)(R)K(R^{-1})dR \\= & {} (L(t)\circledast _{SO(3)} K)(I_d) \end{aligned}$$

Thus, if \(V=\int _{SO(3)}dR\) is the volume of SO(3), then

$$\begin{aligned} V S_2(t)= & {} (L(t)\circledast _{SO(3)} K)(I_d) \int _{SO(3)}dR \\= & {} \int _{SO(3)}(L(t)\circledast _{SO(3)} K)(I_d) dR \\= & {} \int _{SO(3)}(L(t)\circledast _{SO(3)} K)(I_d)_{R^{-1}R} dR \\= & {} \int _{SO(3)}(L(t)\circledast _{SO(3)} K)(R)_{R^{-1}} dR \\= & {} L^*(L(t)\circledast _{SO(3)} K), \end{aligned}$$

where we have used that

$$\begin{aligned} (L(t)\circledast _{SO(3)}K)(I_d)_{R^{-1}R}=((L(t)\circledast _{SO(3)}K)(I_d)_R)_{R^{-1}} \end{aligned}$$

and that

$$\begin{aligned} (L(t)\circledast _{SO(3)}K)(I_d)(z)= & {} (L(t)(z)\circledast _{SO(3)}K)(I_d)\\= & {} \int _{SO(3)}L(t)(z)(Q)K(Q^{-1}I_d)dQ\\= & {} \int _{SO(3)}t_Q(z)K(Q^{-1}I_d)dQ\\= & {} \int _{SO(3)}t(Q^{-1}z)K(Q^{-1}I_d)dQ \end{aligned}$$

so that

$$\begin{aligned} (L(t)\circledast _{SO(3)}K)(I_d)_{R}(z)= & {} (L(t)\circledast _{SO(3)}K)(I_d)(R^{-1}z)\\= & {} \int _{SO(3)}t(Q^{-1}R^{-1}z)K(Q^{-1}I_d)dQ\\= & {} \int _{SO(3)}t(\Theta ^{-1}z)K(\Theta ^{-1} R)dQ \end{aligned}$$

(set \(\Theta ^{-1}=Q^{-1} R^{-1}\), so that \(Q^{-1} =\Theta ^{-1} R\))

$$\begin{aligned}= & {} \int _{SO(3)}L(t)(z)(\Theta )K(\Theta ^{-1} R)dQ \\= & {} (L(t)(z)\circledast _{SO(3)}K)(R) \\= & {} (L(t)\circledast _{SO(3)}K)(R)(z). \end{aligned}$$

It follows that

$$\begin{aligned} \langle t,S_2(t)\rangle= & {} \frac{1}{V} \langle t, L^*(L(t)\circledast _{SO(3)} K)\rangle \\= & {} \frac{1}{V} \int _{\mathbb {R}^3}\langle L(t)(w),(L(t)\circledast _{SO(3)} K)(w)\rangle _{SO(3)}dw\\= & {} \frac{1}{V} \int _{\mathbb {R}^3}\langle L(t)(w),(L(t)(w)\circledast _{SO(3)} K)\rangle _{SO(3)}dw \ge 0. \end{aligned}$$

Thus, \(S_2(t)\) is semidefinite positive.

Moreover, the same type of computation shows that

$$\begin{aligned} \langle f,S_2(g)\rangle= & {} \frac{1}{V} \langle f, L^*(L(g)\circledast _{SO(3)} K)\rangle \\= & {} \frac{1}{V} \int _{\mathbb {R}^3}\langle L(f)(w),(L(g)(w)\circledast _{SO(3)} K)\rangle _{SO(3)}dw\\= & {} \frac{1}{V} \int _{\mathbb {R}^3}\langle (L(f)(w)\circledast _{SO(3)} K),L(g)(w)\rangle _{SO(3)}dw\\= & {} \frac{1}{V} \int _{\mathbb {R}^3}\langle L(g)(w),(L(f)(w)\circledast _{SO(3)} K)\rangle _{SO(3)}dw\\= & {} \frac{1}{V} \langle g, L^*(L(f)\circledast _{SO(3)} K)\rangle \\= & {} \langle g,S_2(f)\rangle , \end{aligned}$$

which proves that \(S_2\) is symmetric. This ends the proof of the theorem. \(\square \)

Note that Theorem 3.1 connects the problem of finding, at a given position x, the rotation R which gives a match between f and \(t_R\) at x with the problem of finding the dominant Z-eigenvalue-eigenvector pair by solving (25) with \(A=C_n(x)\in S^n(\mathbb {R}^4)\) and n even.

3.4 Finding the correct position

Although we can find the spatial positions of peaks by running an algorithm to find the dominant Z-eigenvalue-eigenvector pair for each and every voxel, this is fairly expensive using the current decomposition algorithms for higher degree tensors. However, the Frobenius norm of a tensor is related to its spectral norm, and in practice it turns out it can be used as an excellent proxy for finding the spatial locations of peaks. Indeed, we know that \(C_n(x)\in S^n(\mathbb {R}^{d'})=S^n(\mathbb {R}^{4})\). Now, if \(\Vert T\Vert _{\sigma }\) denotes the spectral norm of tensor T and \(\Vert T\Vert _F\) denotes its Frobenius norm, it is well-known that the largest singular value of T equals its spectral norm, and that

$$\begin{aligned} \Vert T\Vert _{\sigma }\ge \Vert T\Vert _F\frac{1}{\sqrt{4^{n-1}}}= \Vert T\Vert _F\frac{1}{2^{n-1}} \end{aligned}$$

(see e.g., [4, 17]).

In fact, the connection between \(\Vert T\Vert _{\sigma }\) and \(\Vert T\Vert _F\) is stronger than just this inequality. As is well-known, every tensor is a finite sum of tensors of rank 1 (indeed, if the tensor is symmetric, the tensors of rank one can also be chosen symmetric) [6]. Moreover, if \(W_1\) is a tensor of rank 1 satisfying

$$\begin{aligned} \Vert T-W_1\Vert _F=E_1(T):=\min _{\text {rank}(W)=1}\Vert T-W\Vert _F \end{aligned}$$

then (see, e.g., [23])

$$\begin{aligned} \Vert W_1\Vert _F=\Vert T\Vert _{\sigma } \end{aligned}$$

and

$$\begin{aligned} E_1(T)^2= \Vert T\Vert _F^2-\Vert T\Vert _{\sigma }^2 \end{aligned}$$

Thus,

$$\begin{aligned} \Vert T\Vert _F^2=\Vert T\Vert _{\sigma }^2+E_1(T)^2. \end{aligned}$$

Hence if \(E_1(T)\) is preserved, an increment on the size of \(\Vert T\Vert _{\sigma }\) (\(\Vert T\Vert _F\), respectively) is translated into an increment on the size of \(\Vert T\Vert _F\) (\(\Vert T\Vert _{\sigma }\), respectively).

Moreover, in 1938 Banach demonstrated (see [1]) that, for any symmetric tensor T,

$$\begin{aligned} \Vert T\Vert _{\sigma }=\max _{\Vert Q\Vert =1}\left| \langle T,Q^{\odot n}\rangle \right| = \max _{\Vert Q\Vert =1}\left| T\cdot Q^{\odot n} \right| . \end{aligned}$$

Thus, large \(\Vert T\Vert _F\) implies large spectral norm of T, and the spectral norm of \(C_n(x)\) is strongly connected to the optimization problem solved in Theorem 3.1, which justifies using the Frobenius norm of \(C_n(x)\) as a parameter to select positions x where a match is possible.

For each position x identified as a potential peak, the SS-HOPM algorithm (see [15, 16, 23] for precise definition and implementation of this algorithm) is used to find the exact dominant Z-eigenvalue and its associated Z-eigenvector, which is the rotation R candidate to give a match at x.

We have just explained an heuristics to locate the positions -and, after that, the rotations- where a match is possible. Now, sometimes a false positive may occur. Indeed, in the previous subsection we showed that the tensor-based correlation function \(\langle C_n(x),Q^{\odot n}\rangle \) can be seen as using a slightly different degenerate inner product, based on \(S'\) from the proof of Theorem 3.1, rather than S. Concretely, we proved that

$$\begin{aligned} \langle C_n(x),Q^{\odot n}\rangle = w(x)\langle \tau _x(f), t_Q \rangle _{S'} \end{aligned}$$

where \(w(x)=\frac{1}{\Vert P_S(\tau _x(f))\Vert _S}\), \(\Vert t_R\Vert _S=1\) and \(t_R=P_S(t_R)\). This implies that the relation \(-1\le \langle C_n(x),Q^{\odot n}\rangle \le 1\) does not necessarily hold because the normalizations were taken in terms of S instead of \(S'\). Taking \(S'\) into account would lead to the equality

$$\begin{aligned} \langle C_n(x),Q^{\odot n}\rangle = \frac{\langle \tau _x(f), t_Q \rangle _{S'} }{\Vert \tau _x(f)\Vert _{S'}\Vert t_Q\Vert _{S'}} w(x) \Vert \tau _x(f)\Vert _{S'}\Vert t_Q\Vert _{S'} \end{aligned}$$

where \(-1\le \frac{\langle \tau _x(f), t_Q \rangle _{S'} }{\Vert \tau _x(f)\Vert _{S'}\Vert t_Q\Vert _{S'}}\le 1\) (and it is equal to 1 when we have a match).

So what is the impact of this? First of all, observe that the operation that is missing from S in the normalization is effectively a kind of convolution, so that its effect on the constant component of an image is to scale it. Consequently, if an image is S-orthogonal to \(\textbf{1}\), it will also be \(S'\)-orthogonal to \(\textbf{1}\). However, the norms are affected.

For the template t, this means that the normalization is off by a certain factor, but this factor is the same everywhere. For the image f, the impact is less benign though, as \(\Vert \tau _x(f)\Vert _S\) will differ from \(\Vert \tau _x(f)\Vert _{S'}\) in a nonuniform way.

When will this shortcoming would cause a false positive? For this to happen, the normalization factor used at a non-match position would have to be much higher than the “correct” normalization factor, and/or the normalization factor would have to be too low at a match position. Since the difference between S and \(S'\) is essentially a smoothing operation, and the normalization factor is the reciprocal of the norm of the projected image, the image would thus have to be (very) smooth at the non-match position, while exhibiting a lot of high frequency energy around the match position. Such a situation would not be impossible, but would at the very least be unusual in the context of a typical application like the analysis of electron microscopy images.

4 Proof of Lemma 3.2

Let us start recalling the formulae associated to Fourier expansions in hyperspherical harmonics on the sphere \(S^3\). The parametrization of the sphere we consider is the following one:

$$\begin{aligned} \left\{ \begin{array}{llll} a &{} =&{} \cos \theta \\ b &{} =&{} \sin \theta \cos \phi \\ c &{} =&{} \sin \theta \sin \phi \cos \varphi \\ d &{} =&{} \sin \theta \sin \phi \sin \varphi \\ \end{array}\right. \quad \text {with} \quad \left\{ \begin{array}{llll} 0 \le \theta \le \pi \\ 0 \le \phi \le \pi \\ 0 \le \varphi < 2\pi \\ \end{array}\right. \end{aligned}$$

where \((a,b,c,d)\in S^3\) is identified with the unit quaternion \(Q=a+b\textbf{i}+c\textbf{j}+d\textbf{k}\), which represents a rotation of three dimensional Euclidean space \(\mathbb {R}^3\). The volume element (used for integration on \(S^3\) and, henceforth, also in SO(3)) is then given by

$$\begin{aligned} dV=\sin ^2\theta \sin \phi d\theta d\phi d\varphi . \end{aligned}$$

Then every function \(f\in L^2(S^3)\) can be decomposed as

$$\begin{aligned} f(\theta ,\phi ,\varphi )=\sum _{\ell =0}^{\infty } \sum _{k_2=-\ell }^{\ell } \sum _{k_1=|k_2|}^{\ell } {\hat{f}}(\ell ,(k_1,k_2))\Xi _{(k_1,k_2)}^{\ell } (\theta ,\phi ,\varphi ), \end{aligned}$$

where \(\{\Xi _{(k_1,k_2)}^{\ell }\}\) denotes the orthonormal basis of \(L^2(S^3)\) formed by the hyperspherical harmonics and \({\hat{f}}(\ell ,(k_1,k_2))=\langle f, \Xi _{(k_1,k_2)}^{\ell }\rangle _{S^3}\) are the Fourier coefficients of f in this basis. We want to prove that \({\hat{K}}(\ell ,(0,0))\ge 0\) for all \(\ell \). Now, \(K(Q)=(\text {Re}(Q))^n=a^n=(\cos \theta )^n\) and

$$\begin{aligned} \Xi _{(0,0)}^{\ell } = A_{(0,0)}^{\ell } C_{\ell }^{1}(\cos \theta ), \end{aligned}$$

where \(A_{(0,0)}^{\ell }\) is a positive constant and \(C_{\ell }^{\lambda }(t)\) denotes the Gegenbauer polynomial of degree \(\ell \), which appears as the \(\ell \)-th Taylor coefficient in the expansion: \((1-2tz+z^2)^{-1}= \sum _{\ell =0}^{\infty } C_{\ell }^{1}(t)z^{\ell }\). It is well-known that \(C_{\ell }^{1}(t)=U_{\ell }(t)\) (the \(\ell \)-th Chebyshev’s polynomial of second kind) and that \(U_{\ell }(\cos \theta )=\frac{\sin ((\ell +1) \theta )}{\sin \theta }\). Hence \(\Xi _{(0,0)}^{\ell } = A_{(0,0)}^{\ell } \frac{\sin ((\ell +1) \theta )}{\sin \theta }\) and

$$\begin{aligned} {\hat{K}}(\ell ,(0,0))= & {} A_{(0,0)}^{\ell } \langle (\cos (\theta ))^n, \frac{\sin ((\ell +1) \theta )}{\sin \theta }\rangle _{S^3} \\= & {} A_{(0,0)}^{\ell } \int _{0}^{\pi } \int _{0}^{\pi } \int _{0}^{2\pi } (\cos (\theta ))^n \frac{\sin ((\ell +1) \theta )}{\sin \theta } \sin ^2\theta \sin \phi \\{} & {} \times d\theta d\phi d\varphi \\= & {} A_{(0,0)}^{\ell } \left( \int _{0}^{\pi } (\cos (\theta ))^n \frac{\sin ((\ell +1) \theta )}{\sin \theta } \sin ^2\theta d\theta \right) \\{} & {} \times \left( \int _{0}^{\pi } \sin \phi d\phi \right) \left( \int _{0}^{2\pi } d\varphi \right) \\= & {} 4\pi A_{(0,0)}^{\ell } \int _{0}^{\pi } (\cos (\theta ))^n \frac{\sin ((\ell +1) \theta )}{\sin \theta } \sin ^2\theta d\theta \\= & {} 4\pi A_{(0,0)}^{\ell } \int _{0}^{\pi } (\cos (\theta ))^n \sin ((\ell +1) \theta )\sin \theta d\theta \end{aligned}$$

To estimate the integral above, we need to use a few trigonometric formulas, as well as the hypothesis that n is even. Concretely, n even implies that n/2 is an integer and \((\cos (\theta ))^n=(\cos (\pi -\theta ))^n\). Moreover, for \(\ell \) odd, we have that

$$\begin{aligned} \sin ((\ell +1) \theta )\sin (\theta )=-\sin ((\ell +1) (\pi -\theta ))\sin (\pi -\theta ) \end{aligned}$$

This makes the integral equal to 0 for \(\ell \in 2\mathbb {N}+1\).

Assume \(\ell \in 2\mathbb {N}\). Then

$$\begin{aligned} (\cos \theta )^n= & {} \left( \frac{e^{i\theta }+e^{-i\theta }}{2}\right) ^n =\frac{1}{2^n}\sum _{k=0}^n\left( {\begin{array}{c}n\\ k\end{array}}\right) e^{i\theta (n-k)}e^{-i\theta k} \\= & {} \frac{1}{2^n}\sum _{k=0}^n\left( {\begin{array}{c}n\\ k\end{array}}\right) e^{i\theta (n-2k)}\\= & {} \frac{1}{2^n}\left[ \sum _{s=0}^{n/2}\left( {\begin{array}{c}n\\ \frac{n}{2}-s\end{array}}\right) e^{i\theta (n-2(\frac{n}{2}-s))} + \sum _{s=1}^{n/2}\left( {\begin{array}{c}n\\ \frac{n}{2}+s\end{array}}\right) e^{i\theta (n-2(\frac{n}{2}+s))} \right] \\= & {} \frac{1}{2^n}\left[ \left( {\begin{array}{c}n\\ \frac{n}{2}\end{array}}\right) +\sum _{s=1}^{n/2}\left( {\begin{array}{c}n\\ \frac{n}{2}-s\end{array}}\right) e^{2i\theta s} + \sum _{s=1}^{n/2}\left( {\begin{array}{c}n\\ \frac{n}{2}+s\end{array}}\right) e^{-2i\theta s} \right] \\= & {} \frac{1}{2^n}\left[ \left( {\begin{array}{c}n\\ \frac{n}{2}\end{array}}\right) +2 \sum _{s=1}^{n/2}\left( {\begin{array}{c}n\\ \frac{n}{2}-s\end{array}}\right) \frac{e^{2i\theta s} + e^{-2i\theta s}}{2} \right] \\= & {} \frac{1}{2^n}\left[ \left( {\begin{array}{c}n\\ \frac{n}{2}\end{array}}\right) +2 \sum _{s=1}^{n/2}\left( {\begin{array}{c}n\\ \frac{n}{2}-s\end{array}}\right) \cos (2\theta s) \right] \\= & {} \frac{1}{2^n}\left[ \left( {\begin{array}{c}n\\ \frac{n}{2}\end{array}}\right) +2 \sum _{k=1}^{n/2}\left( {\begin{array}{c}n\\ k\end{array}}\right) \cos (\theta (n-2k)) \right] \end{aligned}$$

(for the last line, just set \(k=n/2-s\)). Moreover, it is well-known that

$$\begin{aligned} \sin (\theta )\sin ((\ell +1)\theta )= \frac{1}{2}(\cos (\ell \theta )-\cos ((\ell +2)\theta )), \end{aligned}$$

so that, by a direct substitution in the formula defining \({\hat{K}}(\ell ,(0,0))\) we get

$$\begin{aligned} \frac{{\hat{K}}(\ell ,(0,0))}{4\pi A_{(0,0)}^{\ell }}= & {} \int _{0}^{\pi } \frac{1}{2}(\cos (\ell \theta )-\cos ((\ell +2)\theta ))\\{} & {} \times \left( \frac{1}{2^n}\left[ \left( {\begin{array}{c}n\\ \frac{n}{2}\end{array}}\right) +2 \sum _{k=1}^{n/2}\left( {\begin{array}{c}n\\ k\end{array}}\right) \cos (\theta (n-2k)) \right] \right) d\theta \\= & {} \frac{1}{2^{n+1}} \int _{0}^{\pi } (\cos (\ell \theta )-\cos ((\ell +2)\theta ))\\{} & {} \times \left[ \left( {\begin{array}{c}n\\ \frac{n}{2}\end{array}}\right) +2 \sum _{k=1}^{n/2}\left( {\begin{array}{c}n\\ k\end{array}}\right) \cos (\theta (n-2k)) \right] d\theta .\\ \end{aligned}$$

We can now use that

$$\begin{aligned} \cos (x)\cos (y)=\frac{1}{2}(\cos (x+y)+\cos (x-y)) \end{aligned}$$

to claim that

$$\begin{aligned}{} & {} \frac{{\hat{K}}(\ell ,(0,0))}{4\pi A_{(0,0)}^{\ell }}\\{} & {} \quad = \frac{1}{2^{n+1}} \int _{0}^{\pi } \left( {\begin{array}{c}n\\ \frac{n}{2}\end{array}}\right) (\cos (\ell \theta )-\cos ((\ell +2)\theta ))d\theta \\{} & {} \qquad +\frac{1}{2^{n}}\int _0^\pi \sum _{k=1}^{n/2}\left( {\begin{array}{c}n\\ k\end{array}}\right) (\cos (\ell \theta )-\cos ((\ell +2)\theta )) \cos (\theta (n-2k)) d\theta \\{} & {} \quad = \frac{1}{2^{n+1}} \int _{0}^{\pi } \left( {\begin{array}{c}n\\ \frac{n}{2}\end{array}}\right) (\cos (\ell \theta )-\cos ((\ell +2)\theta ))d\theta \\{} & {} \qquad +\frac{1}{2^{n+1}}\int _0^\pi \sum _{k=1}^{n/2}\left( {\begin{array}{c}n\\ k\end{array}}\right) [\cos (\theta (\ell +n-2k)) + \cos (\theta (\ell +2k-n))\\{} & {} \qquad -\cos (\theta (\ell +2+n-2k))-\cos (\theta (\ell +2+2k-n)] d\theta \end{aligned}$$

The parity of \(\ell \) and n implies that all factors that appear multiplying the variable \(\theta \) inside of the cosine functions are even numbers. This makes the corresponding integrals (on \([0,\pi ]\)) equal to 0, except in the case that the factor itself is 0. In such case, \(\cos (0)=1\) implies that only the cosine functions that appear with a minus sign in front of them in the formula can contribute with a negative number to the integral. Now clearly \(\ell +2>0\) always since \(\ell \ge 0\), and \(\ell +2+n-2k=0\) implies \(2k=n+\ell +2>n\), so that \(k>n/2\) which is impossible since the sum’s range goes from \(k=1\) to \(k=n/2\). This means that the term \(-\cos (\theta (\ell +2+n-2k))\) never contributes with a negative number to the sum. On the other hand, if the cosine function with factor \(\ell +2+2k-n\) contributes, which means that \(\ell +2+2k-n=0\), then \(k=(n-\ell -2)/2<n/2\). In particular, taking \(k^*=k+1\), we have that \(1\le k^*\le n/2\) so that \(\cos (\theta (\ell +2k^*-n))= \cos (\theta (\ell +2k+2-n))=\cos (0)=1\) and the corresponding term effectively appears in the sum. In particular, adding these two terms of the sum we get

$$\begin{aligned}{} & {} \frac{1}{2^{n+1}}\int _0^\pi [\left( {\begin{array}{c}n\\ k^*\end{array}}\right) \cos (\theta (\ell +2k^*-n)) -\left( {\begin{array}{c}n\\ k\end{array}}\right) \cos (\theta (\ell +2+2k-n)] d\theta \\{} & {} \quad = \frac{1}{2^{n+1}}\int _0^\pi \left[ \left( {\begin{array}{c}n\\ k+1\end{array}}\right) -\left( {\begin{array}{c}n\\ k\end{array}}\right) \right] d\theta \\{} & {} \quad = \frac{\pi }{2^{n+1}}\left( \left( {\begin{array}{c}n\\ k+1\end{array}}\right) -\left( {\begin{array}{c}n\\ k\end{array}}\right) \right) >0 \end{aligned}$$

since n is even and \(k<n/2\). This ends the proof of Lemma 3.2\(\square \)

5 Conclusions

We have exposed the maths of classical template matching with rotations. Moreover, an alternative to the classical algorithm, named tensorial template matching (or TTM), has been shown. TTM integrates the information relative to all rotated versions of a template t into a unique symmetric tensor template T, which is computed only once per template. The main theorem of the paper, Theorem 3.1, shows that finding an exact match between an image f and a rotated version \(t_R\) of the template t at a given position x is equivalent to finding a best rank 1 approximation (in the Frobenius norm) to a certain tensor \(C_n(x)\). The resulting algorithm has reduced computational complexity when compared to the classical one. TTM finds the position and rotation of instances of the template in any tomogram with just a few correlations with the linearly independent components of T. In particular, Cryo-electron tomography (3D images) for macromolecular detection requires 7112, 45,123 and 553,680 rotations to achieve an accuracy of 13\(^\circ \), 7\(^\circ \) and 3\(^\circ \) respectively [5]. Therefore, and considering 4-degree tensors (35 linearly independent components), the potential speed-up of our approach with respect to TM is 203\(\times \), 1239\(\times \) and 184,560\(\times \) in these cases, while the angular accuracy remains constant for TTM and it is limited by the computation of tensorial template.

In [20], we develop a practical implementation of TTM showing with both, synthetic and real data, that our method is able to find template instances and determine their rotations with computational complexity independent of the rotation accuracy.