Abstract
Normalized cross-correlation is the reference approach to carry out template matching on images. When it is computed in Fourier space, it can handle efficiently template translations but it cannot do so with template rotations. Including rotations requires sampling the whole space of rotations, repeating the computation of the correlation each time.This article develops an alternative mathematical theory to handle efficiently, at the same time, rotations and translations. Our proposal has a reduced computational complexity because it does not require to repeatedly sample the space of rotations. To do so, we integrate the information relative to all rotated versions of the template into a unique symmetric tensor template -which is computed only once per template-. Afterward, we demonstrate that the correlation between the image to be processed with the independent tensor components of the tensorial template contains enough information to recover template instance positions and rotations. Our proposed method has the potential to speed up conventional template matching computations by a factor of several magnitude orders for the case of 3D images.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A classical problem in image processing and, particularly, in pattern recognition, is to identify if a large image contains copies -and how many, and their locations and orientations- of a small image, named “template”. The resulting algorithms are generically known as template matching algorithms [3, 13, 14]. The most classical solution is based on using cross-correlations, although there are other approaches based, for example, in metaheuristic algorithms [7] or on deep learning [9, 18, 21]. In this paper, we show the mathematical foundations of the cross-correlation-based template matching algorithm (TM in all that follows), and we introduce a new fast algorithm that solves the problem using tensors.
The main advantages of TM, when compared to the algorithms based on machine learning, are that TM is a white box model, it is directly applicable when you have just one template and one larger image (not requiring any kind of training, which may be a very difficult task in some applications), and locates rotations with arbitrary precision. Current deep learning based algorithms for template matching in three-dimensional images are not able to estimate rotations accurately [18].
On the other hand, a major drawback of TM is its computational cost. TM basic idea is to compute the inner product between the (rotated) template and the (translated) image, and normalize the result. These computations are made, for each rotation, in the Fourier domain to efficiently address translations [2, 19, 24]. However, this process has to be repeated for every rotation to be investigated, thus the resulting complexity has a dependency with the rotations processed. The computational cost of this process may become restrictive for 3D images since SO(3), the space of rotations of \(\mathbb {R}^3\), is a (compact) manifold of dimension 3. In application domains such as cryo-electron microscopy, there are required more than ten thousand rotations for achieving an angular precision of a few degrees. An alternative approach is to apply steerable filters to compute efficiently the correlation at different rotations. However, although a new method has been developed to generate steerable filters for any arbitrary kernel [12] in 2D, there is no solution yet for 3D images.
We propose an algorithm called tensorial template matching, TTM, which integrates into a unique symmetric tensor the information relative to the template in all rotations. In other words, the tensor template incorporates in a unique object the information about all rotations of the template, thus allowing us to find the position and rotation of instances of the template in any tomogram with just a few correlations with the linearly independent components of the tensor. The tensor template is computed only once per template, and, as soon as it is generated, it enables to process any image.
2 Classical template matching
Let us introduce some notation. d-dimensional images are just elements of \(L^2(\mathbb {R}^d)\), which is a Hilbert space with the inner product \(\langle f,g\rangle =\int _{\mathbb {R}^d}f(x)g(x)dx\). It is natural to use the inner product to compare two images f, g of the same size. Concretely, we can use that \(\langle f,g\rangle =\Vert f\Vert _2\Vert g\Vert _2\cos \theta \), where \(\theta \) is the angle formed by f and g. In particular, \(f=\alpha g\) for some positive constant \(\alpha \) if \(\frac{\langle f,g\rangle }{\Vert f\Vert _2\Vert g\Vert _2}=1\).
Template matching is typically used to study if instances of a “small” image t (the template) is present in a larger image f, e.g. look for instances of an specific macromolecule in a cryo-electron tomogram (3D volumetric image). The size of the image is connected to the set of points where the image does not vanish, the support of the image. That is, t is meant “small” when the set \(K=supp(t)=\overline{\{x:t(x)\ne 0\}}^{\mathbb {R}^d}\) is small (e.g., is a subset of a small ball \(\mathbb {D}\)). Let’s assume that f and t have quite different sizes, so our interest is to compare t (the template, the small image) with just a part of f. In such case we need to introduce some special operators \(S:L^2(\mathbb {R}^d)\rightarrow L^2(\mathbb {R}^d)\) that fix our attention in just a part of the domain of f. An interesting example of such operators is
where \(U:\mathbb {R}\rightarrow \mathbb {R}\) denotes Heaviside’s unit step function and \(r>0\). If the support of the template t is \(\mathbb {D}_{\textbf{0}}(r)=\{x:\Vert x\Vert _2\le r\}\), the ball of radius r centered at \(\textbf{0}\in \mathbb {R}^d\), the normalized inner product
informs about the similarity between t and the restriction of f to \(\mathbb {D}_{\textbf{0}}(r)\). Moreover, if we introduce the translation operator \(\tau _{x}: L^2(\mathbb {R}^d)\rightarrow L^2(\mathbb {R}^d)\), \(\tau _{x}(f)(z)=f(z+x)\) and compute
the result informs about the similarity between t and the restriction of f to \(\mathbb {D}_{x}(r)=\{z:\Vert x-z\Vert _2\le r\}\). Of course, it may happen that f contains a copy of a rotated version of t, so rotations are also necessary for a complete discussion of the problem. Thus, given \(R\in SO(d)\), we define the operator \(O_R: L^2(\mathbb {R}^d)\rightarrow L^2(\mathbb {R}^d)\), \(O_R(t)(z)=t(Rz)\), and for \(t\in L^2(\mathbb {R}^d)\), we define a rotated version of t,
The normalized inner product
informs about the similarity of \(t_R\) and the restriction of f to \(\mathbb {D}_{x}(r)\). It is important to notice that \(\Vert t\Vert _2=\Vert t_R\Vert _2\).
The operator \(S_r\) defined by (1) has some special properties. Concretely, it is symmetric, semidefinite positive and commutes with rotations Recall that, given \((X,\langle \cdot ,\cdot \rangle _X)\) a (real) inner product space,Footnote 1 an operator \(S:X\rightarrow X\) is named:
-
Symmetric (also named self-adjoint) if
$$\begin{aligned} \langle f,S(g)\rangle _X= \langle S(f),g\rangle _X \quad \text {for all } f,g\in X \end{aligned}$$ -
Semidefinite positive, if
$$\begin{aligned} \langle f,S(f)\rangle _X\ge 0\quad \text {for all } f\in X \end{aligned}$$ -
Definite positive, if it is semidefinite positive and \(\langle f,S(f)\rangle =0\) implies \(f=0\).
If \(S:X\rightarrow X\) is a symmetric semidefinite positive operator (SSP, in all what follows), then X becomes a semi-normed space with the inner product
and the seminorm
Observe, for example, that if S is given by (1), then \(\Vert f\Vert _S=0\) means that \(f_{|\mathbb {D}_{\textbf{0}}(r)}=0\) almost everywhere.
Theorem 2.1
Let X be an inner product space, \(S:X\rightarrow X\) be an SSP operator, and consider the inner product given by (3). Then
- \(\mathrm{(a)}\):
-
\( \langle f,g\rangle _S\le \left| \langle f,g\rangle _S\right| \le \Vert f\Vert _S\Vert g\Vert _S\) for all \(f,g\in L^2(\mathbb {R}^d)\).
Moreover, if \(f,g\in L^2(\mathbb {R}^d)\), \(\Vert g\Vert _S\ne 0\), the following are equivalent statements:
- \(\mathrm{(b)}\):
-
\(\langle f,g\rangle _S=\Vert f\Vert _S\Vert g\Vert _S\).
- \(\mathrm{(c)}\):
-
\(\Vert f-\frac{\Vert f\Vert _S}{\Vert g\Vert _S}g\Vert _S=0\).
Proof
As S is SSP, we have that, for all \(\alpha \in \mathbb {R}\),
Hence, if \(\Vert g\Vert _S\ne 0\), the only way that the quadratic polynomial (in \(\alpha \)) above is nonnegative everywhere is that
which is equivalent to
On the other hand, if \(\Vert g\Vert _S=0\), the only way to satisfy (5) is \(\langle f,g\rangle _S=0\), in whose case (6) also holds. This proves (a).
Let us now demonstrate \((b)\Leftrightarrow (c)\) whenever \(\Vert g\Vert _S\ne 0\). Indeed, (c) is equivalent to
which holds if and only if
Thus \((b)\Leftrightarrow (c)\). \(\square \)
Note that, if \(f,t\in L^2(\mathbb {R}^d)\) are two images, \(\alpha >0\), and we take \(S=S_r\) given by (1), then \(\Vert \tau _xf-\alpha O_{R^{-1}}(t)\Vert _S=0\) means that f has a match with \(t_R\) in the unit ball centered at x. Indeed, there are many ways to define operators S with the property that \(\Vert f-g\Vert _S=0\) means that \(f=g\) in a neighbourhood of \(\textbf{0}\), so that \(\Vert \tau _xf-\alpha O_{R^{-1}}(t)\Vert _S=0\) means that f has a match with a rotated version of t in a neighbourhood of \(\textbf{0}\). Although arbitrary SSP operators may not enjoy this property, they allow the creation of a general way to deal with this kind of operators.
Thus, in all what follows, we assume that X is a vector subspace of \(L^2(\mathbb {R}^d)\) doted with the inner product that inherits from \(L^2(\mathbb {R}^d)\), \(S:X\rightarrow X\) is an SSP operator, and \(\Vert \textbf{1}\Vert _S^2=\langle \textbf{1},\textbf{1}\rangle _S>0\), where \(\textbf{1}(x)=1\) is the constant image.Footnote 2
Rotations and composition of operators will play an important role in this paper. Thus, it is natural to ask how the composition of rotations acts on the images. This is, indeed, a simple computation:
Hence
and
Given an image f, we consider its projection onto the space of images which are \(S-\) orthogonal to the constant image \(\textbf{1}\),
Remark 1
These projections are important to study invariant properties with respect to constant brightness changes in the images of translations and rotations. Note that there is no “real” difference between an image f and the images of the form \(f+\alpha \textbf{1}\), \(\alpha \in \mathbb {R}\). When we modify the constant \(\alpha \), what we observe is a uniform change in the density or the brightness, but not the apparition of new structures or forms, in the image f. Thus, f and its projection \(P_S(f)\) essentially represent the very same image, since \(f=P_S(f)+\alpha \textbf{1}\) for certain \(\alpha \in \mathbb {R}\).
Given two images f, t, we have that
Hence
since \(P_S(f),P_S(t) \perp _S \textbf{1}\). Consequently, if \(x\in \mathbb {R}^d\) and \(R\in SO(d)\), there are two constants \(\rho =\rho (x)\) and \(\delta =\delta (R)\) such that
Assume that S commutes with rotations, and take \(x\in \mathbb {R}^d\) fixed. Then, for each \(R\in SO(d)\) we have that
since \(O_{R^{-1}}(\textbf{1})=\textbf{1}\). Moreover
(just take \(v=R^{-1}u\) and use that \(\det R=1\) )
(since \(O_R\circ S= S\circ O_R\))
(since \(O_R(\textbf{1})=\textbf{1}\) )
Hence \(O_{R^{-1}}(P_S(t))\perp _S \textbf{1}\) and this, in conjunction with (10), implies that
Hence
is an S-orthogonal decomposition of \(t_R\), which means that the constant \(\beta \) that multiplies \(\textbf{1}\) in the S-orthogonal decomposition of \(t_R\) does not depend on R, and
In particular, for each \(x\in \mathbb {R}^d\), the problems:
-
Maximize \(\langle \tau _x(f),t_R\rangle _S \) over rotations R.
-
Maximize \(\langle P_S(\tau _x(f)),P_S(t)_R\rangle _S \) over rotations R.
-
Maximize \(\langle P_S(\tau _x(f)),P_S(t_R)\rangle _S \) over rotations R.
are equivalent.
Let us define:
Lemma 2.2
If S is an SSP operator that commutes with rotations, the parameter \(\delta \) that appears in the S-orthogonal decomposition
does not depend on R. Consequently, given \(x\in \mathbb {R}^d\), the problems
-
Maximize \(\langle f_{-x,R^{-1}},t \rangle _S \) over rotations R.
-
Maximize \(\langle P_S(f_{-x,R^{-1}}),P_S(t)\rangle _S \) over rotations R.
are equivalent.
Proof
We know that \(\delta =\frac{\langle f_{-x,R^{-1}},\textbf{1} \rangle _S}{\langle \textbf{1},\textbf{1} \rangle _S}\), so that we only need to prove that \(\langle f_{-x,R^{-1}},\textbf{1} \rangle _S\) does not depend on R. Indeed,
(Make the change of variable \(z=Ry\) )
(since S commutes with \(O_{R^{-1}}\) )
(since \(O_{R^{-1}}(\textbf{1})=\textbf{1}\) )
\(\square \)
We can now state and demonstrate the following:
Theorem 2.3
(Classical template matching) Let S be a SSP operator which commutes with rotations and let \(x\in \mathbb {R}^d\) be fixed. Then the following are equivalent problems:
- \(\mathrm{(a)}\):
-
Maximize \(\langle f_{-x,R^{-1}},t \rangle _S\) over rotations R.
- \(\mathrm{(b)}\):
-
Maximize \(\langle \tau _x(f), t_R \rangle _S\) over rotations R.
- \(\mathrm{(c)}\):
-
Maximize \(\langle P_S(\tau _x(f)), P_S(t_R) \rangle _S\) over rotations R.
- \(\mathrm{(d)}\):
-
Maximize \(\langle P_S(f_{-x,R^{-1}}),P_S(t) \rangle _S \) over rotations R.
Moreover, if \(\Vert t\Vert _S>0\) and S also has the property that \(\Vert f\Vert _S=0\) implies \(f_{|\textbf{D}}=0\) for a certain neighborhood \(\textbf{D}\) of \(\textbf{0}\in \mathbb {R}^d\) which contains the supports of all the rotated templates \(t_Q\) with \(Q\in SO(d)\), then a match between f and \(t_R\) in x is got whenever any one of the following claims hold:
- \((\textrm{a}^*)\):
-
\(\frac{\langle f_{-x,R^{-1}},t \rangle _S}{\Vert f_{-x,R^{-1}}\Vert _S\Vert t\Vert _S}=1\)
- \((\textrm{b}^*)\):
-
\(\frac{\langle \tau _x(f), t_R \rangle _S}{\Vert \tau _x(f)\Vert _S\Vert t_R\Vert _S}=1\)
- \((\textrm{c}^*)\):
-
\(\frac{\langle P_S(\tau _x(f)), P_S(t_R) \rangle _S}{\Vert P_S(\tau _x(f))\Vert _S \Vert P_S(t_R)\Vert _S} =1\)
- \((\textrm{d}^*)\):
-
\(\frac{\langle P_S(f_{-x,R^{-1}}),P_S(t) \rangle _S}{\Vert P_S( f_{-x,R^{-1}})\Vert _S\Vert P_S(t)\Vert _S}=1\)
Finally, the normalized correlations described in \((a^*)\), \((b^*)\), \((c^*)\), and \((d^*)\) do not change when we substitute f by \(\alpha f+\beta \), and t by \(\delta t+\gamma \), with \(\alpha ,\beta ,\delta ,\gamma \in \mathbb {R}\), \(\alpha ,\delta \ne 0\).
Proof
The equivalences \((a)\Leftrightarrow (d)\) and \((b)\Leftrightarrow (c)\) have been already shown. The following identities demonstrate \((a)\Leftrightarrow (b)\):
( just take \(Rz=y\) and use that \(\det R=1\) )
(since S commutes with \(O_{R^{-1}}\) )
The other claims are a direct consequence of Theorem 2.1. \(\square \)
In all that follows, we assume that S is an SSP operator that commutes with rotations and t is normalized in the sense that \(t\perp _S \textbf{1}\) and \(\Vert t\Vert _S=1\). Then \(P_S(t_R)=t_R\) and \(\Vert t_R\Vert _S=1\) for all rotation R. Consequently,
and
attains its maximum (\(=1\)) if and only if there is a perfect match between f and \(t_R\) in x. Moreover, if we define \(w(x)=\frac{1}{\Vert P_S(\tau _x(f))\Vert _S }\) and consider the cross-correlation of functions \(f,g\in L^2(\mathbb {R}^d)\), which is defined by
then
A perfect match is, in general terms, never attained. This is so because the desired image, represented by the template t, is usually supported on a strict subset \(\Omega \) of the domain D were the operator S is able to distinguish functions. Thus, the image f may well contain a copy of the image represented by \(t_R\) but in the neighbourhoods of the support of \(t_R\), f will contain some information which is not present in \(t_R\). In addition, f is usually corrupted by noise and distortions. This means that the normalized correlations described in items \((a^*)-(d^*)\) of Theorem 2.3, will never equal 1. Consequently, a threshold should be introduced in order to decide if a match has (or has not) been produced.
In order to find the rotation which maximizes c(x, R), the cross-correlation \((f\star S(t)_R)(x)\) should be computed for a huge amount of rotations R, which makes classical matching an inefficient approach for template matching. Indeed, for \(d=3\), the size of the set of rotations R used to sample SO(3) well enough to guarantee a reliable result varies between \(10^4\) and \(5\cdot 10^5\) rotations [5].
Due to numerical reasons, high frequencies may be altered during rotation transformation. Thus, in practice, we do not apply the operator S to the original images f, t but to a filtered version of them that eliminates these high frequencies. Concretely, we apply an isotropic (i.e. rotation invariant) low-pass filter h to both images and, after that, we apply the template matching algorithm to the resulting images. The idea behind this is that, if there is a match between f and t, there will be a match between \({\mathfrak {f}}=f*h\) and \({\mathfrak {t}}=t*h\) too. The operator S results from applying a rotationally symmetric mask \(m(x)=\rho (\Vert x\Vert )\) to the given image. Thus, we substitute f by \({\mathfrak {f}}=f*h\) and t by \({\mathfrak {t}}=t*h\). Then we apply the classical (or tensor) matching algorithm to the pair of images \({\mathfrak {f}},{\mathfrak {t}}\) using the SSP operator \(S({\mathfrak {f}})(x)=m(x) {\mathfrak {f}}(x)\). Usually, the mask m equals 1 within a certain radius around \(\textbf{0}\) and equals 0 outside a sightly larger radius. In between these radii the mask takes values between 0 and 1. Under these restrictions, it is clear that the operator S is SSP and commutes with rotations. Moreover, if \(0=\Vert f\Vert _S=\langle f,S(f)\rangle \ge \int _{\textbf{D}}f^2(x)dx\ge 0\), we have that \(f_{|\textbf{D}}=0\) where \(\textbf{D}\) is a ball of positive radius centered at \(\textbf{0}\). Let us compute the inner product
(since every filter is translation invariant, and h is isotropic)
(use \(\widetilde{h}(x):=h(-x)=h(x)\), which follows from isotropy of h )
where
and we use \(\cdot \) to denote the standard product of real functions. This means that we would have the same effect just considering the template matching algorithm associated with the operator \(\overline{S}\) applied to the images f, t. Moreover, the following holds:
Lemma 2.4
Let \(S:L^{2}(\mathbb {R}^d)\rightarrow L^{2}(\mathbb {R}^d)\) be given by
with h defining an isotropic filter and m a rotationally symmetric mask as described above. Then S is SSP.
Proof
For the proof, we use the following (well-known) formulae: For functions \(a,b,c\in L^{2}(\mathbb {R}^d)\), we have that \((a\star b)(x)=\langle \tau _x(a),b\rangle \), so that \((a\star b)(0)=\langle a,b\rangle = (a*\widetilde{b})(0)\), \(a\star b=a*\widetilde{b}\), and \(a\star (b*c) = (a\star b)\star c\).
Let us now consider the product \(\langle f,S(f)\rangle \):
(since \(h=\widetilde{h}\) and \(m\ge 0\)). This proves that S is semidefinite positive. Let us show the symmetry:
(since \(g*h= h*g \) and \(\cdot \) is the standard product of functions)
\(\square \)
Remark 2
Lemma 2.4 also applies when we consider S as an operator on the space \(C_0(\mathbb {R}^d)\) of continuous functions with compact support defined on \(\mathbb {R}^d\), doted with the scalar product of \(L^2(\mathbb {R}^d)\), so that \(S:C_0(\mathbb {R}^d)\rightarrow C_0(\mathbb {R}^d)\). This is so because \(C_0(\mathbb {R}^d)\) is a vector subspace of \(L^2(\mathbb {R}^d)\), and the convolution of continuous functions with compact support is also continuous with compact support. The space \(C_0(\mathbb {R}^d)\) is, in fact, a good model for images that can be used in many application domains.
In all that follows, we assume that the SSP operator S is of the form (14) with h, m verifying the hypotheses of Lemma 2.4. Thus, the template matching algorithm is applied with this operator and a fast computation of c(x, R) is needed.
A direct computation leads to:
(since \(h=\widetilde{h}\))
(since h is isotropic, and m is rotationally symmetric). Moreover,
Now, using the definition of S (and imposing \(h*\textbf{1}=\textbf{1}\)), we can simplify the computation as follows:
Note that the FFT algorithm can be used to compute the inner products appearing at the end of the formula above, which helps to fasten the algorithm. Indeed, if f, g are two images, \(\langle f,g\rangle = (f*\widetilde{g})(0)\), so that
Moreover, the following identities also hold:
and
Thus,
The formulae above can be used to code an algorithm for classical template matching.
An important tool we will use in this paper is the set \(\mathbb {H}\) of quaternions. In particular, we will use that rotations can be parametrized by unit quaternions (which can be identified with the unit 3-sphere \(\mathbb {S}^3\)), as well as the following formulae (see, e.g. [11, 22]):
-
If \(x\in \mathbb {H}\) has norm 1, then \(x^{-1}=x^*\).
-
Given \(x\in \mathbb {H}\), \(x=a+b\textbf{i}+c\textbf{j}+d\textbf{k}\), we identify x with a pair (a, v) where \(a\in \mathbb {R}\) and \(v=(b,c,d)\in \mathbb {R}^3\), and call a the real part of x, \(a={\textbf{R}}{\textbf{e}}(x)\). Then, if \(x=(a,v), y=(b,w)\in \mathbb {H}\), we have that
$$\begin{aligned} {\textbf{R}}{\textbf{e}} (xy)=ab-\langle v, w \rangle \end{aligned}$$Consequently, if \(x,y\in \mathbb {H}\) have norm 1, then
$$\begin{aligned} \langle x,y\rangle= & {} {\textbf{R}}{\textbf{e}}(y^{-1}x) = {\textbf{R}}{\textbf{e}}(xy^{-1})\nonumber \\= & {} {\textbf{R}}{\textbf{e}}(yx^{-1})= {\textbf{R}}{\textbf{e}}(x^{-1}y) \end{aligned}$$(16)
We end this section with a result about composition of SSP operators that will be used in the proof of the main theorem of the paper. We state the result for arbitrary inner product spaces, and include its proof for the sake of completeness:
Lemma 2.5
Let X be an inner product vector space. If the operators \(T,S:X\rightarrow X\) are semidefinite positive, symmetric, and commute, then TS is also semidefinite positive and symmetric.
Proof
Let S, T satisfy the hypothesis of the lemma. Let us define \(S_1=S/\Vert S\Vert \) and \(S_{n+1}=S_n-S_n^2\), \(n=1,2,3,\ldots \). We prove by induction on n that \(0\le S_n\le I\) for all \(n\ge 1\), where I denotes the identity operator on X.
It is clear that \(S_1\ge 0\) since \(S\ge 0\). Moreover, given \(x\in X\),
so that \(S_1\le I\). Assume that \(0\le S_k\le I\) and consider the case \(k+1\): From \(S_k\le I\) we get \(0\le I-S_k\). From \(0\le S_k\), we get \(-S_k\le 0\) and, henceforth, \(I-S_k\le I\). Now, given \(x\in X\), we have that
since \(0\le I-S_k\). This proves \(S_k^2(I-S_k)\ge 0\). An analogous computation also shows that \(S_k(I-S_k)^2\ge 0\):
Hence
On the other hand, \(S_k^2\ge 0\) and \(I-S_k\ge 0\) imply that
Thus \(0\le S_n\le I\) for all n.
Now, \(S_{n+1}=S_n-S_n^2\) can be written as \(S_n=S_n^2+S_{n+1}\), so that:
Hence
It follows that, for \(x\in X\),
Thus, \(\sum _{k=1}^\infty \Vert S_k(x)\Vert ^2<\infty \) and \(\Vert S_n(x)\Vert \) goes to 0 for \(n\rightarrow \infty \), for all \(x\in X\). Consequently,
Let us now consider the product \(ST=TS\), and let \(x\in X\) be arbitrarily chosen. It follows from the definition of the operators \(S_n\) that they are symmetric and commute with T. Hence
which proves that \(ST=TS\) is SSP. \(\square \)
3 Tensor template matching
This section introduces a tensorial template matching (TTM) algorithm. The purpose is to handle translations and rotations efficiently at the same time. First, we introduce some background necessary to understand further mathematical developments. Second, we present the main theorem for TTM, which allows us to determine the optimal rotation of the template, t, on every match in the image f without sampling the SO(3) by computing some tensors. Finally, we explain how to determine match positions (template translations) directly from the computed tensors.
3.1 Tensor background
A tensor \(A\in T^n(\mathbb {R}^d)\) of order n and dimension d is just an array of the form \(A=(A_{i_1,\ldots ,i_n})_{1\le i_1,\ldots ,i_n\le d}\) where all the entries \(A_{i_1,\ldots ,i_n}\) are real numbers. The tensor A is named symmetric if \(A_{i_1,\ldots ,i_n}=A_{i_{\sigma (1)},\ldots ,i_{\sigma (n)}}\) for every permutation \(\sigma \in \Sigma _n\) (the set of permutations of \(\{1,\ldots ,n\}\)). We denote by \(S^n(\mathbb {R}^d)\) the set of symmetric tensors of order n and dimension d. An important example of symmetric tensor of order n is the so called n-th tensor power of a vector \(v=(v_1,\ldots ,v_d)\in \mathbb {R}^d\), which is defined as
It is well known that \(T^n(\mathbb {R}^d)\) and \(S^n(\mathbb {R}^d)\) are real vector spaces with the natural operations (pointwise sum and multiplication by a scalar), and that \(\dim T^n(\mathbb {R}^d)=d^n\), \(\dim S^n(\mathbb {R}^d)=\left( {\begin{array}{c}n+d-1\\ n\end{array}}\right) \) for all \(n,d\ge 1\). For example, \(\dim T^4(\mathbb {R}^4)=4^4=256\), \(\dim S^4(\mathbb {R}^4)=\left( {\begin{array}{c}7\\ 4\end{array}}\right) =35\). Moreover, every symmetric tensor is a finite sum of tensor powers, which allows us to introduce the concept of the (symmetric) rank of a symmetric tensor as the minimal number of tensor powers used to represent the tensor with their sum [6].
The map \(\langle \cdot ,\cdot \rangle : T^n(\mathbb {R}^d)\times T^n(\mathbb {R}^d) \rightarrow \mathbb {R}\) given by
defines an inner product. It is also usual to denote \(A\cdot B=\langle A,B\rangle \). Moreover, with this notation, if \(x,y\in \mathbb {R}^d\) are d-dimensional vectors, a direct application of the multinomial theorem shows that
Moreover, if \(A\in S^n(\mathbb {R}^d)\), and \(x=(x_1,\ldots ,x_d)\in \mathbb {R}^d\), we can also consider the inner product
which can be seen as an homogeneous polynomial in d variables, of degree n, which justifies using the notation \(Ax^n=A\cdot x^{\odot n}\). Moreover, if \(k<n\), \(Ax^{k}\in S^{n-k}(\mathbb {R}^d)\) denotes the symmetric tensor whose components are
In particular, \(Ax^{n-1}\in S^1(\mathbb {R}^d)=\mathbb {R}^d\) is a vector whose i-th component is
Indeed, if \(\varphi (x)=Ax^n\) then
where \(\nabla \varphi \) denotes the gradient of the function \(\varphi :\mathbb {R}^d\rightarrow \mathbb {R}\).
Note that the vector x can be chosen from \(\mathbb {C}^d\) in the definitions above. This justifies the following definition (see [8]): Given \(A\in S^n(\mathbb {R}^d)\), \(B\in S^m(\mathbb {R}^d)\). We say that \(\lambda \in \mathbb {C}\) is a B-eigenvalue of A and \(u\in \mathbb {C}^d\) is its associated B-eigenvector (equivalently, that \((\lambda ,u)\) is a B-eigenpair of A) if \(Au^{n-1}=\lambda Bu^{m-1}\) and \(Bu^m=1\).
Using gradients, we can rewrite the equation \(Au^{n-1}=\lambda Bu^{m-1}\) as
Hence u is a B-eigenvector of A if and only if it is a critical point of the following optimization problem:
Two particularly important cases are the H-eigenvectors
and the Z-eigenvectors
The optimization problem associated with finding Z-eigenvectors of a given symmetric tensor is particularly important for us since the tensor matching algorithm we propose is reduced to one of these problems in each position, and, fortunately, there are good iterative algorithms to approximate the solutions of (25) (see e.g. [15, 16]). These algorithms have a linear rate of convergence. In section 3.4 we show an heuristics that can be used to select the positions where a match is probable, so that solving (25) is necessary.
3.2 Defining of the Tensor template
In all that follows in this paper, our inner product space is \(X=C_0(\mathbb {R}^d)\), and \(S:X\rightarrow X\) denotes an SSP operator which commutes with rotations. Moreover, we also assume that the template \(t\in X\) is normalized by \(t\perp _S \textbf{1}\) and \(\Vert t\Vert _S=1\). In section 2 we proved that, for each \(x\in \mathbb {R}^d\), \(c(x,R) = w(x) (f\star S(t)_R)(x)\) attains its maximum value on rotation R (and this value equals 1) if and only if there is a match between f and t at (x, R) (i.e., a match between \(\tau _x(f)\) and \(t_R\)). Let us define the symmetric tensor \(C_n(x)\in S^{n}(\mathbb {R}^{d'})\), where \(d'\) is the number of parameters used to describe the rotations SO(d) (in particular, for \(d=3\), we get \(d'=4\)) by the formula:
This means that
for all \(1\le i_1,\ldots ,i_n\le d'\). Hence
where
is a tensor template (or tensorial needle).
It is of fundamental importance to observe that T(z) is computed only once and contains a reduced number of components, since \(\dim S^n(\mathbb {R}^{d'})=\left( {\begin{array}{c}n+d'-1\\ n\end{array}}\right) \) (in particular, \(\dim S^4(\mathbb {R}^{4})=\left( {\begin{array}{c}7\\ 4\end{array}}\right) =35\)). Indeed, this is the main reason why the tensor template matching algorithm we introduce in this paper is fast. Another reason is that rotations R defining a match between f and \(t_R\) at x are Z-eigenvectors of the symmetric tensor \(C_n(x)\), which is really remarkable because the power method used in [15, 16] for the solution of the corresponding optimization problem is fast. Moreover, the corresponding algorithm does not require using myriads or even millions of rotations -as is the case with classical matching algorithms- but just a reduced set of them: one by iteration.
3.3 Finding the correct rotation
Let us state the main result of this paper:
Theorem 3.1
Let \(f,t\in C_0(\mathbb {R}^3)\), \(x\in \mathbb {R}^3\), and \(n\in 2\mathbb {N}\) be given. If there is a match between f and \(t_R\) at x, the function \(\varphi (Q)=C_n(x)\cdot Q^{\odot n}\), defined on rotations of \(\mathbb {R}^3\), when parametrized by unit quaternions Q, attains its global maximum at \(Q=R\).
Proof
We prove the result as a consequence of Theorem 2.3. Thus, our main goal is to represent \(\varphi (Q)\) in terms of a scalar product, \(\varphi (Q)=w(x)\langle \tau _x(f), t_Q \rangle _{S'}\), for some SSP operator \(S'\), which would guarantee that if there is a match between f and \(t_R\) at x, then \(\varphi (Q)\) attains its global maximum at \(Q=R\).
Let us compute \(\varphi (Q)\):
(by definition of c(x, R)). Hence, dividing by w(x), we get:
(where \(S(t)(z)(R)=(O_{R^{-1}}\circ S)(t)(z)\))
where \(K(R)=({\textbf{R}}{\textbf{e}} (R))^n\) and
denotes the convolution of functions \(a,b\in L^2(SO(3))\).
In other words,
Here,
must be interpreted as a function defined on \(\mathbb {R}^3\) with values on \(\mathbb {R}\) (indeed, it is an element of \(C_0(\mathbb {R}^3)\)):
where \(S(t)(z):SO(3)\rightarrow \mathbb {R}\) is given by
Indeed, in general, every element \(t\in C_0(\mathbb {R}^3)\) can be interpreted, for each \(z\in \mathbb {R}^3\), as an element of \(L^2(SO(3))\) just making \(t(z)(R)=t_R(z)\). Consequently, \((t\circledast _{SO(3)} K)(I_d)\) is an element of \(C_0(\mathbb {R}^3)\),
It follows that
(in the last equality, set \(R=QP\) and use that \(|Q|=1\))
where \(I_d\) denotes the identity rotation.
Let us now denote by \(S_2:C_0(\mathbb {R}^3)\rightarrow C_0(\mathbb {R}^3)\) the operator given by
and let \(S'=S_2\circ S\). Then
(since \(S, S_2\) commute with rotations)
Thus, the proof ends as soon as we demonstrate that \(S'\) is an SSP operator, and it is for this that we need to use Lemma 2.5. Indeed, \(S'=S_2\circ S\) is a composition of operators, S is, by hypothesis, symmetric semidefinite positive, and S, \(S_2\) commute because S commutes with rotations and \(S_2\) is defined in terms of convolution in SO(3). Thus, Lemma 2.5 implies that \(S'\) is SSP whenever \(S_2\) is SSP.
To prove that \(S_2\) is symmetric semidefinite positive, we use the properties of the convolution on SO(3) when interpreted as a hyperspherical convolution on \(S^3\), the unit sphere of \(\mathbb {R}^4\). Recall that if \(S^{d-1}=\{x\in \mathbb {R}^d: x\cdot x^t=1\}\) denotes the (unit) sphere of \(\mathbb {R}^d\), then SO(d) acts transitively on \(S^{d-1}\) (which means that, given \(z_1,z_2\in S^{d-1}\) there is a rotation \(R\in SO(d)\) such that \(R(z_1)=z_2\)), which makes of \(S^{d-1}\) a homogeneous space and allows to introduce the convolution of functions defined on \(S^{d-1}\) as follows:
where \(\eta \in S^{d-1}\) is the north pole of the sphere and \(f,g\in L^2(S^{d-1})\). Thus, if we use that the elements of SO(3) are parametrized by quaternions of norm 1, which that can be identified with the elements of the sphere \(S^3=\{x\in \mathbb {H}: x\overline{x}=|x|=1\}\) in fourth-dimensional space, then assuming that the north pole of \(S^3\) is given precisely by the identity rotation \(I_d\), the convolution of \(f,g\in L^2(SO(3))\) can be interpreted as a hyperspherical convolution on \(S^3\):
Now, as it is well known, \(L^2(S^3)\) is a Hilbert space and the so-called hyperspherical harmonics, \(\{\Xi _{M}^\ell \}\), form an orthonormal basis of this space. Thus, every function \(f\in L^2(S^3)\) admits a Fourier expansion
Moreover, in [10], it was proven that, if \(f,g\in L^2(S^3)\) and \(\mathfrak {f}=f*_{S^3} g\), then
It follows that, given a template \(t\in C_0(\mathbb {R}^3)\), for each \(z\in \mathbb {R}^3\), the map \(t(z)(R)=t_R(z)\) belongs to \(L^2(S^3)\) (here the rotations R are parametrized as unit quaternions, so that \(R\in S^3\)) and
and
We need the following Lemma, whose proof is included in Sect. 4:
Lemma 3.2
\({\widehat{K}}(\ell ,O)\ge 0\) for all \(\ell \).
Proof
Then
This means that convolution with K, which is an operator \(C_K:L^2(SO(3))\rightarrow L^2(SO(3))\), \(C_K(f)=f\circledast _{SO(3)}K\), is semidefinite positive. Moreover, it is well known that this operator is symmetric (and we will use both things in our computations bellow).
In order to prove that \(S_2\) is SSP, we introduce the operator \(L:C_0(\mathbb {R}^3)\rightarrow C(SO(3), C_0(\mathbb {R}^3))\) defined by \(L(t)(R)=t_R\), as well as the operator \(L^*: C(SO(3), C_0(\mathbb {R}^3))\rightarrow C_0(\mathbb {R}^3)\) defined by \(L^*(a)(z)=\int _{SO(3)}a(R)_{R^{-1}}(z)dR\).
Then
where \(L(f)(w)(R):=L(f)(R)(w)=f_R(w)=f(R^{-1}w)\) and \(a(w)(R):=a(R)(w)\). On the other hand,
Thus, if \(V=\int _{SO(3)}dR\) is the volume of SO(3), then
where we have used that
and that
so that
(set \(\Theta ^{-1}=Q^{-1} R^{-1}\), so that \(Q^{-1} =\Theta ^{-1} R\))
It follows that
Thus, \(S_2(t)\) is semidefinite positive.
Moreover, the same type of computation shows that
which proves that \(S_2\) is symmetric. This ends the proof of the theorem. \(\square \)
Note that Theorem 3.1 connects the problem of finding, at a given position x, the rotation R which gives a match between f and \(t_R\) at x with the problem of finding the dominant Z-eigenvalue-eigenvector pair by solving (25) with \(A=C_n(x)\in S^n(\mathbb {R}^4)\) and n even.
3.4 Finding the correct position
Although we can find the spatial positions of peaks by running an algorithm to find the dominant Z-eigenvalue-eigenvector pair for each and every voxel, this is fairly expensive using the current decomposition algorithms for higher degree tensors. However, the Frobenius norm of a tensor is related to its spectral norm, and in practice it turns out it can be used as an excellent proxy for finding the spatial locations of peaks. Indeed, we know that \(C_n(x)\in S^n(\mathbb {R}^{d'})=S^n(\mathbb {R}^{4})\). Now, if \(\Vert T\Vert _{\sigma }\) denotes the spectral norm of tensor T and \(\Vert T\Vert _F\) denotes its Frobenius norm, it is well-known that the largest singular value of T equals its spectral norm, and that
In fact, the connection between \(\Vert T\Vert _{\sigma }\) and \(\Vert T\Vert _F\) is stronger than just this inequality. As is well-known, every tensor is a finite sum of tensors of rank 1 (indeed, if the tensor is symmetric, the tensors of rank one can also be chosen symmetric) [6]. Moreover, if \(W_1\) is a tensor of rank 1 satisfying
then (see, e.g., [23])
and
Thus,
Hence if \(E_1(T)\) is preserved, an increment on the size of \(\Vert T\Vert _{\sigma }\) (\(\Vert T\Vert _F\), respectively) is translated into an increment on the size of \(\Vert T\Vert _F\) (\(\Vert T\Vert _{\sigma }\), respectively).
Moreover, in 1938 Banach demonstrated (see [1]) that, for any symmetric tensor T,
Thus, large \(\Vert T\Vert _F\) implies large spectral norm of T, and the spectral norm of \(C_n(x)\) is strongly connected to the optimization problem solved in Theorem 3.1, which justifies using the Frobenius norm of \(C_n(x)\) as a parameter to select positions x where a match is possible.
For each position x identified as a potential peak, the SS-HOPM algorithm (see [15, 16, 23] for precise definition and implementation of this algorithm) is used to find the exact dominant Z-eigenvalue and its associated Z-eigenvector, which is the rotation R candidate to give a match at x.
We have just explained an heuristics to locate the positions -and, after that, the rotations- where a match is possible. Now, sometimes a false positive may occur. Indeed, in the previous subsection we showed that the tensor-based correlation function \(\langle C_n(x),Q^{\odot n}\rangle \) can be seen as using a slightly different degenerate inner product, based on \(S'\) from the proof of Theorem 3.1, rather than S. Concretely, we proved that
where \(w(x)=\frac{1}{\Vert P_S(\tau _x(f))\Vert _S}\), \(\Vert t_R\Vert _S=1\) and \(t_R=P_S(t_R)\). This implies that the relation \(-1\le \langle C_n(x),Q^{\odot n}\rangle \le 1\) does not necessarily hold because the normalizations were taken in terms of S instead of \(S'\). Taking \(S'\) into account would lead to the equality
where \(-1\le \frac{\langle \tau _x(f), t_Q \rangle _{S'} }{\Vert \tau _x(f)\Vert _{S'}\Vert t_Q\Vert _{S'}}\le 1\) (and it is equal to 1 when we have a match).
So what is the impact of this? First of all, observe that the operation that is missing from S in the normalization is effectively a kind of convolution, so that its effect on the constant component of an image is to scale it. Consequently, if an image is S-orthogonal to \(\textbf{1}\), it will also be \(S'\)-orthogonal to \(\textbf{1}\). However, the norms are affected.
For the template t, this means that the normalization is off by a certain factor, but this factor is the same everywhere. For the image f, the impact is less benign though, as \(\Vert \tau _x(f)\Vert _S\) will differ from \(\Vert \tau _x(f)\Vert _{S'}\) in a nonuniform way.
When will this shortcoming would cause a false positive? For this to happen, the normalization factor used at a non-match position would have to be much higher than the “correct” normalization factor, and/or the normalization factor would have to be too low at a match position. Since the difference between S and \(S'\) is essentially a smoothing operation, and the normalization factor is the reciprocal of the norm of the projected image, the image would thus have to be (very) smooth at the non-match position, while exhibiting a lot of high frequency energy around the match position. Such a situation would not be impossible, but would at the very least be unusual in the context of a typical application like the analysis of electron microscopy images.
4 Proof of Lemma 3.2
Let us start recalling the formulae associated to Fourier expansions in hyperspherical harmonics on the sphere \(S^3\). The parametrization of the sphere we consider is the following one:
where \((a,b,c,d)\in S^3\) is identified with the unit quaternion \(Q=a+b\textbf{i}+c\textbf{j}+d\textbf{k}\), which represents a rotation of three dimensional Euclidean space \(\mathbb {R}^3\). The volume element (used for integration on \(S^3\) and, henceforth, also in SO(3)) is then given by
Then every function \(f\in L^2(S^3)\) can be decomposed as
where \(\{\Xi _{(k_1,k_2)}^{\ell }\}\) denotes the orthonormal basis of \(L^2(S^3)\) formed by the hyperspherical harmonics and \({\hat{f}}(\ell ,(k_1,k_2))=\langle f, \Xi _{(k_1,k_2)}^{\ell }\rangle _{S^3}\) are the Fourier coefficients of f in this basis. We want to prove that \({\hat{K}}(\ell ,(0,0))\ge 0\) for all \(\ell \). Now, \(K(Q)=(\text {Re}(Q))^n=a^n=(\cos \theta )^n\) and
where \(A_{(0,0)}^{\ell }\) is a positive constant and \(C_{\ell }^{\lambda }(t)\) denotes the Gegenbauer polynomial of degree \(\ell \), which appears as the \(\ell \)-th Taylor coefficient in the expansion: \((1-2tz+z^2)^{-1}= \sum _{\ell =0}^{\infty } C_{\ell }^{1}(t)z^{\ell }\). It is well-known that \(C_{\ell }^{1}(t)=U_{\ell }(t)\) (the \(\ell \)-th Chebyshev’s polynomial of second kind) and that \(U_{\ell }(\cos \theta )=\frac{\sin ((\ell +1) \theta )}{\sin \theta }\). Hence \(\Xi _{(0,0)}^{\ell } = A_{(0,0)}^{\ell } \frac{\sin ((\ell +1) \theta )}{\sin \theta }\) and
To estimate the integral above, we need to use a few trigonometric formulas, as well as the hypothesis that n is even. Concretely, n even implies that n/2 is an integer and \((\cos (\theta ))^n=(\cos (\pi -\theta ))^n\). Moreover, for \(\ell \) odd, we have that
This makes the integral equal to 0 for \(\ell \in 2\mathbb {N}+1\).
Assume \(\ell \in 2\mathbb {N}\). Then
(for the last line, just set \(k=n/2-s\)). Moreover, it is well-known that
so that, by a direct substitution in the formula defining \({\hat{K}}(\ell ,(0,0))\) we get
We can now use that
to claim that
The parity of \(\ell \) and n implies that all factors that appear multiplying the variable \(\theta \) inside of the cosine functions are even numbers. This makes the corresponding integrals (on \([0,\pi ]\)) equal to 0, except in the case that the factor itself is 0. In such case, \(\cos (0)=1\) implies that only the cosine functions that appear with a minus sign in front of them in the formula can contribute with a negative number to the integral. Now clearly \(\ell +2>0\) always since \(\ell \ge 0\), and \(\ell +2+n-2k=0\) implies \(2k=n+\ell +2>n\), so that \(k>n/2\) which is impossible since the sum’s range goes from \(k=1\) to \(k=n/2\). This means that the term \(-\cos (\theta (\ell +2+n-2k))\) never contributes with a negative number to the sum. On the other hand, if the cosine function with factor \(\ell +2+2k-n\) contributes, which means that \(\ell +2+2k-n=0\), then \(k=(n-\ell -2)/2<n/2\). In particular, taking \(k^*=k+1\), we have that \(1\le k^*\le n/2\) so that \(\cos (\theta (\ell +2k^*-n))= \cos (\theta (\ell +2k+2-n))=\cos (0)=1\) and the corresponding term effectively appears in the sum. In particular, adding these two terms of the sum we get
since n is even and \(k<n/2\). This ends the proof of Lemma 3.2\(\square \)
5 Conclusions
We have exposed the maths of classical template matching with rotations. Moreover, an alternative to the classical algorithm, named tensorial template matching (or TTM), has been shown. TTM integrates the information relative to all rotated versions of a template t into a unique symmetric tensor template T, which is computed only once per template. The main theorem of the paper, Theorem 3.1, shows that finding an exact match between an image f and a rotated version \(t_R\) of the template t at a given position x is equivalent to finding a best rank 1 approximation (in the Frobenius norm) to a certain tensor \(C_n(x)\). The resulting algorithm has reduced computational complexity when compared to the classical one. TTM finds the position and rotation of instances of the template in any tomogram with just a few correlations with the linearly independent components of T. In particular, Cryo-electron tomography (3D images) for macromolecular detection requires 7112, 45,123 and 553,680 rotations to achieve an accuracy of 13\(^\circ \), 7\(^\circ \) and 3\(^\circ \) respectively [5]. Therefore, and considering 4-degree tensors (35 linearly independent components), the potential speed-up of our approach with respect to TM is 203\(\times \), 1239\(\times \) and 184,560\(\times \) in these cases, while the angular accuracy remains constant for TTM and it is limited by the computation of tensorial template.
In [20], we develop a practical implementation of TTM showing with both, synthetic and real data, that our method is able to find template instances and determine their rotations with computational complexity independent of the rotation accuracy.
Notes
In all that follows, we will also use the notation \(x\cdot y\) to denote \(\langle x,y\rangle \), if this simplifies computations.
Note that \(\textbf{1}\) is not an element of X since \(X\subseteq L^2(\mathbb {R}^d)\), but this can be managed in several ways. In fact, in practice we only consider images f with compact support K. Then, when we compute \(\langle f,\textbf{1}\rangle \), we mean \(\langle f,\textbf{1}\rangle = \int _{\mathbb {R}^d}f(x)dx= \int _{K}f(x)dx=\langle f,\textbf{1}\chi _K\rangle \). Moreover, since our interest is on operators S that vanish on functions vanishing outside of a certain neighbourhood \(\textbf{D}\) of 0, by \(\langle \textbf{1},\textbf{1}\rangle _S\) we mean \(\langle \textbf{1}\chi _{\textbf{D}},\textbf{1}\chi _{\textbf{D}}\rangle _S\).
References
Banach, S.: Über homogene polynome in (\(\text{L}^2\)). Stud. Math. 7, 36–44 (1938). http://eudml.org/doc/218624
Böhm, J., Frangakis, A., Hegerl, R., Nickell, S., Typke, D., Baumeister, W.: Toward detecting and identifying macromolecules in a cellular context: template matching applied to electron tomograms. Proc. Natl. Acad. Sci. 97, 14245–14250 (2000). https://doi.org/10.1073/pnas.230282097
Brunelli, R.: Template Matching Techniques in Computer Vision: Theory and Practice. Wiley (2009). https://doi.org/10.1002/9780470744055
Cao, S., He, S., Li, Z., Wang, Z.: Extreme ratio between spectral and Frobenius norms of nonnegative tensors. SIAM J. Matrix Anal. Appl. 44, 919–944 (2023). https://doi.org/10.1137/22M1502951
Chaillet, M., van der Schot, G., Gubins, I., Roet, S., Veltkamp, R., Förster, F.: Extensive angular sampling enables the sensitive localization of macromolecules in electron tomograms. Int. J. Mol. Sci. 24, 13375 (2023). https://doi.org/10.3390/ijms241713375
Comon, P., Golub, G., L.-H., L., Mourrain, B.: Symmetric tensors and symmetric tensor rank. SIAM J. Matrix Anal. Appl. (2008). https://doi.org/10.1137/060661569
Corona, G., Maciel-Castillo, O., Morales-Castaneda, J., Gonzalez, A., Cuevas, E.: A new method to solve rotated template matching using metaheuristic algorithms and the structural similarity index. Math. Comput. Simul. (MATCOM) 206, 130–146 (2023). https://ideas.repec.org/a/eee/matcom/v206y2023icp130-146.html. https://doi.org/10.1016/j.matcom.2022.11
Cui, C., Dai, Y.H., Nie, J.: All real eigenvalues of symmetric tensors. SIAM J. Matrix Anal. Appl. 35, 1582–1601 (2014). https://doi.org/10.1137/140962292
de Teresa-Trueba, I., Goetz, S.K., Mattausch, A., Stojanovska, F., Zimmerli, C.E., Toro-Nahuelpan, M., Cheng, D.W., Tollervey, F., Pape, C., Beck, M., et al.: Convolutional networks for supervised mining of molecular patterns within cellular context. Nat. Methods 20, 284–294 (2023). https://doi.org/10.1038/s41592-022-01746-2
Dokmanic, I., Petrinovic, D.: Convolution on the \( n \)-sphere with application to pdf modeling. IEEE Trans. Signal Process. 58, 1157–1170 (2009). https://doi.org/10.1109/TSP.2009.2033329
Ebbinghaus, H.D., Hermes, H., Hirzebruch, F., Koecher, M., Mainzer, K., Neukirch, J., Prestel, A., Remmert, R.: Numbers. Springer (1991). https://doi.org/10.1007/978-1-4612-1005-4
Fageot, J., Uhlmann, V., Püspöki, Z., Beck, B., Unser, M., Depeursinge, A.: Principled design and implementation of steerable detectors. IEEE Trans. Image Process. 30, 4465–4478 (2021). https://doi.org/10.1109/TIP.2021.3072499
Forsyth, D., Ponce, J.: Computer Vision: A Modern Approach. Prentice Hall (2002). https://dl.acm.org/doi/book/10.5555/580035
Gonzalez, R., Woods, R.: Digital Image Processing, 4th Global edn. Pearson Education (2017).https://elibrary.pearson.de/book/99.150005/9781292223070
Kofidis, E., Regalia, P.: On the best rank-1 approximation of higher-order supersymmetric tensors. SIAM J. Matrix Anal. Appl. 23, 863–884 (2001). https://doi.org/10.1137/S0895479801387413
Kolda, T., Mayo, J.: Shifted power method for computing tensor eigenpairs. SIAM J. Matrix Anal. Appl. 32, 1095–1124 (2010). https://doi.org/10.1137/100801482
Kozhasov, K., Tonelli-Cueto, J.: Probabilistic bounds on best rank-one approximation ratio (2022). arXiv:2201.02191
Lamm, L., Righetto, R., Wietrzynski, W., Pöge, M., Martinez-Sanchez, A., Peng, T., Engel, B.: Membrain: a deep learning-aided pipeline for detection of membrane proteins in cryo-electron tomograms. Comput. Methods Programs Biomed. 224, 106990 (2022). https://doi.org/10.1016/j.cmpb.2022.106990
Lewis, J.: Fast template matching. In: Denis Laurendau (Universitè Laval) and Claudette Cèdras (Unviersitè Laval) (eds.) Proceedings of Vision Interface ’95 conference, pp. 15–19. Canadian Image Processing and Pattern Recognition Society, Canada (1995)
Martinez-Sanchez, A., Almira, J.M., Homberg, U., Phelippeau, H.: Tensorial template matching with rotations and its application for tomography (in preparation)
Moebel, E., Martinez-Sanchez, A., Lamm, L., Righetto, R., Wietrzynski, W., Albert, S., Lariviere, D., Fourmentin, E., Pfeffer, S., Ortiz, J., Baumeister, W., Peng, T., Engel, B., Kervrann, C.: Deep learning improves macromolecule identification in 3d cellular cryo-electron tomograms. Nat. Methods 18, 1386–1394 (2021). https://doi.org/10.1038/s41592-021-01275-4
Pontryagin, L.: Generalization of Numbers. CreateSpace (2010)
Regalia, P., Kofidis, E.: The higher-order power method revisited: convergence proofs and effective initialization. In: 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 00CH37100), vol. 5, pp. 2709–2712 (2000). https://doi.org/10.1109/ICASSP.2000.861047
Roseman, A.: Particle finding in electron micrographs using a fast local correlation algorithm. Ultramicroscopy 94, 225–236 (2003). https://doi.org/10.1016/s0304-3991(02)00333-9
Acknowledgements
This work is based on unpublished ideas of Jasper van de Gronde when he was a researcher of the University of Groningen. We also want to express our thanks to Holger Kohr, Erik Franken and Remco Schoenmakers from Thermo Fisher Scientific for their support and feedback about the potential of tensorial template matching. This work was supported by the Ramon y Cajal program [Grant RYC2021-032626-I funded by MICIU/AEI/10.13039/501100011033 and the European Union NextGenerationEU/PRTR]; and the University of Murcia [Attract-RYC, 2023].
Funding
Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
H.P. is an employee of Thermo Fisher Scientific.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Almira, J.M., Phelippeau, H. & Martinez-Sanchez, A. Fast normalized cross-correlation for template matching with rotations. J. Appl. Math. Comput. (2024). https://doi.org/10.1007/s12190-024-02157-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12190-024-02157-6
Keywords
- Template matching
- Tensors
- Rotations and Quaternions
- 3D images
- Cross-correlation
- Convolution
- Hyperspherical harmonics
- Cryo-electron microscopy
- Tomography