1 Introduction

Template recognition, matching, and tracking as well as object detection are major signal and image processing techniques that have many applications in different disciplines, see e.g. [6, 17, 27, 39]. Such applications involve; but not limited to, detection of colonic polyps [22, 23], detection of anatomical landmarks in brain Magnetic Resonance (MR) images [47], identifying types of film defects [53], fingerprint matching [20, 46], analyzing stock market behavior [16], detecting ventricular tachycardia [8, 9, 19, 24], the detection cancerous masses in various types of images [2, 5, 13, 38], detection of defects of printed circuit boards (PCB) [3, 4, 10]. See Table 1 for a complete list of acronyms of the paper.

Table 1 List of acronyms of the paper

In more details, Kilic et al. introduced in [23] a computer aided detection system to detect colonic polyps in computer tomography images using cellular neural networks, genetic algorithm and a three dimensional (3D) template matching (TM)technique, which is based on a fuzzy-rule-based thresholding. For this purpose, three different templates are created genetically, and they are implemented in the 3D-TM algorithm. See also [22].

Yoon et al. [53] presented an effective defect inspection system that identifies film defects and determines their types in order to produce polarized films for TFT-LCD (thin film transistor-liquid crystal display). The proposed system is designed and implemented to find defects from polarized film images using image segmentation techniques and to determine defect types through the image analysis of detected defects using TM techniques. They extracted features of the defects such as shape and texture, and compared them to the features of referential defect images stored in a template database.

A fingerprint matching automated system is established by Uz et al. in [46] to treat the low quality fingerprints, which may be also affected by various distortions due to the acquisition filters and other enhancement procedures. Their effective approach, to account for the within-class variations, is founded by capturing multiple enrollment impressions of a finger. Concentrating on combining minutiae information from multiple impressions of the same finger increases coverage area, retrieve missing minutiae, and remove spurious minutiae. Therefore a super-template is produced for each finger for the TM applications. Passos et al. [40] established an eye detection system using ensemble of weak classifiers based on a correlation filter, and Peng et al. [41] presented a corner detection and scale estimation algorithm.

In [5], Bator and Nieniewski implemented the correlation coefficient to detect cancerous masses in mammograms and Osman et al. [38], used a 3D convolution TM approach to detect Lung nodules, see also [13]. In the work of Ambrosini et al. [2], spherical tumor appearance models are created in such a way that they match the expected geometry of brain metastases, while accounting for partial volume effects and offsets due to the cut of MR images sampling planes. Then a 3D normalized cross correlations (NCC) measures similarity between the brain volume and the created spherical templates with varying radii to detect the positions of the brain metastases.

Kurosaki et al. [24] used the NCC as a similarity measure of an automated TM system to detect Tachycardia of the right ventricular outflow. Ciaccio et al. [8] adapted a TM technique based on the quantification of beat-to-beat changes in electrograms to locate functional reentrant circuits that are relatively stable and cause monomorphic ventricular tachycardia, cf. [9, 19].

Aggarwal and Kumar [1] have used convolution neural network to classify image surface texture. In [11], Dastanova et al. presented the hardware implementation of a novel algorithm for moving-object detection, which can be integrated with complementary metal oxide semiconductor (CMOS) image sensors. Bit planes of consecutive frames are stored in memristive crossbar arrays and compared using threshold-logic XOR gates. The resulting outputs are combined using weighted summation circuits and thresholded using comparators, to obtain binary images.

In this paper two new and fast TM and OD algorithms are established. We implement two different approaches using the φ-correlation coefficient and logic circuits. The proposed techniques are carried out on the bit-plane slices of grayscale or color images. While these approaches reduce the execution time, they retain accuracy measures compared to state-of-the-art relevant methods. In addition both proposed techniques show robustness against various types of noise. The next section gives a brief account about closer related works as well as some basic mathematical formulations. The proposed techniques of this paper are introduced in the Section 3. It contains two subsections, one is for establishing a TM and object detection (OD) algorithm via the φ-correlation coefficient, and another for the use of Boolean functions between bit-plane slices to establish TM and object detection schemes. Section 4 is devoted to investigate the performance analysis compared to four state-of-the-art techniques. We conclude that, while the TM execution time is remarkably accelerated, the accuracy and robustness are efficiently maintained. To sum up, the main contributions of this paper are:

  • Introducing two novel, fast and robust TM and OD techniques.

  • Compute the running time in comparison with relevant techniques.

  • Compare accuracy with four relevant techniques.

  • Investigate and assure robustness against various types of noise, involving for example occlusions, geometric deformation, illumination variation and the change of background as well as artificial noise.

2 Related works

In image TM techniques, in which this work is concerned with, a reference image f(x,y) is given together with a template image T(u,v) to be detected and matched. Usually, the association between T and a sub-image fc of f is measured at every pixel of f(x,y), where the sizes of fc and T coincide to each other. Consequently, the template \( T\equiv f_{c}^{*}\), where \(f_{c}^{*}\) is the subimage with the highest association, or similarity with T.

The (zero mean) normalized cross correlations (ZNCC), NCC, are considered as basic measures of association in TM algorithms, cf. [6]. However, due to their very high computation expenses, many techniques are derived to reduce computation expenses, as well as to enhance the accuracy measures, cf. e.g. [12, 14, 33, 43, 50]. Also to avoid the inaccuracy of using the NCC, Choi and Kim [7] gave a two stage template matching method for rotation and illumination invariance. In the first stage the matching candidates are selected using computationally low cost features, while in the second one, a rotation invariant TM technique is performed on these candidates using Zernike’s moments. Also Lei and Zhang [26] used an adaptive low cost ZNCC approach to derive a rotation invariant TM algorithm provided that the rotation angle is limited in the range of [− 20,20], cf. [15, 28, 42].

In many fast TM algorithms based on NCC and ZNCC, researchers implement partial eliminations to reduce the number of compared pixels, where the similarity between T and fc(x,y) is measured. For instance, Muramatsu et al. [35] speed up the application of the NCC as a tool of measuring association in TM algorithms by shrinking the compared images in a way that retains main features. This is done in [43] for NCC computations and in [44] for ZNCC computations by neglecting pixels whose correlation with T falls beyond a certain bound. Di Stefano et al. [44] have established a fast ZNCC-based TM algorithm using bounded partial correlation, and Mattoccia et al. [33] implemented this technique for multichannel images TM. However, [31], due to non-monotonicity of NCC, or ZNCC over pixels, the fast elimination techniques may not fasten the search algorithm. In addition, the search may suffer premature termination at pixels that cannot compete with the best match locations, cf. [31]. These issues are treated in [31] by creating a monotonic formulation over pixels, see also [30] for other elimination-based algorithms.

As we have indicated, partial elimination methods are found useful in many TM techniques, as in [33]. Lee and Chen [25] have implemented the technique of [43] with the help of the boxfiltering technique of [34] to derive a rotation invariant TM technique. In [37], using image segmentation, a chamfer TM method is established. The use of super-pixels is utilized in [51] and the search by comparing target signatures is implemented in [36] by using the NCC. In [50] the authors have enhanced computation on tensor cores by using low-level descriptions, together with a local normalization. In [21] the TM procedure is accelerated using an integral image. See [3] for computing NCC for images on 1-D feature vectors and [52] for computing NCC using addition-based criterion on 1-D vectors.

In the present work novel approaches to establish fast TM and object detection algorithms are created. Instead of dealing with images in their gray or color representations, the algorithms are developed via measuring similarity/dissimilarity between binary processed images. If \({f_{c}^{r}}(x,y)\) and Tr(u,v) represent the r th-level bit plane images of fc and T respectively, then we measure association between fc and T by computing association between \({f_{c}^{r}}, T^{r}\). If the image and template are in 8-bit gray level scale, then \({f_{c}^{7}}, T^{7}\) and/or \({f_{c}^{6}}, T^{6}\) are compared as they bear the major properties of images. With this respect, we use the low-computation φ-correlation coefficient [48] instead of the computationally expensive NCC, or ZNCC. Moreover, a suitably chosen Boolean circuit is implemented as another measure in a second TM algorithm. Both techniques are computationally fast, and give accurate results compared to known relevant techniques. It is worthwhile to mention here that both φ-correlation and Boolean functions are efficiently implemented for defect detection in [4].

3 The proposed methods

This sections involves two TM techniques based on the φ-coefficient and Boolean circuits.

3.1 A φ-correlation TM algorithm

It is known that using correlation criteria, like NCC, ZNCC, is an effective tool in various patterns recognition applications cf. [6, 43, 44]. Nevertheless, the implementation of NCC, or ZNCC is based on pixel-by-pixel comparisons, and consequently it is computationally exhaustive. Therefore fast TM and OD techniques are required, see also [45]. From another point of view, NCC and ZNCC measure association between data when they are linearly correlated. To overcome these disadvantages, the first proposed technique of this paper implemented the φ-correlation coefficient as a measure of similarity/dissimilarity.

The φ-correlation coefficient is a measure of association between 2 × 2 tables, which measures association between data that are not necessarily linearly related. In addition, cf. [29], it overcomes the disadvantage when the margins are largely deviated. The implementation of the φ-coefficient as the major measure of association will remarkably reduce the matching time, as indicated below, while maintaining similar accuracy and robustness to state-of-the-art techniques. The main reason of the reduction in execution time of the matching procedure is that instead of performing comparisons on gray-level f(x,y), we carry out performance on bit-plane slices. Thus each pixel is represent by 1 bit only 0, or 1, instead of n-bits in the n-bits gray level. In the following, and without any loss of generality n is taken to be 8.

Let f(x,y) be a given n-bit gray scale image. Then f(x,y) can be sliced into n-bit-plane binary images fr(x,y),r = 0,…,n − 1, with the gray scale

$$ f^{r}(x,y)\equiv_{2} \lfloor f(x,y)/2^{r}\rfloor , r=0,1, \ldots, n-1. $$
(1)

It is well known [18] that f0(x,y) contains all lowest order bits in the bit patterns comprising the pixels of f(x,y), while fn− 2(x,y),fn− 1(x,y) contain the higher order ones and preserves the major properties of f(x,y).

Now let us describe how to use the φ-correlation on the binary images to achieve the TM process. Let I(i,j) be the reference image under examination of size M × N pixels, T(k,l) be the target template of size m × n pixels, and I(x,y)(k,l) be the sub-image of I(i,j) of size m × n located at pixel coordinates (x,y) in I(i,j) such that m < M and n < N. In real applications, we want to find sub-image I(x,y)(k,l) of the original image I(i,j) such that I(x,y)(k,l) is very close to the given template T(k,l). To formalize the proposed algorithm in a mathematical form, suppose that the reference I(i,j) is such that 0 ≤ iM − 1,0 ≤ jN − 1 and the template T(k,l) is such that 0 ≤ km − 1,0 ≤ ln − 1 where m,n are relatively small compared to M,N, respectively. We define I(x,y)(k,l) to the m × n sub-image of I(i,j) by

$$ I_{(x,y)}(k,l)=I(x+k,y+l) $$
(2)

where 0 ≤ km − 1,0 ≤ ln − 1. Thus, the domain of (x,y), where the block image I(x,y)(k,l) is defined, is given by

$$ 0\leq x \leq M-m, 0\leq y \leq N-n. $$
(3)

As is mentioned above, the r th-bit plane images of I,T,I(x,y) are denoted by \(I^{r},T^{r},I^{r}_{(x,y)}\) respectively, 0 ≤ r ≤ 7. Using (1) the r th-bit plane image of I(i,j) is obtained via

$$ I^{r}(i,j)\equiv_{2}\lfloor I(i,j)/2^{r} \rfloor ,r=0{\ldots} 7. $$
(4)

Likewise, \(T^{r}(k,l),I^{r}_{(x,y)}(k,l)\) are sliced.

The current TM algorithm is based on measuring association between the binary images \(T^{r}(k,l),I^{r}_{(x,y)}(k,l)\) at each pixel (x,y) of the domain (3). Then we select the optimum (x0,y0) at which T(k,l),I(x,y)(k,l) nearly coincide to each other. The φ-correlation coefficients are computed between \(T^{7}(k,l),I^{7}_{(x,y)}(k,l)\) for each pixel (x,y) of the domain (3). Indeed, we form the contingency table, with the similarity/dissimilarity values

$$ \lambda_{pq}(x,y)=\vert{\Lambda}_{pq}(x,y)\vert,0\leq p,q\leq 1, $$
(5)

i.e. λpq(x,y) are the cardinalities of the sets

$$ {\Lambda}_{pq}(x,y)=\lbrace (k,l): T^{7}(k,l)=p, I^{7}_{(x,y)}(k,l)=q \rbrace. $$
(6)

Then, we define the margins, cf. [48],

$$ \left.\begin{array}{cccc} \mu_{1}=\lambda_{00}+\lambda_{01}, \mu_{2}=\lambda_{10}+\lambda_{11},\\ \mu_{3}=\lambda_{00}+\lambda_{10}, \mu_{4}=\lambda_{01}+\lambda_{11}. \end{array}\right\} $$
(7)

Thus, we form the contingency table, Table 2.

Table 2 Contingency table of \(T^{7},I^{7}_{(x,y)}\)

The rest of the technique is based on computing the φ-correlation coefficient between \(T^{7},I^{7}_{(x,y)}\), which is

$$ \varphi(x,y)=\varphi(T^{7},I^{7}_{(x,y)})=\frac{\lambda_{00}\lambda_{11}-\lambda_{01}\lambda_{10}}{\sqrt{\mu_{1}\mu_{2}\mu_{3}\mu_{4}}}, $$
(8)

where none of the marginal numbers is zero.

In order to have a perception on the φ-correlation values in the proposed method, an example from the considered dataset1 is displayed in Fig. 1 where Fig. 1 (a) and (b) show individually a 10 × 10 patch in a printed circuit board (PCB) reference image and the corresponding template sub-image, respectively, while Fig. 1 (c) and (d) present the gray scale values of the two 10 × 10 image patches A1 and B1, and Fig. 1 (e) and (f) show the binary values of patches A1 and B1, respectively, extracted using 7-bit plane slice. Finally, Fig. 1(g) represents the φ-correlation coefficients between template image in Fig. 1(b) and 100 sub-images from the reference image in Fig. 1(a). The number typed in boldface represents the maximum φ-correlation value. See Section 4 for the introduction of the different datasets implemented in this paper.

Fig. 1
figure 1

Illustrating φ-correlation values for a patches from PCB images taken from the dataset1: (a) 10 × 10 image patch A1 in the reference image. (b) 10 × 10 image patch B1 in the template image. (c) Respective intensity value of A1. (d) Respective intensity value of B1. (e) Binary version of A1. (f) Binary version of B1. (g) φ-correlation values between template in, (b) and 100 sub-image from reference image in (a)

In (8), the φ-coefficient overcomes the problem in percentage statistics when the margins are extremely scattered. As in the case of the NCC, \(\varphi (I^{r}_{(x,y)},T^{r})=1\) expresses strongest correlation between \(I^{r}_{(x,y)}\), Tr;\(\varphi (I^{r}_{(x,y)},T^{r})\) = − 1 represents the weakest association between \(I^{r}_{(x,y)}\), Tr and the zero value reveals no conclusion in this sense. It is known, cf. [29], that the exact correlation is determined if λ00 = λ11 = 0 or λ01 = λ10 = 0; or μ1 = μ3 or μ2 = μ4. If μ1.μ2.μ3.μ4 = 0, we redefine \(\varphi (T^{7},I^{7}_{(x,y)})\) as follows:

$$ \varphi(T^{7},I^{7}_{(x,y)})=\frac{1}{m\times n} \begin{cases} \lambda_{11}-\lambda_{01} if \mu_{1}=0, \\ \lambda_{00}-\lambda_{10} if \mu_{2}=0, \\ \lambda_{11}-\lambda_{10} if \mu_{3}=0, \\ \lambda_{00}-\lambda_{01} if \mu_{4}=0. \\ \end{cases} $$
(9)

This definition preserves φ between − 1 and 1. We also notice that negative and positive associations are pertained. For instance, if μ1 = 0, then λ00 = λ10 = 0, and the similarity depends on the difference λ11λ01, i.e. there is a positive correlation when λ11 > λ01 and negative correlation when λ01 > λ11 and no correlation if λ11 = λ01. The other cases are similar. The larger the value of \(\varphi (T^{7},I^{7}_{(x,y)})\) is, the more similar the template T(k,l), and the sub-image I(x,y)(i,j) at the position (x,y) are. When \(\varphi (T^{7},I^{7}_{(x,y)})\) is very close to 1, the best matching is obtained. Therefore the exact template matched in I(i,j) is \(I_{(x_{0},y_{0})}(i,j)\) where

$$ (x_{0},y_{0})=\underset{(x,y)}{\operatorname{arg\max \varphi(\textit{x},\textit{y})}} $$
(10)

The pseudocode of the proposed technique is summarized in Algorithm 1. and Fig. 2 is a sketch of the technique.

Algorithm 1
figure a

The proposed φ-correlation TM algorithm.

Fig. 2
figure 2

Sketch for the φ-correlation TM technique of Algorithm 1

3.2 TM via Boolean functions

In this subsection we propose another novel TM-OD technique based on Boolean functions. The Boolean functions are ubiquitous in signal and image processing, where they provide models for logical operations performed by computers on digital signals. They can be used with binary images to solve the TM problem in image processing. Recall that for two binary images η(i,j) and 𝜗(i,j) both of size m × n, the exclusive OR circuit between η and 𝜗 is also a binary image and it is given by:

$$ \left( \eta \oplus \vartheta \right)_{(i,j)}= \begin{cases} 1, \eta(i,j)\neq \vartheta(i,j), \\ 0, \eta(i,j) = \vartheta(i,j). \\ \end{cases} $$
(11)

Using the XOR circuit (11), the two binary images η and 𝜗 are matched if the number of zeros of η𝜗 is very close to m × n. As indicated above the bit-plane slices scheme is a vital technique to obtain a sequence of binary images from a grayscale, or color image. The number of these binary images depend on the bit pattern length that represents each pixel in the original grayscale image. The first binary images in the bit-plane sequence consist of the last bits of each gray value of every pixel. These bits have the least effect in terms of the magnitude of its gray value. So, we call the binary images consisting of those bits, the least significant binary images in the bit-plane sequence. On the other hand, the last binary images in the bit-plane sequence consist of the first bits of each gray value. These bits have the greatest effect in terms of the magnitude of its gray value. We call the binary images consisting of those bits the most significant binary images in the bit-plane sequence. In any clean dataset the Boolean function in (11) can be applied smoothly on the least or most significant binary images to solve the TM problem. Whereas, the simple Boolean function in (11) and the least significant binary images are weak to find the template in the noisy reference image. For these reasons, we are going to propose an effective Boolean function as well as choosing the most significant binary images to solve the TM problem for noisy images.

Instead of working on the least significant binary images we choose the most significant binary images I6,I7 from the reference image and T6,T7 from the template image. Then a robust similarity measure ψ is built using the logical OR and XOR functions. Then ψ is applied to I6,I7,T6, and T7 as follows:

$$ \psi(T,I_{(x,y)})_{(k,l)}=\left[(T^{7}\oplus I^{7}_{(x,y)})\vee (T^{6}\oplus I^{6}_{(x,y)})\right]_{(k,l)}, $$
(12)

0 ≤ km − 1,0 ≤ ln − 1. The output of ψ is 1 if either \((T^{7}\oplus I^{7}_{(x,y)})\) or \((T^{6}\oplus I^{6}_{(x,y)})\) contain a non-zero element at the same location, otherwise that element is assigned the value zero. The two sub-images I7 and T7 are matched if the (I7T7) is zero for each pixel in the domain, likewise I6 and T6. Therefore, the number of zeros in ψ indicates the degree of similarity between the template and a selected sub-image in the reference image. Now for each (x,y) of domain (3), we compute the cardinal numbers

$$ \sigma(x,y) =\vert \lbrace (k,l): \psi (T,I_{(x,y)})_{(k,l)}=0 \rbrace \vert. $$
(13)

Notice that 0 ≤ σ(x,y) ≤ mn and the best possible TM occurs when σ(x0,y0) = mn, therefore, ψ(T,I(x,y))(k,l) = 0 if and only if T7,T6 and their corresponding \(I^{7}_{(x,y)},I^{6}_{(x,y)}\) gain the same value. The sub-image \(I_{(x_{0},y_{0})}\) is the best match, if

$$ (x_{0},y_{0})=\underset{(x,y)}{\operatorname{arg\max \sigma(\textit{x},\textit{y})}}. $$
(14)

The sketch and pseudocode of the Boolean template matching are outlined in Fig. 3 and Algorithm 2.

Algorithm 2
figure b

The Boolean TM algorithm.

Fig. 3
figure 3

The Boolean circuit ψ(k,l)

It is worthwhile to mention that the application of both techniques, i.e. Algorithm 1 and Algorithm 2 is not completely hanged upon slicing images into their bit-plane binary images. It is applicable also if we implement any thresholding technique to create binary images from color or grayscale images.

4 Performance and comparison analysis

This section investigates and assesses the performance of the proposed algorithms introduced in the previous one. Several experiments are carried out to investigate accuracy of the proposed methods. We also discuss robustness in the presence of real and artificial noises. A detailed comparison with four different TM and object detection techniques are also executed. Before discussing the performance and, robustness and comparison analysis, we define the datasets, compared methods and the machine used in these experiments.

The performance of the two proposed methods introduced above is assessed and compared to the use of two classic methods, namely the NCC and the ZNCC, and two methods introduced by Yoo et al. [52] and Xia et al. [49]. Hereafter we call them Yoo2010 and Xia2019, respectively. Together with the proposed algorithms, the all six methods are implemented on two different datasets. The first one is prepared by Mattoccia et al. [32], which we call dataset1 and the second is du to Xia et al. [49], which we call dataset2. Experiments are carried out using Matlab 2016 on a Laptop with an Intel Pentium Core2 Duo 3.00 GHz processor.

As the detailed experiments introduced below indicate, the proposed systems retain accuracy because they measure the association between the template and candidate windows in the source images based on Boolean function and the φ-correlation which is robust against noise. This is because we use the last two bit-plane slices, which preserve the major properties of the original image. The performance analysis presented here shows also that the proposed techniques are very fast compared to standard relevant methods. The main reason of the reduction in execution time of the matching procedure is that instead of performing comparisons on gray-level images f(x,y), we carry out performance on the lowly represented bit-plane slices, which are extracted from f(x,y). Thus each pixel is represent by 1 bit 0, or 1, instead of n-bits in the n-bits gray level.

4.1 Performance in presence of real noise

To validate the robustness of the proposed algorithms against real noise, we have chosen the dataset1 with a real distortion typically occurring in TM applications, such changes occur in viewpoint and due to camera noise. The dataset1 consists of seven grayscale source images and twelve templates of various sizes, as shown in Table 3. Source images and templates are depicted with their sizes together with both the left top corner (x0,y0) of the template in the source image and φ(x0,y0),σ(x0,y0). The results indicate that the proposed techniques are accurate in matching templates. It is noted that φ(x0,y0) and σ(x0,y0) are not always close to 1 and to the size of the template, respectively. This is because the real noise of the images. Figures 45 and 6 exhibit the performance of the proposed algorithms visually on the Board, Wafer, and Catalonia images, respectively.

Table 3 Template matching results for the proposed algorithms applied on dataset1
Fig. 4
figure 4

The performance of the proposed techniques on board image: (a) Test Board image. (b) Test templates board2 and board3 (top to down). (c) Matching of board2 by φ-correlation, matching of board3 by Binary circuit

Fig. 5
figure 5

The performance of the proposed techniques on wafer image: (a) Test Wafer image. (b) Test templates wafer1 and wafer3 (top to down). (c) Matching of wafer1 by φ-correlation, matching of wafer3 by Binary circuit

Fig. 6
figure 6

The performance of the proposed techniques on Catalonia image: (a) Test Catalonia image. (b) Test templates cata1 and cata2 (top to down). (c) Matching of cata1 by φ-correlation, matching of cata2 by Binary circuit

Table 4 displays the execution time required in seconds to search for the templates in its corresponding source image when the six methods are applied on the dataset1 of Table 3. The experiments found that Yoo2010 failed to find the correct match for ringo template in both source images ringo1 and ringo2. Also, it failed to find the correct match for wafer1 and wafer3 in the wafer image. On the other hand, the NCC, ZNCC, Xia2019 and the proposed φ-correlation and Binary circuit can correctly specify the optimal solution for all templates in its corresponding source image.

Table 4 Execution time (in seconds) for the six methods applied on dataset1

All experimental results on the dataset1 show the significance improvement of efficiency (accuracy and speeding) of the proposed Binary circuit algorithm compared to NCC, ZNCC, Xia2019 and Yoo2010 methods. Nevertheless, it is noted that the φ-correlation method comes in the third place in terms of time cost, i.e. after the Binary circuit and NCC methods. In addition, the accuracy of the proposed algorithms are identical to that of the standard ZNCC, but our methods outperform ZNCC, Yoo2010, and Xia2019 in terms of running time to find the optimal solutions when there is a real noise. From the above results, we can see that the proposed φ-correlation and Boolean circuit pattern algorithms are robust against the real distortion, while Yoo2010 remains sensitive to the same distortion.

4.2 Performance in the presence of artificial noise

In this subsection we introduce additional experimental results to check the performance of the proposed φ-correlation and Boolean circuit algorithms in the case of imposing an artificial noise in larger templates. For this purpose, the three different templates Cata1, Cata2, and Cata3 of sizes 162 × 118, 152 × 128, and 160 × 164, respectively, are manually selected from the Catalonia image of the dataset1. These three templates are then matched into the source Catalonia image, after being contaminated by pepper-and-salt noise that is used as outliers, with 0.15 % outlier ratio. Figure 7 visualizes Catalonia image with pepper-salt noise with 0.15 % outlier ratio. The templates have been extracted accurately using the proposed methods.

Fig. 7
figure 7

Results on artificial noise: (a)Matching results of the proposed methods applied on Catalonia image after contaminated pepper-salt noise with 0.15 % outlier ratio. (b) Test templates cata1, cata2, and cata3 (top to down)

The execution time yielded by the six methods applied to the noisy Catalonia image is shown in Table 5. The left-up corner coordinates for Cata1, Cata2, and Cata3 by the six methods are (725,167), (434,152), and (258,134), respectively, in artificial noise case. These coordinates agree with the results in Table 2. This proves that the six methods can detect the correct location for the templates in case of existing artificial noise. We see also that the six pattern matching algorithms are robust against artificial noise. The proposed Binary circuit algorithm come out to be the fastest in both artificial noise and without artificial noise for the three templates. Table 5 reveals that the execution time increases after imposing the salt-pepper noise. When the salt and pepper noise is added by 15%, the running time average of NCC, ZNCC, Yoo2010, Xia2019, φ-correlation, and Boolean circuit increased by 1.25%, 2.99%, 2.37%, 3.36%, 11.78%, and 3.99%, respectively. It can be readily seen that the computational advantages with larger templates made by the proposed techniques generally run notably faster than the other four methods.

Table 5 Execution time (in seconds) with an artificial noise imposed to Catalonia image

4.3 Performance in unconstrained environments

In this subsection, we investigate the robustness of the proposed algorithms implemented on the Dataset2, which have more complicated noise. In these experiments the template is extracted from the first image and is used to find the best matching location in a sequence from five images of the same scene with different noises such as partial occlusions, geometric deformation, illumination variation and the change of background. A bounding-box of a target object has been determined manually within each test image. A total of 210 (35 × 6) images are generated from 35 color videos that have been previously used to evaluate the performance of the best-buddies similarity measure (BBS) template matching algorithm of [49]. Since these images are all in color, the images are converted to grayscale to extract the bit-plane binary images. The bounding-box in the initial frame of video sequence is used to define a template. The proposed template matching algorithms are applied to calculate the similarity between the template and every position in the other five frames from the video sequences. The single position which with the highest similarity is considered as the estimated position of the target. A bounding-box having the same size as the one in the initial frame was determined around this estimated position and the overlap between this and the bounding-box in the ground-truth is also computed, and it is used as a measure of the accuracy. This overlap is calculated as the intersection over union between the two bounding-boxes.

Template matching results for some examples from Dataset2 are shown in Figs. 89101112 and 13. The image in the top row is the initial frame in the video sequence whose template is determined by a green rectangle. The red box in the remaining rows shows the location of the object estimated by several different algorithms. We can see that the accuracy result of NCC and Yoo2010 are low see for example Figs. 811, second and fourth rows, and the accuracy is mediate for ZNCC, Xia2019, and φ-correlation for example Figs. 913, third, fifth, and sixth rows, but the accuracy is high for the binary circuit algorithm see Figs. 813, the bottom row. The accuracy results across all 210 images of Dataset2 are averaged, summarized and depicted in Fig. 14. This graph shows the averaged proportion of overlap between the ground-truth box and the estimated box by the different algorithms. It is seen that the accuracy rate of the binary circuit algorithm exceeds that of the other algorithms followed by Xia2019, φ-correlation, and ZNCC algorithms, as well as the accuracy rate of NCC and Yoo2010, which have the lowest rate of accuracy. It is worthwhile to point out that the performance of φ-correlation is similar to that of using the ZNCC. This is expected since both methods are computing the correlation intensity values in the template and each patch of image based on the mean and variance values. Having said that, we would like to mention that the φ-correlation technique is implemented on one bit-plane slice, not two as in the case of the Boolean function algorithm. The accuracy of the φ-correlation method will be definitely enhanced if it is implemented on two bit-plane slices.

Fig. 8
figure 8

Template matching results for different algorithms when applied to find corresponding locations in five different olympic video frames. The first row shows the target template (outlined in green) in the initial frame of the video sequence. The second to seventh rows show the location of the target (outlined in red) identified by NCC, ZNCC, Yoo2010, Xia2019, φ-correlation, and Binary circuit, respectively

Fig. 9
figure 9

Template matching results for different algorithms when applied to find corresponding locations in five different Subway video frames. The first row shows the target template (outlined in green) in the initial frame of the video sequence. The second to seventh rows show the location of the target (outlined in red) identified by NCC, ZNCC, Yoo2010, Xia2019, φ-correlation, and Binary circuit, respectively

Fig. 10
figure 10

Template matching results for different algorithms when applied to find corresponding locations in five different Couple video frames. The first row shows the target template (outlined in green) in the initial frame of the video sequence. The second to seventh rows show the location of the target (outlined in red) identified by NCC, ZNCC, Yoo2010, Xia2019, φ-correlation, and Binary circuit, respectively

Fig. 11
figure 11

Template matching results for different algorithms when applied to find corresponding locations in five different David video frames. The first row shows the target template (outlined in green) in the initial frame of the video sequence. The second to seventh rows show the location of the target (outlined in red) identified by NCC, ZNCC, Yoo2010, Xia2019, φ-correlation, and Binary circuit, respectively

Fig. 12
figure 12

Template matching results for different algorithms when applied to find corresponding locations in five different Face video frames. The first row shows the target template (outlined in green) in the initial frame of the video sequence. The second to seventh rows show the location of the target (outlined in red) identified by NCC, ZNCC, Yoo2010, Xia2019, φ-correlation, and Binary circuit, respectively

Fig. 13
figure 13

Template matching results for different algorithms when applied to find corresponding locations in five different Jogging video frames. The first row shows the target template (outlined in green) in the initial frame of the video sequence. The second to seventh rows show the location of the target (outlined in red) identified by NCC, ZNCC, Yoo2010, Xia2019, φ-correlation, and Binary circuit, respectively

Fig. 14
figure 14

The accuracy average of different algorithms when applied to find corresponding locations in 210 images of color video frames. Each bar shows the overlap proportion average between the ground-truth and estimated bounding-boxes by specified algorithm

To evaluate the time efficiency, many experiments have been carried out on templates with different sizes, which are extracted from the images of Dataset2. Template sizes varied from 71 × 31 to 151 × 111. The results of the total running time are computed and presented in Table 6. It can be seen from this table that, the proposed φ-correlation method is faster than ZNCC, Yoo2010, and Xia2019 methods, and the proposed binary circuit algorithm is substantially faster than all the other methods. As we could have expected, the computational benefits increase with using the binary bit-plane slices, since the Boolean functions speed up better on binary image than the traditional techniques. It is also noted that the proposed binary circuit algorithm is always faster than the compared algorithms with both template and reference image sizes, and it performs better with bigger source images, due to the fact that with larger images the computational overhead associated with NCC and ZNCC is computationally expensive.

Table 6 Comparison results on the total running time (sec) for the six methods studied

4.4 Object detection application

This subsection is devoted to the last experimental results of this paper concerning the efficiency of the proposed φ-correlation and Boolean circuit algorithms for object detection application. For this purpose,fifteen image pairs are randomly taken from the real-world dataset2 previously used by [49] to evaluate the accuracy and running time. Each pairs of image consists of reference and target images. The target images are taken with existing real noises such as partial occlusions, geometric deformation, illumination variation and the change of background, occurred in one object in the reference image. This object is considered as the object of interest in the reference image. Sample of reference and target images from this set are shown in Fig. 15.

Fig. 15
figure 15

Sample of ten pairs from dataset2: (a) and (c) Reference images marked for object of interest by green box. (b) and (d) Target images marked for ground truth for (a) and (c) respectively, by red box

A ground truth bounding box is determined manually in the target image to compute the accuracy. The ground truth bounding box Bg and the estimated bounding box Be by the active method are used to compute the accuracy of the method as:

$$ \text{Accuracy}= \frac{\text{area}\left( B_{g}\cap B_{e}\right)}{\text{area}\left( B_{g} \cup B_{e}\right)}. $$
(15)

The proposed methods were compared with the four TM methods, NCC, ZNCC, Yoo2010 and Xia2019. Figures 16 and 17 demnostrate the detection results of these methods applied on the selected sample from dataset2. It is found from Fig. 16 that the proposed methods and Xia2019 can detect the correct position of the Coke object in the Coke images in the first column while the other methods fail to detect the object. For the lemming images in the second and third columns NCC and ZNCC fail to detect the lemming object while the other methods can detect the lemming object with percentage from 10% to 90%. The liquor object is detected well for all methods in the images in the last column but only the Binary circuit and Xia2019 methods detect the correct position for the same object in images in the fourth column. Finally, the Boolean circuit method is the only method successfully matching the object of interest in all compared methods.

Fig. 16
figure 16

Object detection results marked by black box for the six methods: (a) Target images marked ground truth by red box, (b) NCC, (c) ZNCC, (d) Yoo2010, (e) Xia2019, (f) φ-correlation, (g) Binary circuit

Fig. 17
figure 17

Additional object detection results marked by black box for the six methods: (a) Target images marked ground truth by red box, (b) NCC, (c) ZNCC, (d) Yoo2010, (e) Xia2019, (f) φ-correlation, (g) Binary circuit

Table 7 lists the average time consumed and the average accuracy of the six methods applied to the randomly fifteen picked pairs from dataset2 to detect the objects of interest. As can be seen, the Boolean circuit method achieves the highest accuracy of 84.08 dominating the other compared methods. The confidence Boolean function and the most significant binary images show distinct and well localized modes to detect the object of interest. However, for the other methods in which modes are not well localized where the difference in confidence between the correct location and estimated locations is relatively large, so we expect a more rapid drop in accuracy. The calculation of Xia2019 consumes excessive time because it depends on best buddies similarity measure which compute the nearest neighbor between every pixel in the template and all pixels in every possible window in the target image and vice versa. On the other hand, the proposed Binary circuit outperform the other methods in terms of running time because it based on Boolean functions which working with high efficiently on binary images.

Table 7 The comparisons between the proposed methods and four other methods in terms of the time cost and accuracy on test images from dataset2

5 Conclusions

Two fast, efficient and robust TM algorithms are introduced, tested and compared with two classic methods; NCC, ZNCC, and two recent schemes; Yoo2010 and Xia2019 methods. The algorithms are executed on the whole images, pixel-by-pixel any without partial elimination techniques. However, the proposed algorithms are developed to lower representations of both images and templates gray-scale and color values, particularly the highest order bit-plane images, which are binary images. The approaches’ novelty is not only the use of these binary representations of images, but also the use of the φ-correlation coefficient, which is rarely used in TM techniques. Boolean circuits are also implemented, giving fastest possible results within the mentioned methods. Both methods are fast and accurate. Robustness is also tested against real and artificial noises. Partial elimination techniques associated with both methods are expected to accelerate the techniques more beyond existing situations, when they are figured out. It is worthwhile to notice that the use of Boolean circuits is a very promising approach as it improves accuracy, robustness and fastens time. The use of the φ-correlation coefficient is faster than the use of the ZNCC, but it is comparable to the use of the NCC. The use of other measures of similarity/dissimilarity, rather than NCC, ZNCC is recently recognized as an efficient tool.