1 Introduction

Nowadays, diverse biometric methods have been utilized for authentication in security priority systems [59, 61] in which facial images classification/recognition is one of the mostly used of them [44] Indeed, image classification/categorization can be considered as one of influential tasks in the domain of machine learning [52,53,54, 68] and computer vision that has been broadly attracted vivid attention from researchers worldwide [3]. Image processing methods have been widely applied to the facial image for authentication in the literature [43]. The face recognition methods have been used in a wide range of subjects for instance military, social media, mobile platforms, urbanism [18, 34, 44, 50] To do so, different machine learning methods have been utilized for face recognition and the most important face recognition methods are listed as follows:

  • Deep learning-based face recognition [85]

  • Geometrical face recognition [73]

  • 3D face recognition [1]

  • Local pattern-based face recognition [12, 74, 82]

Local pattern-based face recognition methods perform recognition using facial salient features. Therefore, success rates of those methods are high in the literature [8]. Local pattern-based facial recognition methods create a special pattern and scan the texture image with this pattern. Then, a texture image is classified using these features obtained by scanning [37, 60]. Facial image recognition is one of the mostly used biometrics methods in the literature and proposing a hand-crafted and effective recognition method is one of the crucial problem for face recognition.

The face recognition methods have been widely used in biometry and web-based applications. Numerous mobile phones have been also used in face recognition. To achieve high successful results, deep learning methods have been used but these methods have high computational costs and need more data. Main aim of this research is to propose a lightweight, high accurate and effective face recognition method. The proposed method is inspired by deep learning networks and this method tries to solve high computational cost of the deep learning based face recognition methods. Therefore, the presented model uses lightweight and effective models together. Perceptual hash methods have generally been used in the information security, especially, image authentication. Main goal of the perceptual hashes is to extract salient features from images to representation. In this view, these methods can be utilized as a preprocessor and feature generator. Therefore, a novel perceptual hash method is proposed to perform preprocessing and feature extraction. The proposed perceptual hash and graph-based quintet-triple binary pattern-based method also uses different methods together for facial recognition successfully and it achieves high success rates in the 5 well-known facial image datasets. The proposed QTBP is an LBP like feature extractor. Thus, a local feature extractor based facial recognition method is presented and this method is perceptual hash-based face recognition. The main purpose of this work is to provide robust face recognition with high classification accuracy. The presented method contains feature extraction using perceptual hash and classification. In the proposed perceptual hash, a novel local structure is presented which is called as quintet-triple binary pattern (QTBP). QTBP is similar to LBP [47, 48] which can be considered as a feature extractor. The main goal of these descriptors is to generate global informative features by using local relationships. Graph theory [5, 10] is also used to define variable patterns to extract informative features. It is worth noting that QTBP uses two main basic shapes (pentagon and triangle) as a Hamilton graph [65] which can create the patterns using these shapes. QTBP denotes that valuable features can be generated by using the basic shapes. Moreover, both DWT and SVD techniques are used for salient feature extraction. Technical contributions of this paper are given in below.

  • A novel perceptual hash is presented in this study and the proposed perceptual hash is utilized for preprocessing and feature generation. To calculate numerical results of the proposed perceptual hash-based face recognition method, 5 face image databases are chosen. 10 widely used descriptors are also used for comparisons. The results show that the perceptual hash-based method has high facial image classification accuracy.

  • QTBP is a novel graph based local descriptor. The main aim of the proposed QTBP is to obtain informative or meaningful textural features from facial images. In the literature there are many Hamilton graph based LBP like descriptors and these descriptors are called as local graph structures (LGS). To extract discriminative and informative features, we presented a new LGS in this paper. This research aims to show strength of the a novel LGS and these LGSs can be presented by using basic shapes. Therefore, QTBP is presented.

  • Moreover, a lightweight face recognition method is proposed by using perceptual hashing to obtain high classification accuracy from small facial image datasets. The deep learning methods have high performance in the big datasets. The main aim of the proposed facial recognition method is to reach high classification accuracies. This method has resulted effectiveness and it is a good hand-crafted method. The results obtained have proved efficacy of the proposed facial recognition method.

The remaining of this work is organized as follows. Section 2 gives related works, databases are defined in the Section 3, the presented local feature generation function (QTBP) is presented in the Section 4, details of the proposed perceptual hash based face classification method is shown in Section 5, results are given and discussed in Section 6, Section 7 presents conclusions and future works.

2 Related work

Some of the existing methods applied to the face recognition domain are studied. Zhou et al. [84] proposed Huffman and Local Binary Pattern (LBP) based face categorization method. In this work, firstly interest points were detected. By using interest points, face images were detected, and the proposed LBP-Huffman based feature extraction method was applied on the face images. Ou et al. [49] presented a robust discriminative nonnegative based face recognition method. In order to classify face images, regression matrix and nonnegative matrix were used. Liang et al. [35] suggested a 3D face recognition method using half face matching. Lv et al. [40] presented a latent face image recognition method to classify 3D, sketch, low resolution and high-resolution face images. A Bayesian based method was used for classification. Wang et al. [77] proposed adaptive SVD and 2D DWT (2-dimension discrete wavelet transform) based skin detection and face recognition methods. Tang et al. [70] suggested a face recognition method using fractal codes. In their study, short execution time was achieved by using fractal codes and the authors claimed that the method can be used for real world application basically. Vazquez-Fernandez and Gonzalez-Jimenez [72] discussed the importance of facial recognition on the mobile platform. On the mobile platform, face recognition is generally used for authentication. However, authors mentioned that biometric data protection was important as face and fingerprint templates were often stolen. They indicated that the confidentiality of the biometric data should be provided. Jain et al. [23] used the hybrid deep neural network for facial expression recognition. They used CNN (Convolution Neural Network) and RNN (Restricted Neural Network) together. Rakshit et al. [58] presented the 6 local graph structures (LGS) for face recognition which were vertical, vertical-symmetric, zigzag horizontal, zigzag horizontal middle, zigzag vertical, zigzag vertical middle and logically extended LSGs. The KNN (K-nearest neighbor) classification method was tested on the 5 face databases. In another study, Zhou et al. [86] proposed a method based on LBP. Improved pairwise-constrained multiple metric learning was also used to enhance the method. Nearest neighborhood classifier was chosen at the classification stage. Ding et al. [12] proposed Dual Cross Pattern to extract textural face feature extraction with high classification accuracy. DCP uses 5 × 5 size of overlapping blocks for feature extraction and 512 dimensions of features are extracted using DCP. Vishwakarma and Dalal [74] presented a robust method for face recognition. Their method operates with the Discrete Cosine transform coefficients of the image. Six face datasets were used in the study. Chakraborty et al. [8] suggested local quadruple pattern for facial image retrieval and recognition. In this method, a local structure is proposed for textural feature extraction and LQPAT uses 4 × 4 size of overlapping blocks to extract 512 dimensions of features. Accordingly, Liu et al. [37] presented an extended LBP version to achieve high classification accuracy. Peng et al. [51] suggested a method based on LBP and ensemble learning for face recognition. The proposed method is performed for face presentation attack detection (using three different face datasets) and had a suitable time cost.

3 Databases

To obtain numerical results from the proposed facial recognition method, 5 well-known facial image databases are used. These are AT&T, Face94, CIE, AR and LFW. These databases are labeled face image databases [13, 51, 62, 78] in which the properties of them are described in the following.

3.1 At&t

AT&T dataset consists of 300 images (for 30 people). The face images are gray level and size of these is 92 × 112. Several sample images from AT&T dataset are demonstrated in Fig. 1 [62].

Fig. 1
figure 1

Sample images of AT&T database

3.2 Face94

Face94 dataset contains 300 images of 30 people. The face images are colored in which the size of them is 180 × 200. Example of some random face images in Face94 dataset are demonstrated in Fig. 2 [32].

Fig. 2
figure 2

Example images of Face94 database

3.3 CIE

Like Face94 dataset, the CIE dataset also consists of 300 images (for 30 people). The face images are colored with size of 2048 × 1536. Example of some random face images from CIE dataset are demonstrated in Fig. 3 [24, 63].

Fig. 3
figure 3

Example images of CIE database

3.4 AR

AR dataset contains 310 face images of 31 people. These images are color and their size is 768 × 536. Example of some random face images for AR dataset are presented in Fig. 4 [41, 42].

Fig. 4
figure 4

Example face images of AR database

3.5 Labeled face in wild (LFW)

Labeled face in wild dataset is novel and widely used dataset. Many methods have been applied on this database to obtain the numerical results. This dataset is a heterogeneous dataset and it contains 13,000 images of the 1680 people. Sample of some random face images of the LFW dataset are demonstrated in Fig. 5 [28].

Fig. 5
figure 5

Example face images of LFW database

4 The proposed graph-based quintet-triple binary pattern

Textural descriptors are very effective methods for face classification. These methods extract valuable features from a face image. Therefore, textural image descriptors have been widely used for face recognition. However, one of the most important problems is to select effective patterns. For this problem, we used a graph-based pattern and this pattern uses two basic shapes which are triangle and pentagon. The main goal of the proposed graph-based local image descriptor is to generate valuable features from face images. This pattern in inspired to shape based (rate based) facial images recognition. Therefore, we presented a hypothesis. This hypothesis is to use basic shape rates as a pattern as below.

To find optimum pattern for face recognition is a non-polynomial (NP) problem. Therefore, the researchers can propose nature inspired pattern as shown in Fig. 6. In this view, we presented a nature inspired pattern to achieve high performance rate. One of the fundamental aim of this research is to show success of the nature inspired pattern for face recognition. This pattern consists of a pentagon and a triangle shapes. Therefore, the proposed pattern is called as quintet-triple binary pattern (QTBP). The main objective of QTBP is to obtain the distinctive features from images. 5 × 5 size of overlapping blocks are used to implement the proposed QTBP. Figure 6 shows a numerical example of the QTBP graph-based binary pattern. The propose theorem of the QTBP is given as follows. The recognition rates are widely used features for face and facial expression recognitions. However, calculating recognition rates is a very hard task. Also, the image descriptors are effective, simple and fast methods for face recognition. In the proposed QTBP approach, we concatenate these methods. A graph-based descriptor is presented, and this graph is similar to recognition rates features lines.

Fig. 6
figure 6

The inspired shape to create pattern of the QTBP

Fig. 7
figure 7

A numerical example of QTBP graph-based binary pattern

Eqs.110 describes mathematical background of the QTBP.

$$ {b}_1=S\left({p}_{k+3,l+2},{p}_{k+2,l}\right) $$
(1)
$$ {b}_2=S\left({p}_{k+2,l},{p}_{k,l+1}\right) $$
(2)
$$ {b}_3=S\left({p}_{k,l+1},{p}_{k,l+3}\right) $$
(3)
$$ {b}_4=S\left({p}_{k,l+3},{p}_{k+2,l+4}\right) $$
(4)
$$ {b}_5=S\left({p}_{k+2,l+4},{p}_{k+3,l+2}\right) $$
(5)
$$ {b}_6=S\left({p}_{k+3,l+2},{p}_{k+4,l}\right) $$
(6)
$$ {b}_7=S\left({p}_{k+4,l},{p}_{k+4,l+4}\right) $$
(7)
$$ {b}_8=S\left({p}_{k+4,l+4},{p}_{k+3,l+2}\right) $$
(8)
$$ S\left(a,b\right)=\left\{\begin{array}{c}1,a\ge b\\ {}0,a<b\end{array}\right. $$
(9)
$$ val=\sum \limits_{t=1}^8{b}_t\mathrm{x}{2}^{8-t} $$
(10)

where b = {b1, b2,…,b8} set stands for binary values, S(.,.) is signum function, p are pixel values and val is the decimal value.

An example about the proposed QTBP is shown in Fig. 8. As shown in Fig. 8, there are three main categories: (a) raw face image, (b) QTBP image and (c) histogram of the QTBP image.

Fig. 8
figure 8

The proposed QTBP

The proposed QTBP is a special image descriptor for face recognition. Therefore, we used it in the proposed face recognition method as a descriptor with perceptual hash. We can obtain more distinctive features by using the QTBP. The proposed QTBP is a graph-based descriptor and it uses lines like recognition rates. Therefore, our theorem is that it is a suitable descriptor for facial recognition, and is clearly proved in the experimental results section. Procedure of the proposed QTBP is demonstrated in Algorithm 1.

figure a

The obtained results of the proposed perceptual hash are presented in the Section 5.

5 The proposed perceptual hash-based face recognition method

Perceptual hash [79] is one of the newest generation decomposition and feature extraction methods for images. It has widely been used in the multimedia information security applications. According to the literature, perceptual hashes generate the salient features for image authentication [81]. These salient features can be used in machine learning methods. Also, LBP like image descriptors are effective feature extractors for textural and facial image classification. Therefore, we use a perceptual hash with a novel image descriptor to generate meaningful features. The proposed perceptual hash is a block-based method and it contains both feature extraction and preprocessing. Here, the preprocessing step comprised of several steps including 3LSBs elimination, RGB to grayscale conversion, image resizing (bilinear interpolation) and median filtering. The proposed preprocessing phase extracts robust facial features extraction against filtering attacks. The 3-level 2D DWT [9] SVD [67] and QTBP are used in the feature extraction phase of the presented perceptual hash. The robust features against JPEG compression [30] are extracted using 3-level DWT approach. The secondary image is created using 4 × 4 sized SVD. SVD provides robustness against geometrical attacks and final features are obtained using the proposed QTBP. Briefly, the proposed perceptual hash extracts robust and salient features. The graphical outline of the proposed method is demonstrated in Fig. 9.

Fig. 9
figure 9

Schematically description of the proposed perceptual hash-based feature generator

The steps of the proposed perceptual hash-based feature extraction are presented below.

  1. Step 1:

    Set 0 to 3LSBs of raw images. The LBPs attacks manipulate most of the methods for instance deep learning method. To provide robustness against these attacks, this step is applied. The mathematical notation of it is demonstrated in Eq. 11.

$$ LI=\left\lfloor \frac{RI}{8}\right\rfloor \mathrm{x}\ 8 $$
(11)

where LI is 3LSBs eliminated image and RI is raw image.

  1. Step 2:

    Convert RGB image to grayscale.

$$ gray= rgb2 gray(LI) $$
(12)

where rgb2gray(.) is an RGB to grayscale conversion function.

  1. Step 3:

    Resize image using bilinear interpolation. This step is used to provide robustness against resizing attack. The resizing attacks are very critical for artificial intelligence methods. In this work, a perceptual hash which is used for information security based facial image categorization method is proposed, and this method is a semantic authentication method. To ensure robustness, we apply resizing attack.

$$ gray= resize\left( gray,1024\ \mathrm{x}\ 1024\right) $$
(13)
$$ gray= resize\left( gray,512\ \mathrm{x}\ 512\right) $$
(14)

where resize(.) is image resizing function. Resize(.) function uses bilinear interpolation.

  1. Step 4:

    Apply 5 × 5 size of median filter onto gray level image. The block size of the proposed QTBP is 5 × 5, therefore, we use 5 × 5 size of median filter.

  2. Step 5:

    Apply 3 level 2D DWT. As we know from literature, one of the most effective discrete wavelet transforms is 3 level 2D DWT. To achieve robustness against JPEG compression attack, this step is used.

$$ \left[l{l}^1,l{h}^1,h{l}^1,h{h}^1\right]= DWT2(gray) $$
(15)
$$ \left[l{l}^2,l{h}^2,h{l}^2,h{h}^2\right]= DWT2\left(l{l}^1\right) $$
(16)
$$ \left[l{l}^3,l{h}^3,h{l}^3,h{h}^3\right]= DWT2\left(l{l}^2\right) $$
(17)
  1. Step 6:

    Divide 4 × 4 size of non-overlapping blocks to ll3.

  2. Step 7:

    Apply SVD to each block. SVD is commonly used method for image processing methods because it extracts invariant features. Therefore, we use SVD to achieve invariance. By using the SVD, singular matrix of each block is calculated. The singular matrix is a diagonal matrix and the maximum value of it is S1.

  3. Step 8:

    Store S1 value of each block and create secondary image. Most of the perceptual hash method uses SVD to generate invariant image and it is called as secondary image.

  4. Step 9:

    Apply QTBP onto secondary image and calculate QTBP image.

  5. Step 10:

    Extract histogram of the QTBP image and obtain feature. Transitions of the proposed perceptual hash are presented in Table 1. The salient features of the facial images are obtained using the proposed perceptual hash. The transitions of this hash function are presented in Table 1.

  6. Step 11:

    Classify extracted features using SVM or KNN with stratified 10-fold cross validation.

Table 1 Transitions of the proposed perceptual hash

We used the MATLAB classification learner toolbox for the experiments of this study. It contains various classifiers. The extracted features were tested on SVMs and KNNs and the best accuracy rates are calculated using Quadratic kernel SVM and City Block 1NN, it is a simplest version of the KNN, and K is selected as 1 in this classifier. Therefore, it is called as 1NN. It should be noted that in this study 10-fold cross validation is used to obtain classification accuracies. Briefly, preprocessing and feature extraction are performed using the presented perceptual hash. Quadratic kernel SVM and KNN classifiers are used in the classification phase.

6 Experimental results and discussions

In this paper, 5 well-known face databases were used in the experiments. To implement the proposed method and existing methods, MATLAB 2018a is utilized as programming platform.

Moreover, to obtain numerical results from this method accuracy (Acc) is used. Mathematical definition of Acc is given by Eq. 18 [71].

$$ Acc\left(\%\right)=\frac{\# True\ predicted\ images}{\# Total\ images}\mathrm{x}\ 100 $$
(18)

Stratified 10 folds cross-validation is used for classification and then the accuracies of the classifiers are listed in Table 2.

Table 2 Classification accuracy of the proposed method

Table 4 clearly shows that the proposed perceptual hash based facial image categorization method has a high recognition capability. To better understand performance of this method, some existing face recognition methods are compared with the proposed method. The comparison results for AT&T, Face94, CIE and AR databases are given in Table 3. To obtain comparisons, the widely used image descriptors for face recognition were programmed.

Table 3 Comparison of results for the selected databases

The proposed method has reached 100% recognition rate in the AT&T, Face94 and CIE databases. In the AR databases, the recognition rate has reached 99.4%. This high recognition rate has been achieved through the proposed perceptual hash. The recognition performance of the proposed method is superior to existing methods.

Also, the performance of the method was evaluated to state of art methods using the AR and LFW datasets. As indicated in Table 5, the proposed method resulted successfully for the small and homogenous face datasets. LFW is a big and heterogeneous dataset. By using 1NN with city block distance, 97.1% success rate is achieved for LFW. The results of AR and LFW datasets using state-of- the-art methods are presented in Tables 4 and 5, respectively.

Table 4 Comparison of the average recognition rates of the proposed method with other state of art descriptor-based methods by using the AR dataset
Table 5 The average classification accuracies of the proposed method and the other existing methods for the LFW dataset

As seen from Table 4, the proposed method has the best classification capability among all the descriptor-based methods.

In the Table 5, the performances of the proposed and other widely used descriptors are compared by using the LFW dataset. Also, other widely used methods and deep methods are listed in Table 5.

Table 5 illustrates that the proposed perceptual hash feature generation-based face classification method has a high classification performance and it can be considered as one of the best methods among the descriptors. However, FaceNet and Multi-Directional Local Gradient Descriptor (MDLGD) have higher accuracy rates than our method because FaceNet is a deep learning method and MDLGD also has a complex mathematical background. They also have a high time cost, but our method is a cognitive and lightweight method. The proposed perceptual hash-based face recognition method is also achieved superior results than MDLGD in the AR dataset.

The time cost calculation of this method is listed in Table 6. Big O notation is utilized to calculate complexity.

Table 6 Time-cost of the proposed perceptual hash-based method

As it can be seen from Table 6, the proposed method has low computational complexity. The proposed perceptual hash has also O(nlogn), where n is the size of the image, because it decreases the size of the image step by step.

The summaries of experimental outcomes obtained are listed as follows:

  • A novel perceptual hash-based face recognition method is presented in this study. The proposed method extracts very robust and informative features. The proposed method combines preprocessing and feature extraction phases.

  • We propose a novel descriptor in this paper and this descriptor is called as QTBP. The proposed descriptor is more suitable than LBP, because this method is proposed for faces but LBP was presented for textural image classification. Table 4 shows that the proposed method achieve 0.994 accuracy rate for AR dataset and LBP achieved 0.988 accuracy rate. Also, 0.832 and 0.971 classification accuracy rates were computed using LBP and QTBP based methods for LFW, respectively. These results proved success of the proposed QTBP. To classify LFW images, many deep methods have been used but we used conventional classifiers which are KNN and SVM.

  • In the classification phase, KNN and SVM classifiers are used. These classifiers are conventional. We use them to show discriminative effect of the extracted features.

  • Quadratic kernel is suitable for facial image classification and KNN (K = 1 and distance metric is city block) is the simplest classifier. Therefore, quadratic kernel SVM and KNN are chosen as classifiers.

  • Accuracy rate is achieved as 100.0% on the AT&T, Face 94, CIE datasets using SVM. 99.4% accuracy is yielded on the AR dataset using SVM.

  • SVM classifier exhibits better performance than KNN in this work.

  • The proposed method extracts universal features. This method is a simple and has high success rate.

  • The proposed method is tested on LFW dataset. This dataset is a heterogeneous and big dataset. According to Table 5, the proposed QTBP and perceptual hash-based method has higher classification rate. It has the best success rate among the descriptors. However, FaceNet is more successful than our method. However, our method is a lightweight method, does not use any exemplar, pyramid like or multilayer to extract features. The proposed method is applied to images without any face detection phase and the extracted features are classified using the simplest classifier only. It can be said that the proposed method is a purely cognitive method.

  • The proposed method has approximately 9% higher accuracy rate than the best of the other descriptor-based methods on LFW dataset (except MDLGD).

The main advantages of the proposed method are as follows:

  • This method has basic mathematical structure and it can be programmed basically.

  • The proposed method has a low time cost. Therefore, real time face recognition can be also constructed using the proposed perceptual hash-based method.

  • The positive effects of the perceptual hash are used directly in this method. This study clearly shows that perceptual hash methods are useful for face recognition.

  • The proposed method is simple.

  • The proposed method extracts the universal features. The success rates of the proposed method are compared with the 5 facial image datasets and the results obtained has proved this claim.

The only limitation of this study is the proposed method cannot achieve the higher classification rate than FaceNet and MDLGD. To solve this issue, we aim to apply various evolutionary algorithms (EAs) to optimize different parameters of the base classification algorithms used in this study [2, 6, 19, 46, 55, 56, 87].

7 Conclusion and future work

In this study, a novel perceptual hash-based facial recognition method is presented to propose a high accurate face classification method. Pre-processing and feature extraction steps were performed by using the proposed perceptual hash. The purpose of the proposed perceptual hash was to extract the robust and informative features of the faces among different image data sets. In addition, a new graph-based descriptor was defined to derive the textural features. Then, SVM and KNN classifiers were used in the classification phase. Therefore, the proposed methodology was applied to 5 well-known face databases and other state-of-the-art face classification methods were used to obtain comparisons. The experimental results clearly illustrated that the proposed method has very high face image recognition capability with low computational complexity. Furthermore, the results compared to the several methods available in the literature and it outperformed previous outcomes. Briefly the proposed method achieved 100%, 100%, 100%, 99.40% and 97.10% classification accuracies for the AT&T, Face94, CIE, AR and LFW, respectively.

In our future studies, novel lightweights deep learning methods can be proposed for face recognition. In addition, other descriptors and perceptual hashes can be used in the proposed novel face recognition methods. The descriptors have been also used in textural image recognition methods. Novel textural image classification, retrieval and recognition methods will be proposed using perceptual hashes and image descriptors together.