1 Introduction

The fast growth of modern human civilization has led to an increasing demand for new and efficient technologies to sustain it. Alongside, security and privacy concerns have emerged, and the usage of highly reliable and accessible individual authentication and identification techniques became crucial. Biometrics has emerged to address this need and has become a science that studies physiological and behavioral characteristics of the human body to recognize an individual’s identity.

Biometric technologies focus on techniques that automatically authenticate both stable human traits, such as DNA, fingerprint [1], faces [2], iris [3], palmprint, and human behavioral traits such as gait [4], voice [5], keystroke [6], and signature [7]. Among these, palmprint recognition has shown itself to be one of the essential biometric technologies, attracting significant attention.

The palmprint images contain rich features such as principal lines, wrinkles, and minutiae. They are relatively stable, and their captured images are easy to obtain [8, 9]. They can be categorized according to the way of their acquisition. Therefore, they can be divided into two categories of palmprint images, contact-based and contactless. The main difference between them is whether the hand is in touch with the acquisition device or not [10]. The first type of image is gathered by placing the palms on the device and using user-pegs, while the second type is obtained without contacting the device’s surface [11].

A biometric system can be divided into two categories, unimodal and multimodal biometric systems. The unimodal biometric system is designed to recognize individuals based on a single biometric trait’s information. That system suffers from some limitations and cannot provide satisfactory recognition accuracy. The multimodal biometric system integrates information from multiple biometric traits. It is more secure than a unimodal system and can solve a variety of problems, including noisy sensor data, non-universality, distinctiveness, and the lack of biometric traits. The combination of modalities provides efficient means for improving the performance and reliability of the biometric system [12].

The main component of a biometric system is feature extraction. It extracts only the discriminant information from the acquired image to create a new representation that should be essentially unique to each person [13]. Deep learning techniques, which provide a better representation of the image, have become popular methods for this purpose. The main benefit of deep learning methods is their ability to generate efficient and discriminative features from the biometric image. Recently, many feature extraction methods based on deep learning techniques have been proposed in the literature [14, 15].

Since feature extraction is an essential part of the recognition task, predicting performance and reducing the computation required can be achieved by using Feature Selection (FS). This latter is an essential component of machine learning and data mining, which has been studied for many years under different conditions and in diverse scenarios [16]. These algorithms aim to rank and select a subset of related features based on their degrees of preference, relevance, or importance, as defined in a particular application. Since feature selection may reduce the number of features needed to train classification models, it mitigates the effect of dimensionality’s curse, speeds up the learning process, improves model performance, and promotes data understanding.

In this paper, inspired by the work of PalmNet [17], we will propose efficient biometric identification systems based on palmprint traits. To accomplish this, we suggest applying Log-Gabor filters in the preprocessing step to adjust the pixel luminance of palmprint images. After, we process to extract the discriminant information by using an adaptive Gabor-based filter tuning procedure [17]. To improve the recognition rate as well as reduce the size of large feature vectors and the computational time, we use feature selection and dimensionality reduction procedures. Additionally, we employ a Support Vector Machines (SVM) classifier instead of the K-Nearest Neighbor (KNN) classifier based on the Euclidean distance, with K=1 (denoted by 1-NN in the following). For the multimodal system, we fuse modalities at the matching score level to improve system performance. To validate our method, we applied it to several public palmprint databases containing images of distinct qualities, resolutions, and dynamic ranges. We also carried out comparisons with several recent state-of-the-art methods.

The main contributions of this work are as follows:

  • The development of effective unimodal and multimodal biometric systems for palmprint recognition.

  • Improvement of palmprint images using log-Gabor filters by adjusting pixel luminance.

  • Using feature selection with dimensionality reduction significantly reduced the features vector size with a reduction rate of 0.003 %, which allowed reducing computational time without degrading the performance of our recognition systems.

  • Employing the SVM classifier instead of the 1-NN classifier as in the paper [17]. Experimental results demonstrate that our methodology scored a higher recognition accuracy than existing approaches in the literature.

The rest of the paper is organized as follows: Sect. 2 summarizes earlier works on biometric systems based on multispectral and contactless palmprint databases. Section 3 introduces the proposed approach of unimodal and multimodal identification systems for palmprint recognition. The experimental results are given in Sect. 4, which reports the obtained experimental results. Finally, Sect. 5 concludes the paper. To facilitate reading, Table 1 presents the list of abbreviations and definitions.

Table 1 Table of Abbreviations

2 Related works

Palmprint recognition has enjoyed great research popularity for identity authentication and identification in recent years. It has many unique advantages, e.g., the richness of features, high user-friendliness, suitability for private security, etc. [18]. There are many palmprint identification systems that exhibit encouraging results, but there is a need to improve the performance of the existing systems. State-of-the-art methods can be broadly organized into two main categories: Handcrafted-features-based and deep-learning-based approaches.

The texture features are an important low-level feature in palmprint recognition [19] that can describe the contents and details of a specific region in an image, and for that, several handcrafted features-based approaches are based on the analysis of image texture information and provide precise features for the best palmprint recognition rate. Zhang et al. [20] supplied a multispectral palmprint recognition approach that captured palmprint images under four bands: Red, green, blue, and near-infrared light. A score-level fusion of these bands achieved superior performance compared to any single band. Jing et al. [21] used a two-dimensional (2D) separability judgment to select DCT frequency bands with appropriate linear separability. Then from the given bands, it extracts the linear discriminant features by optimized Fisherface method and classifies by nearest neighbor classifier. Luo et al. [22] proposed a new image descriptor, local line directional patterns (LLDP). This work shows that different implementations of LLDP descriptors perform competitively in palmprint recognition. Kang et al. [23] presented a novel recognition approach for contact-free palm-vein recognition that performs feature extraction and matching on all vein textures distributed over the palm surface, including finger veins and palm veins, to minimize the loss of features information. First, a hierarchical enhancement algorithm is adopted, which combines a DOG filter and histogram equalization to alleviate uneven illumination and highlights vein textures. Second, a Root Scale Invariant Feature Transform (RootSIFT), a more stable local invariant feature extraction method compared to Scale Invariant Feature Transform (SIFT), is used to overcome the projection transformation in contact-free mode.

Recently, many systems and applications have used deep learning for biometric identification. The deep network is trained on a variety of patterns. Once the network has learned all the unique features of the dataset, it can be used to recognize similar patterns. Deep learning approaches have been used primarily to learn features for palmprint recognition. Deep learning can also be very efficient in classification (supervised learning) and clustering (unsupervised learning) tasks. In a classification task, the system classifies the input instances based on their corresponding class labels, while in clustering, the instances are grouped based on their similarity without the need for class labels. Clustering can be used for several well-known problems, such as recommender systems [24] [25] [26]. Several approaches described below are based on deep learning with classification and clustering.

Wang et al. [27] proposed 2D Gabor wavelets for palmprint images. They used a Pulse-Coupled Neural Network (PCNN) to imitate the creatural vision perceptive process and decompose each Gabor subband into a series of binary images. Entropies for these binary images are calculated and regarded as features. An SVM classifier is employed for classification. Minaee and Wang [15] proposed deep scattering convolutional network with a two-layer for palmprint recognition. Then Principal Component Analysis (PCA) is applied to reduce the dimensionality of the data. For classification, a multi-class SVM and the nearest neighbor classifier are used. Svoboda [28] proposed a Convolutional Neural Network (CNN) based on the AlexNet model and trained by optimizing a loss function related to the d-prime index to achieve a better genuine/impostor score distribution separation of touchless palmprint databases. Meraoumia et al. [14] proposed Principal Component Analysis Network (PCANet) deep learning-based feature extraction using two stages. Then four classifiers (SVM, Radial Basis Function - RBF, Random Forest Transform—RFT, and KNN) are used with the supervised procedure. The testing was performed on multispectral palmprint databases.

Cheng et al. [29] proposed Deep Convolutional Features-Based Supervised Hashing (DCFSH). They used CNN-F architecture to extract the palmprint convolutional features, followed by learning binary coding from distilled deep features. DCFSH is evaluated on a multispectral palmprint database. The Hamming distance is employed in the matching steps. Zhong et al. [30] proposed a new method to achieve end-to-end palmprint recognition using Siamese network. In their network, two parameter-sharing Visual Geometry Group-16 (VGG-16) networks were used to extract convolutional features of two input palmprints images, and the top network directly got the similarity of two input palmprints based on their convolutional features. Bensid et al. [19] proposed a simple new deep learning feature extraction algorithm for an efficient multispectral palmprint identification system called Discrete Cosine Transform Network (DCTNet).

Genovese et al. [17] proposed PalmNet, which is a convolutional network that uses Gabor responses and PCA filters through an unsupervised procedure applied on different touchless palmprint databases and uses the 1-NN classifier based on the Euclidean distance for classification step. Besides, Zhao et al. [33] proposed a joint constrained least-square regression (JCLSR) model with deep convolutional neural networks to solve the under-sampling classification problem by extracting different deep local convolution features using different patches of the same palmprint image. The experiments of the proposed method (JCLSR) are performed on touchless and multispectral palmprint databases. Table 2 include a summary of deep learning methods for palmprint recognition.

Fei et al. [31] proposed LRRIPLD which is a new Low-Rank Representation (LRR) model integrated with principal line distance for contactless palmprint recognition. LRRIPLD generates a graph that is more distinct than LRR because main line distances effectively improve clustering results by increasing the weights of the links between similar samples. The approach is tested on three palmprint databases IITD-Touchless, GPDS-Touchless, and CASIA.

Arora et al. [32] introduced PalmHashNet, a novel indexing method that learns compact feature vectors for palmprint identification. They used the Softmax loss function with additive margin to train the model to index the palmprint database and to simultaneously learn the feature vector embeddings. Furthermore, to generate an index table, the learned embeddings are indexed using the k-means clustering and locality sensitive hashing techniques. PalmHashNet is evaluated on four publicly available palmprint databases CASIA, IITD-Touchless, Tongji-Contactless, and PolyU II.

As we studied above about the state-of-the-art of the existing palmprint recognition, all authors have relied their research on feature extraction using deep learning. Thus, a large number of features are obtained, which causes computational complexity problems, and in order to solve them, this paper presents feature selection and dimensionality reduction schemes to get higher performance and reduce the computational time.

Table 2 A summary of published deep learning approaches for Palmprint Recognition

3 Proposed approach

Figure 1 shows the block diagram of the proposed unimodal palmprint identification system, composed of five steps: Preprocessing; feature extraction; feature selection and dimensionality reduction; classification and matching; decision. Every unimodal system calculates its own matching score. For the multimodal system, these individual scores are eventually combined or fused at the matching score level into a total score used by the decision module. A final decision is made based on this matching score (the user is identified or not). This structure can improve the proficiency of a unimodal system and be used to solve some of its limitations.

Fig. 1
figure 1

Flowchart of the proposed approach

3.1 Preprocessing

The preprocessing step can be divided into three separate tasks: (a) extracting the Region of Interest (ROI) from the palmprint image, (b) resize the ROI palmprint images, and (c) applying the Log-Gabor filter (see Fig. 2). First, the surface of the palm image is segmented for extracting the Region Of Interest (ROI). The ROI attempts to get only the area where the hand has useful information. For that, we have to align the palmprints by using the algorithm mentioned in [34]. The central part of the image, which is \(128 \times 128\), is then cropped to represent the whole palmprint. Second, we resize the ROI to dimensions of \(32 \times 32\) pixels in order to reduce the computational time. In the last step of the preprocessing module, and to enhance the ROI image of palmprint, a Log-Gabor filter is applied to provide a better enhancement with its good smoothening characteristics based on performance and quality measurements that have been empirically observed (Fig. 2d). The log-Gabor filter is a derivative of the standard Gabor filter. The log-Gabor frequency response is Gaussian in the logarithmic frequency scale, as opposed to the standard Gabor that has Gaussian frequency response in a linear scale. The log-Gabor frequency response is described by (Eq. 1):

$$\begin{aligned} G(w) = \exp \left( \frac{-\left( \log (w/w_0)\right) ^2}{2\left( \log (k /w_0) \right) ^2}\right) \end{aligned}$$
(1)

Where \(w_0\) is the centre frequency of the filter, and the bandwidth is determined by the \(\frac{k}{w_0}\) term. The parameters of Log-Gabor filter were experimentally selected as \(w_0\) = 1/3 and k = 0.65.

Fig. 2
figure 2

The main tasks of preprocessing step. a input image, b ROI palmprint extraction, c Image resizing, and d results of Log-Gabor filter

3.2 Features extraction

Feature extraction is a key module for recognition systems. The acquired biometric data are processed, and only the salient information is extracted to form a new representation of the data. Ideally, for each person, this new representation should be unique. In our scheme, to extract highly discriminative palmprint features, PalmNet deep learning is used to extract the features vector of each data type. It is a particular case of an image classification deep learning baseline, which consists of three stages: (1) convolutional stage, (2) binarization stage, (3) histograms stage. Thus, the block diagram of PalmNet algorithm is presented in Fig. 3 and can be summarized as follows [17]:

3.2.1 Convolutional stage

As shown in Fig. 3, the convolution process is ensured by two layers of the Gabor filter bank:

  • First convolutional layer (\(L_1\)) consisting of \(k_1\) filters. Each filter is convolved with the input image (layer \(L_0\) with dimensions of \(u \times v\)). Thus, the output of this layer consists of \(k_1\) images with dimensions of \(u \times v\). So,

$$\begin{aligned} I_{l_{1}}(i,j)= & {} \sum _{m}\sum _{n}h_{l_{1}}(m,n)I_{l_{0}}(i-m,j-n),\nonumber \\&\qquad 1\le l_{1} \le k_{1} \end{aligned}$$
(2)

Where \(I_{l_0}\) represents the input image of layer \(L_0\) to be convolved with the filter \(h_{l_1}\) to produce the output image \(I_{l_1}\) of layer \(L_1\). The indices i and j deal with the images while m and n work with the filters.

  • Second convolutional layer (\(L_2\)) consisting of \(k_2\) filters. Each filter is convolved with the output of layer \(L_1\). Thus, the output of this layer consists of \(k_1 k_2\) images with dimensions of \(u \times v\).

$$\begin{aligned} I_{l_{1}l_{2}}(i,j)= & {} \sum _{m}\sum _{n}h_{l_{2}}(m,n)I_{l_{1}}(i-m,j-n) , \nonumber \\&\qquad 1\le l_{1} \le k_{1}\,and\,1\le l_{2} \le k_{1}k_{2} \end{aligned}$$
(3)

Where \(I_{l_2}\) represents an image of the \(k_1 k_2\) output filtered images. \(h_{l_2}\) is a filter of layer \(L_2\).

The significance of this part is that we used two types of filters, fixed-scale Gabor filters and adaptive multiscale Gabor filters. In the first type, we created a set of fixed scale \(2-D\) Gabor filters with dimensions of \(h_{1} \times h_{2}\), as products of a sinusoidal wave with a Gaussian function [17]. In the second type, we first computed a set of adaptive orientations from the training subset of palmprint ROI. Then, we computed a bank of multiscale Gabor filters with the computed orientations. Finally, we selected the filters that get the greatest magnitude responses with dimensions of \(g_{1} \times g_{2}\) where \(g_{1}\times g_{2}=4\cdot 2^{m_{f}}\). The value of \({m_{f}}\) is computed as \(m_{f}=\left[ 0,1,\cdots ,\cdots M \right]\), where M equal to \(M=\left[ log_{2} (u/2) \right]\) and u is the horizontal size of the ROI images [17]. So, the number of filters chosen for each layer of the network, i.e., \(k_1\) and \(k_2\), consists of \(F + A'\) filters, corresponding to F fixed-scale \(2-D\) Gabor filters and \(A'\) adaptive multiscale \(2-D\) Gabor filters.

3.2.2 Binarization stage

In this stage, the \(k_1k_2\) output images obtained from the output of the second layer are converted to binary format using a Heaviside step function illustrated by the following equation:

$$\begin{aligned} B_{l_{2}}(i,j) = \left\{ \begin{array}{ll} 1~&{}~if\,I_{l_{2}}(i,j) > 0\\ 0~&{}~otherwise \end{array} \right. \end{aligned}$$
(4)

Where \(B_{l_2}\) is a binary image. In total, we obtain \(k_1\) groups of binary images, each containing \(k_2\) binary images \(B_i\), with \(i=1, 2,\ldots ,k_2\). These images have the same dimensions of \(u \times v\). For each position (ij), by concatenating the binary values of all \(k_2\) binary images, we obtain:

$$\begin{aligned} b=\left[ B_{1}(i,j),B_{2}(i,j),\cdots ,B_{k_{2}}(i,j) \right] \end{aligned}$$
(5)

We convert the binary vector b into a decimal number as follows:

$$\begin{aligned} d=\sum _{k=1}^{k_2}2^{k-1}b(k) \end{aligned}$$
(6)

This process is repeated for each position (ij). Finally, we obtain a decimal matrix D(ij) that describes the whole \(k_2\) binary output image group.

Likewise, the decimal matrices \(D_l\) are determined for all \(k_1\) binary images groups, with \(l=1, 2,\cdots , k_1\).

3.2.3 Histograms stage

In this stage, each \(D_l\) matrix is partitioned into \(n_B\) non-overlapping (disjoint) blocks with dimensions of \(b_l \times b_2\), and their histograms are computed. Each histogram consists of \(2^{k_{2}}\) bins. Thus, a features vector H is obtained by concatenating the histograms for all blocks of all images \(D_l\), where:

$$\begin{aligned} \left| H\right| =k_{1}n_{B}2^{k_{2}} \end{aligned}$$
(7)

In conclusion, to achieve the best recognition accuracy, the hyperparameters of the PalmNet include the number of filters in each layer \(k_1\) and \(k_2\), and the values of \(b_1\) and \(b_2\) are experimentally tuned.

Fig. 3
figure 3

Topology of the proposed network

3.3 Feature selection and dimensionality reduction

Feature selection (Fs) is a significant component of machine learning, computer vision, artificial intelligence, and data analysis. The aim of feature selection is to select useful features and remove redundant information. In this section, we use the feature selection and dimensionality reduction methods as follows: (i) Fisher score algorithm (as feature selection method), (ii) ReliefF algorithm (as feature selection method) and (iii) Whitening Principal Component Analysis (WPCA) algorithm (as dimensionality reduction method).

Feature selection algorithms can be divided into three groups [35]:

  • Wrappers method that uses classifiers to score a particular subset of features;

  • embedded methods that insert the selection process into the classifier’s learning process;

  • filter methods that analyse intrinsic properties of data, ignoring the classifier [36].

We aim to use feature selection algorithms to rank and select a subset of pertinent features based on their degree of importance, preference, or significance as specified in an application, since the number of features used for training classification models can be decreased by the selection of features. Furthermore, dimensionality reduction reduces the impact of the dimensional curse, reduces time and over-fitting, improves training model, and data comprehension.

While feature selection can be used in both supervised and unsupervised learning, we will focus our study on supervised learning (classification) methods in which the class labels are known ahead of time. The interesting topic of feature selection for unsupervised learning (clustering) is a more complicated issue, and research in this field is recently getting more attention in several communities. In recent years, a variety of feature selection methods have been exploited for clustering paradigms, for example [37,38,39,40,41].

3.3.1 Fisher score algorithm

Fisher score is one of the most common supervised feature selection methods. We used a linear discriminant approach based on Fisher’s score, which evaluates the discriminating power of features. The score is given by:

$$\begin{aligned} W_i=\frac{\sum _{j=1}^{c}N_{j}.(m_{j}-{\bar{m}})^2}{\sum _{j=1}^{c} N_{j}\sigma _{j}^{2}} \end{aligned}$$
(8)

Where \(W_i\) is the score of features i, c is the number of classes, \(N_j\) is the number of samples in class j, \({\bar{m}}\) is the feature mean. \(m_ j\) and \(\sigma _{j}^{2}\) are the mean and the variance of the class j in the intended feature.

3.3.2 ReliefF algorithm

Kira and Rendell [42] formulated the original Relief algorithm inspired by instance-based learning, which is optimized for two-class problems without losing values. The basic idea of the algorithm, when analysing learning instances, is to take into account not just the difference in features values and the variation in classes but also the distance between the instances. In the features space, distance is calculated so that similar instances are close to each other and far apart, and dissimilar ones are far away. By taking into account the similarity of instances, the context of all the features is implicitly considered [35]. For each instance, from a random subset of m (\(m\leqslant M\)), learning instances compute the closest instance of the same class (near hit \(x_H\)) and the closest instance of the opposite class (near miss \(x_M\)). Then it updates the quality of each feature (see Eq. 8).

$$\begin{aligned} W[i]= W[i]-diff(i,x_k,x_H )/m+ diff(i,x_k,x_M )/m \end{aligned}$$
(9)

Kononenko et al. [43] propose a number of updates to Relief. First, they found the near hit \(( x_H)\) and near miss \(( x_M)\) instances using the Manhattan (L1) norm rather than the Euclidean (L2) norm, although the rationale is not specified. Second, they found taking the absolute differences between \(x_k\) and near hit \(( x_H)\), and \(x_k\) and near miss \(( x_M)\) to be sufficient when updating the weight vector (rather than the square of those differences), which can deal with multiclass problems.

3.3.3 Whitening principal component analysis (WPCA) algorithm

Principal Component Analysis (PCA) is the most popular dimensionality reduction technique widely used in machine learning to reduce the features’ redundancy for efficient palmprint recognition. However, PCA has two weaknesses: The performance of PCA is degraded when using its leading eigenvalues and the weak discriminating in its eigenvectors. WPCA is PCA with an extra step: whitening the eigenvectors by eigenvalues. The whitening step is simple but very effective; it helps to rectify the deficiencies of PCA, where (i) the features are less correlated with each other, and (ii) the features all have the same variance. Therefore, making the palmprint recognition system achieve better performance.

3.4 Classification and feature matching procedures

A person’s identity can be ascertained through the classification and feature matching process. The feature vectors of the training set issued by the feature extraction module are used for classification. In our work, we used two classifiers, a 1-NN classifier based on the Euclidean distance and a multi-class SVM classifier with Gaussian kernel. The first classifier arranges a sample based on the category of its nearest neighbour. It basically consists of finding the similarity between the test model and each model of the training set. The second classifier uses a set of training data that enables a hyperplane to separate the best points [44].

A matching process is intended to compare the test features vectors against the stored templates (training set) to generate match scores. The match score is a measure of the similarity or dissimilarity between the template and the test. Therefore, a higher match score indicates a greater similarity between the template and the query. If a matcher measures the dissimilarity between the two feature sets, the score is denoted as a distance score. A lower distance score points to higher similarity.

3.5 Normalization and fusion procedures

In multimodal systems, the normalization method widely used allows each measured score to be converted into a common interval. \(Min-Max\) is the type of normalization mostly used in biometric recognition systems. This technique is most appropriate where the limits (minimum and maximum values) of the scores produced by the systems are known [45]. So, we can conveniently convert the minimum and maximum values of the scores vector into 0 and 1, respectively. The following formula gives the score normalized by the \(Min-Max\) method.

$$\begin{aligned} \widehat{V_{d}} =\frac{V_{d}-min(V_{d})}{max(V_{d})-min(V_{d})} \end{aligned}$$
(10)

Where the vector \(V_{d}\) includes all the scores calculated between the test and all the stored feature vectors, while the vector \(\widehat{V_{d}}\) comprises the normalized scores.

Score level fusion is the most commonly used biometric information fusion strategy since matching scores are readily available, and they retain enough information to distinguish genuine matching from impostor matching. There are several matching score fusion rules that integrate normalized matching scores of a user to produce the final matching score. In our work, we conducted the experiment with four fusion rules: sum-score rule (SUM), product score rule (MUL), SUM-weighted-score (SUM MUL), and product-weighted-score (WHT MUL) [45].

  • Simple Sum rule: This rule takes the sum of the N unimodal systems matching scores of the \(k\) th user as the final matching score \(S_{k}\) of this user. \(S_{k}\) is calculated as follows:

    $$\begin{aligned} S_{k}=\sum _{i=1}^{N} S_{k_{i}} \end{aligned}$$
    (11)
  • The product rule: This rule presents the multiplication result of the N unimodal systems matching scores of the \(k\) th user as the final matching score of this user, which is expressed as follows:

    $$\begin{aligned} S_{k}=\prod _{i=1,2,...,N}S_{k_{i}} \end{aligned}$$
    (12)
  • The weighted Sum rule: This rule can define the final matching score of the \(k\) th user, which is calculated as follows:

    $$\begin{aligned} S_{k}=\sum _{i=1}^{N} w_{i} S_{k_{i}} \end{aligned}$$
    (13)
  • The weighted Product rule: This rule can determine the final matching score of the \(k\) th user, which is shown as follows:

    $$\begin{aligned} S_{k}=\prod _{i=1,2,\ldots,N}S_{_{k_{i}}}^{w_{i}} \end{aligned}$$
    (14)

    Where \(w_i\) represents the weight of the matching score of the \(i\) th biometric trait of the \(k\) th user, which is calculated as follows:

    $$\begin{aligned} w_i = \frac{\frac{1}{EER_i}}{\sum _{j=1}^{N} (\frac{1}{EER_j})} \end{aligned}$$
    (15)

3.6 Simplified PalmNet gabor algorithm

The proposed approach uses an innovative procedure based on deep learning and feature selection for palmprint recognition. First, we apply Log-Gabor filters in the preprocessing step to adjust the pixel luminance of palmprint images. Then, in order to extract discriminative palmprint features, we use the PalmNet Gabor network [17] in feature extraction. Additionally, we employ a Support Vector Machines (SVM) classifier and K-Nearest Neighbour (KNN) classifiers. The key idea of our approach is to use feature selection and dimensionality reduction procedures to improve PalmNet Gabor performance and reduce feature vector size. Therefore, we use the Fisher score and ReliefF feature selection algorithms and dimensionality reduction WPCA algorithm. For the multimodal system, we combined modalities at the matching score level to improve system performance. To validate our method, we applied it to several public palmprint databases.

4 Experiments and results

This section presents the experimental evaluation, by conducting experiments on four popular and publicly available databases which are CASIA multispectral palmprint, PolyU multispectral palmprint, Tongji contactless palmprint, and PolyU 2D/3D contactless palmprint. First, we give a brief description of the adopted palmprint databases (Sect. 4.1). Second, we present the setup of our approach in the experimental setup (Sect. 4.2). Finally, in the experimental results (Sect. 4.3), we discuss and analyse the results.

4.1 Databases

The proposed method is tested using four publicly available palmprint databases; the CASIA and PolyU multispectral databases and Tongji and PolyU 2D/3D contactless databases. The aim of employing contact-based and contactless databases is to verify the robustness and high efficiency of our method. A description of these databases is given below.

4.1.1 CASIA multispectral palmprint database V1.0

The CASIA Multispectral Palmprint Image Database comprises 7,200 palm images obtained from 100 different individuals using self-designed multiple spectral imaging devices described in [46]. In this database, the images of each hand are collected in two separate sessions. The time interval between two sessions is more than one month. In each session, there are three samples. Each sample includes six palm images, which are captured at the same time with six different electromagnetic spectrums. Wavelengths of the illuminator corresponding to the six spectrums are 460, 630, 700, 850, 940 nm, and white light, respectively (see Fig 4). Between two samples, certain degrees of variations of hand postures are allowed. All palm images are low resolution \(<150 dpi\) stored as 8-bit gray-level images per band with dimensions of \(128\times 128\).

Fig. 4
figure 4

Palmprint ROI samples from the multispectral CASIA database. a 460 nm, b 630 nm, c 700 nm, d 850 nm, e 940 nm, and f White light

4.1.2 PolyU multispectral palmprint images database

PolyU Multispectral Palmprint Images Database comprises 6000 images obtained from 500 different palms for each band using palmprint images capturing device designed by Hong Kong Polytechnic University researchers described in [47]. The multispectral database contains cropped multispectral palmprint images of four different bands (Red, Green, Blue, and NIR) are shown in Fig 5. The images were collected in two separate sessions at a time interval of about two months. In each session, the person provides 6 images per palm, so there are 12 images for each person. Therefore, 48 spectrum images of all illumination from 2 palms were collected from each person. The average time interval between the first and the second sessions was about nine days. Also, all palm images are low resolution \(<150 dpi\) stored as 8-bit gray-level images per band with dimensions of \(128\times 128\).

Fig. 5
figure 5

Palmprint ROI samples from the multispectral PolyU database. a Red, b Green, c Blue, and d NIR

4.1.3 Tongji contactless palmprint dataset

Tongji Contactless Palmprint Dataset comprises 12,000 images obtained from 600 different palms using the proprietary touchless acquisition device described in [48]. Tongji University collected images from 300 volunteers, including 192 males and 108 females. Among them were 235 people between the ages of 20 and 30 and the others between the ages of 30 and 50. The left and right samples were collected in two separate sessions (see Fig 6). In each session, the person provides ten images per palm. Therefore, 40 images from 2 palms were collected from each person. The average period of time between the first and second sessions was approximately 61 days. The minimum and maximum time intervals were 21 days and 106 days, respectively. All palm images are stored as \(8-bit\) gray-level images per band with dimensions of \(128\times 128\).

Fig. 6
figure 6

Palmprint ROI samples from the Contactless Tongji database. a Left and b Right

4.1.4 PolyU 2D/3D contactless palmprint dataset

The PolyU 2D/3D Contactless Palmprint database contains 8000 images collected from 400 palms of 200 volunteers. The Bio-Research Center (UGC/CRC) of Hong Kong Polytechnic University [49] created the PolyU 2D/3D database. The participants’ gender includes 136 males and 64 females with the age range of 18 to 50 years. Each person provided twenty samples for both the left and right palms. The left and right palms from the same person can be considered as belonging to different classes. Thus, there are 400 classes of 2D/3D palmprint image samples. The samples have been collected in two sessions, where ten samples are captured in each session, and the average time between the two sessions is one month. All 2D images in this database are stored as \(8-bit\) gray-level images with dimensions of \(128\times 128\). In this work, we use only the 2D ROI images. The following figure (Fig 7) shows the 2D ROI images from this database.

Fig. 7
figure 7

Palmprint 2D ROI samples from the Contactless PolyU 2D/3D database

4.2 Experimental setup

The identification system can operate in two modes: open-set and closed-set identification. In the first mode, the person to identify is not guaranteed to exist in the database, but is assumed to exist in the second mode. The proposed approach has been tested in both modes of our work. The three principal criteria illustrate the performance of a biometric system for open-set identification:

  • False Rejection Rate or FRR: It reflects the percentage of persons required to be accepted, but the system rejects them. It is described by the following equation:

    $$\begin{aligned} \mathrm{FRR}\,(\%)=\frac{\mathrm{Number\,~of~\,rejected\,~genuine~(FR)}}{\mathrm{Total~\,number\,~of\,~genuine\,~access}} \end{aligned}$$
    (16)
  • False Acceptance Rate or FAR: This rate reflects the percentage of persons expected to be not recognized, but they are accepted by the system. It is described by the following equation:

    $$\begin{aligned} \mathrm{FAR}\,(\%)=\frac{\mathrm{Number\,~of\,~accepted\,~imposter\,~(FA)}}{\mathrm{Total\,~number\,~of\,~imposter\,~access}} \end{aligned}$$
    (17)
  • Equal Error Rate or EER: This rate is defined as the percentage where the false acceptance rate and false rejection rate are equal FAR = FRR. That is the best trade-off between false rejections and false acceptances.

We can use another performance measurement, which is the Genuine Acceptance Rate (GAR). It represents the system identification rate. This rate is defined as the percentage of genuine users accepted by the system, which is expressed as follows:

$$\begin{aligned} \mathrm{GAR}\,(\%)=100-\mathrm{FRR}\,(\%) \end{aligned}$$
(18)

So, we can display the Receiver Operating Characteristics (ROC) curves, which are the GAR against the FAR or the FRR against the FAR [50].

In closed-set identification, to measure the accuracy performance of a biometric system, we use the Cumulative Matching characteristic (CMC) curve. It shows the ranking of individual templates based on the match rate. This curve is associated with two criteria Rank of Perfect Rate (RPR), defined as at which rank the identification rate attempts \(100\%\) and Rank-One Recognition (ROR) defined as the percentage of persons recognized by the system as a function of a variable “rank”.

Moreover, to evaluate the computational time requirements of each algorithm, the performance indicator time refers to the CPU time needed to classify one palmprint image in seconds.

To evaluate the efficiency of our proposed method, the experiments were conducted on two sub-datasets. The first sub-dataset is used for the training phase, while the second is for the testing phase. Each sub-dataset contained 50% of the images in the database.

To reach the best possible recognition accuracy on the considered datasets, we experimentally tuned some filter parameters, and we selected the others by considering the optimal values found in the literature [17]. For the Gabor filters, the chosen values of the filter numbers \(k_1\) and \(k_2\) were \(k_1=k_2=13\) in two stages. The fixed-scale \(2-D\) Gabor filters F is 10 with dimensions \(h_{1}= h_{2}=31\) and the adaptive multiscale \(2-D\) Gabor filters \(A^{'}\) is 3.

The feature vector size is computed by using the Eq. 7 as follows: \(\left| H\right| =k_{1}n_{B}2^{k_{2}} = 13~.~4~.~2^{13}=425984\). Where \(n_B=4\) represents the number of non-overlapping blocks with the values of \(b_1=b_2=15\) that fit in the input ROI image with size \(u=v=32\). The network parameters are summarized in Table. 3.

Table 3 Network parameters

4.3 Experimental results

In our experiments, after extracting the Region of Interest (ROI) from the palmprint images, we resized the images to \(32 \times 32\) and applied the Log-Gabor filters for adjusting pixel luminance. We used two layers of Gabor filters with an adaptive Gabor-based filter tuning technique for extracting palmprint specific informative features. To increase recognition accuracy and reduce computation time, we used feature selection algorithms Fisher score and ReliefF with the reduction of dimensionality algorithm whitening Principal Component Analysis (WPCA). For classification, we used the Support Vector Machine classifier (SVM) and the Nearest Neighbour classifier (1-NN). Finally, we fuse the spectral bands at the matching score level to improve identification system performance.

All the computation times presented in this paper are obtained with MATLAB\(\circledR\) 2018a in PC with a processor (Intel Core i7-4710MQ) 2.50-GHz and RAM 16 GB.

The experimental results can be divided into three subparts: the first subpart includes the results obtained from the unimodal identification system evaluated on the CASIA and PolyU multispectral databases, and the contactless Tongji and PolyU 2D/3D databases. While in the second subpart, the results of the multimodal identification system are presented. As for the third subpart, we do a comparison study with some works in literature.

4.3.1 Unimodal biometric system performance

The unimodal system results were performed on four public multispectral and contactless palmprint databases.

  1. 1.

    CASIA multispectral database results In order to evaluate and test the effectiveness of the proposed method, experiments are done on the CASIA multispectral database with 100 persons and six spectral bands (460, 630, 700, 850, 940 nm, and white light). Since our evaluation adopts random splits for training and testing, we studied the effect of the amount of training on the method’s performance. To this end, we run our method using three different random training-testing splits (without Feature selection) (see Table 4). From the obtained results, we can observe that the 50% training-testing split gives the best results for all spectral bands. Therefore, the experiments will be conducted using the 50% training-testing splits. Also, to evaluate the performance of the proposed approach, we calculated the performance indicators for several random splits of the training and testing images. Thus, we calculated the standard deviation of the main performance indicators EER and ROR in two cases: without and with FS and dimensionality reduction using the SVM classifier. To this end, we have adopted ten random image splits, each of which contains six images for training and the remaining six images for testing. The number of genuine and impostor comparisons for each spectral band is 600 and 29,700, respectively. Table 5 shows the different results of ERR (%) and ROR (%) on the ten random splits for two cases without feature selection (FS) and with FS (Fisher score) and dimensionality reduction (WPCA) of six spectral bands from the CASIA database. This table also shows the mean and standard deviation of ERR (%) and ROR (%). Comparing all the results in the two tables shows that feature selection and dimensionality reduction have improved average performance (i.e., EER and ROR) and reduced standard deviations, implying that performance is relatively stable. Moreover, in the case without feature selection, the best results were obtained with band 630, and in the case with feature selection, with band 460. Table 6 shows the results of the proposed method with two classifiers (1-NN and SVM) and two modes of identification (open-set and closed-set). This table also shows the mean of ERR (%), ROR (%), and time for all spectral bands. By comparing all the obtained findings, it is clear that the spectral band 460 nm gives the best results in terms of EER and ROR values. Firstly, the features’ number of the unimodal identification systems is reduced from 12288000 features for the work [17] to 425984 features for our work. In order to reduce features more, applying Fisher score algorithm with ReliefF algorithm or not has allowed reducing the features while maintaining a good identification accuracy. Figures 8 and 9 show an example of selecting features for a Fisher score algorithm using an SVM classifier based on the performance of EER (%) and GAR (%) against the number of ranked features, respectively. The obtained results show that the insertion of Fisher score with SVM classifier achieves the perfect results with EER equal to 0.000 % in the open set and ROR equal to 100 % in the closed set for spectral band 460 nm with 11500 features. Fig 10 illustrates the effectiveness (ROC and CMC curves) of this case for all spectral bands. For the last case, Fisher score algorithm with WPCA algorithm, the system yielded better results compared with previous (without FS), it can achieve an EER of 0.003 % in the open set and a ROR of 99.83 % in the closed set for the 460 nm spectral band while reducing the feature vector size to 410 with CPU time 0.001s instead of 0.478s. Thus, the use of the FS and dimensionality reduction allowed us to reduce the number of features and improve the identification accuracy.

Table 4 Performances comparison of different random training-testing splits
Table 5 Mean and standard deviation results using 50% for training and 50% for testing
Table 6 The unimodal identification system performance for the CASIA database using 50% for training and 50% for testing
Fig. 8
figure 8

The performance of EER (%) against the Number of ranked features on the CASIA database using 50% for training and 50% for testing

Fig. 9
figure 9

The performance of GAR (%) against the Number of ranked features on the CASIA database using 50% for training and 50% for testing

Fig. 10
figure 10

Unimodal system results for the CASIA database using 50% for training and 50% for testing. a ROC curves (FRR against FAR) and b CMC curves (Identification rate against Rank)

  1. 2.

    PolyU multispectral database results In the following, we present experimental results of the proposed system evaluated on the PolyU multispectral database, which contains 500 persons and various modalities (Red, Green, Blue, and NIR). In our experiment, six images of each person are selected for training and the other six for testing. Namely, 3000 images are used for training, and 3000 images are used for testing for each modality. Furthermore, there are 3000 genuine comparisons, and 748,500 impostor comparisons are generated for each band. Thus, in order to see the performance of the biometric system with two modes of identification and CPU time needed to classify one palmprint image, we present the findings in Table 7. We can observe from the results obtained by the proposed system that NIR and Blue spectral bands presented the best results in terms of the EER(%) and ROR(%) values. For example, the results of the open-set identification with 1-NN classifier and without FS give EERs equal to \(4.008\times 10^{-4}\) % and \(8.016\times 10^{-4}\) % for NIR and Blue spectral bands, respectively. While, for the closed-set identification, the system achieved a \(ROR= 99.96~\%\) and CPU time \(= 0.121~s\) with NIR band, and a ROR of \(99.96~\%\) and CPU time equal 0.118 s with Blue. It is also clear that the use of feature selection and dimensionality reduction improves the system’s accuracy and reduces computational time. The Fisher score with SVM classifier achieves the perfect results with an EER = \(0.000~\%\) in the open-set and a ROR = \(100~\%\) in the closed-set for NIR spectral band. The effectiveness of all spectral bands is shown in Fig 11, where the ROC and CMC curves are illustrated. The use of the Fisher score + ReliefF with SVM classifier reduced the feature vector size to 3000 and gave a perfect result with an EER = \(0.000~\%\) in the open-set and a ROR of \(100~\%\) in the closed-set for the blue spectral band. Similarly, the Fisher score \(+\) WPCA yielded perfect results with an EER = \(0.000~\%\) in the open-set and a ROR = \(100~\%\) in the closed-set and reduced the feature vector size to 410 and CPU time to 0.007 s for NIR spectral band.

Table 7 The unimodal identification system performance for the PolyU database using 50% for training and 50% for testing
Fig. 11
figure 11

Unimodal system results for the PolyU database using 50% for training and 50% for testing. a ROC curves (FRR against FAR) and b CMC curves (Identification rate against Rank)

  1. 3.

    Tongji contactless database results In order to verify the robustness and high efficiency of the proposed method in the contactless database, we used the Tongji database that containing 300 persons with two modalities (Left and Right hands). In our experiment, we apply the 10 images of each person for training and the other ten for testing. Namely, 3000 images are used for training, and 3000 images are used for testing each sample. Thus, 3000 genuine comparisons and 448,500 impostor comparisons are generated. Table 8 shows the performance of our unimodal biometric system with two modes of identification system (open-set and closed-set) and the CPU time needed to classify one palmprint image. Compared to all the obtained results, almost all of them give perfect accuracies. The feature selection and dimensionality reduction with two classifiers have not decreased the performance of our identification system. Effectively, with the SVM classifier, the use of the fisher score and the WPCA yielded perfect results with EER of \(0.000~\%\) in the open-set and ROR of \(100~\%\) in the closed-set and reduced the feature vector size to 410 and CPU time 0.004 s instead of 6.987 s (without FS) and 0.004 s instead of 7.184 s (without FS) for Left and Right hands, respectively.

Table 8 The performance of unimodal identification systems for the Tongji database using 50% for training and 50% for testing

In our case, we do not need the fusion for PolyU and Tongji databases because we got perfect results in the unimodal system (EER = \(0.000 ~\%\) and ROR = \(100~\%\)).

  1. 4.

    PolyU 2D/3D contactless database results

In order to evaluate the efficiency of the proposed method in the contactless database, we utilized the PolyU 2D/3D database, which contains 400 people. In our experiment, we used 2D images from this database splitting each person’s 20 images into ten for training and ten for testing. Each sample is tested using 4000 images for training and 4000 images for testing. As a result, there are 4000 genuine comparisons and 798,000 impostor comparisons. Table 9 shows the performance of our unimodal biometric system with two modes of identification system (open-set and closed-set) and the CPU time needed to classify one palmprint image. From the obtained results, we observe that the SVM classifier achieves better performance than the NN classifier, especially in the case of the Fisher score. Furthermore, the use of the fisher score and the WPCA provided perfect results with an EER of \(0.000~\%\) in the open-set and ROR of \(100~\%\) in the closed-set, reducing the feature vector size to 410 and CPU time 0.007 s instead of 2.394 s (without FS).

Table 9 The performance of unimodal identification systems for the Contactless PolyU 2D/3D database using 50% for training and 50% for testing

4.3.2 Multimodal biometric system performance

Unimodal systems suffer from some limitations and cannot provide satisfactory recognition performance in several cases, such as the possibility of noise in the biometric modality and its non-universality [51], which increases system error (EER). Intra-class dissimilarity, as well as inter-class similarity, can also impact the unimodal biometric system and hence the result of identification [52]. An excellent biometric identification system requires a very low EER value, which can be achieved by the multimodal system [53] [54]. Such a system combined several features of each modality at different levels to improve system performance. Matching score level fusion is the most widely used in the biometric system. In our work, we fuse only the spectral bands of CASIA database palmprint at the matching score level to improve system performance. The other databases have given perfect results with unimodal biometric systems (EER = \(0.000 ~\%\) and ROR = \(100~\%\)). The experiment was conducted with four methods of fusion which are the SUM rule (SUM), the product rule (MUL), and their weighted versions, that is, weighted SUM rule (WHT SUM) and weighted product rule (WHT MUL). The performance of our multimodal identification system is shown in Tables 10 and 11. The analysis of data showed that the results of the multimodal fusion were much better than those of the unimodal biometric systems. As can be seen from the results, the lowest EER of multimodal identification was obtained by using the combination of all spectral bands that are always better than the lowest results of the unimodal system. In addition, the best results were obtained with an EER = \(0.000~\%\). In contrast, the best results of the unimodal biometric system were \(0.003~\%\) (case Fisher score\(+\) WPCA with SVM classifier). Figure 12 illustrates the CMC and ROC curves for the multimodal identification system.

Table 10 The performance of multimodal identification system fusion between (460, 630, 700) and (850, 940, WHT) using 50% for training and 50% for testing
Table 11 The performance of multimodal identification system (fusion between all spectral bands) using 50% for training and 50% for testing
Fig. 12
figure 12

Multimodal system results for the CASIA database (SUM rule) using 50% for training and 50% for testing. a ROC curves (FRR against FAR) and b CMC curves (Identification rate against Rank)

4.3.3 Comparative study

To prove the effectiveness of the proposed approach against alternative methods, we did a comparative study along with some works found in the literature. Thus, in this paper, we provided unimodal and multimodal identification systems using multispectral and contactless palmprint images. The results validated the robustness and effectiveness of the proposed method. The system reached very high identification accuracy (\(EER = 0.000~ \%\) and \(ROR = 100~\%\)). Therefore, a comparison of some related and existing techniques must be made on the same databases. Table 12 summarizes the works of the unimodal biometric system performed on the multispectral CASIA and PolyU databases and the contactless Tongji and PolyU 2D/3D databases. To obtain an equitable comparison, we chose works where the percentage of training and testing is \(50~ \%\). From this table, we observe that the proposed algorithm (Simplified PalmNet-Gabor) provides high identification performance for multispectral PolyU database and contactless Tongji and PolyU 2D/3D databases, using a substantially lower number of features compared with other methods listed in the same table. On the other hand, for the multispectral CASIA database, a fusion process is used to improve the identification performance. Table 12 also lists the recognition accuracies of the proposed method and the most recent methods published in the literature with the number of features and computing time of the different classifiers. The computing time represents the CPU time needed to classify one palmprint image. To obtain an equitable comparison, we chose works where the percentage of training and testing is \(50~ \%\). From this table, we observe that the proposed algorithm (Simplified PalmNet-Gabor) provides high identification performance for multispectral PolyU database and contactless Tongji and PolyU 2D/3D databases, using a substantially lower number of features compared with other methods listed in the same table. On the other hand, for the multispectral CASIA database, a fusion process is used to improve the identification performance.

Table 12 Performance comparison of the unimodal systems using 50% for training and 50% for testing

For the multimodal biometric system, a comparative study was performed in Table 13 with the works that used fusion at the matching score level and 50% train-test split procedure. It is observed that the proposed algorithm has also given perfect identification performance (\(\mathrm{EER} = 0.000~\%\), \(\mathrm{ROR} = 100~\%\)) like the two other works, but it has reduced the features vector to 410 instead of 46,060 for the two other works.

Table 13 Performances comparison of the multimodal systems using 50% for training and 50% for testing

5 Conclusion

In this work, we proposed efficient unimodal and multimodal identification systems for fast palmprint recognition. The proposed method, namely simplified PalmNet-Gabor, adopts the PalmNet network with enhancements based on feature selection and reduction dimension methods. Therefore, we used feature selection methods to select a subset of relevant features of PalmNet using Fisher score and ReliefF methods and dimensionality reduction by WPCA method to reduce the computational time and improve the accuracy recognition. Furthermore, we applied log-Gabor filters by adjusting the pixel luminance of palmprint images. For the multimodal system, we use the matching score fusion method to improve the performance of the unimodal system. The proposed method effectively improves the accuracy of PalmNet, reduces the number of features, and solves the problem of computational complexity. The proposed approach was evaluated using four popular and publicly available palmprint databases. The extensive experiments presented in this work have validated the robustness and effectiveness of our method by achieving high recognition accuracy with a significantly smaller number of features. The unimodal identification system performed on multispectral PolyU database and contactless Tongji and PolyU 2D/3D databases obtained a very high identification accuracy (\(\mathrm{EER} = 0.000~\%\) and \(\mathrm{ROR} = 100~\%\)). It gave a CPU time less than 0.009 s and reduced the feature vector size to 410. Likewise, the multimodal identification system performed on the CASIA database offers perfect results \(\mathrm{EER} = 0.000~\%\) for the open-set identification and \(\mathrm{ROR} = 100~\%\) for the closed-set identification. In the future, we will test our proposed method with other large databases, such as medical images. We will also employ additional clustering and dimensionality reduction techniques. Furthermore, we will employ the graphics processing unit (GPU) to reduce processing time, which is a highly valuable tool for speeding up the processing speed of computationally intensive algorithms.