1 Introduction

Human biometrics in the form of signals and images are widely used in several applications such as human recognition, identification, and authentication. Among them, the most extensively used biometrics employed in various applications are fingerprint, iris, finger vein, and facial images. Today, these traits are ubiquitous for human identification in banks, airports, hospitals, and applications ranging from banking, surveillance, litigation, education, electronic healthcare, and many other areas. Furthermore, more recently, human biometrics are assuming important roles in engineering technology and security including blockchains, Internet of things (IoT) and cloud computing (Gudeme et al. 2020). The preponderance and ubiquity of human biometric information make it imperative to invest in securing such biometric data. This is the primary objective for this study, which presents a scheme to generate and evaluate cancellable biometric templates. Our choice of cancellable biometric systems (CBS) is motivated by its proven capability to facilitate the required updates to biometric information whenever needed without affecting the rest of the system (Gudeme et al. 2020; Peng et al. 2021; Kaur and Khanna 2020; Wang et al. 2017; Akdogan et al. 2018). This makes CBS attractive for deployment in different advanced biometric identification and authentication frameworks, especially in platforms of emerging technologies such as cloud services and Internet of Things (IoT). To deliver on our outlined contribution, the rest of the study is organised as follows. Section 2 presents an overview of the work related to contributions emanating from the study. Section 3 presents an overview of cancellable biometric systems as the foundation for our proposed multi-biometric cancellable scheme (MBCS) that is presented in later parts of the same section. Section 4 is devoted to presenting extensive performance evaluation in terms simulation results reporting visual, quantitative, qualitative, and complexity analyses of our proposed MBCS relative to similar state-of-the-art methods reported in the literature.

2 Related work

Cancellable biometric templates provide frameworks to safeguard biometric information where different inputs can be revised or updated without affecting the rest of the system (Peng et al. 2021). Furthermore, CBS systems are known to have low storage demands (Kaur and Khanna 2020), which has led to their consideration in different applications and domains. For example, a bi-level biometric system employing secure key arrangement (i.e., SKA) conventions was proposed by Kaur et al. in Kaur and Khanna (2020). Therein, the first convention utilised unadulterated biometric SKA arrangement (i.e., SKA-PB) while the second one utilises a cancellable biometric arrangement (i.e., SKA-CB). Both conventions use biometrics with unordered highlights, but SKA-PB uses symmetric cryptographic key arrangement between client and worker. Therefore, its resulting key is composed of irregular or arbitrary information. Unlike SKA-PB, dropped biometrics of the client are associated with SKA-CB instead of the unadulterated biometrics and the ability to generate cancellable templates is acquired by a two-fold string technique. As an insurance to layout bargain, which, in the SKA-CB convention, biometric formats can be dropped at any time. The conventions were validated using two fingerprints in applications demonstrating multi-standards security and examination of unpredictability.

Similar to Kaur and Khanna (2020), Akdogan et al. (2018) proposed a CBS-based connection invariant irregular separation (CIRF) convention. Their scheme is considered a promising assurance format that depends on the representation of the element of a certain biometric using 2D number threshold transform (NTT) and arbitrary sifting. However, despite its utility, CIRF is known to have image related computational demands that are attributed to the reverse 2D NTT used in the coordinating stage. This makes its development as a biometric ID system very costly. Consequently, in Akdogan et al. (2018), the concept of cancellable ordering was proposed to circumvent issues around CIRF. The resulting protocol utilises low-position estimates of biometric traits as a minimum spanning tree representation. Therefore, Akdogan et al.’s CIRF is a two-stage score level combination approach that associates the acquired scores from cancellable layouts of different biometric modalities. At the first level, scores from different matchers are combined using a mean-conclusion weighting (MCW) strategy. Subsequently, at the second level, scores from various modalities that utilise a rectangular territory weighting (RAW) technique are used.

Elsewhere in Murakami et al. (2019), a cancellable biometric approach dependent on Gaussian random vectors (GRV) and hashing was proposed by Murakami et al. The proposed framework applies variants of the first biometrics as opposed to the first biometric itself for storage and authentication. The study reports assessments in exhibition, non-invertibility and uniqueness, while validation was accomplished using face and palmprint biometric modalities.

In their contribution, Yang et al. proposed a CBS based on a random slope strategy in Yang et al. (2018). This technique reportedly creates a safe and revocable system with non-invertible cancelled templates. Structurally, the technique consists of two stages: random slope version one (RS-V1) and version two (RS-V2), which provide secured layouts manifesting two-thirds decrease in measurements. The technique was validated using different biometric modalities including noticeable and warm faces, print, palm vein, and finger vein.

CBS techniques have also been used to secure access to cloud-based platforms where biometric authentication is used to safeguard different levels of cloud resources (Gudeme et al. 2020). Among others, pseudo biometric characters of cancellable biometrics have been used to circumvent unauthorised access (Peng et al. 2021) where the cancelled biometric identities are generated using a random distance method. Additionally, the generated biometrics exhibited properties of revocability, diversity, and security for the user in diverse applications. Similarly, in Rathgeb and Busch (2014), Rathgeb and Busch proposed a cancellable biometric approach based on adaptive Bloom filter-based transforms. This technique integrates features of binary iris biometric templates to generate an irreversible fused template that obscures the information of both iris templates. The framework is implemented on Iris Database version 1.0 (Rathgeb and Busch 2014. In Trivedi et al. (2020), a CBS was proposed for fingerprint biometrics based on information extracted from the Delaunay triangulation of minutiae points. Further, to generate a user-specific cancellable biometric, a key of a random binary string is generated for each user. As with other highlighted CBS methods, the generated cancellable template exhibits properties of revocability, diversity, and security.

Meanwhile, in their contribution in Dwivedi and Dey (2019), Dwivedi and Dey proposed a non-invertible cancellable biometric technique based on partial Hadamard transform. To address the security issue, the proposed transformation is carried out on the DFT transformed biometric instead of the original biometric, specifically a set of fingerprint biometrics where the generated cancellable templates are required to satisfy revocability, diversity, non-invertibility and performance. Similarly, Kaur and Khana proposed a multi-biometric CBS system based on both fingerprint and finger vein modalities in Kaur and Khanna (2015). The focus of that study was aimed at meeting the protection and revocability requirements of the generated templates which combine minutia-based fingerprint feature set and image-based finger-vein feature set. Subsequently, a fusion process is carried out on both sets and the proposed approach was evaluated and analysed prior to matching and security. After that, they compared their proposed method alongside the original partial discrete Fourier transform (P-DFT). In Kaur and Khanna (2019), Kaur and Khana also proposed Random Slope concept to generate the cancelable features. They designed two biometric template protection methods based on this concept for various biometric modalities to observe efficacy and the security of the approaches.

Recently, however, the potency of deep learning models (DLM) has led to its emergence as the go-to strategy in different applications across wide ranging domains, including self-driving cars, natural language processing, entertainment, visual recognition, fraud detection and healthcare. In healthcare, recent applications of DLM have been reported in COVID-19 detection (Sedik et al. 2020b), e-healthcare for smart cities (Alghamdi et al. 2020), security (Al-Azrak et al. 2020; Elaskily et al. 2020; Sallam et al. 2019), biometric recognition (El-Rahiem et al. 2020; El-Moneim et al. 2019) and wireless communication (El-Ashkar et al. 2019).

Building on the highlighted contributions, in this study, we explore utilising the practicality of DLMs to generate secure and efficient cancellable biometric templates for iris, fingerprint, and finger vein modalities. Specifically, our proposed strategy is to generate a unique template that suffuses these modalities into an efficient multi-biometric CBS.

Consequently, the main contributions of our proposed study include:

  1. 1.

    Building a multi-exposure deep fusion module to generate a fused aggregate of different input biometric modalities.

  2. 2.

    Deploying a deep dream module to generate a cancellable template from the fused biometric image modalities.

  3. 3.

    Utilising standard metrics and techniques to evaluate the performance of the generated cancellable template relative to available state-of-the-art methods.

3 Proposed multi-biometric cancellable scheme

Cancellable biometric systems (CBS) are used to safeguard biometrics by generating an alternative encrypted template and storing it instead of the original biometric images where the generated cancellable templates are considered secure and unique for each identity.

Our proposed multi-biometric cancellable scheme (MBCS) is based on multi-exposure biometric fusion using deep dream. To elaborate, we employ a deep learning strategy to fuse the multi-exposure biometric modalities from iris, fingerprint, and finger vein biometrics that are subsequently transformed into a cancellable template. Figure 1 presents the layout of the proposed technique whose details are discussed in the remainder of this section.

Fig. 1
figure 1

Outline of proposed multi-biometric cancellable system (MBCS)

3.1 Deep multi-exposure technique

As outlined in Fig. 1, as inputs for an enrolle, the multi-exposure deep fusion unit uses the separate modalities from iris, fingerprint, and finger vein biometrics consisting of images of each finger from a hand, its vein and iris, making seven input biometrics altogether. The fusion process utilises a sequence of convolutional layers to extract relevant features from these input biometric images. Following that, an additional layer is used to generate a feature map from each image, and, finally, a reconsideration network is used to generate the fused image.

Meanwhile, as elucidated in Fig. 2, in the case of multi-exposure fusion, the enumerated process is repeated recursively for each pair of images. Furthermore, execution of the convolutional neural network (CNN)-based deep fusion process for each pair of biometric images is accomplished in three phrases. In the first phase, targeted features are extracted from the input images and used to generate a feature mapping where the input images are fused, i.e., during the second phase. Finally, in the third phase, the fused feature map is reconstructed using a reconstruction network. In terms of configuration, the feature extraction phase consists of two channels, \({C}_{1}\) and \({C}_{2}\), each composed of two convolution layers, (i.e., \({C}_{1,1}\), \({C}_{1,2}\), and \({C}_{2,1}\), \({C}_{2,2}\)) for the second feature extraction channel. On its part, the fusion layer uses an addition operation to fuse feature maps \({F}_{1}\) and \({F}_{2}\). Finally, the reconstruction phase consists of three convolution layers \({C}_{3}\), \({C}_{4}\) and \({C}_{5}\). The enumerated three phases combine to accomplish the multi-exposure deep fusion process that feeds the deep dream cancellable unit of our proposed MBCS scheme. Further details of the intrigues involved in the multi-exposure deep fusion and deep dream cancellable processes are outlined in the sequel.

Fig. 2
figure 2

Description of multi-exposure deep fusion process

3.1.1 Loss function and optimisation

As a deep learning algorithm, the proposed multi-exposure fusion (MEF) process is optimised during the training phase. The loss function minimises the error between real and estimated targets for an instance of data. However, this would depend on whether the application is classification or fusion. For example, in classification applications, there are different loss functions, such as cross entropy and mean square error, so loss functions are deployed to minimise the error between the real and estimated classes. In contrast, since the main goal of fusion applications (Fig. 3) is to generate an image with quality close to the input images then the fused image should manifest qualities as close to the input image as possible. Consequently, the structural similarity index measure (SSIM) would be a good loss function for such applications. SSIM computes the pixel-wise similarity between images and as such a low SSIM would indicate concordance between the fused and input images as deduced from Eqs. (1)-(5).

Fig. 3
figure 3

Illustration of fusion process for image pairs

$$\begin{aligned} l\left( y_k {\ },\ y_f {\ }\right) =\ \frac{2{\mu }_{y_k {\ }}{\mu }_{y_f {\ }}+C_1}{{\mu }^2_{y_k {\ }}+{\mu }^2_{y_f {\ }}+C_1}, \end{aligned}$$
(1)
$$\begin{aligned} c\left( y_k {\ },\ y_f {\ }\right) =\ \frac{2{\sigma }_{y_k {\ }}{\sigma }_{y_f {\ }}+C_2}{{\sigma }^2_{y_k {\ }}+{\sigma }^2_{y_f {\ }}+C_2}, \end{aligned}$$
(2)
$$\begin{aligned} s\left( y_k {\ },\ y_f {\ }\right) =\ \frac{{\sigma }_{y_k {\ }y_f {\ }}+C_3}{{\sigma }_{y_k {\ }}{\sigma }_{y_f {\ }}+C_3}, and \end{aligned}$$
(3)
$$\begin{aligned} SSIM {(}y_k {\ },y_f {\ } {)=[}l {(}y_k {\ },y_f {\ } {)]}\alpha {\cdot } {[}c {(}y_k {\ },y_f {\ } {)]}\beta {\cdot } {[}s {(}y_k {\ },y_f {\ } {)]}\gamma \end{aligned}$$
(4)

where \({\mu }_G\ {\rm and}\ {\mu }_I,\ {\sigma }_G\ {\rm and}\ {\sigma }_I,\) and \({\sigma }_{GI}\) are the local means, standard deviations, and cross-covariance for input and output image patches \(y_k\) and \(y_f\), respectively, while \({C}_{1}\), \({C}_{2}\), and \({C}_{3}\) are stabilisation constants. If the parameters \(\alpha\) = \(\beta\) = \(\gamma\) = 1 (i.e., the default values used in exponents), and \({C}_{3}\) = \({C}_{2}/2\) (i.e., the default selection of \({C}_{3}\)) then the simplified SSIM index is defined as:

$$\begin{aligned} SSIM\left( y_k {\ },y_f {\ }\right) {=\ }\frac{ {(2}\mu y_k {\ }\mu y_f {\ } {+}C {1} {)(2}\sigma y_k {\ }y_k {\ } {+}C {2} {)}}{ {(}\mu {2}y_k {+}\mu {2}y_f {+}C {1} {)(}\sigma {2}y_k {+}\sigma {2}y_f {+}C {2} {)}} \end{aligned}$$
(5)

whose score at a certain pixel p is:

$$\begin{aligned} score\left( p\right) =SSIM\left( y_k {\ },y_f {\ }\right) , \end{aligned}$$
(6)

and the total loss is calculated using (7).

$$\begin{aligned} Loss=\frac{1}{N}\sum ^{\ }_{p\in P}{score(p)}, \end{aligned}$$
(7)

where N is the total number of pixels in the image and P is the set of all pixels in the input image.

The computed loss is backpropagated during the network training and it can be inferred that the choice of SSIM as performance indicator for MEF is attributed to its objective function which maximises structural consistency between each fused and input image pairing.

3.2 Deep dream algorithm

As presented in earlier Fig. 1 and the discussion that followed it, the second unit of our proposed multi-biometric cancellable system (MBCS) is the so-called deep dream algorithm (Cox 2019), which curtails losses associated with convolutional layers of our deep learning architecture. These losses arise because CNN perfoms digital filtering operations where each filter extracts features from input images. Since a CNN network consists of multiple layers and each layer includes a certain number of filters selected by a dedicated design, then a layer comprising of \({\varvec{N}}_{\varvec{l}}\) distinct filters would have \({\varvec{N}}_{\varvec{l}}\) feature maps each of size \({\varvec{M}}_{\varvec{l}}\), where \({\varvec{M}}_{\varvec{l}}\) is a multiplication of the height and width of the feature map. Furthermore, the responses in a layer l can be stored in a matrix \({\varvec{F}}^{\varvec{l}}\) \(\in\) \({\varvec{R}}^{{\varvec{N}}_{\varvec{l}}\varvec{{\times }}{\varvec{M}}_{\varvec{l}}}\) where \({\varvec{F}}^{\varvec{l}}_{\varvec{ij}}\) is the activation of the \({\varvec{i}}{{th}}\) filter at position \(\varvec{j}\) in layer \(\varvec{l}\).

Consequently, the new value of a certain pixel \({\varvec{p}}_{\varvec{new}}\) is computed as the summation of the surrounding old pixels \(\varvec{p}\) multiplied by the applied filter elements \(\varvec{w}\), as formalised in (8).

$$\begin{aligned} p_{new}=\sum ^{\ }_{i\in s}{p_i.w_i} \end{aligned}$$
(8)

If we consider \(\widetilde{\varvec{p}}\) and \(\widetilde{\varvec{x}}\) as the original and generated images and \({\varvec{P}}^{\varvec{l}}\) and \({\varvec{F}}^{\varvec{l}}\) as their respective feature representations at layer \(\varvec{l}\), then the squared-error loss between the two feature representations could be defined in the form presented in (9).

$$\begin{aligned} L_{\ }\ \left( \tilde{p},\tilde{x},l\right) =\frac{1}{2}\sum _{i,j}{{(F^l_{ij}-P^l_{ij})}^2}, \end{aligned}$$
(9)

and the derivative of this loss with respect to the activations in layer l is computed using:

$$\begin{aligned} \frac{\partial L_{\ }}{\partial F^l_{ij}}=\left\{ \begin{array}{ll} F^l_{ij}-P^l_{ij} &{} if \quad F^l_{ij}>0 \\ 0 &{} if \quad F^l_{ij}<0 \end{array}\right. \end{aligned}$$
(10)

3.2.1 Implementation of deep dream algorithm

As outlined in Cox (2019), the execution of the deep dream algorithm (DDA) on a Convnet pretrained ImageNet model requires access to many Convnets, such as VGG16, VGG19, Xception, and ResNet 50. While DDA can be executed with any of them, the choosen convnet will naturally affect the visualisations because different convnet architectures result in different learned features. Moreover, Inception is known to intuitively produce decent looking Deep Dreams (Cox 2019). Motivated by this, we utilise Inception V3 model, whose layout is presented in Fig. 4, for the DDA unit of our proposed MBCS scheme.

Fig. 4
figure 4

Layout demonstrating the architecture and layers of Inception V3 deep learning model (Cox 2019)

Subsequently, loss value is maximised during the gradient-ascent process. For filter visualisation, the target is to maximise the value of a specific filter in a specific layer, which entails maximising the activations of filters in different layers simultaneously. Specifically, the objective is to maximise a weighted sum of the \(\varvec{L}\varvec{2}\) norm of the activations in a set of high-level layers. Since the choice of exact set of layers (as well as their contribution to the final loss) has a major influence on the visuals that will be produced, then these parameters must be easily configurable. Moreover, it has been established that lower layers result in geometric patterns, whereas higher layers produce visuals wherein some classes can be recognised from ImageNet (Cox 2019). Consequently, while the implementation starts with a somewhat arbitrary configuration involving four layers, eventually it trancends many other configurations.

Furthermore, the maximisation process is performed on the loss gradients of the convnet layers where three main parameters are used to control the gradient ascent process, maximum loss \(L_{max}\), gradient step S and number of iterations I. In this regard, the gradient ascent X for loss gradients of layer l at an iteration i can be computed using (11).

$$\begin{aligned} X=\sum ^{\ }_{i,l}{S\times L_{i,l}} \quad if \quad X<L_{max} \end{aligned}$$
(11)

Finally, a list of scales (i.e., octaves) generated by dream rein junction at each layer define subsequent points to process the images. Moreover, the rein junction can be utilised to upscale the input image at each layer and increase its cancellability. Therefore, each successive scaling is larger than the previous one by a factor of 1.4, which results in 40% increase in the initial dimensions of the image. Therefore, the rein junction process starts with a small image that is successively up scaled as demonstrated in Fig. 5.

Fig. 5
figure 5

Illustration of deep dream image generation Cox (2019)

As illustrated in the figure, at each step of the deep dream generation, from the smallest image to the largest, gradient ascent is performed to maximise the previously defined loss function. Additionally, since the resulting image is upscaled by 40% then, to avoid losing a lot of image detail, after each successive scale-up (resulting in increasingly blurry or pixelated images), there is need to reinject some of the lost details back into the image. Therefore, given a small image \(I_s\) and a larger image \(I_L\), the difference between the original image resized to size \(I_L\) and the original image resized to size \(I_s\) quantifies the image details lost when going from \(I_s\) to \(I_L\).

4 Simulation results

Guided by Sedik et al. (2020a), the validation and performance evaluation for our proposed MBCS scheme is undertaken using three biometric inputs, namely: fingerprint, finger vein and iris. The dataset comprises of nine images from each modality, i.e., fingerprint, finger vein and iris as presented in Fig. 6a–c. Finger veins are collected using an infrared (IR) sensor, while the fingerprints are collected using a special device equipped with a light emitting diode (LED) to generate the fingerprint image (Peng et al. 2014). Finally, the iris images are obtained via the CASIA iris dataset in Wang et al. (2013) where further details regarding its collation are presented.

As presented in preceding sections of this study, the objective of our MBCS scheme is to generate a secure cancellable template to replace the hitherto separate biometric input images. Therefore, applying our outlined MBCS scheme on the dataset in Fig. 6a–c, produces the cancellable templates in Fig. 6d for each block of the biometric inputs.

Fig. 6
figure 6

Input biometric images and generated cancellable template

Our proposed MBCS scheme is implemented via a workstation equipped with Python, Intel® Core \({{}^{TM}}\) i7 on a NVIDIA GPU with 4 GB and each analysis is undertaken using standard metrics as reported in subsequent subsections. Similarly, in this section, we report outcomes of our extensive evaluation that covers visual and statistical performance indicators with the former (i.e., visual evaluation) further divided into histogram and correlation analyses whilst the latter (i.e., statistical evaluation) is similarly further divided into quantitative, qualitative, and complexity analyses. These metrics and their outcomes are presented in the sequel.

4.1 Visual evaluation

The visual evaluation includes the histogram, genuine and imposter correlation scores, and receiver operating characteristics (ROC) which are aimed at establishing the visual performance of the proposed scheme.

4.1.1 Histogram analysis

Histograms provide insightful visualisation of the pixel-wise intensity distribution in an image (Abd-El-Atty et al. 2021). A good cancellable scheme should exhibit identical distribution for the generated templates. Figure 7a–c present respective histograms for the fingerprint, finger vein and iris biometrics (i.e., those presented earlier in Fig. 6a–c) whereas Fig. 7d presents the histograms for the generated cancellable templates, i.e., those in Fig. 6d. As observed in Abd-El-Atty et al. (2021), histogram plots indicate the scheme’s vacuity or permeation of noise, which is also a confirmation of its capacity to withstand statistical attacks.

Fig. 7
figure 7

Histograms for the three input biometrics and their generated templates

4.1.2 Correlation analysis

Correlation coefficient is another metric that is useful in visualising the discombobulation expected in cancellable schemes. The correlation coefficient of neighbouring pixels \({C}_{A,B}\) is used in correlation analysis to assess the relationship between an input biometric image and its generated cancellable template (Kaur and Khanna 2020; Akdogan et al. 2018; Wang et al. 2017). In this regard, the neighbouring pixels of the pristine input biometric should be highly correlated with values close to unity whilst those in the cancellable template should have values closer to zero (Yan et al. 2013).

We compute \({C}_{A,B}\) using (12) and report plots of \({C}_{A,B}\) for auto and cross correlation in Fig. 8a, b, respectively.

$$\begin{aligned} R_{xy} {=}\frac{\frac{ {1}}{N}\sum ^N_{i {=1}}{ {(}x_i {-}\overline{x} {)(}y_i {-}\overline{y} {)}}}{{\sigma }_x{\sigma }_y}, \end{aligned}$$
(12)

where N is the total number of pixels, x and y are the encrypted stored template in the database and the new subject encrypted template.

Figure 8c presents the correlation score for the original and generated images, while the ROC curve is plotted in Fig. 8d. This curve shows that based on the area under the curve (AUC) metric, an accuracy of 99% was achieved, which suggests potential applications for the proposed MBCS in safeguarding confidentiality of cybersecurity systems.

Fig. 8
figure 8

Plots for visual evaluation of proposed MBCS scheme

4.2 Statistical evaluation

Outcomes of our statistical evaluation comprising of qualitative and quantitative analyses that utilise different metrics to assess the quality of the generated cancellable template and correlation of its signal to noise ratio are presented and discussed in this section.

4.2.1 Quantitative analysis

The performance of our proposed MBCS scheme is evaluated in terms of three quantitative metrics, namely: percentage pixel change rate (NPCR), unified average changing intensity (UACI) and peak signal to noise ratio (PSNR).

Given two images \({I}_{1}\) and \({I}_{2}\), the NPCR, UACI and PSNR are computed using (13), (15) and (16), respectively, where M and N represent the width and height of the images.

$$\begin{aligned} NPCR\left( \%\right) =\ \frac{1}{M\times N\times 3}\sum ^M_{i=1}{\sum ^N_{j=1}{\sum ^3_{k=1}{S\left( i,j,k\right) \times 100,}}} \end{aligned}$$
(13)

where

$$\begin{aligned} S\left( i,j,k\right) =\left\{ \begin{array}{ll} 1, &{} I_1(i,j,k)=I_2(i,j,k) \\ 0,&{} elsewhere \end{array} \right. . \end{aligned}$$
(14)
$$\begin{aligned} UACI\left( \%\right) =\frac{1}{M\times N\times 3}\sum ^M_{i=1}{\sum ^N_{j=1}{\sum ^3_{k=1}{\frac{\left| I_1\left( i,j,k\right) -I_2\left( i,j,k\right) \right| }{255}\times 100.}}} \end{aligned}$$
(15)
$$\begin{aligned} PSNR=20{{\rm log}}_{10}\left[ \frac{I_{MAX}}{\sqrt{MSE}}\right] , \end{aligned}$$
(16)

where \({I}_{MAX\ }\) is the maximum possible pixel value and MSE is mean square error defined as:

$$\begin{aligned} MSE=\ \frac{1}{MN}\sum ^{M-1}_{i=0}{\sum ^{N-1}_{j=0}{\left[ I_2\left( i,\ j\right) -I_1\left( i,\ j\right) \right] }}. \end{aligned}$$
(17)

4.2.2 Qualitative analysis

Qualitative evaluation involves the use of quality metrics to analyse the performance of the scheme. Employing spectral distribution (SD) and universal image quality index (UIQ) quality metrics, we report the performance of our proposed MBCS scheme in the remainder of this subsection.

(a) Spectral distortion

Spectral distribution (SD) provides a qualitative assessment of similarity between the spectral information in two images (Benrhouma et al. 2015). It is accepted that SD values indicate congruence between the assessed images (Abd-El-Atty et al. 2021). Mathematically, SD is defined as:

$$\begin{aligned} SD=\frac{\sum _{i=1}^{M}\sum _{j=1}^{N}\left| x(i,j)-y(i,j)\right| }{M\times N}, \end{aligned}$$
(18)

where \(M\times N\) is the total number of pixels in the image, x(ij) and y(ij)] are the original image and the encrypted image, respectively.

(b) Universal image quality index Universal image quality index (UQI) is another metric that assesses structural concordance between two images (El-Latif et al. 2020). Mathematically, UQI, whose values vary in the range -1 to 1, is computed using (19) with values closer to 1 indicating greater congruity between images (El-Latif et al. 2020).

$$\begin{aligned} UQI(i,j)=\frac{Cov_{ij} }{\sigma _{i} \sigma _{j} } \cdot \frac{2\mu _{i} \mu _{j} }{\mu _{i}^{2} +\mu _{j}^{2} } \cdot \frac{2\sigma _{i} \sigma _{j} }{\sigma _{i}^{2} +\sigma _{j}^{2} }, \end{aligned}$$
(19)

where \(\mu _i {\ {\rm and}}\ \mu _j,s_i\ {\rm and}\ s_j,\) and \(Cov_{ij}\) are the means, variances, and covariance of i and j, respectively.

To conclude the quantitative and qualitative evaluation, we present (in Table 1) outcomes of the defined metrics for our proposed MBCS scheme. From the table, we can deduce average performance in terms of 99.158% (NPCR), 24.52dB (PSNR), 0.0783 (SSIM), 0.9093 (UIQ), 59.5817 (SD), and 23.6268 (UACI) which indicate proposed scheme’s utility as an efficient platform for biometric authentication with potential applications in cloud services and IoT frameworks.

Table 1 Quantitative and qualitative evaluation of proposed MBCS scheme

5 Discussion of results and complexity evaluation

The intrigues and resources required to implement an algorithm are assessed in terms its complexity. In this section, we report evaluations of the performance of our proposed MBCS scheme in terms of its execution time and its inherent limiting factor, i.e., the big O analysis.

The time required, in seconds, to implement our MBCS scheme is computed in terms of steps required for its execution for every user (Gudeme et al. 2020; Peng et al. 2021; Kaur and Khanna 2020) where each biometric is an M by N image as enumerated below.

Steps performed for each user:

  1. 1.

    (O(1)) operations to register current biometrics of the user

  2. 2.

    (O(n\({\times }\)(M\({\times }\)N))) operations to perform feature extraction on an M\({\times }\)N image, where n is an integer.

  3. 3.

    (O(2\({\times }\)n\({\times }\) (M\({\times }\)N))) operations to fuse the features.

  4. 4.

    (O(M\({\times }\)N)) operations to reconstruct the fused image.

  5. 5.

    (O(5\({\times }\)n\({\times }\) M\({\times }\)N)) operations to perform the deep dream (which has 5 steps).

  6. 6.

    (O(n\({\times }\)(M\({\times }\)N))) operations to perform the authentication process leading to acceptance or rejection the user.

Furthermore, the actual time required to execute the proposed scheme is tabulated in Table 2 from which we can infer a moderate execution time especially since generation of the cancellable template is an off-line process (Peng et al. 2021). Furthermore, by noting that the complexity analysis is performed in terms of central processing unit (CPU) operations required to execute the proposed MBCS scheme, running time for each step is estimated.

Table 2 Execution time (in seconds)

As reported in Table 3, the authentication process of our MBCS scheme requires O(max(n, (n\({\times }\) (M\({\times }\)N))) steps (or operations) which, relative to a recent study in Peng et al. (2021), is considered moderate.

Table 3 Comparison of computational complexity

Additionally, Table 4 presents a comparison between the proposed method and others reported in the literature in terms of Equal Error Rate (EER), False Acceptance Rate (FAR), False Rejection Rate (FRR), and area under ROC curve (AROC). As deduced therefrom, with values of 0.0032 (EER), 0.0006 (FAR), 0.001 (ERR) and 0.99 (AROC), our proposed scheme outperforms similar ones reported in Soliman et al. (2018b), Soliman et al. (2018a), Algarni et al. (2020), Tarif et al. (2017), Sree and Radha (2016), Dang et al. (2016), Kumar et al. (2011) and Refregier and Javidi (1995).

Table 4 A comparison between the proposed method and works in the literature

6 Concluding remarks

A multi-biometric cancellable scheme (MBCS) has been proposed to generate secure and efficient cancellable templates for fingerprint, finger vein and iris biometric modalities. In doing so, we exploited the potency of deep learning models to build a multi-exposure deep fusion module used to generate a fused biometric template that is subsequently aggregated as the final cancellable template in the deep dream module. Extensive performance evaluation comprising of visual and statistical analyses produced average values of 99.158%, 24.523 dB, 0.079, 0.909, 59.582 and 23.627 for NPCR, PSNR, SSIM, UIQ, SD and UACI, respectively that validate the effectiveness of our proposed scheme. In ongoing efforts, we are expanding the study via integration of new recognition and encryption protocols to be validated on larger and more robust datasets. Furthermore, we are exploring deployment of this refined version of the scheme for real-time biometric applications in airports, banks, surveillance, etc.