Among the various traits used for human identification, the iris pattern has gained an increasing amount of attention for its accuracy, reliability, and noninvasive characteristics. In addition, iris patterns possess a high degree of randomness and uniqueness which is true even between identical twins, and the iris remains constantly stable throughout an adult’s life [1, 2].

The initial pioneering work on iris recognition, which is the basis of many functioning commercial systems, was conducted by Daugman [1]. The performance of iris recognition systems is impressive as demonstrated by Daugman [3] who reported false acceptance rates of only 10−6 on a study of 200 billion cross-comparisons. Additionally, the potential of iris biometrics has also been affirmed with 1.2 trillion comparison by tests carried out by the National Institute of Standards and Technology (NIST) which confirmed that iris biometrics has the best balance between accuracy, template size, and speed compared to other biometric traits [4].

Iris recognition technology nowadays is widely deployed in various large-scale applications such as the border crossing system in the United Arab Emirates, Mexico national ID program, and the Unique Identification Authority of India (UIDAI) project [5]. As a case in point, more than one billion residents have been enrolled in the UIDAI project where about 1015 all-to-all check operations are carried out daily for identity de-duplication using iris biometrics as the main modality [5, 6].

Nearly all currently deployed iris recognition systems operate predominately in the near-infrared (NIR) spectrum capturing images at 800–900 nm wavelength. This is because there are fewer reflections coming from the cornea and the dark pigmented irides look clearer under the NIR light. In addition, external factors such as shadows and diffuse reflections become less under NIR light [7, 8].

The color of the irides is governed by the congruity of two molecules: eumelanin (black/brown) and pheomelanin (red/yellow). Dark pigmented irides have a high concentration of eumelanin. As the latter deeply absorbs visible light (VL), stromal features of the iris are only revealed under NIR and they become hidden in VL so the information related to the texture is revealed rather than the pigmentation. On the other hand, pheomelanin is dominant in light-pigmented irides. Capturing such irides under NIR light eliminates most of the rich pheomelanin information because the chromophore of the human iris is only visible under VL [8, 9]. Consequently, capturing iris images under different light conditions reveals different textural information.

Research in VL iris recognition has been gaining more attention in recent years due to the interest in iris recognition at a distance [10, 11]. In addition, competitions such as the Noisy Iris Challenge Evaluation (NICE) [12] and the Mobile Iris Challenge Evaluation [13] focus on the processing of VL iris images. This attention to visible wavelength-based iris recognition is boosted by several factors such as (1) visible range cameras can acquire images from long distance and they are cheaper than NIR cameras and (2) surveillance systems work in the visible range by capturing images of the body, face, and iris which could be used later for authentication [14].

Since both VL and NIR iris recognition systems are now widely deployed, studying the performance difference of iris recognition systems exploiting NIR and VL images is important because it gives insight into the essential features in each wavelength which in turn helps to develop a robust automatic identification system. On the other hand, cross-spectral iris recognition is essential in security applications when matching images from different lighting conditions is desired.

In this paper, we therefore propose a method for cross-spectral iris images matching. To the best of our knowledge, this attempt is amongst the first in the literature to investigate the problem of VL to NIR iris recognition (and vice versa) dealing with unregistered iris images belonging to the same subject. In addition, we investigate the difference in iris recognition performance with NIR and VL imaging. In particular, we investigate iris performance in each channel (red, green, blue, and NIR) and the feasibility of cross-channel authentication (i.e., NIR vs. VL). Furthermore, enhancing the iris recognition performance with multi-channel fusion is attained.

In summary, the main contributions of the paper are as follows:

  • A novel framework for cross-spectral iris recognition capable of matching unregistered iris images captured under different lighting conditions

  • Filling the gap in multi-spectral iris recognition by exploring the performance difference in iris biometrics under NIR and VL imaging

  • Boosting iris recognition performance with multi-channel fusion

The rest of this paper is organized as follows: related works are given in Section 2. The proposed framework for cross-spectral iris matching is explained in Section 3. Section 4 presents the experimental results and the discussion while Section 5 concludes this paper.

Related work

Iris recognition technology has witnessed a rapid development over the last decade driven by its wide applications in the world. At the outset, Daugman [1] proposed the first working iris recognition system which has been adopted later by several commercial companies such as IBM, Irdian, and Oki. In this work, the integro-differential operator is applied for iris segmentation and the 2D Gabor filters are utilized for feature extraction while the Hamming distance scores serve as a comparator. The second algorithm is due to Wildes [15] who applied the Hough transform for localizing the iris and the Laplacian pyramid to encode the iris pattern. However, this algorithm has a high computational demand.

Another interesting approach was proposed by Sun and Tan [2] exploiting ordinal measures for iris feature representation. Unlike the traditional approaches that use quantitative values, the ordinal measure focuses on qualitative values to represent features. The multi-lobe differential filters have been applied for iris feature extraction to generate a 128-byte ordinal code for each iris image. Then, the error rates have been calculated based on the measured Hamming distances between two ordinal templates of the same class.

All the previous work assessed iris recognition performance under NIR. The demand for more accurate and robust biometric systems has increased with the expanded deployment of large-scale national identity programs. Hence, researchers have investigated iris recognition performance under different wavelengths or the possibility of fusing NIR and VL iris images to enhance recognition performance. Nevertheless, inspecting the correlation of NIR and VL iris images has been understudied, and the problem of cross-spectral iris recognition is still unsolved.

Boyce et al. [16] explored iris recognition performance under different wavelengths on a small multi-spectral iris databases consisting of 120 images from 24 subjects. According to the authors, higher accuracy was achieved for the red channel compared to green and blue channels. The study also suggested that cross-channel matching is feasible. However, iris images were fully registered and captured under ideal conditions. In [17], the authors employed the feature fusion approach to enhance the recognition performance of iris images captured under under both VL and NIR. The wavelet transform and discrete cosine transform were used for feature extraction while the features were augmented with the ordered weighted average method to enhance the performance.

In Ngo et al. [18], a multi-spectral iris recognition system was implemented which employed eight wavelengths ranges from 405 to 1550 nm. The results on a database of 392 iris images showed that the best performance was achieved with a wavelength of 800 nm. Cross-spectral experiment results demonstrated that the performance degraded with larger wavelength difference. Ross et al. [19] explored the performance of iris recognition in wavelengths beyond 900 nm. In their experiments, they investigated the possibility of observing different iris structures under different wavelengths and the potential of performing multi-spectral fusion for enhancing iris recognition performance. Similarly, Ives et al. [20] examined the performance of iris recognition under a wide range of wavelengths between 405 and 1070 nm. The study suggests that illumination wavelength has a significant effect on iris recognition performance. Hosseini et al. [8] proposed a feature extraction method for iris images taken under VL using a shape analysis method. Potential improvement in recognition performance was reported when combining features from both NIR and VL iris images taken from the same subject.

Recently, Alonso-Fernandez et al. [21] conducted comparisons on the iris and periocular modalities and their fusion under NIR and VL imaging. However, the images were not taken from the same subjects as the experiments were carried out on different databases (three databases contained close-up NIR images, and two others contained VL images). Unfortunately, this may not give an accurate indication about the iris performance as the images do not belong to the same subject. In [22], the authors suggested enhancing iris recognition performance in non-frontal images through multi-spectral fusion of iris pattern and scleral texture. Since the scleral texture is better seen in VL and the iris pattern is observed in NIR, multi-spectral fusion could improve the overall performance.

In terms of cross-spectral iris matching, the authors in [14] proposed an adaptive method to predict the NIR channel image from VL iris images using neural networks. Similarly, Burge and Monaco [23, 24] proposed a model to predict NIR iris images using features derived from the color and structure of the visible light iris images. Although the aforementioned approaches ([14, 23, 24]) achieved good results, their methods require the iris images to be fully registered. Unfortunately, this is not applicable in reality because it is very difficult to capture registered iris images from the same subject simultaneously.

In our previous work [25], we explored the differences in iris recognition performance across the VL and NIR spectra. In addition, we investigated the possibility of cross-channel matching between the VL and NIR imaging. The cross-spectral matching turns out to be challenging with an equal error rate (EER) larger than 27 %. Lately, Ramaiah and Kumar [26] emphasized the need for cross-spectral iris recognition and introduced a database of registered iris images and conducted experiments on iris recognition performance under both NIR and VL. This database is not available yet. The results of cross-spectral matching achieved an EER larger than 34 % which confirms the challenge of cross-spectral matching. The authors concluded their paper by: “it is reasonable to argue that cross-spectral iris matching seriously degrades the iris matching accuracy”.

Proposed cross-spectral iris matching framework

Matching across iris images captured in VL and NIR is a challenging task because there are considerable differences among such images pertaining to different wavelength bands. Although, the appearance of different spectrum iris images looks different, the structure is the same as they belong to the same person. Therefore, we exploited various photometric normalization techniques and descriptors to alleviate these differences. In this context, we employed the Binarized Statistical Image Features (BSIF) descriptor [27], DoG filtering in addition to a collection of the photometric normalization techniques available from the INface Toolbox1 [28, 29]: adaptive single scale retinex, non-local means, wavelet based normalization, homomorphic filtering, multi-scale quotient, Tan and Triggs normalization, and multi-scale Weberface (MSW).

Among these illumination techniques and descriptors, the DoG, BSIF, and MSW are noticed to reduce the iris cross-spectral variations. These models are described in the next subsections.

Difference of Gaussian (DoG)

The DoG is a feature enhancement technique which depends on the difference of Gaussians filter to generate a normalized image by acting as a bandpass filter. This is achieved by subtracting two blurred versions of the original images from each other [30]. The blurred versions G(x,y) are obtained by convolving the original image I(x,y) with two Gaussian kernels having differing standard deviations as shown in Eq. (1):

$$ D\left(x,y|\sigma_{0},\sigma_{1}\right)=\left[G\left(x,y|\sigma_{0}\right)-G\left(x,y|\sigma_{1}\right)\right]*I(x,y), $$

where * is the convolution operator and σ represents the Gaussian kernel function which is defined as

$$ G(x,y|\sigma)=\frac{1}{\sqrt{2\pi\sigma^{2}}}e^{-\left(x^{2}+y^{2}\right)/2\sigma^{2}} $$

Here, σ 0<σ 1 to construct a bandpass filter. The values of σ 0 and σ 1 are empirically set to 1 and 2, respectively. The DoG filter has a low computation complexity and is able to alleviate the illumination variation and aliasing. As there are variations in the frequency between VL and NIR images, the DoG filter is efficient because it suppresses these variations and alleviates noise and aliasing which paves the way for a better cross-spectral matching [30].

Binarized statistical image features (BSIF)

The BSIF [27] have been employed due to their ability to tolerate image degradation such as rotation and blurring [27]. Generally speaking, feature extraction methods usually filter the images with a set of linear filters then quantize the response of such filters. In this context, BSIF filters are learned by exploiting the statistics of natural images rather than using manually built filters. This has resulted in promising results for classifying the texture in different biometric traits [31, 32].

For an image patch X of size l×l pixels and a linear filter W i of the same size, the filter response s i is obtained by

$$ s_{i}=\sum W_{i}(u,v)X(u,v)={w^{T}_{i}}x. $$

The binarized feature b i is obtained based on the response values by setting b i =1 if s i >0 and b i =0 otherwise. The filters are learned from natural images using independent component analysis by maximizing the statistical independence of s i . Two parameters control the BSIF descriptor: the number of the filters (length n of the bit string) and the size of the filter l. In our approach, we used the default set of the filters2 which were learned from 5000 patches. Empirical results demonstrated that a filter size of 7×7 with 8 bits gives the best results.

Multi-scale Weberfaces (MSW)

Inspired by Weber’s law which states that the ratio of the increment threshold to the background intensity is a constant [33], the authors in [34] showed that the ratio between local intensity of a pixel and its surrounding variations is constant. Hence, in [34], the face image is represented by its reflectance and the illumination factor is normalized and removed using the Weberface model. Following this, we applied the Weberface model to the iris images to remove the illumination variations that result from the differences between the VL and NIR imaging, thus making the iris images illumination invariant.

Following the works of [28, 29], the Weberface algorithm has been applied with three scales using the following values: σ= [1 0.75 0.5], Neighbor=[9 25 49] and alfa= [2 0.2 0.02]. The steps of the Weberface algorithm are listed in Algorithm 1.

Proposed scheme

The variations in iris appearance due to different sensors, spectral bands, and illumination variations are believed to significantly degrade the iris recognition performance. To overcome these artifacts, a robust method should be carefully designed. Extensive experiments demonstrated that using one of the aforementioned methods alone is not sufficient to achieve an acceptable iris recognition performance with EER >17 %. Therefore, we propose to integrate the Gabor filter with these methods in addition to decision level fusion to achieve a robust cross-spectral iris recognition. Also, using the phase information of the Gabor filter rather than amplitude is known to result in robustness to different variations such as illumination variations, imaging contrast, and camera gain [7]. Hence, we propose to integrate the 1D log-Gabor filter [35] with DoG, BSIF, and MSW to produce the G-DoG, G-BSIF, and G-MSW (where G stands for Gabor) in addition to decision level fusion to achieve a robust cross-spectral iris recognition. The block diagram of the proposed framework is depicted in Fig. 1.

Fig. 1
figure 1

Block diagram of the proposed cross-spectral matching framework

Unlike previous works [14, 23, 24] in which they require fully registered iris images and learn models that lack the ability of generalization, our framework does not require any training and works on unregistered iris images. This combination along with its decision level fusion achieved encouraging results as illustrated in the next section.

Results and discussion

In this work, our aim is to ascertain true cross-spectral iris matching using images taken from the same subject under the VL and NIR spectra. In addition, we investigate the iris biometric performance under different imaging conditions and the fusion of VL+NIR images to boost the recognition performance. The recognition performance is measured with the EER and the receiver operating characteristic (ROC) curves.


The experiments are conducted on the UTIRIS database [8] from the University of Tehran. This database contains two sessions with 1540 images; the first session was captured under VL while the second session was captured under NIR. Each session has 770 images taken from the left and right eye of 79 subjects where each subject has an average of five iris images.

Pre-processing and feature extraction

Typically, an iris recognition system operates by extracting and comparing the pattern of the iris in the eye image. These operations involve four main steps namely, image acquisition, iris segmentation, normalization, feature extraction, and matching [7].

The UTIRIS database includes two types of iris images, half of which are captured in the NIR spectrum while the other half are captured under the VL spectrum. The VL session contains images in the sRGB color space which then are decomposed to the red, green, and blue channels. To segment the iris in the eye image, the circular Hough transform (CHT) is applied because the images used in our experiments were captured under a controlled environment so they can be segmented with circular approaches [36, 37].

It is noticed that the red channel gives the best segmentation results because the pupil region in this channel contains the smallest amount of reflection as shown in Figs. 2 and 3. The images in the VL session were down-sampled by two in each dimension to obtain the same size as the images in the NIR session. The segmented iris images are normalized with a resolution of 60×450 using the rubber sheet method [7].

Fig. 2
figure 2

Green-yellow iris image decomposed into red, green, blue, and grayscale with the NIR counterpart

Fig. 3
figure 3

Brown iris image decomposed into red, green, blue, and grayscale with the NIR counterpart

After feature extraction, the Hamming distance is used to find the similarity between two IrisCodes in order to decide if the vectors belong to the same person or not. Then, the ROC curves and the EER are used to judge the iris recognition performance for the images in each channel as illustrated in the next subsections.

NIR vs. VL performance

For feature extraction, the normalized iris image is convolved with the 1D log-Gabor filter to extract the features where the output of the filter is phase quantized to four levels to form the binary iris vector [35].

We carried out experiments on each channel (i.e., NIR, red, green, and blue) and measured the performance using ROC and EER. Figure 4 and Table 1 illustrates the EER and ROC curves for each channel. It can be seen that the best performance is achieved under the red channel with EER = 2.92 % followed by the green channel with EER = 3.11 % and the grayscale channel with EER = 3.26 % while the blue channel achieved worse results with EER = 6.33 %. It is also noticed that NIR images did not give the best performance for this database (EER = 3.45 %).

Fig. 4
figure 4

The performance of the iris recognition under red, green, blue, and NIR spectra

Table 1 EER (%) of different channels comparison on the UTIRIS database

This is in agreement with our results where the red channel images achieved better results than the NIR images as most of the iris images in the UTIRIS database are light pigmented. Figure 5 shows the distribution of the irides color in the UTIRIS database.

Fig. 5
figure 5

The color distributions of the irides of the 79 subjects in the UTIRIS database

Light-eyed vs. dark-eyed

As mentioned before, capturing iris images under NIR light eliminates most of the rich melanin information because the chromophore of the human iris is only visible under VL [8, 9]. Therefore, light-pigmented irides exhibit more information under visible light. Figure 2 shows a green-yellow iris image captured under NIR and VL. It can be seen that the red channel reveals more information than the NIR image. So, intuitively, the recognition performance would be better for such images in the VL rather than the NIR spectrum.

On the contrary, with dark-pigmented irides, stromal features of the iris are only revealed under NIR and they become hidden in VL so the information related to the texture is revealed rather than the pigmentation as shown in Fig. 2. Therefore, the recognition performance for the dark-pigmented irides would give better results if the images were captured under NIR spectrum.

Cross-spectral experiments

Cross-spectral study is important because it shows the feasibility of performing iris recognition in several security applications such as information forensics, security surveillance, and hazard assessment. Typically a person’s iris images are captured under NIR but most of the security cameras operate in the VL spectrum. Hence, NIR vs. VL matching is desired.

In this context, we carried out these comparisons using the traditional 1D log-Gabor filter: NIR vs. red, NIR vs. green, and NIR vs. blue. Figure 6 depicts the ROC curves of these comparisons. According to Fig. 6, the green and blue channels resulted in bad performance due to the big gap in the electromagnetic spectrum between these channels and the NIR spectrum.

Fig. 6
figure 6

Cross-channel matching

On the contrary, the red channel gave the best performance compared to the green and blue channels. This can be attributed to the small gap in the wavelength of the red channel (780 nm) compared to the NIR (850 nm). Therefore, the comparisons of red vs. NIR is considered as the baseline for cross-spectral matching. Table 1 shows the EER of cross-channel matching experiments.

Cross-spectral matching

Cross-spectral performance turned out to be a challenging task with EER >27 % which is attributable to matching unregistered iris images from different spectral bands. Hence, to achieve an efficient cross-spectral matching, adequate transformations before the feature extraction are needed.

Different feature enhancement techniques are employed, out of which the DoG, MWS, and BSIF recorded the best results as shown in Table 2. Therefore, our proposed framework, which is depicted in Fig. 1, is based on these descriptors.

Table 2 Experiments on different descriptors for cross-spectral matching

For all cross-spectral experiments, we have adopted the leave-one-out approach to obtain the comparison results [38]. Hence, for each subject with (m) iris samples, we have set one sample as a probe and the comparison is repeated iteratively by swapping the probe with the remaining (m−1) samples. The experiments for each subject are repeated (m(m−1)/2) times, and the final performance is measured in terms of EER by taking the minimum of the obtained comparison scores of each subject.

Cross-spectral fusion

To further enhance the performance of cross-spectral matching, the fusion of the G-DoG, G-BSIF, and G-MSW is considered. Different fusion methods are investigated namely, feature fusion, score fusion and decision fusion, out of which the decision fusion is observed to be the most effective.

Table 3 shows the performance of different fusion strategies for cross-spectral matching in terms of EER. Feature fusion resulted in poor results where the EER varied from 14 to 18 %. Score level fusion with minimum rule achieved better results. On the other hand, AND rule decision level fusion achieved the best results with EER = 6.81 %.

Table 3 Experiments on different fusion strategies for cross-spectral matching

A low false accept rate (FAR) is preferred to achieve a secure biometric system. To enhance the performance of our system and reduce the FAR, a fusion at the decision level is performed. Thus, the conjunction “AND” rule is used to combine the decisions from the G-DoG, G-BSIF, and G-MSW. This means that a false accept can only happen when all the previous descriptors produce a false accept [39].

Let P D(F A), P S(F A), and P M(F A) represent the probability of a false accept using G-DoG, G-BSIF, and G-MSW, respectively. Similarly, P D(F R), P S(F R), and P M(F R) represent the probability of a false reject. Therefore, the combined probability of a false accept P C(F A) is the product of the three probabilities of the descriptors:

$$ PC(FA)=PD(FA).PS(FA).PM(FA). $$

On the other hand, the combined probability of a false reject P C(F R) can be expressed as the complement of the probability that none of the descriptors produce a false reject:

$$ \begin{aligned} &{}PC(FR)=(PD(FR)'.PS(FR)'.PM(FR)')', \\ &\,\,\,\quad=(1-(1-PD(FR))(1-PS(FR))(1\!-PM(FR))), \\ &\,\,\,\quad=PD(FR)+PS(FR)+PM(FR) \\ &\,\,\,\quad+PD(FR).PS(FR)+ PD(FR).PM(FR) \\ &\,\,\,\quad+PS(FR).PM(FR)+ PD(FR).PS(FR).PM(FR). \end{aligned} $$

It can be seen from the previous equations that the joint probability of false rejection increases while the joint probability of false acceptance decreases when using the AND conjunction rule.

All the previous descriptors (G-DoG, G-BSIF, and G-MSW) are considered as local descriptors. It can be argued that the fusion of local and global features could enhance the performance further. We wish to remark that fusing the local and global features would require further stages to augment the resultant global and local scores as they will be in different range/type [40]. Such stages will increase the complexity of the cross-spectral framework. We have carefully designed the proposed framework so that all three descriptors (G-DoG, G-BSIF, and G-MSW) generate homogenous scores (binary template). Therefore, a single comparator (Hamming distance) can be quickly used for score matching.

Multi-spectral iris recognition

The VL and NIR images in the UTIRIB database are not registered. Therefore, they provide different iris texture information. The cross-channel comparisons demonstrated that red and NIR channels are the most suitable candidates for fusion as they gave the lowest EER compared to other channels as shown in Figs. 4 and 6, so it is common sense to fuse them in order to boost the recognition performance. Score level fusion is adopted in this paper due to its efficiency and low complexity [41]. Hence, we combined the matching scores (Hamming distances) from both the red and NIR images using sum rule-based fusion with equal weights to generate a single matching score. After that, the recognition performance is evaluated again with the ROC curves and EER.

It is evident from Fig. 7 that such fusion is useful to the iris biometric as there is a significant improvement in the recognition performance after the fusion with EER of only 0.54 % compared to 2.92 and 3.45 % before the fusion.

Fig. 7
figure 7

ROC curves showing the iris recognition performance before and after fusing the information of the red and NIR channel

Comparisons with related work

Although the previous works [14, 23, 24] reported good results in terms of cross-spectral iris matching, it must be noted that these works have adopted fully registered iris images and learn models that lack the ability of generalization.

In the works of [25, 42], the results of cross-spectral matching on unregistered iris images were reported. However, no models were proposed to enhance the cross-spectral iris matching. Table 4 shows the comparison results of the aforementioned works compared to our method.

Table 4 Cross-spectral matching comparison with different methods

Processing time

All experiments were conducted on a 3.2-GHz core i5 PC with 8 GB of RAM under the Matlab environment. The proposed framework consists of four main descriptors namely, BSIF, DoG, MSW, and 1D log-Gabor filter. The processing times of the 1D log-Gabor filter, BSIF, and DoG descriptors are 10, 20, and 70 ms, respectively, while the MSW processing times is 330 ms. Therefore, the total computations time of the proposed method is less than half a second which implies its suitability for real time applications.


In this paper, a novel framework for cross-spectral iris matching was proposed. In addition, this work highlights the applications and benefits of using multi-spectral iris information in iris recognition systems. We investigated iris recognition performance under different imaging channels: red, green, blue, and NIR. The experiments were carried out on the UTIRIS database, and the performance of the iris biometric was measured.

We drew the following conclusions from the results. According to Table 2, among a variety of descriptors, the difference of Gaussian (DoG), BSIF, and multi-scale Weberface (MSW) were found to give good cross-spectral performance after integrating them with the 1D log-Gabor filter. Table 4 and Fig. 6 showed a significant improvement in the cross-spectral matching performance using the proposed framework.

In terms of multi-spectral iris performance, Fig. 4 showed that the red channel achieved better performance compared to other channels or the NIR imaging. This can be attributed to the large number of the light-pigmented irides in the UTIRIS database. It was also noticed from Fig. 6 that the performance of the iris recognition varied as a function of the difference in wavelength among the image channels. Fusion of the iris images from the red and NIR channels notably improved the recognition performance. The results implied that both the VL and NIR imaging were important to form a robust iris recognition system as they provided complementary features for the iris pattern.