Ear Recognition Using BlockBased Principal Component Analysis and Decision Fusion
Abstract
In this paper, we propose a fast and accurate ear recognition system based on principal component analysis (PCA) and fusion at classification and feature levels. Conventional PCA suffers from time and space complexity when dealing with highdimensional data sets. Our proposed algorithm divides a large image into smaller blocks, and then applies PCA on each block separately, followed by classification using a minimum distance classifier. While the recognition rates on small blocks are lower than that on the whole ear image, combining the outputs of the classifiers is shown to increase the recognition rate. Experimental results confirm that our proposed algorithm is fast and achieves recognition performance superior to that yielded when using whole ear images.
Keywords
Ear recognition Principal component analysis (PCA) Feature fusion Classifier fusion1 Introduction
Ear recognition systems are a relatively recent biometric technique, and are challenging to implement in practice due to difficulties controlling occlusions, pose, illumination etc. Ears have played a significant role in forensic science, especially in the United States, where an ear classification system based on manual measurements has been used for more than 40 years [5]. Using a collection of over 10,000 ears, they were found to be distinguishable based on only 12 measurements.
Chang et al. [3] compared ear recognition with face recognition using a standard PCA technique on face and ear images, and reported accuracies of 71.6 % and 70.5 % for ear and face recognition, respectively. They also presented results with varying ligthing which resulted in lower recognition accuracies of 64.9 % and 68.5 % for face and ear images, respectively. Combining ear and face images lead to a significant improvement and an accuracy of 90.9 %. Kumar and Zhang [7] employed different feature extraction methods and different classification algorithms, namely feedforward artificial neural networks and three classifiers based on a nearest neighbour rule. The experiments they performed yielded recognition rates ranging from 76.5 % to 94.1 %. Alaa et al. [13] used feature combination to improve the performance of an ear recognition system, and achieved recognition rates between 85.9 % and 96.1 %.
Principal component analysis (PCA) is widely used for dimensionality reduction, feature extraction, compression, visualiation, and other tasks. PCA finds the \(c\) principal orthonormal vectors which describe an eigenspace. In many applications, \(c\) is much smaller than the original dimensionality of the data, while the computation of PCA can be implemented using eigenvalue decomposition (EVD) of the covariance matrix of the data matrix [16]. However, PCA requires relatively high computational complexity and memory requirements, especially for large datasets [1, 11].
To address this, Golub and van Loan [4] used Jacobi’s method which diagonalises a symmetric matrix and requires about \(O(d^3 + d^2 n)\) computations, where \(n\) represents the number of feature vectors or samples and \(d^d\) represents the dimensionality of the vectors. Roweis [9] proposed an expectation maximisation (EM) algorithm for PCA, which is shown to be computationally more effective compared to the EVD method for PCA. However, calculating PCA based on EM is still expensive, while the EM algorithm may not converge to the global maximum but only a local one, and is dependent on the initialisation. The power method can also be used to find leading eigenvectors, and is less expensive, but can compute only one most leading eigenvector [10]. Also, Skarbek [12] and Liu et al. [8] proposed eigenspace merging where it is not necessary to store the covariance matrix of previous training samples.
In this paper, we propose a fast and accurate ear recognition system through fusion at classification level and feature level and PCA on subimages. Our algorithm aims to decrease the dimensionality and hence decrease the complexity of a PCAbased algorithm by dividing the ear image into small blocks. PCA is then applied on the image blocks separately and classification performed using a minimum distance classifier. The outputs of these classifiers at abstract, rank, and score level are combined, while we also investigate combining the block features at the feature level. Experimental results confirm that our proposed algorithm is fast and achieves recognition performance superior to that yielded when using whole ear images.
The rest of the paper is organised as follows. In Sect. 2, we summarise some of the background on principal component analysis and current fusion methods. Section 3 then details our proposed algorithm. Section 4 gives experimental results, while Sect. 5 concludes the paper.
2 Background
2.1 Principal Component Analysis
Principal component analysis (PCA) is a popular linear subspace method that finds a linear transformation which reduces the \(d\)dimensional feature vectors to \(h\)dimensional feature vectors with \(h < d\). It is possible to reconstruct the \(h\)dimensional feature vectors from the \(d\)dimensional reduced representation with some finite error known as reconstruction error. Of the resulting \(h\) basis vectors, the first one is in the direction of the maximum variance of the given data, while the remaining basis vectors are mutually orthogonal and maximise the remaining variance. Each basis vector represents a principal axis. These principal axes can be obtained by the dominant/leading eigenvectors (i.e. those with the largest associated eigenvalues) of the measured covariance matrix of the original data matrix. In PCA, the original feature space is characterised by these basis vectors and the number of basis vectors is usually much smaller than the dimensionality \(d\) of the feature space [12].
For an ear image \(\varGamma (M \times N)\), where \(M\) and \(N\) are the width and height of the image, it is first transformed into a vector of length \(M \times N\). The feature matrix of \(K\) training ear images is then given by \(\varGamma = [\varGamma _1,\varGamma _2,\dots ,\varGamma _K]\) and the average of the training set is calculated as \(\psi =\frac{1}{K}\sum \varGamma _i\). The average is subtracted, i.e. \(\phi _i=\varGamma _i\psi \), and the data matrix created as \(A=[\phi _1, \phi _2, \dots , \phi _K]((M\times N)\times K)\).
2.2 Fusion Methods
Combining different and independent resources can increase the accuracy of biometric (or other) systems. Misclassification of some samples by a method or classifier can be compensated by combining different resources which in turn can be performed at different levels. There are various approaches for such fusion methods, of which we summarise the most common in the following.
Multiinstance systems use various sensors to capture samples. In multisensorial systems, samples from the same instance are captured using two or more different sensors (e.g., both visible light and infrared cameras) are combined in a sensor level fusion approach to increase the robustness of the biometric system [6].
Combination at feature level can lead to improved performance as more information is available (compared to fusion at classification level, which is discussed below). Fusion of features is usually implemented by concatenating two or more feature vectors, i.e. if \(f_1 = \{x_1, \dots , x_n\}\) and \(f_2 = \{y_1,\dots ,y_m\}\) are two feature vectors of lengths \(n\) and \(m\), respectively, then the fused feature vector \(f=\{x_1,\dots , x_n,\dots , y_1,\dots ,y_m\}\) is obtained by concatenation of \(f_1\) and \(f_2\).

Abstract Level Fusion: Abstract or decision level fusion can be seen as making a decision by combining the outputs of different classifiers for a test sample. It is the simplest fusion method and majority voting is the most commonly employed method here.

Rank Level Fusion: Here, the outputs of each classifier (a subset of possible matches) are sorted in decreasing order of confidence so that each class has its own rank. Fusion can be performed by summing up the ranks of each class and the decision is given by the class of the highest rank.
 Score (Measurement) Level Fusion: Fusion rules on the vectors are derived to represent the distance between the test image and the training images. Thus, the output of each classifier is represented by scores or measurements. Fusion at this level combines the vectors of scores, and the decision is given by the class that has the minimum value. Assume that we want to classify an input pattern \(Z\) into one of \(m\) possible classes based on the evidence provided by \(R\) different classifiers. Let \(\acute{x_i}\) be the score derived for \(Z\) from the \(i\)th classifier, and let the outputs of the individual classifiers be \(P(\omega _j\acute{x_i})\), i.e., the posterior probability of pattern \(Z\) belonging to class \(\omega _j\) given the scores \(\acute{x_i}\). If \(c=\{1, 2, \dots , m\}\) is the class to which \(Z\) is finally assigned, then this can be done by the following rules [14]:$$\begin{aligned} c= & {} \arg \; \max _j \; \max _i\; P(\omega _jx_i) \\ c= & {} \arg \; \max _j\; \min _i \;P(\omega _jx_i) \nonumber \\ c= & {} \arg \; \max _j\; \text{ med }_i\; P(\omega _jx_i) \nonumber \\ c= & {} \arg \; \max _j\; \text{ avg }_i \;P(\omega _jx_i) \nonumber \\ c= & {} \arg \; \max _j\; \prod _i\; P(\omega _jx_i)\nonumber \end{aligned}$$(4)
3 Proposed Algorithm
In this paper, we propose a fast and accurate ear recognition system based on principal component analysis (PCA) and fusion at classification and feature levels. Figure 1 gives an overview of our proposed algorithm.
The first step in our approach is to divide each ear image into nonoverlapping blocks. In each block, PCA features are extracted and each subimage is matched separately using a minimum distance classifier.
In the first model, the features of the blocks are combined at feature level. Here, the PCA algorithm is applied on each subimage and then the features of all blocks combined to form a feature vector. Classification is performed using a minimum distance classifier.
4 Experimental Results
In our experiments, we use a dataset of 102 ear images, 6 images for each of 17 subjects [2]. In particular, six views of the left profile from each subject were taken under uniform diffuse lighting conditions, while there are slight changes in the head position from image to image.
Recognition rates [%] using different minimum distance classifiers and PCA on whole ear images
Metric  2 Training images  4 Training images 

Euclidean  89.7059  94.1176 
Cityblock  88.2353  94.1176 
Cosine  89.7059  94.1176 
Comparison of required CPU times of applying PCA on whole ear images and on image blocks
Blocks  2 Training images  4 Training images 

1 (whole image)  607  615 
\(2\times 2=4\)  36  39 
\(3\times 3=9\)  9  10 
\(4\times 4=16\)  2  3 
In the second experiment, the proposed PCA algorithm is applied. Here, the ear images are divided into into 4, 9, and 16 blocks, respectively to reduce the computational complexity of the proposed model. The resulting computation times are given in Table 2. As we can see from there, the time required for PCA calculation on blocks is very small compared to that of performing PCA on the whole images, while computation time decreased with decreasing block sizes.
The final experiment is conducted to investigate the proposed classifier fusion models. Here, the features that result from each block are matched separately, while we combine the results of the classifiers at abstract level, rank level, and score level, respectively. The corresponding recognition results are shown in Figs. 3, 4, and 5, respectively.
From Figs. 3, 4, and 5, we can see that our proposed classifier fusion model achieves accuracies that exceed those obtained based on the whole ear images. Moreover, the accuracy of our model is in general inversely proportional to the number of blocks employed. When the number of blocks is increased, some of the resulting blocks will contain only background or small parts of the ear, which decreases the accuracy for these parts and hence affects the overall accuracy. This problem can in particular be observed for abstract level fusion, while for rank and score level fusion, using many ranks or scores may compensate the problem. Score level fusion leads to recognition performance that is significantly better compared to abstract and rank level fusion. Not surprisingly, abstract level fusion yields the lowest accuracy, as it is based purely on the decisions without further information.
5 Conclusions
In this paper, we have presented an algorithm to identify persons using 2D ear images based on principle component analysis. Crucially, the computational complexity of PCA is addressed by partitioning the images into small blocks and performing PCA on the subimages separately. We then combine the blocks at feature and classification level, respectively, with the latter leading to the best results and significantly improved performance compared to performing PCAbased recognition based in the whole ear images. In addition to this increased classification accuracy, our approach also significantly reduces the computation time required, hence giving a fast and accurate ear recognition algorithm as demonstrated by a series of experimental results.
References
 1.Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58(3), 11 (2011)CrossRefMathSciNetGoogle Scholar
 2.CarreiraPerpinan, M.: Compression neural networks for feature extraction: Application to human recognition from ear images. MS thesis, Faculty of Informatics, Technical University of Madrid, Spain (1995)Google Scholar
 3.Chang, K., Bowyer, K.W., Sarkar, S., Victor, B.: Comparison and combination of ear and face images in appearancebased biometrics. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1160–1165 (2003)CrossRefGoogle Scholar
 4.Golub, G.H., van Loan, C.F.: Matrix Computations, vol. 3. JHU Press, Baltimore (2012)Google Scholar
 5.Iannarelli, A.V.: Ear Identification. Paramont Publishing Company, Fremont (1989)Google Scholar
 6.Jain, A., Nandakumar, K., Ross, A.: Score normalization in multimodal biometric systems. J. Pattern Recogn. 38(12), 2270–2285 (2005)CrossRefGoogle Scholar
 7.Kumar, A., Zhang, D.: Ear authentication using logGabor wavelets. In: Defense and Security Symposium. pp. 65390A–65390A (2007)Google Scholar
 8.Liu, L., Wang, Y., Wang, Q., Tan, T.: Fast principal component analysis using eigenspace merging. In: IEEE International Conference on Image Processing (ICIP), vol. 6, pp. VI457 (2007)Google Scholar
 9.Roweis, S.: EM algorithms for PCA and SPCA. In: Advances in Neural Information Processing Systems, vol. 10, pp. 626–632. MIT Press, Cambridge (1998)Google Scholar
 10.Schilling, H.A., Harris, S.L.: Applied Numerical Methods for Engineers Using MATLAB. Brooks/Cole Publishing Co, Pacific Grove (1999)Google Scholar
 11.Shlens, J.: A tutorial on principal component analysis. arXiv preprint arXiv:1404.1100 (2014)
 12.Skarbek, W.: Merging subspace models for face recognition. In: Petkov, N., Westenberg, M.A. (eds.) CAIP 2003. LNCS, vol. 2756, pp. 606–613. Springer, Heidelberg (2003) CrossRefGoogle Scholar
 13.Tharwat, A., Hashad, A., Salama, G.: Human ear recognition based on parallel combination of feature extraction methods. Mediterr. J. Comput. Netw. 6(4), 133–137 (2010)Google Scholar
 14.Tharwat, A., Ibrahim, A.F., Ali, H.A.: Multimodal biometric authentication algorithm using ear and finger knuckle images. In: 7th International Conference on Computer Engineering & Systems, pp. 176–179 (2012)Google Scholar
 15.Toygar, Ö., Acan, A.: Boosting face recognition speed with a novel divideandconquer approach. In: Aykanat, C., Dayar, T., Körpeoğlu, I. (eds.) ISCIS 2004. LNCS, vol. 3280, pp. 430–439. Springer, Heidelberg (2004) CrossRefGoogle Scholar
 16.Turk, M., Pentland, A.: Eigenfaces for recognition. J. Cogn. Neurosci. 3(1), 71–86 (1991)CrossRefGoogle Scholar