Classification by Principal Component Regression in the Real and Hypercomplex Domains

Linear regression is a simple and widely used machine learning algorithm. It is a statistical approach for modeling the relationship between a scalar variable and one or more variables. In this paper, a classification by principal component regression (CbPCR) strategy is proposed. This strategy depends on performing regression of each data class in terms of its principal components. This CbPCR formulation leads to a new formulation of the Linear Regression Classification (LRC) problem that preserves the key information of the data classes while providing more compact closed-form solutions. For the sake of image classification, this strategy is also extended to the 4D hypercomplex domains to take into account the color information of the image. Quaternion and reduced biquaternion CbPCR strategies are proposed by representing each channel of the color image as one of the imaginary parts of a quaternion or reduced biquaternion number. Experiments on two color face recognition benchmark databases show that the proposed methods achieve better accuracies by a margin of about 3% over the original LRC and like methods.


Introduction
Linear regression is a simple and widely used machine learning algorithm that has received a lot of attention in many fields. In the image recognition area, Naseem et al. [1] proposed a Linear Regression Classification (LRC) algorithm that represents each class's training images independently assuming a linear regression relationship. The algorithm depends on applying the least squares method to find the regression coefficient then decides the class label that gives the smallest reconstruction error. To enhance its performance, Huang and Yang [2] and Zhu et al. [3] proposed to apply principal component analysis (PCA) [4] to extract the vital information from images and reduce the feature vector dimensions. Then, the original data are transformed into B Moumen T. El-Melegy moumen@aun.edu.eg 1 a low-dimensional subspace. Finally, LRC is performed on the projected data.
This paper contributes to this literature by proposing a new strategy for classification by performing regression of data in terms of its principal components. A novel formulation of the LRC problem, called Classification by Principal Component Regression (CbPCR), is presented. Moreover, a novel closed-form solution based on this formulation is derived. This classification strategy preserves key data class information and removes redundant and correlated details, yet yielding a more compact solution. Several experiments on public face recognition benchmark databases are reported to provide evidence that the proposed strategy outperforms the original LRC method [1] and its recent variants [2,3].
The proposed strategy is also extended to color images. PCA techniques [4][5][6] and the existing methods [1][2][3] work in principle on grayscale, single-channel images. They may operate on color images after converting them to grayscale images, thus losing the important color information. Some methods (e.g., [7]) apply LRC to every color channel separately, then select the class having the smallest total prediction error over all color channels. Unlike those methods, inspired by several studies [8][9][10][11][12][13][14], the current paper proposes to use 4D hypercomplex numbers to represent color images. This allows treating the color components of each image pixel as one entity thus considering the correlation between color components. Among the studies [8][9][10][11][12][13][14], two address the LRC problem. Zou et al. [11] proposes a quaternion LRC (QLRC) method that extends the classical LRC algorithm to quaternion space. QLRC converts the quaternion quantities to real ones to circumvent using quaternion derivatives. The recent paper [14] develops closed-form solutions for QLRC from the principles of quaternion calculus. In addition, the current paper proposes novel solutions based on reduced biquaternions (RBs), another hypercomplex space consisting of one real component and 3 imaginary ones. In addition to having commutative algebra-in contrast to quaternions-RBs may be represented using the so-called e 1 -e 2 form [15] that can lead to more time-efficient computation.
The proposed CbPCR formulation is extended to both the quaternion and RB domains to process color images. To that end, the current paper exploits an efficient algorithm derived by the authors in [10] for computing the principal components (eigenvectors) of an RB matrix by casting it as an x + y selection problem [16,17]. The experimental results on public benchmark databases for color face recognition demonstrate the better performances of the new quaternion and RB-based CbPCR algorithms over competing algorithms [7,11,14].
The rest of this paper is organized as follows: Sect. 2 gives a brief history of using 4D hypercomplex domains, namely quaternion and reduced biquaternions, in color image processing, with focus on color face recognition. Section 3 gives some notations and formally defines the problem of our concern here. Section 4 briefly reviews the quaternion and RB domains. Section 5 describes the proposed CbPCR method and its extension to the quaternion and the RB domains. The classification results on two benchmark color face databases are reported in Sect. 6. Section 7 concludes the paper.

Related Work
This section briefly reviews the use of 4D hypercomplex domains, namely quaternion, and reduced biquaternions, in representing color image, with focus on their application to color face recognition.
In 1996, Sangwine [18] introduced the idea of using 4D hypercomplex numbers (Quaternion numbers) in color image processing by encoding the pixel's color components into the three imaginary parts of a quaternion number. Bihan and Sangwine [8] and Pei et al. [19] proposed a Quaternion PCA method, which extracts more informative and robust features from the color image than conventional PCA. In 2011, Sun et al. [20] proposed 2DPCA and bi-dimensional PCA (BDPCA) based on quaternion representation. Also, Javier et al. [21] proposed an independent component analysis algorithm based on quaternions. Years later, Jia et al. [22] presented a 2DPCA based on a quaternion model (2DQPCA) that depends on reducing feature dimension in a row direction. In addition, Liu et al. [23] proposed a quaternion-based maximum margin criterion algorithm. The methods proposed in [20,22,23] mainly targeted color face recognition and showed an enhancement in the recognition rate over their grayscale-based counterpart methods.
Using the other 4D domain of reduced biquaternions for the same purpose started rather late. A reduced biquaternion PCA based on reduced biquaternion representation was proposed [10] to represent color images in the typical PCA framework, which takes full advantage of the color characteristics of the image. Recently, a color occlusion face recognition method [24] based on the quaternion non-convex sparse constraint mechanism was proposed.
On the other hand, LRC was originally designed for grayscale images. Zou et al. [11] proposed an LRC model based on quaternion representation to consider the color image information. In [14], an LRC model was developed based on a reduced biquaternion representation. Both works [11,14] were also applied to the face recognition problem and achieved better performances than the original LRC.

Notations and Problem Statement
Let us first give some notations to distinguish between real, quaternion, and reduced biquaternion. Scalars and vectors were denoted using italic and bold lowercase letters, respectively, while matrices are indicated by bold uppercase letters. The number of dots on top of a symbol indicates its intended domain: real (R), complex (C), quaternion (Q), or reduced biquaternion (B). Symbols without any dots on top indicate real or complex quantities, where the intended domain is disambiguated from the context. A quaternion is represented by a symbol with one dot above, while an RB quantity has two dots.
In the context of image classification, the problem of our concern can be defined as follows. Suppose there are L classes in a training set of images, where the l-th class consists of n l images. Each image is of size m × n. The typical goal of LRC is to find the label of a new probe image Y from the given training samples and their labels. Assuming grayscale images, each image X ∈ R m×n is represented as a 1D vector x ∈ R mn by stacking the rows, one after another. As such, one can form a matrix A l = [x l 1 , x l 2 , . . . , x l n l ] ∈ R mn×n l that represents samples from the l-th class. For our work here, a color image is represented as a 1D pure quaternion vectoṙ x ∈ Q mn . Analogously, the color image can be represented as a 1D pure reduced biquaternion vectorẍ ∈ B mn .

Quaternions
A quaternion number [25] consists of one real and three imaginary parts:q = q r + q i i + q j j + q k k (a quaternion number with no real part is called a pure quaternion). The three imaginary parts satisfy: The quaternion conjugate isq = q r −q i i −q j j −q k k, and the quaternion norm is |q| = qq. The Hermitian (conjugate transpose) satisfies ṖQ H =Q HṖ H , see [26] for more details.
Due to the noncommutativity of quaternion multiplication, a quaternion matrix has left and right eigenvalues that may be different [15,27]. Any quaternion matrixQ ∈ Q n×n can be expressed as: andQ k being the real and the three imaginary parts of the quaternion matrixQ. The eigenvalues (and eigenvectors) can be calculated from the equivalent complex matrix [27]: where the mapping H(.) transforms a quaternion matrix into an equivalent complex matrix. Thus, there are 2n eigenvalues (eigenvectors) for any n × n quaternion matrix.

Reduced Biquaternions
A reduced biquaternion number also has one real and three imaginary parts:p = p r + p i i + p j j + p k k, where: In contrast to quaternions, multiplication on the RB domain is commutative. There are two special numbers e 1 and e 2 [15] such that any RB number can be represented as: Expressing many operations in terms of the e 1 -e 2 forms reduces their complexity. For example, direct RB multiplication requires 16 real multiplications while applying the e 1 -e 2 form requires only 8. The RB norm and Hermitian can be defined in a similar way as quaternion numbers. The RB conjugate [28] is defined as: There are other definitions of conjugate [28][29][30][31] while (4) is the only definition satisfyingp = p 1 e 1 + p 2 e 2 .
For any n × n RB matrix there are n 2 eigenvalues (eigenvectors) (see [10,15] for proof). The computation of n 2 eigenvalues and their corresponding eigenvectors would increase the computational cost required to find these eigenvalues. The time complexity of finding the t largest eigenvalues will be O(n 2 t). A more efficient algorithm for this purpose was derived in our paper [10] based on the wellknown computer science problem x + y selection [16,17] with time complexity of O(nt + nlogn).
For an m × n RB matrixP, the Frobenius norm P = where M(.) maps any RB matrix to its complex equivalent matrix: where P 1 and P 2 ∈ C m×n and are defined as

Proposed Methods
In this section, we review LRC [1], derive the CbPCR for grayscale images and then extend it for color images using the theory of quaternions and RBs.

Linear Regression Classification (LRC)
The goal of LRC is to infer the correct label of any new probe image y ∈ R mn . LRC [1] seeks to represent y as a linear combination of the training images of each class by setting up the following real-valued linear regression problem: for which a closed-form solution is found via Then, the predicted vector y l is given by: The test sample y is eventually assigned to the class with the minimal distance min l y − y l , l = 1, 2, . . . , L.

Classification by Principal Component Regression
The idea in our proposed methods is based on the fact that an image (column) in A l can be represented as [4] x where µ l is the mean vector of the l-th class, and U l ∈ R mn×t represents the t largest principal components of the class That is, an image can be represented as a linear combination of the Eigen-components of the classspecific scatter matrix. The real weights {w b g } t b=1 represent the projections of the g-th image along these components. Huang and Yang [2] and Zhu et al. [3] apply the standard LRC on these weights. Nevertheless, our CbPCR model is defined as: Forcing the gradient of the objective function (11) with respect to c l to vanish, the closed form is found as: Since U l is orthonormal, The response vector y l is predicted as: The distance between y and the predicted response vector y l is computed as: The test image y is decided to belong to the class minimizing (15).
In order to reduce the computation burden of finding the largest principal components of the class scatter matrix, the common practice [4] was followed by finding first the eigenvectors V l ∈ R n l ×t of the matrix A T l A l , and then the target eigenvectors are computed as:

Quaternion-based CbPCR
Q-CbPCR is based on the algebra and calculus of quaternion matrices [32] to identify the class to which a new color image belongs. Here a m × n training color image is portrayed as a 1D pure quaternion vectorẋ ∈ Q mn . The columns of matrixȦ l ∈ Q mn×n l represent samples from the l-th class where each column represents the difference between the training image and the class meanμ l . A test imageẏ ∈ Q mn is represented by setting up the following quaternion regression problem: whereU l encompasses the key eigenvectors of the quaternion scatter matrixȦ lȦ H l corresponding to the t largest eigenvalues in terms of the quaternion norm as computed by a QPCA technique [8]. The reconstructed query color imageẏ l is computed by: Eventually,ẏ is given the label of the class with the minimal quaternion norm min l ẏ − ẏ l , l = 1, 2, . . . , L.
The proposed quaternion-based representation is depicted in Fig. 1, where a color face image is represented as a linear combination of the mean image and the best t eigenvectors of its class.

Reduced Biquaternion-based CbPCR
Analogously, RB-CbPCR relies on color image representation using RBs in place of quaternions. Our goal here is to find the correct label of a query color imageÿ ∈ B mn from the given training data matricesÄ l ∈ B mn×n l and their true labels. The proposed RB-CbPCR model is set up as: whereμ l is the mean of the l-th class, andÜ l represents the eigenvectors of the RB scatter matrixÄ lÄ H l corresponding to the t largest eigenvalues in terms of the RB norm as obtained Fig. 1 Quaternion-based color representation: Column (a) shows original color image. Column (b) gives the closed-form solution of (17) with t=3. Column (c) depicts the real parts of the principal components inU l as grayscale images. Column (d) depicts the imaginary parts of the principal components as color images. First row depicts the real (zero) and imaginary parts of the mean of the training images. Column (e) represents the reconstructed image from (18) via our efficient RB-based PCA algorithm [10]. By Lemma 2 in [14], it can be proved that (20) is equivalent to: where a closed-form solution is derived in proposition 2.

Proposition 2:
The closed-form solution of (20) is c l = U H l ÿ −μ l .

Proof: See Appendix 2.
The class-specific reconstructed test image is The test imageÿ is finally labeled to the class with the minimal RB norm min l ÿ − ÿ l , l = 1, 2, . . . , L. (23)

Experiments
In this section, the proposed methods are evaluated on two color face recognition benchmark databases: the GATech database [33] and the FERET database [34]. The proposed CbPCR, Q-CbPCR, and RB-CbPCR methods are compared with LRC [1], the quaternion-based QLRC method [11,14], the RB-based RBLRC method [14], IPCRC [2,3], and the CLRC method [7]. Note that LRC, IPCRC, and CbPCR work on grayscale images, while the other methods operate on color images. All experiments are carried out on a pc with an Intel i7 CPU at 2.5 GHz with 8 GB RAM using MATLAB 2015. Quaternion computations are done using the quaternion MATLAB toolbox [35], while RB computations are carried out using our own MATLAB toolbox.

Experiments on GATech Database
The GATech database [33] consists of 50 (but only 38 are available to us) subjects with 15 images per subject taken in two or three sessions. It experiences several variations in facial expression, pose, illumination, and scale; see Fig. 2a. Following [36], 10 images from each subject were used for training and the remaining 5 for testing.
Two factors were studied on the recognition performance: the image size and the number of principal components t. Figure 3 graphs the rank-1 recognition accuracy by all methods versus the image size ranging from 5 to 100% (size 54 × 39 pixels) in steps of 5%. We fix t=7 per class in our proposed methods and use 266 principal components from the whole training data in the IPCRC method. Expectedly, the performances of the grayscale-based methods are generally worse than those of color-based ones. CbPCR achieves 3% and 4.7% improvement over LRC and IPCRC, respectively. IPCRC is the worst among the three and achieves a peak accuracy of 84.2% compared with respective peak accuracies of 88.9% and 85.8% by CbPCR and LRC.
The color methods offer better performances than the grayscale-based methods except for CLRC which has worse accuracy than CbPCR and a close-performance to LRC with a peak accuracy of 85.8%. The new grayscale-based CbPCR method has a better performance than QLRC and RBLRC till image size 50% then has almost the same performance afterward. The new Q-CbPCR and RB-CbPCR have a closeperformance that is better than all other methods. Both have a  peak accuracy of 88.95% yielding about 2.28% improvement over QLRC and RBLRC. Figure 4 shows the rank-1 recognition accuracy of all methods against t = 1 to 9 fixing the image size at 54 × 39 pixels. The number of principal components in IPCRC is taken as the number of classes (38) times t. The accuracies of CLRC, LRC, QLRC, and RBLRC are not dependent on t. IPCRC is the worst overall. As more principal components are used, the proposed CbPCR, Q-CbPCR, and RB-CbPCR offer higher accuracies. For t ≤ 4, CbPCR and RB-CbPCR have similar performance, while Q-CbPCR tops both methods. Afterward, the performances of Q-CbPCR and RB-CbPCR are better than that of CbPCR. Q-CbPCR tends to offer a slightly better accuracy than RB-CbPCR, where the former has a peak accuracy of 88.95% versus 87.37% for the latter. CbPCR has 86.84% peak accuracy.
An additional experiment is carried out to compare the proposed methods (Q-CbPCR and RB-CbPCR) with existing PCA-based methods: QPCA [8], RBPCA [10], 2DQPCA [22], and 2DRBPCA [10]. Figure 5 graphs the rank-1 recognition accuracies of all these methods versus an image size from 5 to 100% (size 54 × 39 pixels) in steps of 5%. Seven principal components per class are used in the proposed methods, while 266 principal components are employed in QPCA, RBPCA, 2DQPCA, and 2DRBPCA. The performance of 1D PCA methods is the worst. QPCA is worse than RBPCA, with a peak accuracy of 82.63%, while RBPCA has a peak accuracy of 85.78%. Q-CbPCR and RB-CbPCR are the best overall methods, with a peak accuracy of 88.95% compared to 2DQPCA and 2DRBPCA, which achieve peak accuracies of 85.79% and 87.36%, respectively.
Finally, the recognition time (in seconds) was studied by taking the average of running each algorithm 10 times on image size 54 × 39, with t = 7 in the proposed methods and 266 principal components in IPCRC, QPCA, RBPCA, 2DQPCA, and 2DRBPCA. As shown in Table 1, QPCA and RBPCA are the slowest while the grayscale-based LRC and IPCRC are the fastest due to their simpler computation. 2DQPCA and 2DRBPCA are faster than Q-CbPCR and RB-CbPCR since the size of the scatter matrix in 2D methods is less than the size of the scatter matrix in the 1D methods. RB-CbPCR is around 1.8 × faster than Q-CbPCR. This is due to the faster computations of RB operations taking advantage of the e 1 -e 2 form and to the faster computation of the RB principal components by our efficient RB-based PCA algorithm [10].

Experiments on FERET database
The FERET database [34] contains more than 14,000 face images having pose and light variations. In this study, we consider a subset consisting of 115 subjects with 4 images from each subject captured in 3 poses, see Fig. 2b. Two faces are captured at 0°while the other two faces are captured at 15°a nd − 15°. One frontal image and one with the head rotated   15°were chosen for training, while testing is performed on the other 2 images. Figure 6 shows the rank-1 recognition accuracy for various image sizes varying from 5 to 100% (48 × 32 pixels) in a step of 5% using t = 1 in our proposed methods and 115 principal components in IPCRC. Expectedly, the grayscale-based methods have lower accuracy. IPCRC and LRC are close to each other with respective peak accuracies of 79.6% and 80.9%. CbPCR achieves 1.79% and 1.18% improvements over LRC and IPCRC, respectively. QLRC and RBLRC exhibit nearly the same performance with a top accuracy of 81.74%. CLRC performs better than QLRC and RBLRC with a peak of 82.1%. Both Q-CbPCR and RB-CbPCR offer the best overall accuracy of 82.61% at 20% image size. Q-CbPCR shows a slightly better performance than RB-CbPCR for larger image sizes.
Moreover, the average recognition time (in seconds) was assessed by running each algorithm 10 times on image size 48 × 32 using t = 1 in the proposed methods and 115 principal components in IPCRC. As shown in Table 2, LRC and IPCRC are the fastest while CLRC is the slowest. Q-CbPCR and RB-CbPCR take less time than QLRC and RBLRC. Moreover, RB-CbPCR is about 1.5 × faster than Q-CbPCR.

Conclusions
In this paper, a novel formulation of LRC based on principal component regression has been proposed. This formulation keeps the key information of the data classes while providing more compact closed-form solutions. This formulation is also extended to the quaternion and RB domains to take into account the color information. The specific contributions of this paper are: • CbPCR strategy is proposed by performing regression of each data class in terms of its principal components. • This CbPCR strategy is extended to the hypercomplex domains of quaternions and RBs to consider color images. • The CbPCR closed-form solutions are developed from the principles of real, quaternion and RB domains. • Experiments on two color face recognition benchmark databases have showed that the proposed Q-CbPCR and RB-CbPCR have the highest overall accuracy among eight different methods including very recent ones [7,11,14]. Moreover, RB-CbPCR is about 1.8 × faster than Q-CbPCR. The grayscale-based CbPCR algorithm has even outperformed some color-based algorithms in the literature in addition to the original grayscale LRC method [1] and its recent variants [2,3]. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.