Keywords

1 Introduction

Human faces presents rich textural and morphological cues for their peers to recognition their demographic group. Since the 90s of the last century, computer vision researchers have been solving the problem of image-based automatic facial Soft-Biometrics recognition, e.g. gender [7,8,9] and ethnicity [9, 10]. While the conventional works usually rely on the color images, a new trend consisting on the use of the 3D face has emerged in the past 10 years. In addition to the merits of invariant to illumination and facial makeups, the 3D modality has also demonstrated superior Soft-Biometric recognition performance in comparison with 2D texture based methods [11, 12]. Remarkably, since the appearance of the Face Recognition Ground Challenge Dataset (FRGCv2) [2], abundant approaches and results have been reported based on that. In [5], Lahoucine et al. put geodesic lengths of facial curves in boosting scheme for gender recognition with the 466 earliest scans of the FRGCv2 subjects. In [14], Toderici et al. explored wavelets for Gender and Ethnicity recognition on FRGCv2 dataset. In [6], Xia et al. deployed the DSF features to capture 3D facial symmetry and averageness for Gender classification on FRGCv2. In [16], Gilani et al. used the 3D facial landmarks distances of FRGCv2 for gender classification. In [15], Wang et al. performed gender classification with 3D coordinates of FRGCv2 meshes. In [17], Huang et al. established a common recognition framework for gender and ethnicity classification on FRGCv2, using the Local Circular Pattern (LCP) features on both depth and color scans of FRGCv2.

Despite the successfulness and prevalence of 3D modality, the underlying common thought in all these works [5, 6, 14,15,16,17] is to use 3D face as basis for Soft-Biometric recognition. However, researches from other domains have also revealed that demographic cues exist in each individual facial parts. For example, researchers in Sexual Dimorphism (Male/Female differences) [18, 19] have found that male faces usually possess more prominent facial features than female faces. Male faces usually have more protuberant noses, eyebrows, more prominent chins and jaws. The forehead is more backward sloping, and the distance between top-lip and nose-base is longer. In the study of the ethnic differences [20], researchers have found that compared to the North America Whites, Asians usually have broader faces and noses, far apart eyes, and exhibit the greatest difference in the anatomical orbital regions (around the eyes and the eyebrows). In the clinical study reported in [19], Alphonse et al. have revealed that Caucasians have significantly lower fetal Fronto-Maxillary Facial Angle (FMFA) measurements than Asians. These findings indicate that significant demographic cues exist in facial feature level, such as the nose, the forehead, the mouth, the chin. However, very limited research attention has been given to specific facial parts, and to test their individual capability in revealing the Soft-Biometric traits. In 3D domain, Han et al. have proposed to investigate gender recognition with face divided into four arbitrary regions [23]. In [22], Hu et al. have integrated the volume and area information of facial parts and constructed a sparse feature for 3D face gender recognition. None of the above two works have actually explored the natural facial parts alone for Soft-Biometric recognition. To the best of our knowledge, there has been only one work proposed by Yasmina et al. [21] in 2D domain, which investigates gender classification with 2D facial parts on the FERET dataset. With SVM Classifier, they achieved around 81.50% gender classification rate with chin, each eye and the mouth parts. With the nose part, they achieved 86.40% classification rate, which suggests the nose region is more discriminant for gender recognition.

From the analysis above, we find that the nose region has remarkable discriminant power over gender and ethnicity. There is a lack of research on 3D facial parts based, especially the 3D nose based Soft-Biometrics recognition. Thus, we propose to explore the idea of 3D nasal Soft-Biometric recognition, in particular for gender and ethnicity recognition. Our research is further motivated by the fact that, when using only the nasal region, we could effectively get rid of the influence of hair/glasses occlusions and the facial expressions, which are challenging issues when working with face scans [3].

2 Methodology and Contributions

Our approach consists of a 3D feature extraction step followed by a standard machine learning step for Soft-Biometrics classification. Firstly, given a preprocessed 3D face, a 3D sphere centered at the nosetip is built to remain only the nasal region. In the next, a collection of parameterized radial curves emanating from the nosetip are extracted to represent the 3D nasal region. We then build our features with the simple 3D coordinates on interpolated points of each curve. The features are later fed to machine learning techniques, e.g. the Support Vector Machine (SVM), the Random Forest (RF) and the Linear Discriminant Analysis (LDA), to examine their strength in gender and ethnicity recognition. We note that the proposed approach is fully automated, without the need of human interaction in any step. The main contributions of this work are the following:

  • We propose the idea of 3D nasal Soft-Biometrics recognition, in particular the first work in the literature for gender and ethnicity recognition. We demonstrate with experiments that the nasal shape contains rich demographic cues for Soft-biometrics recognition.

  • With 466 earliest scans of FRGCv2 dataset (mainly neutral), we achieve 91% gender recognition rate and 94% ethnicity recognition rate in 10-fold cross-validation using simple 3D coordinates features, which demonstrate the effectiveness of 3D nasal shape in revealing our gender and ethnicity groups. When testing on the FU3D dataset, we achieve 90.53% gender classification rate and 100% ethnicity classification rate, which shows the generalization ability of the proposed approach.

  • With the whole FRGCv2 dataset which involves significant expression changes, we achieve again 91% gender recognition rate and 94% ethnicity classification rate. The results demonstrate that the nasal shape based approach is invariant to expressions. It echoes the idea that the nasal shape is robust to facial expression changes.

The rest of the paper is organized as follows. In Sect. 3, the 3D nasal dataset construction and feature extraction methods are detailed, as well as a preliminary analysis of the features’ competence in revealing the two demographic traits. Nasal Soft-Biometrics recognition experiments and analysis are presented in Sect. 4. Section 5 concludes the work and states future directions.

3 Feature Extraction and Preliminary Analysis

3.1 3D Nasal Dataset Construction and Feature Extraction

As there is no specific 3D nose dataset, we are motivated to define and extract the nasal region from 3D face scans, and construct the 3D nasal dataset by ourselves. We base our study on the FRGCv2 dataset [2], which has been extensively used in 3D face analysis. This dataset contains 4, 007 near-frontal 3D face scans of 466 subjects. There are 1, 848 scans of 203 female subjects, and 2, 159 scans of 265 male subjects. For ethnicity, 1213 scans of 112 subjects are related to Asian group, and the rest 2, 794 scans of 354 subjects are Non-asian. About 60% of the scans have a neutral expression, and the others show expressions of disgust, happiness, sadness and surprise.

Given a near-frontal 3D face scan in FRGCv2, after preprocessing [3], the face is in normalized position and we are able to detect the nosetip simply according to the closest point to the camera. The nose region is then cropped out by a 3D sphere centered at the nosetip. In practice, we set the sphere radius to be 45 mm, in accordance to the statistics of adult nasal bridge length [1]. Following this, we employ the radial curves extraction technique to represent the 3D nasal mesh. This technique has been widely and successfully used in 3D face related studies [3,4,5, 13]. In our work, we extract densely 100 radius curves emanating from the nosetip on the nasal mesh by equally angular profile extraction, and represent each radial curve with a resolution of 50 interpolated 3D points. The extracted nasal region contains mainly the nose and the region between upper lip and nose. The most flexible facial parts during expression, such as the mouth and cheek regions, have been effectively excluded. Also, the forehead and eye regions, which would suffer significantly in the presence of hair or glasses occlusions, are also successfully removed. The dense radial curves representation has resulted into a close representation to the original nasal shape, and a simple 3D coordinates based feature extraction, which forms the basis of further investigations in this work.

4 Experiments

We adopt the experimental settings proposed in 3D face related study [6], which designs two separate experimental sessions on the FRGCv2 dataset. The first session executes 10-fold cross-validation with the 466 earliest scans of FRGCv2 subjects. This subset contains mainly neutral scans. We term this as Expression-Dependent setting. The second session performs 10-fold cross-validation on the whole FRGCv2, for which 40% are expressive. This setting allows to test the robustness of the proposed approach with facial expressions challenges. In our study, this setting also allows to examine the general belief that nasal region is robust to facial expressions. We term this as Expression-Independent setting. In both settings, no subject is used in both training and testing stages of the same experiment. Experimental results from 3 classifiers, namely the Support Vector Machine (SVM) with linear kernel, the Random Forest (RF), and the Linear Discriminant Analysis (LDA), are reported in the following.

4.1 Expression-Dependent Gender and Ethnicity Recognition

Expression-Dependent gender classification results are depicted in the left of Fig. 1. The height of each bar signifies the mean classification rate, and the red line shows the standard deviation in the 10 folds. All the classifiers achieve >85% mean classification rate. With the linear-kernel SVM classifier, we achieve 90.99% gender classification rate, with a standard deviation of 4.29%. This result is further detailed as confusion matrix in Table 3. The recognition rates for male (91.63%) and female (90.15%) are very close, which means the proposed approach performs in balance way between two gender groups. The above results demonstrate that the nasal shape contains rich demographic cues for gender recognition, and our proposed approach can perform effective nasal shape based gender recognition. To confirm these findings, we also performs cross-dataset experiment: training with 466 scans of FRGCv2 and testing on the Florence University 3D face dataset (FU3D [24]). The FU3D dataset contains 53 neutral-frontal face scans of 53 Caucasian (Non-asian) subjects. For gender, 14 of the subjects are female and 39 are male. The results are shown in the right of Fig. 1. Despite the significant difference in data acquisition techniques (A laser rangefinder is used in FRGCv2 and a multi-cameras stereo system is used for FU3D), the linear-kernel SVM classifier achieves 90.57% gender classification rate in the cross-dataset experiment. It also demonstrates the generalization ability of our proposed approach in nasal shape based gender classification.

Fig. 1.
figure 1

Expression-Dependent gender classification. (Expression-Dependent: experiment on 466 FRGCv2 scans; Cross-Dataset:train on 466 FRGCv2 scans, test on FU3D)

Table 1. Confusion matrix of Expression-Dependent gender classification (T: Truth, P: Prediction)
Table 2. Confusion matrix of Expression-Dependent ethnicity classification (T: Truth, P: Prediction)

For Expression-Dependent ethnicity recognition, the results are depicted in the left of Fig. 2. All the classifiers achieve >90% mean classification rate. With the linear-kernel SVM classifier, we achieve 94.21% ethnicity classification rate, with a standard deviation of 4.06%. Details of this result is presented in Table 4. The recognition rates for Asian (88.39%) and Non-asian (96.05%) are effective, but considerably different. We assume that the unequal number of available scans for Asians (112 scans) and Non-asians (354 scans) accounts largely for this imbalanced performance. In conclusion, the above results highlight that the nasal shape contains rich cues for ethnicity recognition, and the proposed ethnicity recognition approach is effective. Similar to gender, we perform cross-dataset ethnicity recognition with the FU3D dataset. As shown in the right of Fig. 2, the Random Forest classifier gets 100% of the 53 scans correctly classified, and the linear-kernel SVM results in 98.11% ethnicity classification rate. This result again demonstrates the generalization ability of our approach in nasal shape based ethnicity classification.

Fig. 2.
figure 2

Expression-Dependent Ethnicity Classification. (Expression-Dependent: experiment on 466 FRGCv2 scans; Cross-Dataset:train on 466 FRGCv2 scans, test on FU3D)

4.2 Expression-Independent Gender and Ethnicity Recognition

In this section, experiments are carried out on the whole FRGCv2 dataset, for which about 40% scans are expressive. As shown in Fig. 3 (A), for the 3 classifiers, the gender classification rates stay comparable to the Expression-Dependent setting, but the standard deviations decrease significantly. The linear-kernel SVM classifier achieves 91.26% gender classification rate, with a standard deviation of 1.59%. The confusion matrix in Table 3 shows balanced recognition performance for both male (91.71%) and female (91.07%). For ethnicity recognition, results are depicted in Fig. 3 (B). The linear-kernel SVM achieves 93.96% Asian and Non-asian classification rate. The confusion matrix in Table 4 shows the performance for Asian (88.43%) is lower than the performance of Non-asian (96.34%). However, these results are no worse than those from the Expression-Dependent experiments. In conclusion, the above results demonstrate that our proposed approaches for recognizing these two Soft-Biometric traits are robust to facial expression challenges.

Fig. 3.
figure 3

Gender and Ethnicity recognition results under Expression-Independent setting. (A) Results for gender classification; (B) Results for ethnicity classification

Table 3. Confusion Matrix of Expression-Independent Gender Classification (T:Truth, P:Prediction)
Table 4. Confusion matrix of Expression-Independent Ethnicity Classification (T:Truth, P:Prediction)

5 Conclusion

In this work, we have proposed the idea of 3D nasal Soft-Biometrics, in particular for gender and ethnicity classification. The nasal shape is free from hair and glass occlusions, and also stays relatively rigid and robust with expression changes. Using the simple 3D coordinates features, under the Expression-Dependent setting, the cross-validation experiments carried out on FRGCv2 dataset have achieved 91% gender recognition rate and 94% ethnicity recognition rate, which confirm the usefulness of nasal shape, and the effectiveness of proposed approach. Cross dataset experiments tested on the FU3D dataset have reached 90.5% gender recognition rate and 100% ethnicity recognition rate, which demonstrate the generalization ability of proposed approach on other dataset. When testing with the Expression-Independent setting on the Whole FRGCv2 dataset, for both gender and ethnicity, comparable performances have been achieved in terms of recognition rate, and higher performances have been observed considering the standard deviation. These results demonstrate that the proposed approach stays stable with expression challenges.