Keywords

1 Introduction

Soft-biometrics provides complementary information about the individual, without being able to fully authenticate him. It includes various traits such as gender, skin color, eyes color and ethnicity. These characteristics were recently used to reinforce biometric identification systems. Besides, in forensics applications, soft-biometrics allows the restriction of investigations to a limited category of persons or suspects [1]. First works in this field extracted soft-biometrics by analyzing individual face images which constitute the most practical identification tool [2, 3]. Nevertheless, soft-biometrics can bring useful information to some forensics and handwriting recognition applications. For instance, when analyzing an anonymous threat letter soft-biometric information such as the writer’s gender, handedness, age range, and educational level, has a precious contribution in investigations. Since 2001, researchers in the handwriting recognition field started to predict soft-biometric traits from handwritten text. The first work proposed by Cha et al. [4] tried to classify US population into some demographic sub-categories defined by gender, ethnicity and educational level. Then, some other works have followed later by dealing with various traits such as gender, handedness, age range and nationality [5,6,7,8,9,10,11,12,13,14]. However, in the state of the art, soft-biometrics prediction systems were developed to deal with a single trait. This is mainly due to two reasons. First, there is a lack of datasets providing several soft-biometric characteristics and second, the prediction of one characteristic is a challenging task. In fact, the results reported on several benchmark datasets vary from 55% to 85% [5,6,7,8,9,10,11,12,13,14]. Thereby, the following question have come up: Could we predict two or more characteristics from the same analysis and if so, how much will be the prediction score? The present work attempts to answer these questions by proposing two multi-class prediction schemes. In the first scheme, we employ the same handwritten text to develop individual systems that predict writer’s gender, handedness and age range. Then, the predictions obtained are grouped to get a global prediction on the three characteristics. Whilst, the second scheme adopts directly a multiclass prediction based on the one against all implementation. In both schemes, the prediction process is based on SVM classifier associated with gradient features. The rest of this paper is arranged as follows: Sect. 2 introduces multi-trait prediction schemes. Section 3 presents the experimental evaluation while the last section reports the main conclusions of this work.

2 Multiclass Prediction of Soft-Biometrics

Gender is the social definition of a man and a woman, while handedness defines the preference for use of a hand, known as the dominant hand (left or right). As to age, it is perceived by ranges. Until now, predicting such characteristics from handwriting is performed for only one characteristic at a time. In this work we investigate the feasibility of a multiclass prediction from the analysis of the same handwritten text. Whatever the adopted scheme, the prediction task is founded on two main steps that are feature generation and prediction. As feature generation several texture, gradient, shape and geometric features were proposed [10, 14]. Also, for the prediction step, several classifiers were employed such as artificial neural networks, Support Vector Machine (SVM) and decision tree algorithms [8, 10]. Nevertheless, findings report that SVM is the best candidate for solving the prediction task [10]. So, in this work we employ SVM associated with the Gradient Local Binary Patterns (GLBP) which showed a high performance for predicting a single soft-biometric trait [10].

2.1 Dataset Description

Up to now, the prediction of writer’s soft-biometrics is not widely investigated because of the lack of public datasets. Precisely, for the Latin script IAM is the only public dataset which provides gender, handedness and age range of writers. IAM was developed by a research group on computer vision and artificial intelligence at Bern University in SwitzerlandFootnote 1. It contains handwritten sentences of more than 200 writers grouped into two age categories that are “25–34 years” and “35–56 years”. So, by considering the three available traits, we can define 8 classification categories as shown in Table 1. Presently, 534 samples are collected to perform multi-class soft-biometrics prediction. Specifically, we considered only one handwritten sentence per writer for right-handed writers. For classes including left-handed classes we considered more than one sample per writer since the IAM dataset contains only 20 left-handed writers.

Table 1. Data distribution for multiclass prediction.

2.2 Multiclass Soft-Biometrics Prediction Based on Individual Systems

Since soft-biometrics prediction is commonly evolved in systems predicting a single characteristic, the direct extension for a multi-class prediction consists of grouping individual decisions of such systems. The idea of this scheme is to predict each soft-biometric characteristic independently from the others, so that the global system will be composed of “j” individual binary systems, if we have “j” characteristics to predict. In this respect, three systems are developed to predict gender, handedness and age range by grouping the training data according to the considered characteristic. Hence, we use 176 female samples and 180 male samples for gender prediction, 168 left-handed samples and 188 right-handed samples for handedness prediction, and finally, 178 samples for age ranges prediction. For the test stage, each sample is simultaneously presented to the three systems as shown in Fig. 1. Then, predictions on gender, handedness and age range are grouped and compared to the ground truth of the considered sample. In experiments, classes in Table 1 were grouped to perform a binary classification. For gender prediction training samples of classes 1, 2, 3 and 4 were grouped to constitute the Female class while the remaining classes were grouped to form the Male class.

Fig. 1.
figure 1

Multiclass soft-biometrics prediction based on the individual systems.

2.3 Multiclass Soft-Biometrics Prediction Based on One-Against-All SVM

The One-Against-All (OAA) SVM builds “j” binary SVM to solve a j-class classification problem. Each SVM is dedicated to separate one class from all other classes. After the training stage, a test sample is presented to all SVM to produce 8 decisions according to the classes of interest. Then, the sample is assigned to the class with the highest decision as depicted in Fig. 2.

Note that test samples are common for the two schemes, in order to get a fair comparison of the prediction scores.

Fig. 2.
figure 2

OAA SVM for multiclass soft-biometrics prediction system; F: female, M: male, R: right, L: left, A1: 24–34 years, A2: 35–56 years.

3 Experimental Evaluation

The proposed multiclass prediction schemes are evaluated on the selected IAM sub-set. For performance evaluation the confusion matrix is used to highlight the precision per class and the global prediction accuracy. Recall that the prediction system is better the more the confusion matrix approaches a diagonal matrix.

3.1 Prediction Based on Individual Systems

Prediction results expressed through overall accuracies are exhibited in Fig. 3. The overall accuracy based on the combination of individual decisions that is 32,48% is much lower than those given by each individual predictions. This can be explained by the proliferation of prediction errors of each binary system, when aggregating the final decision. Specifically, the decrease in the prediction accuracy is mainly due to the age range prediction system that gives a medium prediction, which is about 52,86%. This finding leads us to move towards multiclass implementation to improve these results.

Fig. 3.
figure 3

Multiclass prediction results of combined individual systems.

3.2 Prediction Based on the OAA SVM

Compared to the first scheme, the OAA implementation improves the overall prediction accuracy to 54.09%. To understand this result, which remains low, we present the confusion matrix in Table 2. From a look at this table, we note that classes 3, 4, 7 and 8 that correspond to right-handed writers are poorly predicted. These classes are problematic, as the addition of the age range characteristic, especially for right-handed writers, increases the complexity of the prediction task. More precisely, for right-handed females the age range doesn’t show any behavioral differences (classes 3 and 4). This means that writers of this category keep almost a stable and stationary age characteristic over time. Similar behavior is observed for right-handed males that are highly confused with the right-handed females by 36.36% in precision. To get a more precise interpretation of the soft-biometric behavior, we reproduced the OAA test by considering two soft-biometrics that are gender and handedness. This was done by merging classes age ranges according to gender and handedness, which yields a 4-classes prediction. Experiments report an overall prediction score of 66.17%. Based on this outcome as well as the results derived by the first scheme, we suggested that the age range is the most critical trait. So, to improve the description of age range information, we developed a fuzzy membership model that gives an additional knowledge about the writer’s age.

Table 2. Confusion matrix for multiclass prediction (%).
  • Membership degree for age modeling

Age range is modeled through a membership degree that gives an automatic information about the affinity of a sample to one of the two age ranges. Indeed, inspired by a work carried out in a remote sensing application [15], we define a fuzzy membership degree to age categories based on Mahalanobis distance. Specifically, all training samples were grouped into two sets according to the age range. For each set, we calculated the mean and the covariance matrix. Then, for each sample, the fuzzy membership degree to the age range categories is calculated according to the steps presented in Algorithm 1.

The membership degree is generated for all samples and concatenated with GLBP features. Table 3 illustrates the confusion matrix obtained be adding the fuzzy meberships of age range.

figure a
Table 3. Confusion matrix for multiclass prediction with membership degree contribution (%).

As can be seen, the membership degree allows a gain of 7,42% given an overall prediction about 61.51%. Indeed, an improvement of 13,64% and 27,27% are reached for right-handed females of both age ranges. Moreover, an improvement of 24% is noticed for right-handed males of the second age range. However, we observe a negative effect on left-handed writers. For instance, the precision of left-handed females aged between 25 and 34 years old drops from 90,91% to 77,27% which corresponds to 3 samples wrongly predicted.

In summary, predicting gender, handedness and age range simultaneously is very challenging as it is limited by the difficulty of separating sub-categories according to the age characteristic. For this reason, prediction accuracy drops from 66.17% to 54.09% without and with age characteristic, respectively. These, all outcomes reveal that the right-handed writers which represent the majority of writers in the database, are not distinguished according to age. This allows to say that the characteristic that is supposed to evolve over time, has generally stagnated for this category of writers.

4 Concluding Remarks

This work addresses the possibility of a simultaneous prediction of writer’s gender, handedness and age range from one single analysis of the handwriting giving an 8-classes prediction problem. In this respect, using a set of data extracted from an English benchmark dataset, we investigate two prediction schemes. In the first scheme, three systems designed to predict a single characteristic are developed. Then, predictions are grouped to give an overall prediction of the three characteristics. The second scheme adopts a multiclass prediction based on the one against all implementation of SVM to solve the 8 classes prediction problem. Experimental findings reveal that the age characteristic is problematic as it seems to be stable and unchanged over time, especially for right-handed writers. This is perhaps due to the fact that all contributers are adult since the dataset doesn’t contain young and very old writers. So, it seems necessary to perform other experiments with larger age range categories to get more concluding results. Nevertheless, the best multiclass prediction accuracy is about 61,51%. This remains a promising result and can be improved by finding a better modeling of the age characteristic, not necessarily through a classifier but through a model representation such as regression. Also, perhaps other features or classification methods can deal better with this characteristic.