1 Introduction

Age progression has been an ever-growing field for several decades. It has been applied to the search for missing children [28, 30], entertainment [32], cosmetics [1, 3] and dermatology research [1, 24]. In this kind of applications, artificial facial aging must consider age-related morphological changes as well as skin appearance modifications in order to provide realistic results. The most dramatic change of the face with age is morphological and results from facial growth; it occurs from birth to early adulthood [8]. Another age-related morphological modification concerns the facial volumes due to fat distribution variations; they vary all along life, from birth to late adulthood [7]. During adulthood facial skin also undergoes dramatic changes with age, including wrinkling and sagging, increases of pigmented irregularities [39]. All these skin age-related features are keys in the perception of facial age in adults [5, 9, 22, 27]. Our objective is to elaborate a model for age-related changes of visual cues on older women faces affecting age perception to better predict them. As we will see in the next section, lots of age progression methods change shape and appearance without incorporating specific aging signs such as wrinkles.

For this reason, we propose WOAAM (Wrinkle Oriented Active Appearance Model): we base our work on the Active Appearance Model to simulate facial aging (Section 2.1 p. 3), to which we incorporate a specific channel to analyze and synthesize wrinkles (Sections 2.2 and 2.3 p. 4–5) before explaining the computation of an aging trajectory (Section 2.4 p. 7). Afterwards, we will show images resulting from the aging and rejuvenating of faces (Section 3.1 p. 8), and finally, that this approach increases/decreases perceived age more precisely than the unmodified Active Appearance Model, tested with an age estimation Convolutional Neural Network (Section 3.2 p. 8).

1.1 Related works

Given the diversity of potential applications of facial aging and the growing variety of computer vision techniques, many methods have been developed in recent decades [10, 21, 36].

Ramanathan and Chellappa [23] propose a craniofacial growth model to analyze shape variations due to age for children under 18 years of age. Shapes are defined by a set of facial landmarks, and a model of facial deformation for aging during childhood is introduced. Then, faces are warped according to the deformation model to rejuvenate or age. This model permits them to estimate an age based on a face and to mock-up the face aging process for children. This model only takes on board shape variations because that is considered the principal source of variations from birth to adolescence.

When elaborating a model for facial aging during adulthood, in addition to shape, texture changes also need to be considered. The work of Lanitis et al. [16, 17] is the first to use Active Appearance Model on age progression. They use AAM to create a subspace modeling both texture and shape variations of faces. Regression of coordinates from this newly created space on age indicates the direction of facial aging. Finally, they can project a new face in this subspace, translate it in the face aging direction and reconstruct a shape and texture to obtain an aged appearance. Nevertheless, AAM-based age progression is known to produce a blurry texture because wrinkles and spots are never perfectly aligned between people.

Facing this problem, more recent approaches [4, 35] use AAM to produce appearance and shape, and add a post-processing step on appearance to superimpose patches of high-frequency details. While faces produced are plausible, details added are not statistically learned for age progression, as texture patches that contain details are chosen with a similarity measure, and not with respect to a precise age.

Jinli Suo et al. [13] divide faces into several patches to create an And-Or graph containing every patches at five age intervals, spaced over ten years. And nodes represent different parts of the face, whereas Or nodes represent the different realizations of these parts for the population in every age group. They use a first order Markov chain to model aging of parts of the face. Wrinkles are annotated and their properties (numbers, lengths, positions...) are modeled by a Poisson distribution, for each property. Artificial aging can be created by decomposing a face, present in age group t, in a And-Or graph Gt, and to sample the probability p(Gt+ 1 | Gt) with Gibbs sampling algorithm; the graph Gt+ 1 can be collapsed to generate a new face.

Another approach creates a prototype [5, 26], an average face from faces within a constrained age group, meant to represent typical features from this group. A younger face can be then warped in the mean shape, and the prototype blended on the texture of the younger face to make it look older. As for AAM-based methods, prototype-based methods suffer from the same problem; making an average face will blur out every non-aligned high frequency detail. Tiddeman et al. [33, 34] propose to add a post-processing step to enhance high frequency information on the average face. They extract fine details with wavelet decomposition [33] for every face to add them on the final average face, with a parameter σ controlling the level of details to transfer. In [34], they combine wavelet decomposition with Markov Random Field to regenerate fine details on the average face, which produces more realistic results. Although having more wrinkles, the final results are not completely realistic nor completely facial aging oriented, as details generated are not chosen with respect to age.

Shu et al. [29] propose to encode aging pattern of faces in age-group specific dictionaries. Every two neighboring dictionaries are learned jointly taking into consideration extra personalized facial characteristics, e.g. mole, which are invariant in the aging process. However, faces produced are still blurry and no wrinkles appear, even for long-term aging (+ 40 years).

Promising approaches [2, 20, 37, 38, 40] propose to use Deep Neural Networks to produce aged faces.

Antipov et al. [2] propose age conditional Generative Adversarial Network (acGAN). Generative Adversarial Networks (GAN) are known to produce images with sharper textures because the reconstruction metric is not defined in the pixels space, but in the latent variables space. They combine a GAN with a face recognition neural network to preserve identity during reconstruction and aging.

Wang et al. [37, 38] introduce a Recurrent Face Aging (RFA) framework using a Recurrent Neural Network which takes as input a single image and automatically outputs a series of aged faces.

Zhang et al. [40] presented Conditional Adversarial Autoencoder (CAAE). They use an Autoencoder combined with 2 discriminators working on latent variables and output images to impose photo-realistic results. The first discriminator Dz imposes latent variables z to be uniformly distributed to avoid “holes” in the latent space, and thus to produce a smooth age progression. The second discriminator Dimg, inspired by the GAN architecture, discriminates between real images and generated images, and its loss is used to improve the photo-realism of pictures. Age progression is achieved by regressing the latent variables with respect to age.

However, age progression algorithms based on neural networks can produce in some cases unrealistic faces (e.g the 2 eyes of a reconstructed face can have different shapes). In addition, lots of these algorithms work on low resolution faces, at most 128 × 128 [2, 20, 40]. Thus, as the used faces are too small to show fine details, these face aging systems cannot generate faces with fine wrinkles.

In addition to facial aging, many applications aim to estimate age from faces.

Early works have been made by Kwon and Lobo [14, 15]; they computed several distance ratios between landmarks at specific locations on faces to distinguish between 3 age classes, babies, young adults, and seniors.

Lanitis et al. [17, 18] proposed to obtain a compact parametric description of face images using Active Appearance Model and to use this description to estimate ages. Shapes are normalized with Procrustes Analysis and parametrized with Principal Component Analysis. Thereafter, faces are warped in the mean shape before being also parametrized with Principal Component Analysis. Shape and appearance parameters are then concatenated and a third Principal Component Analysis is performed. Finally, the authors tested a range of classifiers and regressions like linear regression, quadratic regression, cubic regression, and artificial neural network.

Guo et al. [11] proposed the Biological Inspired Features. Face images are firstly convoluted with several Gabor kernels extracting specific details in terms of scales and orientations. Secondly, the result undergoes a max pooling compensating for small translations and small rotations. Finally, the pooled feature is used with Support Vector Machines to estimate age with a low Mean Absolute Error.

Recent uses of deep convolutional neural networks have demonstrated great performance and robustness on big datasets with large variations in pose and illumination. Rothe et al. [25] proposed to use the ConvNet VGG-16 [31] pretrained on the ImageNet database for image classification. Thereafter, they finetuned it with a database of 500k celebrity faces to estimate biological age. Finally, they finetuned it again on the database of the ChaLearn LAP 2015 challenge which they won.

In view of the current state of art and our constraints, we base our work on the Active Appearance Model to simulate facial aging (Section 2.1 p. 3), to which we incorporate a specific channel to fully integrate wrinkles (Section 2.2 p. 4); in this subspace, computed aging trajectories will take into account shape, appearance and wrinkles, differing from other methods which use classic AAM and add a post-processing step to include wrinkles.

Afterwards, we detail how to synthesize aged faces from our new wrinkle oriented AAM (Section 2.3 p. 5) before explaining the computation of an aging trajectory (Section 2.4 p. 7).

Finally, we propose to study the quality of our aging system by presenting images resulting from the aging and rejuvenating of faces (Section 3.1 p. 8). Then, we show that this approach increases/decreases perceived age more precisely than the unmodified Active Appearance Model with an age estimation convolutional neural network (Section 3.2 p. 8).

To analyze faces in the light of facial aging, we propose 3 contributions.

The first contribution is the parametrization of each wrinkle where shape and texture are represented altogether by a very understandable 7-length vector. Conversely, such a vector can be used to produce a wrinkle in shape and texture just from parameters.

To represent a group of wrinkles in one facial zone, we propose an approximation of an arbitrary joint probability of n random variables, as the set of every joint probability for every random variable taken two at a time; that is our second contribution.

Our third and last contribution is a new method of sampling for our approximated density mentioned above.

2 Proposed method

We propose a parametric model based on an Active Appearance Model (AAM) able to project a face in a latent space integrating high frequency facial details such as wrinkles. The face is transposed in this latent space, in a direction identified as an aging direction, and reconstructed to synthesize an aged face.

We will firstly describe AAM (Section 2.1), before explaining how to integrate high-frequency details like wrinkles (Section 2.2) and synthesize them (Section 2.3). Finally, we will present how to identify an aging trajectory in the latent space and use it to make a face look younger/older (Section 2.4).

2.1 Active appearance model

Active Appearance Model [6] is a statistical model which creates a subspace modelling appearance and shape variations in an annotated dataset of faces.

For shape, we put landmarks on key points, and afterwards, a Procrustean analysis is performed to align shapes on the mean shape using translation, rotation and homothety. Appearance information is then computed by warping every image into the mean shape, using each individual annotation.

After that, according to the AAM algorithm, Principal Component Analysis (PCA) is carried out separately for shape and appearance, and a final PCA is made on the concatenation of shape weights and appearance weights. This creates a subspace, which models variations present in the dataset of shape and appearance (see Fig. 1). Because the PCA has the advantage of being perfectly invertible, we can reconstruct a shape and an appearance from any point in the newly created subspace.

Fig. 1
figure 1

AAM Scheme

2.2 Analyzing wrinkles

As mentioned in [4], aged faces produced by AAM will always seem blurry. This is because high frequency details, like wrinkles, must be perfectly aligned between faces for the PCA to capture their variations and thus to reconstruct them.

We propose a new framework (Fig. 2), first to represent each wrinkle by a compact vector (Section 2.2.1), and after that to represent all wrinkles on a face by a feature vector which is robust enough for PCA and thus able to still retain wrinkle information after PCA reconstruction (Section 2.2.2).

Fig. 2
figure 2

Wrinkle Oriented AAM Scheme

2.2.1 Wrinkle model

We propose a separate model to analyze the shape and texture variations of wrinkles.

First, wrinkles are annotated with 5 points for each wrinkle. Afterwards, these 5 points are transformed into more explainable pose parameters containing:

  • center (cx, cy) of wrinkle

  • length which is equal to the geodesic distance between the first point and the last point of annotation

  • angle a in degrees

  • curvature \(\phantom {\dot {i}}\mathcal {C}\) computed as least squares minimization of

    $$ \min \parallel Y - \mathcal{C}X^{2} {\parallel_{2}^{2}} $$
    (1)

    with Y (resp. X) the ordinates (resp. abscissa) of the wrinkle centered with the origin, and with first and last points horizontally aligned.

Here we just transformed the shape of a wrinkle in a 5-length vector \(\phantom {\dot {i}}(c_{x}, c_{y}, \ell , a, \mathcal {C})\).

In addition, each texture wrinkle is extracted by making a bounding box around annotation and only keeping high frequency information by Difference of Gaussians (see Fig. 3). Here we blur texture with parameter σb = 6 and subtract blurring result with the untouched texture to make a high-pass filter and extract wrinkles. This filter has the advantage of being able to reconstruct perfectly the original image by simply summing the low and high frequency versions of the image. Here, as the wrinkle is high frequency information, we only keep the high frequency image and drop the low frequency version which contains skin color.

Fig. 3
figure 3

High frequencies extraction. Left: original image. Middle: Image Gaussian blurred with σb = 6. Right: Difference of Left and Middle Image to extract high frequencies. The parameter σb is relative to image resolution (i.e higher resolution implies higher σb), and can be found empirically

After that, wrinkle appearance is warped in the mean shape and then transformed in pose parameters. A second derivative Lorentzian function (Eq. 2) is fitted on each column and the average of every parameter found by fitting is kept (Fig. 4).

$$ A * \frac{2\sigma\left( 3\left( x-\mu\right)^{2}-\sigma^{2}\right)}{\left( \left( x-\mu\right)^{2}+\sigma^{2}\right)^{3}} + o $$
(2)

where μ and σ are respectively location and scale of the second derivative Lorentzian function, and, A and o are tweaking parameters to adjust the curve. Only A and σ are kept to characterize respectively depth and width of wrinkles.

Fig. 4
figure 4

Texture Fitting Example. Left: warped wrinkle; fitted column is highlighted. Right: in blue the pixels intensity variations and in green the fitting result

Thus, we constructed a model able to transform a wrinkle in a set of 7 understandable parameters \(\phantom {\dot {i}}(c_{x}, c_{y}, \ell , a, \mathcal {C}, A, \sigma )\), 5 for shape and 2 for appearance. On a side note, we can say that other pose parameters could have been computed. Taking the curvature parameter \(\phantom {\dot {i}}\mathcal {C}\) as minimization of (Eq. 1) is implicitly modeling wrinkle shapes as second order polynomials. For more accurate but more complex modeling, third or fourth order polynomials, or any parametric curve, could be used. Also, concerning appearance pose parameters, our modeling implicitly defines wrinkles as having uniform intensity and width. Instead of taking the average parameters (A, σ), several parameters (Ai, σi) could have been taken at different locations for each wrinkle appearance.

2.2.2 Robust feature

The objective remains to obtain a representation of wrinkles for each face and to analyze them by applying PCA. As people have different numbers of wrinkles, we cannot just compute parameters for each wrinkle in a face and concatenate them to create a fixed-length representation usable with PCA. We have to find a fixed-length representation vector of wrinkles for each face.

We propose to estimate the probability density modeling the structure of wrinkles for each face and each zone.

Using the system introduced in Section 2.2.1, each wrinkle is represented by a 7-length vector. We divide faces into 15 zones (forehead, nasolabial folds, chin, cheeks…), aiming to compute a joint probability P(d1,…, d7) of wrinkles from each zone and each face. Unfortunately, such joint probabilities can have a very large memory footprint because of dimensionality, as the memory size of densities grows exponentially with dimensionality. To circumvent this problem, we propose an approximation of an arbitrary joint probability of n random variables by computing every joint probability for every random variable taken two at a time (Fig. 5). More precisely, we propose to approximate P(d1,…, dn) by the set {P(d1, d2), P(d1, d3),…, P(dn− 1, dn)}. From now, when the number of dimensions n grows linearly, work memory no longer grows exponentially but quadratically \(\phantom {\dot {i}}{\Theta }(\frac {n(n-1)}{2})\).

Fig. 5
figure 5

Ensemble of joint probabilities for the frown lines for one person. With n = 7, there are \(\frac {n(n-1)}{2} = 21\) densities; however, we only show 10 densities for convenient purpose

Joint probabilities are computed by Kernel Density Estimation (KDE) with a Gaussian kernel of standard deviation σkde = 1.5 for 60x60 densities; σkde parameter controlling the tradeoff between accuracy of wrinkles representation with a low σkde, and generalization with a higher σkde.

Thus, for one face, we propose to extract a vector containing, for each of the 15 zones:

  • number of wrinkles nw in current zone,

  • average wrinkle,

  • densities computed with KDE on wrinkles where the average wrinkle was subtracted,

and to concatenate all 15 vectors to create the representation of wrinkles in one face.

2.3 Synthesizing wrinkles

We now have a representation of wrinkles that we are able to incorporate in the classic AAM as seen on Fig. 2. PCA being perfectly invertible, we can reconstruct a shape, an appearance and a wrinkles representation vector from any point in the final PCA space. However we must define how to generate wrinkles from our wrinkles representation vector.

We propose a new sampling method able to extract plausible wrinkles from our wrinkles representation vector, which is composed of joint probabilities. Algorithm’s main point is finding a point iteratively, dimension after dimension, whose projections in each density is above a probability threshold; the threshold is decreased from 0.9 to 0.1 progressively to find the best candidate; precise algorithm is available on the Appendix page 16.

First of all, peaks are found in P(cx, cy) and Sample function is called for each peak found with a peak as parameters px and py, from the peak with highest probability to the lowest.

We will present a step-by-step running of the function Sample for a given peak (39,41). A vector p = (39,41,0,0,0,0,0) is created which will contain the point’s coordinate created by the function (Fig. 6).

Fig. 6
figure 6

The first two values of p are found by peak detection (the green point)

After that, function Get_argmax_min will extract two 1-D densities, P(cx = 39, ) and P(cy = 41, ), apply the minimum operator element-wise on them, and finally find the coordinate with highest probability ii such as ii = argmax (min (P (cx = 39, ), P(cy = 41, ))) (Figs. 7 and 8). With p3 = ii, if P(p3) is below the reference Pref = 0.9, Pref is decreased at 0.8; otherwise the search for p4 begins with Pref still equals to 0.9 and p = (39,41, p3,0,0,0,0).

Fig. 7
figure 7

The algorithm has to assign p3 a value that maximizes the probability in P(cx = 39, ) and P(cy = 41, )

Fig. 8
figure 8

The two extracted red lines on Fig. 7 are the first two curves at the top, the third curve is the result of the element-wise minimum operator. We find that the maximum is obtained for = 1

Here, p3 = 1 and P(p3) = 0.52, so Pref is sequentially decreased from 0.9 to 0.8, then 0.7, then 0.6, and finally 0.5, where the value of P(p3) is accepted and the search for p4 begins with Pref equals to 0.5 and p = (39,41,1,0,0,0,0).

For p4, the same processing is made with the three 1-D densities P(cx = 39, a), P(cy = 41, a) and P( = 1, a). With p4 found (Figs. 9 and 10), if P(p4) is below the reference Pref = 0.5, then backtracking starts: P(cx = 39, = 1) and P(cy = 41, = 1) are set to 0 and a new p3 has to be found; otherwise the search for p5 begins with p = (39,41,1, p4,0,0,0).

Fig. 9
figure 9

The algorithm has to assign p4 a value that maximizes the probability in P(cx = 39, a), P(cy = 41, a) and P( = 1, a)

Fig. 10
figure 10

The three extracted red lines on Fig. 9 are the first three curves at the top, the fourth curve is the result of the element-wise minimum operator. We find that the maximum is obtained for a = 25

As the algorithm keeps running, more and more cases are explored to finally get a point p which maximizes probabilities in densities given the starting peak (px, py), and thus corresponds to a plausible wrinkle.

The wrinkle representation vector contains the number of wrinkles nw to generate, the average wrinkle and the densities. We can create the nw wrinkles parameters by running this algorithm nw times and adding them to the average wrinkle.

Afterwards, we trivially have to produce wrinkles shape and texture from parameters (see Section 2.2.1 p. 4 for definition of these parameters).

Shape is created from \(\phantom {\dot {i}}(c_{x}, c_{y}, \ell , a, \mathcal {C})\) by sampling the polynomial defined by the curvature \(\phantom {\dot {i}}\mathcal {C}\) until the specified geodesic length is reached. After that, points composing the shape are rotated according to angle a and finally center (cx, cy) is added to shape.

Texture is produced by creating an empty image and variations of a second derivative Lorentzian function (see Eq. 2) of parameters (A, σ) are affected to each column.

Finally, texture is warped in the newly created shape, for every wrinkle, and wrinkles are subsequently blended by merging the gradient of wrinkles with gradient of the underlying face (Fig. 11).

Fig. 11
figure 11

Before and after aging wrinkles under the left eye. As we can see, the method doesn’t produce any artifact nor suppress micro-texture

2.4 Aging trajectory

The final PCA subspace of the system (Fig. 2) models variations of faces in shape, appearance and wrinkles where original pictures are projected, and can be back-projected and perfectly reconstructed as we keep all components. As we can see in Fig. 2, we drop the final PCA from the classic AAM (Fig. 1). As PCA is unsupervised, the PCA algorithm could combine on a same component, variations correlated with age and others uncorrelated with age, perturbing the following trajectories computation part. In that respect, we keep the first 3 PCAs, reducing dimensions of our data and thus making the computation of trajectories possible, and drop the last PCA. As a consequence, PCA weights W in the final subspace correspond to the concatenation of PCA weights from the 3 channels: (Wshape, Wappearance, Wwrinkles). As our objective is to identify variations correlated with perceived age, we have to make a regression f of PCA weights W on perceived ages \(\phantom {\dot {i}}\mathcal {A}\). We decided to make a cubic polynomial regression to model facial aging in our case, as this choice gave us the best results:

$$ f(W) = A^{T}W^{3} + B^{T}W^{2} + C^{T}W + D = \mathcal{A} $$
(3)

To make a face with a perceived age a look older/younger of y years, we have to project it on the final subspace to obtain weights Wcurrent, apply this formula:

$$ W_{new} = W_{current} + (f^{-1}(a + y) - f^{-1}(a)) $$
(4)

with f− 1(a) = Wmean, a and reconstruct a new face from Wnew. As multiple different faces can match the same age, f− 1(a) will return the average PCA weights Wmean, a of this specific age.

Just as [17], we make a Monte Carlo simulation to inverse f: we generate a lot of plausible weights W; the corresponding age for each weight wW is found by applying f(w), and f− 1 is a lookup table where for a given age a, f− 1(a) is an average of all weights WaW as such f(Wa) = a.

3 Analyzing results

In this section, we first present examples of aged and rejuvenated faces resulting from our model (Section 3.1), and after that we quantify the correlation between age progressed faces and the perception of these faces by an independent age estimation algorithm (Section 3.2). We show that our system is better correlated with the perception of age than the classic AAM (Section 3.2.2).

Our database consists of 400 Caucasian women taken in 2014, in frontal pose with a neutral expression and with the same lightning (Fig. 12). All faces are resized to 667 × 1000 resolution and annotated with 270 landmarks to locate eyebrows, eyes, mouth, nose, and facial contour. In addition, 5 landmarks are placed on each wrinkle. Each face has been rated by 30 untrained raters to obtain a precise perceived age; perceived ages in the dataset range from 43 to 85 years with an average of 69 years.

Fig. 12
figure 12

A subsample of our database with their corresponding perceived ages

3.1 Qualitative results

As seen on Fig. 13, aging changes several known cues on a face [7, 8, 39].

Fig. 13
figure 13

Face aging results. Left: Rejuvenating of 20 years. Middle: Original. Right: Aging of 20 years

Concerning shape, the size of the mouth is reduced, especially the height of the lower mouth; eyebrows and eyes are both reduced as well, and we can see facial sagging at the lower ends of the jaw.

Concerning appearance, the face globally becomes whiter and yellowish, eyebrows and eyelashes are less present, and the mouth loses its red color as aging progresses.

With aging, more wrinkles appear and existing wrinkles are deeper, wider and longer. As we can see, new wrinkles created by our system are plausibly located with realistic texture.

3.2 Quantitative results

3.2.1 Age estimation

As in [25], we employ a pre-trained VGG-16 CNN [31] to create a face representation less sen- sitive to pose and illumination : we feed a picture as input where the face has been cropped and the representation produced is the output from block5_pool, the last pooling output.

Afterwards, a Ridge regression is made in a 40-fold manner. As seen on Fig. 14, we obtain a R2 score of 0.92 and an average absolute error and maximum absolute error of respectively, 2.8 years and 13.7 years. On the very same database, the average human estimates perceived age with an average absolute error and maximum absolute error of respectively, 5.5 years and 17.1 years.

Fig. 14
figure 14

Performance of our age estimation algorithm

3.2.2 Comparison with prior works

For this experiment, we compare the perception of aged faces and the perception of rejuvenated faces for Active Appearance Model (AAM) [17], Conditional Adversarial Autoencoder (CAAE) [40] and our method Wrinkle Oriented Active Appearance Model (WOAAM). To test facial aging, we use faces with a perceived age of less than 60 years, and, for rejuvenating faces, a perceived age of 70 years and more. For AAM and WOAAM, each face is aged/rejuvenated 2 years at a time, and we compare, on average, the difference between estimated and expected age. For CAAE, each face is aged/rejuvenated 10 years at a time because this method use 10 discrete labels, and each label account for a 10-year interval.

As we can see on Figs. 15 and 16, our method produces faces that are perceived as older than classic AAM and CAAE for aging, and younger for rejuvenating. In other words, a facial aging with WOAAM of y years better reduces the gap between the expected age and the age estimated by the age estimation system than a classic AAM or CAAE. For a 10-year aging period, the estimation of age has increased by 4.9 years for WOAAM, by 3.4 years for AAM, and by 2.9 years for CAAE. Also, for a 10-year rejuvenating period, the estimation of age has decreased by 4 years for WOAAM, by 2.3 years for AAM, and by 1.5 years for CAAE. On average, we improved performance by a factor of 1.5 over AAM, and by a factor of 2.5 over CAAE.

Fig. 15
figure 15

Perception of faces aged of y years, in function of y going from 0 to 30 years, for the classic AAM, CAAE, and our Wrinkle Oriented AAM

Fig. 16
figure 16

Perception of faces rejuvenated of y years, in function of y going from 0 to -30 years, for the classic AAM, CAAE, and our Wrinkle Oriented AAM

However, we can note that for a 10-year period of aging and rejuvenating, the estimation of age has been altered too slightly: respectively, by only 4 years and -3.4 years, which is low. This can be explained by the fact that we used only one aging trajectory, and because our model does not consider age spots.

Age spots could be incorporated in our model by creating a dedicated channel in our system, as we did for wrinkles. Afterwards, pose parameters of each age spots shape could be computed by fitting an ellipse to shapes and taking parameters of the fitted ellipses. Also, pose parameters of each age spots appearance could be computed by taking their mean RGB color. After that, we can carry out the same processing that we made for wrinkles. Firstly, to estimate the probability density modeling the structure of age spots for each face and each zone. Secondly, we can compute a PCA on our age spots representation vectors and connect the output to the final PCA. Thus, aging trajectories would take into account age spots, in addition to shape, appearance and wrinkles.

4 Conclusion

We presented a new framework to analyze facial aging taking into account shape, appearance and wrinkles. We showed that the system can generate realistic faces for aging and rejuvenating, and such age-progressed faces better influence age perception than with Active Appearance Model or Conditional Adversarial Autoencoder. On average, we demonstrated an improvement factor of 2.0 over prior works.

Nevertheless, the model can be improved in several ways. Firstly, the realism of the faces produced by the model has not been rated in this study. Moreover, we know that facial aging is influenced by environmental factors like sun exposure, alcohol consumption or eating practices [12, 19]. A potential improvement could be to compute multiple trajectories in function of those factors. In addition, dark spots must be included in the model to increase the accuracy of facial aging. We are confident that dark spots can be integrated in the same way as wrinkles. This is the objective of future research.