Dynamic Distance-Based Shape Features for Gait Recognition

Whytock, Tenika; Belyaev, Alexander; Robertson, Neil M.

doi:10.1007/s10851-014-0501-8

Dynamic Distance-Based Shape Features for Gait Recognition

Open access
Published: 04 March 2014

Volume 50, pages 314–326, (2014)
Cite this article

Download PDF

You have full access to this open access article

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Dynamic Distance-Based Shape Features for Gait Recognition

Download PDF

Tenika Whytock¹,
Alexander Belyaev¹ &
Neil M. Robertson¹

3836 Accesses
35 Citations
1 Altmetric
Explore all metrics

Abstract

We propose a novel skeleton-based approach to gait recognition using our Skeleton Variance Image. The core of our approach consists of employing the screened Poisson equation to construct a family of smooth distance functions associated with a given shape. The screened Poisson distance function approximation nicely absorbs and is relatively stable to shape boundary perturbations which allows us to define a rough shape skeleton. We demonstrate how our Skeleton Variance Image is a powerful gait cycle descriptor leading to a significant improvement over the existing state of the art gait recognition rate.

A New Gait-Based Identification Method Using Local Gauss Maps

Robust gait identification using Kinect dynamic skeleton data

Article 21 November 2018

Gait recognition from corrupted silhouettes: a robust statistical approach

Article 27 July 2016

1 Introduction

Defining and extracting proper object features is a key component of any object recognition pipeline. In this paper, we deal with the problem of gait recognition and propose utilisation of dynamic properties of rough shape skeletons as a gait cycle descriptor. Our contribution is threefold.

We introduce the concept of the Skeleton Variance Image and demonstrate that it stores important information about moving human silhouette figures. We show that the Skeleton Variance Image is a powerful gait cycle descriptor which leads us to a significant improvement over the existing state of the art gait recognition rate.
We demonstrate that smooth distance fields yield robust extraction of rough skeletal structures which promote stability with respect to shape boundary perturbations.
In particular, we demonstrate that solving the so-called screened Poisson equation yields a computationally efficient way to define a family of smooth distance functions with simple and efficient control over their smoothness yielding a skeleton which is significantly more robust compared to the exact distance function.

1.1 Gait Recognition

Gait recognition seeks to identify a person by their walking manner and posture [45]. With applications including surveillance and access control, gait as a behavioural biometric is advantageous over physical biometrics, e.g. fingerprint, given capture without consent or cooperation, unobtrusively, at low resolution and at distance. Early studies in medical [48] and psychophysics [15] demonstrate the uniqueness of gait, and gait recognition has developed significantly since the first computer-based approach by Niyogi and Adelson [49] in 1994. In practical terms, we require robustness to real world covariate factors capable of altering gait appearance and motion which are detrimental to performance, e.g. clothing, bags, shoe type and even elapsed time between capture.

Approaches are split into model-based, model-free and multi-information fusion. Model-based approaches [41, 72] construct gait signatures by modelling or tracking human body segments via anthropometrics [17, 19], model-free approaches [27, 28] disregard human body structure in favour of silhouette-based representations, while multi-information fusion approaches replicate human vision perception by utilising multiple features [40, 69] or biometrics e.g. face [32, 35]. We currently consider single feature and biometric gait recognition, however this is not to say the performance of our proposed approach could be boosted with such efforts; we also find the benefits of low computational cost and image quality insensitivity associated with model-free approaches outweigh the benefits of view and scale invariance associated with model-based approaches.

Considering model-free approaches more in detail, silhouettes commonly serve as the foundation and can be extracted easily from sources such as time of flight, Microsoft Kinect and Lidar; colour and texture are rejected thus ensuring no bias to appearance occurs during gait recognition given motion is more consistent over time.

Skeleton, compared to silhouette, gait representations are few and far between—especially those founded on distance functions. Lack of implementation is linked to boundary perturbation sensitivity from imperfectly extracted silhouettes and the natural self occluding nature of gait. For example, an oversimplified skeleton can be constructed by connecting the silhouette figure centroid to its head and limbs [13], whereas anthropometrics enable a more realistic six joint skeleton [72]. Both examples utilise a gait cycles worth of skeletons which is uneconomical with respect to memory and computational costs; the alternative is to perform the increasingly popular space- and time-normalisation techniques to yield a single, compact 2D gait representation [6, 27, 33, 68, 71, 74].

1.2 Generalised Distance Fields and Distance-Based Shape Features

A generalised distance field is a scalar (vector) field approximating the minimum distance (minimum distance and direction) to a shape with respect to a certain metric. Generalised distance fields and distance-related shape features such as skeletons [9] are widely used in pure mathematics in relation to analysis of Hamilton–Jacobi equations and curvature-driven manifold evolutions [1, 43], computational mathematics [50] in connection to level set methods, computer vision, pattern recognition, and image processing [23–25, 54, 76], shape matching [51], computer graphics and geometric modeling [11, 14, 34, 52, 53], computational mechanics [21], CFD and turbulence modelling [66] (the so-called wall distance, the minimum distance to a solid wall is a key parameter in several turbulence models), medical image processing, analysis, and visualisation [36], and many other areas.

Our gait recognition approach deals with smooth distance fields approximated by solutions to the Poisson equation alongside its normalised and screened Poisson equations; results suggest our approach yields an efficient manner of extracting rough shape skeletons associated with the smooth distance fields. Given a sequence of silhouettes representing a gait cycle, the pixel-wise variance of their corresponding skeletons reflect dynamic gait patterns which turns out to be a powerful gait descriptor.

1.3 Validation

Validation of our proposed approach is performed on the largest, latest and most covariate factor rich, standardised publicly available database: TUM Gait from Audio, Image and Depth (GAID). Overall, our representation significantly boosts robustness as we focus on gait motion which is more consistent over time than gait appearance.

2 Smooth Distance Functions

It is well known that the true Euclidean distance function and its corresponding skeleton (medial axis) are very sensitive to small boundary perturbations. In our study, imperfect silhouette segmentation leads to an abundance of boundary noise. As a possible remedy, one can hope that a properly defined smoothed distance function and its corresponding skeleton are less sensitive to segmentation inaccuracies and silhouette boundary noise. Below we exploit a partial differential equation (PDE) approach and consider several PDE-based schemes to generating smooth distance functions.

To the best of our knowledge, the idea of using diffusion-type PDEs for skeleton extraction purposes was first proposed in [64] where the so-called screened Poisson equations were used. While we consider some other PDE-driven schemes for the distance function approximation and skeleton extraction, the screened Poisson equations serve as our main working horse.

2.1 Screened Poisson Distance Function

Our first approach to constructing a family of smooth distance functions explores an asymptotic relationship between the distance function and solutions to screened Poisson equations [67, Theorem 2.3].

Consider a Dirichlet boundary value problem for a screened Poisson equation in a bounded domain $\varOmega $.

$$\begin{aligned} v-t\Delta v=0\quad \text{ in }\quad \varOmega , \qquad v=1\quad \text{ on }\quad \partial \varOmega , \end{aligned}$$

(1)

where $t$ is a small, positive parameter. Then, as shown in [67],

$$\begin{aligned} \lim \limits _{t\rightarrow 0}-\sqrt{t}\ln [v({\varvec{x}})] = d ({\varvec{x}},\,\partial \varOmega ), \end{aligned}$$

(2)

where $d({\varvec{x}},\partial \varOmega )$ is the distance from ${\varvec{x}}\in \varOmega $ to $\partial \varOmega $. In other words, $d({\varvec{x}},\partial \varOmega )$ is approximated by

$$\begin{aligned} u({\varvec{x}})=-\sqrt{t}\ln v({\varvec{x}}), \end{aligned}$$

(3)

which defines a smooth distance field and parameter $t$ controls the smoothing properties of $u({\varvec{x}})$.

Distance function approximation (3) has been previously employed to extract skeletal structures from grayscale images [64]. An inhomogeneous version of the screened Poisson equation in (1) has been employed [26, 56] to estimate the distance function from a point set. An anisotropic version of (1) was used very recently [14] for tracing geodesics on triangulated surfaces.

It is interesting that the energy corresponding to (1) is a part of the Ambrosio-Tortorelli elliptic regularisation [2] of the Mumford-Shah functional [47]. See also, for example, [57] and [5, Sect. 4.2].

An intuitive explanation of (2) is given in [26] and uses a variant of the so-called Hopf-Cole transformation [20]. Substituting

$$\begin{aligned} v({\varvec{x}})=\exp \left\{ -u({\varvec{x}})/\sqrt{t}\right\} \end{aligned}$$

(4)

in (1) yields

$$\begin{aligned} \frac{\partial v}{\partial x_i}=-\frac{v}{\sqrt{t}}\frac{\partial u}{\partial x_i}, \qquad \frac{\partial ^2 v}{\partial x_i^2}=\frac{v}{t} \left| \frac{\partial u}{\partial x_i}\right| ^2 -\frac{v}{\sqrt{t}}\frac{\partial ^2 u}{\partial x_i^2}. \end{aligned}$$

Thus (1) can be rewritten as

$$\begin{aligned} 0=v-t\Delta v =v\left[ \left( 1-|\nabla u|^2\right) +\sqrt{t}\Delta u\right] . \end{aligned}$$

(5)

This gives a regularised eikonal equation for ${u}({\varvec{x}})$

$$\begin{aligned} \left( 1-|\nabla u|^2\right) +\sqrt{t}\,\Delta u=0 \text{ in }\varOmega , \quad u=0\text{ on }\partial \varOmega . \end{aligned}$$

(6)

Thus it is natural to expect that $u({\varvec{x}})$, the solution to (6), approximates the true distance function $d({\varvec{x}},\partial \varOmega )$ which satisfies the eikonal equation

$$\begin{aligned} |\nabla d|^2=1 \quad \text{ in }\quad \varOmega , \qquad d=0\quad \text{ on }\quad \partial \varOmega . \end{aligned}$$

(7)

Note that (1) is linear and can therefore be easily and efficiently solved numerically by using a sparse system of linear equations. Figure 1 shows the graphs of smooth distance functions (3) for various values of smoothing parameter $t$.

2.2 Screened Poisson Distance and Mean Curvature Flow

In the two-dimensional case an interesting relationship between $v({\varvec{x}})$, the solution to (1), and its level set curvature was derived in [64] and utilised for grayscale image skeletonisation purposes. Below we informally extend the relationship to the multidimensional case.

Let $\partial \varOmega $ be oriented by its inner normal $ {\varvec{n}}$. It is not difficult to show [22, Appendix B] that the minus Laplacian of the distance function $d({\varvec{x}},\partial \varOmega )$ yields the mean curvature $H({\varvec{x}})$ of the distance function level set passing through ${\varvec{x}}$

$$\begin{aligned} \Delta d = -H, \end{aligned}$$

where we assume that the level set of $H({\varvec{x}})$ is smooth at $x$.

Since $u({\varvec{x}})$ tends to $d({\varvec{x}},\partial \varOmega )$, as $t\rightarrow \infty $, it is natural to expect that $\Delta u$ is close to $\Delta d$ for small $t$ values. Thus (5) implies that

$$\begin{aligned} |\nabla u|^2\approx 1-H\sqrt{t}\,,\quad \text{ as }\quad t\rightarrow 0. \end{aligned}$$

Now taking into account that $t|\nabla v|^2=v^2|\nabla u|^2$, we arrive at

$$\begin{aligned} \left| \frac{\partial v}{\partial {\varvec{n}}}\right| ^2 \approx \frac{1}{t}\left( 1-H\sqrt{t}\right) v^2,\quad \text{ as }\quad t\rightarrow 0. \end{aligned}$$

Therefore, since $v({\varvec{x}})$ is decreasing in the direction of $ {\varvec{n}}$, we have

$$\begin{aligned} v({\varvec{x}})\approx -\sqrt{t}\left( 1+H\sqrt{t}/2\right) \frac{\partial v}{\partial {\varvec{n}}}\,, \quad \text{ as }\quad t\rightarrow 0. \end{aligned}$$

(8)

In the two-dimensional case, a much more accurate asymptotic relation was derived in [47, Appendix 3, Theorem B]

Similar to [64, Section 2], (8) can be linked to a surface evolution with the normal speed component equal to

$$\begin{aligned} 2/\sqrt{t}+H. \end{aligned}$$

While (8) is not directly related to our study of gait recognition problems, it supports works on multidimensional shape symmetries [63] and may be also useful for investing properties of more general diffuse distance fields [62].

2.3 Poisson and Normalised Poisson Distance Functions

Now let us consider a simpler approach to smooth distance function generation. The approach is based on solving a Dirichlet boundary value problem for a Poisson equation

$$\begin{aligned} \Delta \varphi =-1 \quad \text{ in }\quad \varOmega , \qquad \varphi =0\quad \text{ on }\quad \partial \varOmega . \end{aligned}$$

(9)

This problem serves as a basic mathematical model describing Brownian motion of particles which are born at a constant rate inside $\varOmega $ and die on $\partial \varOmega $. The solution to (9), the so-called Poisson distance function, is proportional to the particle density and therefore can be considered as a smooth approximation of the true distance function $d({\varvec{x}},\partial \varOmega )$ from $\partial \varOmega $.

Poisson distance functions have been employed for action recognition [24, 25], skeleton extraction [4], turbulence modelling applications [65], and geometric de-featuring purposes [70].

Although the Poisson distance function $\varphi ({\varvec{x}})$ does not deliver an accurate approximation of the distance function $d({\varvec{x}},\partial \varOmega )$, a simple normalisation procedure applied to $\varphi ({\varvec{x}})$ can significantly improve the approximation of $d({\varvec{x}},\partial \varOmega )$ near $\partial \varOmega $. Namely, following [61, 65] let us introduce

$$\begin{aligned} \psi ({\varvec{x}})=-|\nabla \varphi |+\sqrt{|\nabla \varphi |^2+2\varphi }. \end{aligned}$$

(10)

Normalisation procedure (10) is inspired by the fact that in the one-dimensional case (9) and (10) reconstruct the distance function precisely [65]. It is straightforward to rewrite (10) as

$$\begin{aligned} \psi ({\varvec{x}})=\frac{2\varphi }{\sqrt{|\nabla \varphi |^2+2\varphi }+|\nabla \varphi |}, \end{aligned}$$

and check that

$$\begin{aligned} \psi =0\quad \text{ and }\quad \partial \psi /\partial {\varvec{n}}=1 \quad \text{ on }\quad \partial \varOmega , \end{aligned}$$

(11)

where $ {\varvec{n}}$ is the outer unit normal to $\partial \varOmega $. Thus $\psi ({\varvec{x}})$ approximates $d({\varvec{x}},\partial \varOmega )$ very accurately near $\partial \varOmega $. It seems that the second normalisation condition in (11) has not been noticed before.

Of course there are many other possibilities to achieve a similar effect. For example, one can apply the normalisation procedure considered during geometric modelling purposes [55, 58]

$$\begin{aligned} \frac{\varphi ({\varvec{x}})}{\sqrt{|\nabla \varphi ({\varvec{x}})|^2+\varphi ({\varvec{x}})^2}}. \end{aligned}$$

While this and similar normalisation schemes lead to accurate approximations of the distance function near the boundary, they fail to achieve a satisfactory behavior far from the boundary. In contrast, as demonstrated by Fig. 2, (9) combined with (10) generates a good approximation of the distance function.

Both the Poisson and normalised Poisson distance functions have lower computational costs compared to the screened Poisson distance functions (1, 3). On the other hand, the latter provides us with an ability to control the amount of smoothing by tuning parameter $t$ in (1). For example, as shown in the right of Fig. 2, for a sufficiently small $t$, the screened Poisson distance function delivers a better approximation of true distance $d({\varvec{x}},\partial \varOmega )$ than the normalised Poisson and Poisson distance functions.

Figure 3 demonstrates a comparison of the Poisson, normalised Poisson, and screened Poisson distance functions. While all the smoothed distance functions demonstrate excellent properties in absorbing boundary perturbations, as shown later in this paper, the possibility to control smoothing properties of distance function approximations is vital for significant improvements in gait recognition.

2.4 $p$-Laplacian Distance Functions and $L_p\!$ Distance Fields

One more approach to approximate the distance function uses a quasi-linear generalisation of the Poisson equation. Namely, let us consider a Dirichlet boundary value problem for the $p$-Laplacian

$$\begin{aligned} \mathop {\mathrm {div}}\left( \left| \nabla \varphi _p\right| ^{p-2}\nabla \varphi _p\right) =-1 \text{ in } \varOmega ,\quad \varphi =0 \text{ on } \partial \varOmega , \end{aligned}$$

(12)

with $1\le p<\infty $. Then it can be shown [8, 37] that

$$\begin{aligned} \varphi _p({\varvec{x}})\rightarrow d({\varvec{x}},\partial \varOmega )\quad \text{ as }\quad p\rightarrow \infty . \end{aligned}$$

Moreover, as demonstrated in [8], for arbitrary $m>1$, this convergence is strong in the Sobolev space $W^{1,m}(\varOmega )$.

While $\varphi _p$ for sufficiently large $p$ delivers an accurate approximation of the distance function (see Fig. 4 for a simple example), achieving an accurate numerical approximation of the solution to (12) is a complex task compared with the linear PDE problems considered before.

It is also worth mentioning that the so-called $L_p$-distance fields introduced recently in [7] also allow the user to control an amount of smoothing added to the true distance function. However, according to our numerical experiments, the screened Poisson distance functions tend to distribute smoothing uniformly over the domain, while the $L_p$-distance fields apply less smoothing near the boundary and more smoothing far from the boundary.

3 Rough Skeletons

After the pioneering work of Blum [9], skeleton-based shape representations have been widely utilised for the analysis and processing of static and dynamic 2D and 3D shapes [59]. Strong correlations between medial shape structures and perceptual shape organisation [38, 39] remain a subject of intensive research [3].

While the classical medial axis [9] reflects shape organisation, its main drawback is high sensitivity to small-scale boundary perturbations. As the medial axis of an object is closely connected to the distance function from the boundary of the object (the medial axis can be defined as the set of singularities of the distance function), it is natural to expect that a smooth distance function may lead to a more robust shape skeletonisation scheme. Indeed attempts of using smooth distance functions for better (less sensitive) skeletonisation have been made, for example in [4, 18, 25, 64].

Our approach to shape skeletonisation is conceptually similar to those developed in [4, 25, 64], but instead of using second-order derivative operators (e.g. laplacian, curvature, or a curvature-based operators) as employed in these papers, we compute the squared gradient $|\nabla u|^2$ of a smooth distance function $u({\varvec{x}})$. We choose the gradient due to the following observation. Assume that the boundary $C_0=\partial \varOmega $ of $\varOmega $ is oriented by its inner normal $ {\varvec{n}}$ and consider offset curves $C_\rho $ obtained from $C_0$ by shifting each point of $C_0$ in the direction of $ {\varvec{n}}$ onto distance $\rho $. Then the skeleton $S$ of $\varOmega $ is formed by the first self-intersections of $C_\rho $, as $\rho $ increases. One can easily see that these self-intersections move along $S$ faster than the offset curves $C_\rho $ move along their normals. Namely, if curve $C_\rho $ moves with the unit speed, then its self-intersection point moves along $S$ with speed equal to $1/\sin \theta $, where $\theta $ is the angle between $C_\rho $ and $S$. This means that the rate of change of the distance function $d({\varvec{x}},\partial \varOmega )$ at that offset self-intersection point $ {\varvec{x}}\in S$ is given by $\sin \theta $. Further, if $\theta $ is small at $ {\varvec{x}}\in S$ (and therefore $\sin \theta $ is small as well), then the orientation normals at the boundary points corresponding to $x$ have almost opposite directions and a part of $S$ near $x$ reflects important bilateral symmetry properties of $\partial \varOmega $. Figure 5 illustrates these simple ideas.

Figure 6 demonstrates the advantage of the smoothed distance function gradient for extracting a fuzzy skeleton of a given shape. In practice we use the standard $3\times 3$ Sobel kernels to estimate the gradient. Figure 7 demonstrates how the squared gradient map $|\nabla u({\varvec{x}})|^2$ depends on smoothing parameter $t$ in (1) and (3).

A rough skeleton is obtained from the fuzzy skeleton by thresholding. In practice, as seen in the left image of Fig. 8, it also detects the silhouette boundary which is subsequently removed by rejecting a small number of $u({\varvec{x}})$ boundary layers. The resulting skeleton is considerably less sensitive to boundary noise than the true medial axis.

Note that in contrast to the classical medial axis, our rough skeleton is not a deformation retract of the original shape. For example, the rough skeleton shown in Fig. 8 contains gaps while the silhouette is a simple connected 2D shape. If necessary, Canny’s hysteresis thresholding procedure [12] can be utilised to remove such gaps.

4 Skeleton Variance Image

Over a complete gait cycle, skeleton motion can be extracted by considering how pixel intensity values vary during the skeleton sequence; this prompts our primary contribution—Skeleton Variance Image (SVIM) gait representation.

While in silhouette form, we perform (1) size normalisation to ensure constant height silhouettes and (2) horizontal alignment to centre silhouettes with the centroid of top 10 % figure height as a reference. We perform time-normalisation post skeleton construction to condense the skeleton sequence into a single, compact 2D gait representation by computing the pixel-wise variance. The resulting representation, seen in the rightmost column of Fig. 9, enables visualisation of high and low pixel intensity values corresponding to higher and lower degrees of body motion respectively.

5 Experimental Procedure

5.1 Validation

The TUM GAID database [28, 30], seen in Fig. 10, is one of the latest, largest and covariate factor rich databases and the first to utilise depth images extracted with the Microsoft Kinect—the database freely provides depth images which have been converted into silhouettes thus enabling research to concentrate on the gait recognition problem as opposed to data preprocessing problems such as silhouette segmentation. Training sequences are based on 155 persons and contain four normal i.e. covariate factor free sequences; test sequences contain two sequences each for: normal (N), carrying a bag (B—consistent across database) and shoes i.e. wearing over shoe covers (S)—see Fig. 10. Time-based test sequences are also captured three months later and contain 16 persons in two sequences each for: time and normal (TN), time and carrying a bag (TB) and time and shoes (TS); conversely these sequences contain coupled covariate factors i.e. time and clothing given the change in weather season. Depth, compared to RGB-based, silhouettes [30] are chosen given their cleaner appearance due to ease of extraction. We focus on persons captured from side views given their greater visibility of dynamic limb motion associated with a higher discriminative nature and greater robustness [44]—this is commonplace in gait recognition. Viewpoint as a covariate factor is another commonly, but often separately addressed, covariate factor however the TUM GAID database considers side views only (see Fig. 10).

5.2 Baseline and Comparable Representations

The Gait Energy Image (GEI) [27], seen in the leftmost column of Fig. 9, is our baseline and applies the same procedures outlined in Sect. 4 however using the pixel-wise mean and silhouettes in place of the pixel-wise variance and skeletons respectively. This appearance-based representation permits visualisation of static and dynamic information corresponding to high and low pixel intensity values respectively. We also present two new related representations for enhanced comparison: Skeleton Energy Image (SEIM) and Gait Variance Image (GVI) seen in the middle left and middle right columns of Fig. 9 respectively. The SEIM and GVI are analogous to SVIM and GEI respectively where the pixel-wise mean replaces the pixel-wise variance and vice versa respectively. These representations permits equal comparison of appearance-based (GEI and SEIM) vs. motion-based (GVI and SVIM) representations as well as silhouette (GEI and GVI) vs. skeleton (SEIM and SVIM) representations.

5.3 Distance Function

We compare the behaviour of distance functions extracted via the Poisson and screened and normalised Poisson equations.

5.4 Smoothing Parameter

Given smoothing parameter $t$ dictates the skeleton thickness produced by the screened Poisson distance function, demonstrated in Fig. 7, we therefore choose a broad range of values to evaluate its effect on gait recognition: small values {t = 0.1, 0.5, 5} correspond to a thinner, more traditional looking skeleton compared to large values {t = 10–90 in steps of 10} which correspond to a thicker skeleton tending towards a silhouette appearance.

5.5 Dimensionality Reduction and Classification

The GEI, GVI, SEIM and SVIM serve as a means to represent gait ($128\times 178$—typical for the TUM GAID database [31]) and describe gait when reshaped to a 1D feature vector (22784D). Dimensionality reduction transforms the feature vector into lower dimensional space (154D) by maximising variance and class separability with Principle Component Analysis (PCA) and Linear Discriminant Analysis (LDA) respectively [42]. Nearest Neighbour classification utilises the cosine distance measure [31] where rank 1 and rank 5 results are presented demonstrating the correct identity occurring first or in the top five matches respectively. This dimensionality reduction and classification combination is commonly employed by approaches utilising single, compact 2D gait representations like our baseline [27], and is advantageous in situations where training sequences are few.

6 Results and Discussion

6.1 Smoothing Parameter $t$ Behaviour

We first consider at how $t$ affects the performance of the SEIM and SVIM representations seen in Fig. 11. Across covariate factors we can see only the normal (N) and shoe (S) sequences behave consistently across $t$ which may be attributed to their similarity to training sequences; remaining sequences (B, TN, TB, TS) contain significant silhouette-based appearance differences compared to training sequences and the resulting skeleton variations cause covariate factors to prefer varying $t$. Given how differently covariate factors effect silhouettes and therefore our skeletons, inconsistency with preferred $t$ can be seen as advantageous as we could more effectively target covariate factors—especially should covariate factor detection be applied as a future pre-processing stage.

We are currently interested in the weighted average performance as we desire $t$ which is most effective over a varying range of covariate factors. First to notice is the significant performance jump, regardless of covariate factor, from $t=0.1$ to $t=0.5$ which is attributed to $t=0.1$ producing an overly thin skeleton risking considerable segmentation at branch points especially. Weighted average performance wise, we can see a subtle performance trend where the SVIM and SEIM decrease and increase respectively with larger $t$ values; this is linked to the SVIM and SEIM preferring a thinner, more traditional looking skeleton compared to a thicker skeleton tending towards a skeleton appearance respectively. We therefore suggest small ($t = 5$) value for the SVIM, however as pointed to us by one of the reviewers of the paper, scaling the image $\varOmega $ by a factor $s$ (while keeping its resolution fixed) and assuming that the solution $v(x)$ to (1) remains invariant leads to scaling the smoothing parameter $t$ by $s^2$. This means that in our current model no optimal $t$ exists if the image size and resolution are not specified—note that $t$ may also be database dependent.

6.2 Comparison to GEI Baseline

Table 1 compares rank 1 and rank 5 SVIM, SEIM, GVI and GEI performances across covariate factors with respect to Poisson, screened Poisson and normalised Poisson distance functions. For this table we choose the screened Poisson distance function yielding the best weighted average performance with respect to smoothing parameter $t$.

Table 1 TUM GAID database rank 1 and rank 5 performance for representations: GEI (baseline), and SEIM, GVI and SVIM, and sequences: normal (N), carrying a bag (B), shoes (S), time and normal (TN), time and carrying a bag (TB), time and shoes (TS), weighted average; distance functions are based on: Poisson, normalised Poisson and screened Poisson schemes

Full size table

6.3 Covariate Factor Performance Trends

Normal (N) and shoe (S) sequences perform highly given their appearance similarities to training sequences. Note the shoe sequences cause little gait appearance and motion alterations, whereas shoe types such as heels and flip flops may cause greater alterations and subsequently cause increased misclassification [10]. Bag carrying (B) sequences show poorer performances given the significant appearance alterations caused; bags appear as a mass of pixels around the back or a bend in silhouettes and skeletons respectively, see Fig. 9—note that bags also cause the body to lean due to compensation for a shifted centre of gravity. Time-based sequences (TN, TB, TS) cause significant issues performance wise, halving performance in some cases; see [46] for further information regarding time as a covariate factor during gait recognition. The primary cause of misclassification is due to appearance alterations caused by clothing which is a hidden covariate factor given the time (months) between capture. Clothing as a covariate factor is often addressed separately e.g. in the CASIA B database [73, 75]. Overall, these trends apply to both appearance-based and motion-based, and silhouette and skeleton approaches.

6.4 Appearance- vs. Motion-Based Representations

We can see significant performance differences between appearance-based and motion-based representations across the database. Especially during time-based sequences, motion-based representations often double that achieved with appearance-based representations—this occurs given gait motion is considerably more consistent over time compared to gait appearance. This observation leads us to recommend motion-based representations given their ability to overcome the majority and especially more complex real world covariate factors presented by the database.

6.5 Silhouette vs. Skeleton Representations

A pattern exists where combining silhouette and appearance-based representations (GEI) is favourable while skeleton and motion-based representations (SVIM) is superior overall, therefore this is what we recommend for gait recognition. The SVIM is successful as it places emphasis on body motion as opposed to covariate factor motion; for example, a rucksack undergoes motion due to natural gait motion (visible especially in the GVI in Fig. 9), where the skeleton represents the rucksack as a mere bend in the skeleton compared to a mass of static and dynamic pixel values for silhouette representations.

6.6 Distance Function Behaviour

While the distance function constructed from the normalised Poisson provides performance increases over the Poisson, we find the screened Poisson superior and is advantageous given the tunable smoothing parameter $t$ which provides a performance boost. With respect to time, the Poisson distance function is the fastest and the normalised and screened Poisson are successively slower to implement. However given our gait recognition approach is not geared towards real-time processing, we favour the screened Poisson for its superior person discrimination.

6.7 General Recommendations

We have demonstrated the variance aspect of our SVIM to be a useful tool during gait recognition given gait motion is more consistent over time compared to gait appearance. The SVIM paired with the screened Poisson distance function offers significant flexibility due to the tunable smoothing parameter $t$. Note that we only suggest a general recommendation for smoothing parameter $t$ instead of promoting an optimised parameter explicitly due to how performance changes with (a) silhouette quality e.g. missing head or limbs due to imperfect extraction, (b) silhouette creation i.e. RGB versus depth images, (c) image size, (d) databases and even (e) applications. While this means we could achieve greater performance with alternative smoothing parameters $t$, we have none the less demonstrated the effectiveness of the SVIM with the screened Poisson distance function.

7 Comparison to State of the Art

Table 2 compares our SVIM to state of the art approaches including the Gait Energy Volume (GEV) [60] and Depth Gradient Histogram Energy Image (DGHEI) [29]. The GEV is analogous to the GEI where 3D binary voxels are averaged in place of 2D silhouettes, while the DGHEI averages Histograms of Oriented Gradients (HOG) [16] descriptors captured during an image sequence. While the DGHEI outranks the GEV due covariate factor generalisation, the SVIM is superior overall, especially during time-based sequences, primarily given gait motion is more consistent over time compared to appearance; a 9.9% weighted average performance increase over the DGHEI exists due to the combined efforts of skeleton and motion-based representations achieving superior covariate factor handling and generalisation.

Table 2 Existing versus proposed TUM GAID database performances: normal (N), carrying a bag (B), shoes (S), time and normal (TN), time and carrying a bag (TB), time and shoes (TS), weighted average

Full size table

8 Conclusion and Future Work

We have demonstrated an efficient approach to extract skeletons via the screened Poisson equation with tunable smoothing parameter $t$. This combined with skeleton and motion-based representations yields our proposed SVIM which is capable of superior covariate factor generalisation despite the tough time-based covariate factors posed by the TUM GAID database. The SVIM owes its success due to (a) utilising gait motion which is more consistent over time than gait appearance and (b) skeletons which place emphasis on gait motion as opposed to covariate factor motion for greater covariate factor handling compared to silhouettes-based representations. Future work considers extension to action recognition combined with more advanced learning and classification tools (e.g. SVM).

References

Ambrosio, L., Mantegazza, C.: Curvature and distance function from a manifold. J. Geom. Anal. 8, 723–748 (1998)
Article MathSciNet Google Scholar
Ambrosio, L., Tortorelli, V.M.: Approximation of functionals depending on jumps by elliptic functionals via $\gamma $-convergence. Comm. Pure. Appl. Math. XLIII, 999–1036 (1990)
Google Scholar
Ardila, D., Mihalas, S., von der Heydt, R., Niebur, E.: Medial axis generation in a model of perceptual organization. In: 46th Annual Conference on Information Sciences and Systems (CISS), pp. 1–4. IEEE, New York (2012)
Aubert, G., Aujol, J.F.: Poisson skeleton revisited: a new mathematical perspective. J. Math. Imaging. Vis. 4, 1–11 (2012)
Google Scholar
Aubert, G.P.K. (ed).: Mathematical Problems in Image processing. Springer, Heidelberg, (2002)
Bashir, K., Xiang, T., Gong, S.: Gait recognition using Gait Entropy Image. In: 3rd International Conference on Crime Detection and Prevention, pp. 1–6. Springer, London (2009)
Belyaev, A., Fayolle, P.A., Pasko, A.: Signed $L_p$-distance fields. Comput. Aided. Des. 45, 523–528 (2013)
Article MathSciNet Google Scholar
Bhattacharya, T., DiBenedetto, E., Manfredi, J.: Limits as $p\rightarrow \infty $ of $\Delta _pu_p=f$ and related extremal problems. In: Fascicolo Speciale Nonlinear PDEs, pp. 15–68. Rendiconti del Seminario Matematico Universita e Politecnico di, Torino (1989)
Blum, H.: Transformation for extracting new descriptors of shape. In: Wathen-Dunn, W. (ed.) Models for the Perception of Speech and Visual Form. MIT Press, Cambridge (1967)
Google Scholar
Bouchrika, I., Nixon, M.: Exploratory factor analysis of gait recognition. In: 8th IEEE International Conference on Automatic Face and Gesture Recognition (2008)
Calakli, F., Taubin, G.: SSD: smooth signed distance surface reconstruction. Comput. Graph. Forum. 30(7), 1993–2002 (2011)
Article Google Scholar
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern. Anal. Mach. Intell. 8, 679–698 (1986)
Article Google Scholar
Chen, H.S., Chen, H.T., Chen, Y.W., Lee, S.Y.: Human action recognition using star Skeleton. In: Proceedings of the 4th ACM International Workshop on Video Surveillance and Sensor, Networks, pp. 171–178 (2006)
Crane, K., Weischedel, C., Wardetzky, M.: Geodesics in heat: a new approach to computing distance based on heat flow. ACM Trans. Graph. 32, 152:1–152:11 (2013)
Google Scholar
Cutting, J., Kozlowski, L.: Recognising friends by their walk: Gait perception without familiarity cues. Bull. Psychon. Soc. 9(5), 353–356 (1977)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)
Dempster, W., Gaughran, G.: Properties of body segments based on size and weight. Am. J. Anat. 120(1), 33–54 (1967)
Article Google Scholar
Direkoglu, C., Dahyot, R., Manzke, M.: On using anisotropic diffusion for skeleton extraction. Int. J. Comput. Vis. 100, 170–189 (2012)
Article MathSciNet Google Scholar
Drillis, R., Contini, R.: Body segment parameters. Report No. 1163.03. Office of Vocational Rehabilitation, Department of Health, Education and Welfare, New York (1966)
Evans, L.C.: Partial Differenetial Equations. American Mathematical Society, New York (1998)
Google Scholar
Freytag, M., Shapiro, V., Tsukanov, I.: Finite element analysis in situ. Finite. Elem. Anal. Des. 47(9), 957–972 (2011)
Article Google Scholar
Giusti, E.: Minimal Surfaces and Functions of Bounded Variation. Monographs in Mathematics, Vol. 80. Birkhäuser, Boston (1984)
Gomes, J., Faugeras, O.: The vector distance functions. Int. J. Comput. Vis. 52(2/3), 161–187 (2003)
Article Google Scholar
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Article Google Scholar
Gorelick, L., Galun, M., Sharon, E., Basri, R., Brandt, A.: Shape representation and classification using the Poisson equation. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 1991–2005 (2006)
Article Google Scholar
Gurumoorthy, K.S., Rangarajan, A.: A Schrödinger equation for the fast computation of approximate Euclidean distance functions. In: Scale Space and Variational Methods in Computer Vision (SSMV 2009). LNCS, vol. 5567, pp. 100–111 Springer (2009)
Han, J., Bhanu, B.: Individual recognition using Gait energy image. IEEE Trans. Pattern Anal. Mach. Intell. 28(2), 316–322 (2006)
Article Google Scholar
Hofmann, M., Bachmann, S., Rigoll, G.: 2.5d Gait biometrics using the depth gradient histogram energy image. In: IEEE 5th International Conference on Biometrics: Theory, Applications and Systems (BTAS), pp. 399–403 (2012)
Hofmann, M., Bachmann, S., Rigoll, G.: 2.5D Gait biometrics using the depth gradient histogram energy image. In: 5th IEEE International Conference on Biometrics: Theory, Applications and Systems, pp. 399–403 (2012)
Hofmann, M., Geiger, J., Bachmann, S., Schuller, B., Rigoll, G.: The TUM gait from audio, image and depth (GAID) database: multimodal recognition of subjects and traits. J. Vis. Commun. Image Represent. 117(2), 130–144 (2013)
Google Scholar
Hofmann, M., Geiger, J., Bachmann, S., Schuller, B., Rigoll, G.: The TUM Gait from Audio, Image and Depth (GAID) Database: Multimodal Recognition of Subjects and Traits. Special Issue on Visual Understanding and Applications with RGB-D Cameras. J. Vis. Commun. Image Represent. (2013)
Hofmann, M., Schmidt, S., Rajagopalan, A., Rigoll, G.: Combined face and Gait recognition using Alpha matte preprocessing. In: IAPR/IEEE International Conference on Biometrics, pp. 390–395 (2012)
Huang, X., Boulgouris, N.: Gait recognition with shifted energy image and structural feature extraction. IEEE Trans. Image Process. 21(4), 2256–2268 (2012)
Article MathSciNet Google Scholar
Jones, M.W., Baerentzen, J.A., Sramek, M.: 3D distance fields: a survey of techniques and applications. IEEE Trans. Visual. Comput. Graph. 12(4), 581–599 (2006)
Article Google Scholar
Kale, A., Roychowdhury, A., Chellappa, R.: Fusion of gait and face for human identification. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, pp. V901–904 (2004)
Karimov, A., Mistelbauer, G., Schmidt, J., Mindek, P., Schmidt, E., Timur Sharipov, T., Bruckner, S., Gröller, M.E.: ViviSection Skeleton-based volume editing. Comput. Graph. Forum. 32(3), 461–470 (2013)
Article Google Scholar
Kawohl, B.: On a family of torsional creep problems. J. Reine Angew. Math. 410(1), 1–22 (1990)
MathSciNet MATH Google Scholar
Kimia, B.B.: On the role of medial geometry in human vision. J. Physiol. 97(2–3), 155 (2003)
Google Scholar
Kovács, I., Fehér, Á., Julesz, B.: Medial-point description of shape: a representation for action coding and its psychophysical correlates. Vis. Res. 38(15), 2323–2333 (1998)
Article Google Scholar
Lam, T., Lee, R., Zhang, D.: Human gait recognition by the fusion of motion and static spatio-temporal templates. Pattern. Recognit. 40(9), 2563–2573 (2007)
Article MATH Google Scholar
Lee, L., Grimson, W.: Gait analysis for recognition and classification. In: Proceedings of the 5th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 148–155 (2002)
van der Maaten, L.: Matlab toolbox for dimensionality reduction. MIT Press, Cambridge
Mantegazza, C., Mennucci, A.C.: Hamilton-Jacobi equations and distance functions on Riemannian manifolds. Appl. Math. Optim. 47, 1–25 (2003)
Article MathSciNet Google Scholar
Martín-Félez, R., Xiang, T.: Gait Recognition by Ranking. Comput. Vis. ECCV, Lect. Notes. Comput. Sci. 7572, 328–341 (2012)
Article Google Scholar
Matovski, D., Nixon, M., Carter, J.: Encyclopedia of Computer vision, chap. Gait recognition. Springer Science+Business Media, Dordrecht (2013, in press)
Matovski, D., Nixon, M., Mahmoodi, S., Carter, J.: The effect of time on Gait recognition performance. IEEE Trans. Inf. Forensics Secur. 7(2), 543–552 (2012)
Article Google Scholar
Mumford, D., Shah, J.: Optimal approximations by piecewise smooth functions and associated variational problems. Commun. Pure and Appl. Math. 42(5), 577–685 (1989)
Article MathSciNet MATH Google Scholar
Murray, M., Drought, A., Kory, R.: Walking Patterns of Normal Men. J.Bone. Jt. Surg. 46(2), 335–360 (1964)
Google Scholar
Niyogi, S., Adelson, E.: Analyzing gait with spatiotemporal surfaces. In: Proceedings of the IEEE Workshop on Motion of Non-Rigid and Articulated Objects, pp. 64–69. (1994)
Osher, S., Fedkiw, R.P.: Level set methods: an overview and some recent results. J. Comput. Phys. 169, 463–502 (2001)
Article MathSciNet MATH Google Scholar
Paragios, N., Taron, M., Huang, X., M., R., Metaxas, D.: On the representation of shapes using implicit functions. In: Statistics and Analysis of Shapes, pp. 167–200. Birkhäuser (2006)
Peng, J., Kristjansson, D., Zorin, D.: Interactive modeling of topologically complex geometric detail. ACM Trans. Graph. 23, 635–643 (2004). ACM SIGGRAPH
Google Scholar
Petrovic, L., Henne, M.J.A.: Volumetric methods for simulation and rendering of hair. Technical Report, Pixar Animation Studios, Emeryville (2005)
Peyré, G., Cohen, L.D.: Geodesic methods for shape and surface processing. In: Advances in Computational Vision and Medical Image Processing, pp. 29–56. Springer (2009)
Rvachev, V.L.: Theory of R-functions and some applications. Naukova Dumka, Russian (1982)
MATH Google Scholar
Sethi, M., A., R., Gurumoorthy, K.S.: The Schrödinger distance transform (SDT) for point-sets and curves. In: CVPR, pp. 198–205 (2012)
Shah, J.: Segmentation by nonlinear diffusion. In: CVPR, pp. 202–207 (1991)
Shapiro, V.: Semi-analytic geometry with R-functions. Acta. Numerica. 16, 239–303 (2007)
Article MathSciNet MATH Google Scholar
Siddiqi, K., Pizer, S.M.: Medial Representations: Mathematics, Algorithms and Applications, vol. 37. Springer, New York (2008)
Sivapalan, S., Chen, D., Denman, S., Sridharan, S., Fookes, C.: Gait energy volumes and frontal gait recognition using depth images. In: International Joint Conference on Biometrics, pp. 1–6. (2011)
Spalding, D.B.: Calculation of turbulent heat transfer in cluttered spaces. In: Proceedings 10th Int. Heat Transfer Conference, Brighton (1994)
Tari, S., Genctav, M.: From a non-local ambrosio-tortorelli phase field to a randomized part hierarchy tree. J. Math. Imaging Vis. 32(2), 161–179 (2013)
Google Scholar
Tari, S., Shah, J.: Local symmetries of shapes in arbitrary dimension. In: Sixth International Conference on Computer Vision (ICCV’98), pp. 1123–1128. Bombay, (1998)
Tari, Z.S.G., Shah, J., Pien, H.: Extraction of shape skeletons from grayscale images. Comput. Vis. Image. Underst. 66(2), 133–146 (1997)
Article Google Scholar
Tucker, P.G.: Assessment of geometric multilevel convergence and a wall distance method for flows with multiple internal boundaries. Appl. Math. Model. 22, 293–311 (1998)
Article Google Scholar
Tucker, P.G.: Hybrid Hamilton-Jacobi-Poisson wall distance function model. Comput. Fluids. 44(1), 130–142 (2011)
Article MATH Google Scholar
Varadhan, S.R.S.: On the behavior of the fundamental solution of the heat equation with variable coefficients. Comm. Pure Appl. Math. 20, 431–455 (1967)
Article MathSciNet MATH Google Scholar
Wang, C., Zhang, J., Wang, L., Pu, J., Yuan, X.: Human identification using temporal information preserving Gait template. IEEE Trans. Pattern. Anal. Mach. Intell. 34(11), 2164–2176 (2012)
Article Google Scholar
Wang, L., Ning, H., Tan, T., Hu, W.: Fusion of static and dynamic body biometrics for gait recognition. IEEE Trans. Circuits. Syst. Video. Technol. 14(2), 149–158 (2004)
Article Google Scholar
Xia, H., Tucker, P.G., Coughlin, G.: Novel applications of BEM based Poisson level set approach. Eng. Anal. Bound. Elem. 36, 907–912 (2012)
Article MathSciNet Google Scholar
Yogarajah, P., Condell, J., Prasad, G.: PRWGEI: Poisson random walk based gait recognition. In: 7th International Symposium on Image and Signal Processing and Analysis (ISPA), pp. 662–667 (2011)
Yoo, J., Nixon, M.: Automated markerless analysis of human Gait motion for recognition and classification. ETRI J. 33(3), 259–266 (2011)
Article Google Scholar
Yu, S., Tan, D., Tan, T.: A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: 18th International Conference on Pattern Recognition, ICPR 2006, vol. 4, pp. 441–444 (2006)
Zhang, E., Zhao, Y., Xiong, W.: Active energy image plus 2DLPP for gait recognition. Signal. Process. 90(7), 2295–2302 (2010)
Article MATH Google Scholar
Zheng, S., Zhang, J., Huang, J., He, R., Tan, T.: Robust view transformation model for gait recognition. In: 18th IEEE International Conference on Image Processing (ICIP), pp. 2073–2076 (2011)
Zucker, S.W.: Distance images and the enclosure field: applications in intermediate-level computer and biological vision. In: Innovations for Shape Analysis, pp. 301–323. Springer (2013)

Download references

Acknowledgments

We would like to thank the anonymous reviewers for extensively reading this paper and providing their valuable and constructive feedback. Tenika Whytock is supported by an EPSRC DTA studentship. Neil M. Robertson is supported by the MOD University Defence Research Collaboration in Signal Processing (EPSRC grant number EP/J015180/1).

Author information

Authors and Affiliations

Institute of Sensors, Signals and Systems, School of Engineering and Physical Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, Scotland, UK
Tenika Whytock, Alexander Belyaev & Neil M. Robertson

Authors

Tenika Whytock
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Belyaev
View author publications
You can also search for this author in PubMed Google Scholar
Neil M. Robertson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tenika Whytock.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Reprints and permissions

About this article

Cite this article

Whytock, T., Belyaev, A. & Robertson, N.M. Dynamic Distance-Based Shape Features for Gait Recognition. J Math Imaging Vis 50, 314–326 (2014). https://doi.org/10.1007/s10851-014-0501-8

Download citation

Received: 11 October 2013
Accepted: 30 January 2014
Published: 04 March 2014
Issue Date: November 2014
DOI: https://doi.org/10.1007/s10851-014-0501-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Dynamic Distance-Based Shape Features for Gait Recognition

Abstract

Similar content being viewed by others

A New Gait-Based Identification Method Using Local Gauss Maps

Robust gait identification using Kinect dynamic skeleton data

Gait recognition from corrupted silhouettes: a robust statistical approach

1 Introduction

1.1 Gait Recognition

1.2 Generalised Distance Fields and Distance-Based Shape Features

1.3 Validation

2 Smooth Distance Functions

2.1 Screened Poisson Distance Function

2.2 Screened Poisson Distance and Mean Curvature Flow

2.3 Poisson and Normalised Poisson Distance Functions

2.4 \(p\)-Laplacian Distance Functions and \(L_p\!\) Distance Fields

3 Rough Skeletons

4 Skeleton Variance Image

5 Experimental Procedure

5.1 Validation

5.2 Baseline and Comparable Representations

5.3 Distance Function

5.4 Smoothing Parameter

5.5 Dimensionality Reduction and Classification

6 Results and Discussion

6.1 Smoothing Parameter \(t\) Behaviour

6.2 Comparison to GEI Baseline

6.3 Covariate Factor Performance Trends

6.4 Appearance- vs. Motion-Based Representations

6.5 Silhouette vs. Skeleton Representations

6.6 Distance Function Behaviour

6.7 General Recommendations

7 Comparison to State of the Art

8 Conclusion and Future Work

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation