1 Introduction

Anthropometric measurements, such as chest and hip circumference or shoulder-to-shoulder distance, provide detailed information about the body shape. The body shape information is essential for industrial design [18], clothing design [8], medical sciences [16] and ergonomics [19]. The measurements have traditionally been made manually from physical subject using a tape measure, but the raise of online shopping and personalized tools set new demand for computerized anthropometric measurements.

Fig. 1
figure 1

The proposed pipeline for measuring anthropometric clothing measurements from 3D body scans. A 3D point cloud is produced by a set of depth sensors (body scanner). A body template is fitted (registered) to the 3D point cloud (step-2); circumference measurements are computed on the model surface (step-3); supervised regression is adopted to provide estimates of anthropometric measurements (step-4)

A standard pipeline for computerized anthropometric measurements is the following [3, 7, 10, 25,26,27]: (1) a 2D or 3D body scan producing a 3D point cloud or an initial model, (2) fitting of a pre-defined model and (3) measurements from the fitted model. The main challenge is the step two which should provide an accurate and watertight volumetric model of a subject so that important measurements can be made on the model surface. Challenges arise from different sensor modalities, poses and occluded regions. The proposed method in this work shares the main steps of the standard pipeline (Fig. 1), but instead of physiologically valid model fit, we adopt a non-rigid iterative closest point (ICP) registration between the model and captured point clouds. Moreover, we do not make anthropometric measurements directly from the fitted model surface, but extract a set of physiologically meaningful surface features (body circumferences) and use them to train a regressor that provides estimates of the physical anthropometric measurements. Our main contributions are:

  • A full processing pipeline from 3D body scans to anthropometric measurements.

  • The body model registration step using a non-rigid ICP to fit a pre-defined model to captured body scans.

  • Nonlinear regression based anthropometric measurement estimation step from circumference based intermediate features.

  • A public benchmark dataset—NOMO3D—with anthropometric measurement ground truth.

Our pipeline is evaluated with the NOMO3D dataset of real male and female subjects (194 plus 181) for which we provide average accuracy and percentage of subjects whose accuracy is below the thresholds in [9].

2 Related work

Anthropometric measurement datasets There have been several campaigns to collect 3D body scans and anthropometry ground truth for them. For example, the UMTRI dataset was collected to find the safest sitting posture of young children in cars [12]. ANSUR 88 (1988) and ANSUR 2012 datasets contain 3D body scans and tape measured anthropometric measurements of US Army Force soldiers. ANSUR 2012 contains 4082 male and 1986 female subjects of varying age and 93 ground truth anthropometric measurements for each of them. Unfortunately, UMTRI and ANSUR datasets are not publicly available. CAESAR dataset [21] is a commercial counterpart of ANSUR and contains 3D scans of 2400 US & Canadian and 2000 European civilians with tape measured ground truth. CAESAR (http://store.sae.org/caesar/) has been used in various scientific works but has not been widely adopted in benchmarking due to its price. The main usage of UMTRI, ANSUR and CAESAR datasets is to make “virtual tape measurements” on the point cloud surface. In the follow-up work of the CAESAR, Robinette and Daanen [20] compared virtual tape measurements over two different scanners and scanning teams and showed that measurements are highly reproducible within the US Army defined error limits (cf. ANSUR experiments). Reproducibility error in their experiments was less than \(\pm 5\) mm for the most measurements. However, these were relative accuracies over repeated tests. Simmons and Istook [23] noted that there is substantial variation in available softwares how to measure anthropometric measurements from 3D data. Paquette et al. [17] demonstrated much larger errors for 3D measurements as compared to manual tape measurements. They reported systematic errors of up to 30–40 mm despite the fact that standard measurement procedures were implemented to the softwares (ISO-8559 and US Army).

3D Human body models The early works following the data campaigns above were based on “virtual tape measurements” where the anthropometric measurements were made manually with the help of 3D measurement software. If this step needs to be automated, then 3D scan data need to be aligned with a model for which the measurement paths can be pre-defined using 3D model vertex ids. However, first a good 3D human body model needs to be devised. The model should contain intuitive parameterization for shape and pose and provide realistic body shapes. There are several options for scientific work. The most popular parametric body model is MakeHuman which is an open source project (http://www.makehuman.org/) based on an artistic body model and aiming at high quality rendering for games and movies.

However, better models are based on statistics of real human data. These require a single artist made initial point model which is iteratively matched to scanned point clouds in a normalized pose. Principal component analysis (PCA) over the matched model points provides natural parameterization for the shape. The pose can be intuitively defined by a skeleton joint model, but the final quality depends on how well the model can represent pose-specific shape deformations. One of the first attempts to create a 3D human body from PCA shape and skeleton pose is the SCAPE body model by Anguelov et al. [2]. Hirshberg et al. [11] proposed a better parametric body model for SCAPE and introduced the BlendSCAPE model. Other attempts are by Baek and Lee [3] and more recently the SMPL model by Loper et al. [15]. SMPL provides high-quality models where the shape is divided to pose invariant and pose-dependent deformations and the model parameters are optimized using a combination of their own dataset of 1786 scans and 3800 scans from CAESAR. For this work, we adopt the SMPL model due to its good overall quality.

Computerized anthropometric measurements There have been several attempts to infer 3D body models from 2D RGB images. For example, Guan et al. [10] proposed a method and compared their measurements to the ground truth. However, for many industrial and commercial applications, the accuracy of 2D measurements is insufficient. For better accuracy, 3D scans are needed.

Weiss et al. [26] propose a Kinect-based 3D body scan method that uses the SCAPE body model. The method requires manual pose initialization and then optimizes the model mesh using a standard ICP. Tsoli et al. [25] propose a pipeline that is similar to ours. They use the BlendSCAPE model to register a 3D scan, and then, they compute various local and global features which are used in regression. A different approach was proposed by Zuffi et al. [32] in their “stitched puppet” model where the body model is divided to local templates where “local PCA” matching is performed and then the local parts are globally aligned in the next optimization step. Wuhrer et al. [27] introduce an inverse problem of ours where a 3D body model is estimated from the given 1D anthropometric measurements.

The above works particularly address the problem of unknown pose. However, we believe that a fixed pose can be assumed for many applications since customers can be assumed co-operative. Therefore, the process can be drastically simplified and provide accurate results.

Fig. 2
figure 2

A 3D body scan (point cloud) captured by TC2 body scanner. The scanner covers most of the body surface and missing parts occur only in the head and feet regions

3 3D body scanning

Recently, novel single depth sensor-based body scanning approaches have been proposed, for example, BodyFusion [28] and DoubleFusion [29], but since 3D scanning is out of the scope of this work, a commercial 3D body scanner was used. Our dataset was collected using a commercial TC2 body scanner (https://www.tc2.com) that uses off-the-shelf depth sensors (Intel RealSense R200). Inside the scanner, subjects were instructed to step on the rotating platform and take a standing pose with the feet at around their shoulder width apart and the arms slightly raised to create a gap between the arms and torso. The platform then rotates around once, during which three depth sensors produce a raw 3D scan of the customer, and the process takes a few seconds (Fig. 2). The test subjects wore tight fitting underwear-like sport costumes. The scanner outputs a triangulated mesh structure in the regular OBJ file format. Each triangulated mesh contains on average 57,000 vertices and around 113,000 faces. For our experimental studies, we scanned 194 men and 181 women. Scanned persons were instructed to wear tight underwear.

4 Model registration

4.1 SMPL body model

The popular 3D human body models MakeHuman, SCAPE [2], BlendSCAPE [11] and SMPL [15] (see Sect. 2 for details) share similar model parameterization \(\left\{ \mathcal {T},\mathcal {S},\varvec{\theta }\right\} \) where \(\mathcal {T}\) is the initial model in a “canonical shape” and “canonical pose,” \(\mathcal {S}\) defines the shape deformation and \(\varvec{\theta }\) defines the pose. Pose parameterization is intuitive and typically based on a skeleton rig of K skeleton joints. A pose is encoded to the 3D rotation angles of K joints in \(\varvec{\theta }\). Each vertex location in \(\mathcal {T}\) is relative to a specific skeleton part or parts, and therefore, the whole point cloud deforms. Parameterization of the shape is more difficult to model since parameters need to capture shape statistics of the human population. The standard approach is to use principal component analysis (PCA) where principal components represent the most important axes of variation in the population. In the PCA space, any shape can be reconstructed by linearly adding \(|\varvec{\beta }|\) principal directions to a mean shape \(\mathcal {T}\) (the zero shape):

$$\begin{aligned} \mathcal {T}+B(\varvec{\beta }) = \mathcal {T}+\sum _{n=1}^{|\varvec{\beta }|}\beta _n\mathcal {S}_n . \end{aligned}$$
(1)

Often as few as \(|\varvec{\beta }|=10\), principal component vectors provide sufficient accuracy for applications where subtle details are not important. For our work, we selected the skinned multi-person linear model (SMPL) by Loper et al. [15] since it provides very competitive accuracy and the original implementation is publicly available.

Fig. 3
figure 3

We adopted the skinned multi-person linear model (SMPL) [15] for our framework since it provides intuitive model parameterization and high-quality models. A SMPL model in its canonical (zero-pose) position (left) and the model in the initial position that corresponds to the instructed pose in our body scans (right)

SMPL mesh model contains \(N = 6890\) vertices (13,766 faces) and \(K = 23\) skeleton joints. The mesh has the same topology for men and women, spatially varying resolution, a clean quad structure, segmentation into parts, initial blend weights and a skeletal rig. A particular detail that makes SMPL registration more accurate than its competitors is that it divides the shape deformation to pose-independent deformation \(B_\mathrm{S}(\varvec{\beta })\) and pose-specific deformation \(B_\mathrm{P}(\varvec{\theta })\) which are summed to define the final shape. Notably, the shape deformation parameters are also used to predict the rotations of the \(K=23\) skeleton joints \(J(\varvec{\beta }): \mathbb {R}^{|\varvec{\beta }|} \rightarrow \mathbb {R}^{3K}\). We re-defined the SMPL zero-pose to correspond to the pose subjects were instructed to take (Fig. 3).

4.2 Non-rigid ICP registration

Fig. 4
figure 4

A scanned point cloud contains holes and measurement noise, but registration of the 3D body model (red) is robust to these distortions and achieves an accurate—“skin level”—registration which is essential for accurate anthropometric measurements in the next stage (color figure online)

The goal of the body model registration to the scanned point cloud is to provide “skin-level registration” where the two surfaces, the model and the scan, overlay almost perfectly (Fig. 4). This is a challenging task since (a) points contain measurement noise, (b) large point regions may be missing and (c) the model points do not exactly match the scan point locations. To make the final anthropometric measurements accurate in the next processing stage, we need a registration method that is accurate and robust to the aforementioned non-idealities.

A core component in constructing the SCAPE, BlendSCAPE and SMPL datasets is an artistic-generated point model and an algorithm to register the model to real human scans. However, these algorithms perform complex optimization and must be manually initialized. Therefore, the artistic models and special algorithms have not been used outside body model generation. However, the final body models, SCAPE, BlendSCAPE and SMPL, provide intuitive parameterization as discussed in Sect. 4.1 and registration can be defined as an optimization problem where a few pose and shape parameters \(\left\{ \mathcal {S},\varvec{\theta }\right\} \) are optimized to minimize a registration error. Skin-level registration requires a large number of PCA components for the shape, and therefore, we take an alternative approach from the generic point cloud matching literature.

Several comparison of generic registration methods exist. For example, Bogo et al. [4] introduced the FAUST dataset for comparing non-rigid registration methods. In their experiments, several popular methods, e.g., generalized multi-dimensional scaling (GMDS) [5], Möbius voting [14] and blended intrinsic maps (BIM) [13], did not perform well since these methods assume that both inputs are watertight and have the same topology. However, the baseline point cloud matching method, iterative closest point (ICP), does not require such assumptions.

There are two extensions of the baseline ICP that are suitable for human body point clouds: Amberg et al. [1] and Schneider et al. [22]. Since the 3D scans often contain holes (Fig. 2), we adopted the Amberg et al. approach that explicitly handles missing points. The challenge is twofold—we want to retain the global convergence properties of ICP while still allow local deformations to the skin level. Local deformations make this ICP non-rigid.

The starting point of our algorithm is a pre-aligned model defined by \(\left\{ \mathcal {T},\beta _i,\theta _k\right\} _{i=1,\ldots ,|\beta |,k=1,\ldots ,K}\) that brings the SMPL template to approximate correspondence with the obtained scan point cloud \(\mathcal {T}_\mathrm{scan}\). A simple procedure for pre-alignment is described in Sect. 4.3. If we define the pre-aligned model as \(\mathcal {V}\), then the problem is to find optimal values for the alignment parameters \(\mathcal {X}\) so that \(\mathcal {V}(\mathcal {X})\) registers the template points to the surface points \(\mathcal {T}_\mathrm{scan}\).

To solve the optimal parameters \(\mathcal {X}\), an energy function of three terms is defined [1]:

$$\begin{aligned} E(\mathcal {X}) = E_d(\mathcal {X}) + \alpha E_s(\mathcal {X}) + \beta E_l(\mathcal {X}) . \end{aligned}$$
(2)

\(E_d\) is the standard ICP distance term between the model and scan points

$$\begin{aligned} E_d(\mathcal {X}) = \sum _{v_i \in \mathcal {V}}w_i\hbox {dist}^2\left( \mathcal {T}_\mathrm{scan},\mathcal {X}_iv_i\right) \end{aligned}$$
(3)

where \(\mathcal {X}_i\) is a linear mapping of a single model vertex \(v_i\) to correspondence in \(\mathcal {T}_\mathrm{scan}\). \(w_i\) defines whether a model point has a correspondence in scan (\(w_i = 1\)) or not (\(w_i = 0\)). \(E_s\) is a local stiffness term

$$\begin{aligned} E_s(\mathcal {X}) = \sum _{i \in \mathcal {N}_j} \Vert (\mathcal {X}_i - \mathcal {X}_j) \hbox {diag}(1,1,1,\gamma )\Vert ^2_{F} \end{aligned}$$
(4)

where \(\Vert \cdot \Vert ^2_F\) is the matrix Frobenius norm. The stiffness term enforces similar transformations between neighbor vertices \(\mathcal {N}_j\) of the model vertex \(v_j\). \(\gamma \) is used to weight differences in the rotational and skew part of the deformation against the translations part of the deformation (\(\gamma =1\) in the experiments). The third energy term is a landmark term

$$\begin{aligned} E_l(\mathcal {X}) = \sum _{\varvec{v}_i,\varvec{l} \in \mathcal {L}} \Vert \mathcal {X}_i\varvec{v}_i - \varvec{l}\Vert ^2 . \end{aligned}$$
(5)

The landmarks \(\mathcal {L}\) are pre-defined and important positions in the model and this term enforces them to be registered accurately. The landmark term improves registration significantly, but requires manual labeling of selected keypoints and is therefore omitted in our experiments.

The algorithm in [1] uses locally affine regularization which assigns an affine transformation to each vertex and minimizes the difference in the transformation of neighboring vertices. The deformation parameters \(\mathcal {X}\), which would be applied on source vertices to generate the target surface deformation, are obtained by minimizing the cost function in Eq. 6 directly and exactly.

$$\begin{aligned} \begin{aligned} \bar{E}(\mathcal {X})&= \left\Vert \begin{bmatrix} \alpha \varvec{M}\otimes \varvec{G}\\ \varvec{WD}\\ \beta \varvec{D}_{L} \end{bmatrix} \mathcal {X} - \begin{bmatrix} \varvec{0}\\ \varvec{WU}\\ \varvec{U}_{L} \end{bmatrix} \right\Vert ^{2}_{F} \\&= \Vert \varvec{A}\mathcal {X} - \varvec{B} \Vert ^{2}_{F} \end{aligned} . \end{aligned}$$
(6)

The cost function \(\bar{E}(\mathcal {X})\) takes its minimum at \(\mathcal {X} = (\varvec{A}^\mathrm{T}\varvec{A}^{-1})\varvec{A}^\mathrm{T}\varvec{B}\). In the above equation, \(\varvec{M}\) is the node-arc incidence matrix of the template mesh topology, and \(\varvec{G} := \mathrm{diag}(1,1,1,\gamma )\) is a weighting matrix, \(\varvec{W} := \mathrm{diag}(w_1, \dots , w_n)\) represents the weighting matrix in which \(w_i = 0\) if template vertices \(v_i\) corresponds to missing data in the target mesh and n represents the number of template vertices, \(\varvec{D}\) is the sparse matrix of template vertices mapping the \(4n\times 3\) deformation parameters \(\mathcal {X}\), \(\varvec{U}\) is the matrix of the correspondence points on the target mesh, \(\varvec{D}_L\) and \(\varvec{U}_L\) are the pre-defined landmarks on the template mesh and their correspondence points on the target mesh, respectively, the Kronecker product is denoted by \(\otimes \). \(\alpha \) and \(\beta \) are the penalty terms that balance the two corresponding energy functions with respect to the standard ICP term \(E_d\).

The whole registration process consists of two loops. In the outer loop, a series of deformations of the template are performed for each stiffness \(\alpha ^i \in \{\alpha ^1, \dots , \alpha ^n\}\), where \(\alpha ^i > \alpha ^{i+1}\). These \(\alpha \) values guarantee the registration process from a global deformation to more localized ones. In our experiments, \(\alpha \) values are set to from 100 to 1 by step size 1. In the inner loop, a deformation \(\mathcal {X}\) for a fixed stiffness term \(\alpha ^i\) and preliminary correspondences is found. Preliminary correspondences are found by a nearest point search. The optimal deformation \(\mathcal {X}\) is determined until \(||\mathcal {X}^j - \mathcal {X}^{j-1}|| < \epsilon \), where \(\epsilon \) is the threshold.

4.3 Pre-alignment and initialization procedures

A simple pre-alignment procedure is performed before non-rigid ICP registration. Generally, the mis-alignment of registration is partly raised by wrong scales, face orientations, the different center points of subjects. To depreciate it, first, we scale all scans into the same unit of measurement (meter) as the SMPL model meshes; we then rotate all scans to make sure that they face the same direction. Compared to the previous works which adopt the mean coordinate of vertices as the center points and align all meshes into the same center point, we additionally align all samples into the same lowest point (Z-axis). The center points change dramatically since the missing parts on scans and bring negative effects on registration. A standard point (xy, 0) is employed as the lowest point for all meshes. After the pre-alignment procedure, all scans and the SMPL models are standing on the XY plane and facing to Y-axis direction with the same scale.

The height of the SMPL model is controlled by the first shape parameter \(\beta _1\). To obtain a suitable initial value for \(\beta _1\), we utilize a simple linear function over the heights of the training set scans to estimate the parameter \(\hat{\beta _1} \approx \beta _1\). To initialize the pose parameters, we start from the pose \(\varvec{\theta }\) (on the right in Fig. 3) and iteratively test a number of arm angle shifts to match with the target scan. These initialization procedures aid convergence and improve accuracy, but their effect is not significant.

5 Anthropometric measurements

The proposed pipeline outputs estimates of the target physical anthropometric measurements from a fitted model (Sect. 4) by first calculating circumference paths through the model points (Sect. 5.1) and then estimating the physical measurements from the path distances by nonlinear regression (Sect. 5.2).

5.1 Surface measurements

Fig. 5
figure 5

Distances of circumference paths through mesh vertices of a registered SMPL body model are used as features for regression. Multiple paths (dotted red, green and blue lines) are used to estimate a single anthropometric measurement. Example circumference paths: Path 1: NaturalWAIST; Path 2: Hip; Path 3: Thigh; Path 4: Knee (color figure online)

The registration process brings two main benefits: (a) it produces a hole-free mesh without missing body parts and reduces the point cloud noise; and (b) registered meshes of all subjects are in the same topology that facilitates finding the corresponding vertices of the pre-defined circumference paths.

For each anthropometric measurement \(t_i\), we define a set of surface circumference paths. The path lengths \(t_i^{(1)}\), ..., \(t_i^{(C)}\) are used as features for regression. The paths are defined as sets of vertices in the model \(\mathcal {P}^{c}_i = \left\{ \varvec{v}^{c}_1, \varvec{v}^{c}_2,\ldots , \varvec{v}^{c}_m\right\} \). The length of a circumference path is the sum of edge lengths through the defined path (Fig. 5). The selected circumference paths were not optimal, but manually set near the true anthropometric measurement locations. It was assumed that multiple paths provide extra robustness to shape deformations (see the ablation study in the experimental part of our work).

5.2 Nonlinear regression

The purpose of a suitable regressor is to find a mapping \(f(\cdot )\) such that

$$\begin{aligned} f(\mathcal {P}_i): \left( t_i^{(1)}, \ldots , t_i^{(C)}\right) \mapsto \hat{t}_i \end{aligned}$$
(7)

where \(\hat{t}_i\) is the estimate of the true anthropometric clothing measurement \(t_i\). The most straightforward solution is the ordinary least squares (linear regression) which finds a solution \(\varvec{\omega } = (\omega _0,\omega _1,\ldots ,\omega _C)^\mathrm{T}\) that minimizes the squared loss over training subjects i

$$\begin{aligned} \sum _i \left( t_i - \varvec{\omega }^\mathrm{T}\varvec{t}_i\right) ^2 \end{aligned}$$
(8)

where \(t_i\) is a training set the ground truth value and \(\varvec{t}_i = (t_i^{(1)}, \ldots , t_i^{(C)})^\mathrm{T}\) are the computed circumference path distances for this specific anthropometric measurement. Linear regression with regularization (ridge regression) minimizes the squared loss with a weight penalty term \(\lambda \)

$$\begin{aligned} \sum _i \left( t_i - \varvec{\omega }^\mathrm{T}\varvec{t}_i\right) ^2+\lambda ||\varvec{\omega }|| . \end{aligned}$$
(9)

There are also more advanced extensions of linear regression, such as elastic net regression [31], and other learning-based regressors such as support vector regression (SVR) [24]. We compare several popular regression methods in our ablations studies.

6 Experiments

6.1 Dataset and settings

We collected a set of 3D scans using the commercial scanner (Sect. 3). The dataset—NOMO3D—consists of 194 male and 181 female scans. For each subject, a clothing expert (tailor) made the actual anthropometric measurements (15 male and 19 female). All results are average performance over fivefold cross-validation.

Method evaluation We employ the mean absolute error (MAE) as the error metric between the ground truth and estimated anthropometric measurements. For each measurement i, mean absolute error \(\epsilon _i\), over all subjects j was obtained as

$$\begin{aligned} \epsilon _{i} = \frac{1}{|j|} \sum _{j=1}^{|j|} \left| t^{(j)}_{i} - \hat{t}^{(j)}_{i}\right| . \end{aligned}$$
(10)

In addition to the measurement-specific MAEs, we also computed the average MAEs over all measures. All numbers were measured in millimeters (mm). Moreover, for each measurement, we also report the proportion of the test samples for which the accuracy was below the defined error limits in [9] as Success rate.

Computational complexity The most time-consuming part is the non-rigid ICP registration. Matlab code was adapted from [1] and it runs approximately 2 min on each scan. The pre-alignment and initialization procedures are very fast, less than a second, as well as the regression which is also computationally fast.

Table 1 Average fivefold (\(80\%\) for training and \(20\%\) for testing) performance (mean absolute error) and success rate (a proportion of the test samples within the error limits in [9]) of anthropometric measurements

6.2 Results

The average fivefold errors for each anthropometric measurement and their accuracy thresholds and success rates are shown in Table 1. In all cases, the number of surface measurements were optimized for each anthropometric measurement and the best performing regressor (nonlinear SVR) was used. For the both male and female subjects the best performing measurement was neck circumference with \(93\%\) test cases below the threshold (6 mm) for men and \(81\%\) for women. The worst performing measure was ankle circumference for which only \(28\%\) of male \(24\%\) of female success rates were achieved. The error distributions for the male and female neck and ankle circumferences and male chest and female natural waist circumferences are shown in Fig. 6. The distributions reveal that there exists a small amount of test samples with a large error. It turned out that the main source of large estimation errors yields from the body scanner that often misses certain body parts. For example, feet regions often lack point cloud points which makes the registration fail in these regions (Fig. 7).

Fig. 6
figure 6

Error distributions illustrating low, moderate and well performing estimates. Top (male) : ankle circumference (low), chest circumference (moderate), neck circumference (high); bottom (female) : ankle circumference (low), natural waist circumference (moderate) and neck circumference (high). The red vertical lines denote the acceptance thresholds in [9] (color figure online)

6.3 Ablation study

Number of circumference paths In the first ablation study, we investigated the effect of adding multiple surface measurements (circumference paths) to the anthropometric regression. The results for three well and three poorly performing measurements for the both male and female are shown in Fig. 8. Results are for nonlinear SVR regressor with fivefold cross-validation. The most important findings are that additional paths always improve the accuracy and depending on the measurement the results saturate at 3–9 surface circumference paths. In particular, paths close to the physical anthropometric measurement location strongly contribute to the estimation accuracy. The best single paths (\(C=1\)) were also selected using cross-validation and the results with and without SVR regression are shown in Table 1. These results indicate that (i) the multi-path regression is superior to single path regression and (ii) SVR significantly improves the estimation performance.

Non-rigid ICP To validate the importance of non-rigid ICP, we conducted an experiment where the SMPL model was directly fitted to the point clouds. SMPL parameter optimization was done using the popular L-BFGS-B optimizer [30]. Similar to the non-rigid ICP, the distance term \(E_d\) with the normal direction constraints was used as the target function. The stop criterion was set to \(10^{-6}\) to keep the computation times reasonable, and the same pre-alignment procedure was adopted. The results are shown in Table 1 and are clearly inferior to the proposed non-rigid ICP registration.

Regression methods We compared a number of publicly available regression methods for the regression step. The standard linear regressors were linear regression, stepwise linear regression and ridge regression, and more recent regression methods are elastic net linear regression, Gaussian process regression (GPR), binary regression decision tree (BRDT), linear support vector regression (SVR) and Nonlinear SVR. The mean accuracy and success rates for these methods are shown in Table 2. The results show that even the basic linear regressors (linear regression, ridge regression and stepwise linear regression) perform well indicating that the proposed registration step performs well. Nonlinear SVR and Gaussian process regression also perform well. They are all safe choices for regressing anthropometric measurements from surface measurements, but we selected the nonlinear SVR due to its best overall performance.

Table 2 Average MAEs and success rates of several regression methods

7 Conclusions

Fig. 7
figure 7

Two examples of registration failures due to missing points in the scanned point clouds: scanned point cloud (left), model emphasizing the ankle circumference location (middle) and output of the registration process (right) that illustrates the failure cases (color figure online)

Fig. 8
figure 8

Test set errors (MAEs) as functions of the number of surface measurements (circumference paths) for three well performing (success \(>50\%\)) (top) and three poorly performing (\(< 50\%\)) anthropometric measurements (bottom)

This work introduced a full processing pipeline for estimating physical anthropometric measurements from 3D body scans. The pipeline consisted of a commercial 3D scanner, a deformable SMPL body model, non-rigid ICP-based model registration, computation of circumference path features and nonlinear regression for anthropometric measurement estimation. Depending on the measurement, our pipeline provided success rates from \(28\%\) to \(93\%\) for male and from 24 to 82% for female subjects. The proposed pipeline works in practice and shows that an affordable scanning system can be built for clothing industry.

In the future work, we will further investigate and refine each step of the pipeline. For example, selection of better surface features in addition to the circumference paths, fast-to-compute alternatives for the slow ICP algorithm (e.g. Chen et al. [6]) and better scanners and scanning procedures.