Prediction of geometry deviations in additive manufactured parts: comparison of linear regression with machine learning algorithms

Baturynska, Ivanna; Martinsen, Kristian

doi:10.1007/s10845-020-01567-0

Prediction of geometry deviations in additive manufactured parts: comparison of linear regression with machine learning algorithms

Open access
Published: 08 April 2020

Volume 32, pages 179–200, (2021)
Cite this article

Download PDF

You have full access to this open access article

Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Prediction of geometry deviations in additive manufactured parts: comparison of linear regression with machine learning algorithms

Download PDF

Ivanna Baturynska¹ &
Kristian Martinsen¹

7664 Accesses
61 Citations
Explore all metrics

Abstract

Dimensional accuracy in additive manufacturing (AM) is still an issue compared with the tolerances for injection molding. In order to make AM suitable for the medical, aerospace, and automotive industries, geometry variations should be controlled and managed with a tight tolerance range. In the previously published article, the authors used statistical analysis to develop linear models for the prediction of dimensional features of laser-sintered specimens. Two identical builds with the same material, process, and build parameters were produced, resulting in 434 samples for mechanical testing (ISO 527-2 1BA). The developed linear models had low accuracy, and therefore needed an application of more advanced data analysis techniques. In this work, machine learning techniques are applied for the same data, and results are compared with the previously reported linear models. The linear regression model is the best for width. Multilayer perceptron and gradient boost regressor models have outperformed other for thickness and length. The recommendations on how the developed models can be used in the future are proposed.

Multi-response Optimization of 3D Printed Parts with Triangular Patterns Using Nonlinear Machine Learning Regressor Technique

Article 29 July 2024

A Comparative Study of Artificial Neural Network and Regression Model for Hybrid Additive Manufacturing of Ti6Al4V Parts and Microstructural Analysis

Article 02 August 2024

Optimization of process parameters and predicting surface finish of PLA in additive manufacturing—a neural network approach

Article 03 May 2024

Find the latest articles, discoveries, and news in related topics.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Additive manufacturing (AM) is a relatively new technology, and most common use is for prototyping purposes. However, during the last decades, AM has found increased use for fabrication of functional parts, and this has led to an increased attention to this technology from both researchers and the various industries. One of the most attractive features of AM is the flexibility of design. Complex shapes are manufactured directly from 3D CAD model.

There have been several examples of benefits of AM creating lighter structures, combining several components into one as well, as the possibility to customize products for personal use. One area where AM has found much use is in the dental industry, where the potential to make dental protheses adapted to patients’ anatomy is utilized. This is caused by the flexibility in design, achieving a lighter product, adding more functionality into the product, and the ability of customization of each product for small production batches.

Nowadays, additive manufacturing has already been used to produce end user products in electronics, automotive, medical and aerospace industries (Wohlers 2016; Stoyanov and Bailey 2017). Management and control of variations in additive manufacturing are, however, one of the challenges of today’s AM processes. “Fist-time-right” and consistency of the parts’ properties are one of the major issues that researchers attempt to address in order to make additive manufacturing more attractive for end user parts production.

Dimensional accuracy is already presented as an important issue in different studies (Baturynska et al. 2018; Caulfield et al. 2007; De Ciurana et al. 2013; Paras et al 2016; Wohlers 2016; Zhu et al. 2018). Zhu et al. (2018) have presented three main mechanism of error generation in AM processes, which are the mathematical geometry approximation error (conversion from CAD to STL file), error due to machine and process parameters, and material-related error (thermal shrinkage and material distortion). Typically, research focus is set on the investigation of how machine process parameters influence the shrinkage effect, while the investigation of an effect of build layout design on dimensional accuracy is not presented in the literature.

For example, Singh et al. (2012) reported that shrinkage effect is connected to such process-related parameters as scan spacing, optimization of laser power, bed temperature, and hatch length of a polymer powder bed fusion process (3D Systems). Delgado et al. (2012) also evaluated significance of the process parameters’ effects on dimensional error, surface roughness, and mechanical properties for metal powder bed fusion systems. The authors also reported that research on dimensional accuracy for two metal materials is very limited comparing with surface roughness and mechanical properties. Another study on part quality of parts fabricated with selective laser sintering (SLS) was statistically investigated concerning various machine parameters by Dingal et al. (2008), but dimensional accuracy was not mentioned.

While other researchers focused on the investigation of how part placement (Yang et al. 2002; Zhang et al. 2017) and different build strategies (Senthilkumaran et al. 2009a) may affect the dimensional accuracy, Yang et al. (2002) reported that application of Taguchi and Analysis of Variance allowed authors optimizing shrinkage ratio for three orientation groups, which they call X, Y, and Z orientation.

Investigation of different build strategies performed by Senthilkumaran et al. (2009a) pointed out to the importance of contouring and hatching, beam compensation, inertia of scanning mirror, scan direction and compensation of positioning errors with regards to shrinkage effect. Moreover, the authors highlighted the impact of part orientation on deviations per unit length.

Later, Senthilkumaran et al. (2009b) introduced a new model for shrinkage compensation based on the results and gained knowledge from the previous study. This model was developed for compensation of shrinkage “at every layer and at every hatch length, unlike a uniform compensation scheme applied to entire part” (Senthilkumaran et al. 2009b). Results were compared with suggested compensation by machine manufacturer, and improvements of dimensional accuracy approximately by 55–62% were observed for newly developed compensation scheme.

To date, the role of STL model properties, which are the number of mesh triangles, number of mesh points, surface and volume of the CAD model, with respect to shrinkage effect is not known. Moreover, compensation of shrinkange effect is usually performed by using scaling ratio for the whole build layout. In the previous study (Baturynska 2018) the authors have already made an attempt to predict scaling ratio for each part separately and investigate these parameters in combination with part placement and part orientation. The results obtained from a Pearson correlation test showed that abovelisted parameters are significant with respect to dimensional features. In addition to central part placement coordinates, maximal and minimal coordinates have also been included in the analysis (see Fig. 11). However, prediction of dimensional features (thickness, width, and length of the part) required improvement by applying more advanced techniques.

Therefore, in this paper, the authors describe a preliminary study of using four machine learning techniques to predict dimensional accuracy based on the collected data. The results of multi-layer perceptron neural network, decision tree regressor, gradient boosting regressor and support vector regressor compared with findings from the previous report, where solely linear regression models were used to make a prediction ( Baturynska 2018).

In order to be able to compare the results of machine learning techniques with linear regression models, data analyzed in this study is the same as in Baturynska (2018). This data was gathered from an EOS P395 polymer powder bed fusion system, with more details on practical experiment and data gathering are described in “Experimental work and data gathering” section.

This work addresses the following aims:

Investigate effect of part placement, part orientation and STL model properties on dimensional accuracy by applying more advanced methods and compare them with results of the previous study.
Develop non-linear models (a result of machine learning techniques) for prediction of thickness, width and length for each part separately.
Compare the performance of non-linear models and linear regression models based on the prediction accuracy and define which one(s) could be used in the future.
Discuss how the predicted dimensional features can be used to compensate geometry deviations for every part separately instead of using scaling ration in x, y and z axes for the whole build.
Provide recommendations on how proposed models could be used in the future within the manufacturing industry.

Results of this study are also considered as the first step towards the development of an intelligent system for quality assurance in additive manufacturing. This system will be used as a decision support tool for designers and operators in manufacturing. The proposed models will be incorporated as separate modules, which would be executed in different orders based on the requirements.

Due to the high importance of mechanical properties in the end user products, improvements of dimensional features may contribute to the improvements in mechanical properties. How this may be executed is discussed in the “Recommendations” section.

Experimental work and data gathering

An EOS P395 polymer powder bed fusion system was used in the experiment performed to collect data. Two identical runs were executed in order to evaluate the repeatability of the results for the build layout presented in Fig. 1. By identical runs it is meant that a build layout, material and process parameters were the same for both runs (for details see Table 1). Polyamide 2200, also known as PA12, was used in both runs with virgin/aged powder ratio of 50/50 %. In order to be able to control material properties and keep them constant in both runs, polymer powder was self-aged, with more details presented in the previous study Baturynska 2018).

Table 1 Material and process parameters used in experiment

Full size table

The results presented by Rüsenberg et al. (2014) was used as the reference. Although placement and orientation of specimens were chosen to be different, the authors assumed that build layout should be designed similarly to real manufacturing conditions. Based on this assumption, the maximum number of parts is chosen to be the main criterion for design of the build layout. It means that the parts are placed as close to each other as possible, and the minimum distance between the specimens is set to 5 mm based on the recommendations from Magics 20.0 software. Additional attention was paid to the specimens placed in the same orientation for verification and validation of the results. In other words, more than five specimens in the same orientation were placed as close to each other as possible for better control of potential coordinate variations.

In total, 358 specimens were produced in one run (or 716 specimens for the two runs combined). However, in this paper, data were analyzed from 217 (or 434 in total) specimens of type ISO 527-2 1BA for mechanical testing. Since the first attempt to predict dimensional features with the help of statistical methods were performed based on the data from Baturynska (2018), the same data is used in this work in order to be able to compare the results of linear regression modeling with the machine learning techniques.

The schematic represenation of software and hardware components is depicted in Fig. 2. This representation shows the main steps and attributes the authors performed for development of the predictive models.

Description of specimens’ orientation

All investigated parts were placed in four different orientations (see Fig. 3), and the names of the orientations were defined according to the ISO/ASTM 52921:2013(E) (2013) standard:

Group 1 XYZ (XY on Fig. 3 )-oriented parts
Group 2 XZY (XZ on Fig. 3)-oriented parts
Group 3 ZYX (Z on Fig. 3)-oriented parts
Group 4 Angle-oriented parts

By the Angle-oriented parts, the authors mean parts oriented at $45^{\circ }$ between X and Z axes.

Since the design of the experiment defined the requirement to fit as many specimens as possible, the number of specimens in each orientation differ. Thus, 65 parts (the word “parts” is used as a synonym) are placed in XY orientation, 24 parts in XZ orientation, 84 parts in Z orientation, and 44 in Angle orientation.

To identify parts and be able to connect the results of testing and measurements by part placement, every part has its label, which is placed on two sides of the part. This led to variations in the number of mesh triangles, surface and volume values for each part within the build layout (see Table 2). Therefore, it is critical to evaluate whether these variations can influence the quality of the parts with more advanced methods. In addition, it is important to mention that there is no variation in STL model properties between Run 1 and Run 2 due to the usage of the same build layout.

Table 2 Comparison of STL model data in XZY orientation and all specimens together including values of standard deviation ($\sigma $)

Full size table

Data gathering

The data is collected from two identical runs and is used to evaluate the dimensional accuracy of the produced specimens. Length value was measured using a Digital ABS Caliper CoolantProof IP67 with the accuracy of $\pm \,0.02$ mm. Width and thickness were measured using a Digital Micrometer QuantuMike IP65 with the accuracy of $\pm \, 1 \upmu \mathrm{{m}}$.

In addition, to minimize measurement error, the final value of each dimensional feature (see Fig. 3) was calculated as a mean of three repeated measurements.

Distributions of measured thickness, width and length are shown in Figs. 4, 5, 6 respectively, desired (nominal) values shown as a straight line. Kernel density estimation was used to estimate the density probability function for the illustrated dimensional features. In addition, measurements from Run 1 and Run 2 are presented separately to show the variations between runs.

Machine learning techniques: theoretical background

MLP using backpropagation

A feed-forward multi-layer perceptron using backpropagation is one of the machine learning techniques. This method can be applied for modeling of complex tasks, where more conventional mathematical modeling is difficult or unsuitable. A performance of MLP neural network can be described based on its operational unit, the perceptron. The perceptron takes a set of features as an input vector. Typically, it is represented as a vector $\mathbf {x} \in \mathbb {R}^n$ where n is a number of features. A set of features should be preliminary collected describing an event that MLP algorithm is learning to approximate. However, an output vector $y \in \mathbb {R}$ should also be provided beforehand. Thus, an algorithm will map the input values to output as a function $f: \mathbb {R}^n \rightarrow \mathbb {R}$. The function f is evaluated based on the sum of weighted inputs and bias factor $\sum _{i=1}^{n} x_i w_i +b$.

The most common MLP is a three-layer neural network that uses different layers for processing information sequentially. These layers are an input layer, a hidden layer and an output layer, which are schematically represented in Fig. 7. Each hidden unit approximates an input layer to the output layer using activation function:

$$\begin{aligned} h_j=f\left( \sum _{i=1}^{n} x_i w_{ji}+b_j\right) \end{aligned}$$

(1)

where $h_j$ is the output of jth hidden unit, n is a number of inputs, $w_{ji}$ is a weight (connection link) for ith neuron, and $b_j$ is a bias.

The approximated output is calculated by using the output from Eq. 1 as an input:

$$\begin{aligned} \hat{y_k}=f\left( \sum _{j=1}^{N} h_j w_{kj}+b_k\right) \end{aligned}$$

(2)

where $\hat{y_k}$ is an approximated value of the kth output unit, N is a number of neurons in a hidden layer, and $b_k$ is a bias.

Optimization of weights needs to be performed until difference (e) between observed and approximated outcomes is minimized:

$$\begin{aligned} e = \underset{w_{ji},b_j, w_{kj}, b_k}{\arg \min } \left( \frac{1}{2}\sum _{k=1}^{m}(y_k-\hat{y_k})^2\right) \end{aligned}$$

(3)

where $y_k$ refers to the observed outcome, $\hat{y_k}$ is the approximated outcome, and m is a number of outcomes.

The collected dataset needs to be randomly divided into training and testing sets as 70/30 % respectively. A training of MLP neural network should be performed on training dataset, while testing should be done on a testing dataset.

Decision tree regressor

A decision tree is also one of the machine learning techniques. Typically, this method is used for classification tasks, but there is a possibility of applying it for a regression task. Opposite to using an artificial neural network as a black-box, a decision tree method is an open and easy to understand method.

For a given training vector $\mathbf {x} \in \mathbb {R}^n$ (where n is a number of features) and a training label $\mathbf {y} \in \mathbb {R}^l$ ($i=1,2,\ldots l$ represents a number of labels) the regression tree algorithm recursively partions the features domain into smaller regions (separate classes). It is important to choose correct metrics for best data split and determining when a tree node should become a terminal.

Since in this work a decision tree algorithm is used for a regression task, then the target is a continuous value.Thus, for node m, which represents a region $R_m$ with $N_m$ observations, mean squared error (MSE) or mean absolute error (MAE) are possible regression criteria to minimize impurity function H() as for determining locations for future data splits. Minimization ofan error can be done by using mean values of the terminal nodes for MSE (Smola and Schölkopf 2004):

$$\begin{aligned} H(X_m)= \frac{1}{N_m}\sum _{i=1}^{N_m} (y_i -\tilde{y}_m)^2 \end{aligned}$$

(4)

and for MAE:

$$\begin{aligned} H(X_m)=\frac{1}{N_m}\sum _{i=1}^{N_m} |y_i -\tilde{y}_m| \end{aligned}$$

(5)

where $X_m$ is training data in node m.

However, when it comes to analysis of the big amount of data, this method has issues with scalability, stability and robustness (Aluja-Banet and Nafria 2003; Kotsiantis 2013). Another issue that should be addressed is an increase of the complexity when large data samples are used. The total number of nodes, total number of leaves, tree depth and the number of attributes are metrics that can be controlled in order to minimize the complexity of decision tree (Kotsiantis 2013). Since these issues not always can be addressed, ensemble decision trees are used instead and are more robust.

Gradient boosting regressor

Gradient boosting regression machine learning method can be described as an ensemble of decision trees (see Fig. 8). Instead of building one tree, this method predicts the desired outcome based on the additive regression model that uses decision trees as a weak learner (Ye et al. 2009). Sequential fitting of a parameterized function (base learner) to current “pseudo”-residuals is done at each iteration by optimizing regression loss (e.g., least squares, absolute error) Friedman 2002). Friedman (2002) describes “pseudo” residuals as minimization of the gradient of a loss function with respect to values of the regression model at each training data point for the current step.

Introduction of randomization in the process of training data set selection allows to improve accuracy and reduce the possibility of overfitting. This way of compiling a decision tree allows to minimize the errors at each next step, and therefore boosting regressor is considered as more reliable and robust method comparing to classic decision tree regressor.

Support vector regression

Support vector regression (SVR) is a type of Support Vector Machine techniques that tackle the regression tasks. This machine learning method is less sensitive to the dimensionality of the input and has greater ability to achieve lower generalization error of regression model (Drucker et al. 1997; Gunn et al. 1998). Gunn et al. (1998) explains that better generalization is due to minimization of upper bound on the expected risk, also called as structural risk minimization (SRM) principle, although optimization of error on the training data is typically used (Empirical Risk Minimization principle). The former principle is employed by support vector machines, while neural network algorithms apply the latter.

If to assume that input data is presented as a space of the input patterns, then the main goal in SVR is to find such function $f(x) = \langle w, x \rangle + b$ that has the largest deviations from the observed outcome $y_i$ for all training data and at the same time is as flat as possible (Smola and Schölkopf 2004). By flatness is meant determining the smallest weight w (Smola and Schölkopf 2004):

$$\begin{aligned} minimize \quad \frac{1}{2} \Vert w\Vert ^2 \end{aligned}$$

(6)

where $w \in \chi $ and $b \in \mathbb {R}$.

In order to map inputs into high-dimensional feature spaces, different kernel functions (e.g., linear, polynomial, radial basis function or sigmoid) are used. In addition, tuning of algorithm parameters is a critical task for SVR performance, and therefore, one should pay attention to the choice of a subset of training data and parameters defined in kernel functions.

Model’s evaluation with 5-fold cross-validation

Typically, the five-fold cross-validation (CV) is a process when all data is randomly split into k folds, in our case $k=5$, and then the model is trained on the $k-1$ folds, while one fold is left to test a model (an example is illustrated on Fig. 9). This procedure is repeated k times. However, in this work, all data first is split into training and testing datasets, and a training dataset is used for cross-validation. The repeated cross-validation technique is used for the estimation of models’ accuracy and 95% confidence intervals (CI) (Vanwinckelen and Blockeel 2012).

The 5-fold cross-validation is repeated for each model 50 times. An average of all repetitions is used as CV accuracy, and 95% confidence intervals are calculated based on the results of the repeated cross-validation. The final evaluation of the model performance is conducted by looking at whether testing accuracy is in the range of 95% CI. If testing accuracy is in the 95% CI, the model is considered as acceptable, if testing accuracy is outside of the range and difference is significant, the underfitting or overfitting is considered to be present. The testing data is an isolated dataset that is not used in the cross-validation procedure.

Architecture of used machine learning techniques

The development of machine learning models is conducted in several steps. First of all, data preprocessing is performed in two stages: (i) cleaning the collected data and (ii) normalization of investigated features, which are described in “Description of investigated parameters” section. More details about data preprocessing are described in the following section, while the data analysis pipeline is illustrated in Fig. 10. The preprocessed data is split into training and testing datasets to evaluate the generalization ability of the model and to detect overfitting. If prediction accuracy ($R^{2}$—determination coefficient) obtained from the model’s training is significantly larger than the testing prediction accuracy, then overfitting is present. The model should be retrained with new hyperparameters. In addition, k-fold cross-validation is typically used to detect the overfitting. The resulting models’ architectures are described in “MLP using backpropagation”, “Decision tree regressor”, “Gradient boosting regressor”, “Support vector regression” sections.

Data preprocessing

Data analysis always requires clean and normalized data beforehand. This step is especially important in a case when parameters’ values are different. The application of machine learning requires the normalization of features in the training data. In this study, an impact of 13 different parameters on three-dimensional features (thickness, width, and length) is investigated (see Table 3). For example, thickness has a value ca. 1.8–2.5 mm, while the value of the number of mesh triangles starts at ca. 1200 and increases up to ca. 7000. These ranges in parameters’ values have to be scaled to zero mean and unit variance.

Table 3 Investigated parameters

Full size table

The work underlying this paper is based on SPSS statistics (Pearson correlation test) and Scikit-learn (Pedregosa et al. 2011). The authors performed data preprocessing for Scikit-learn applications. In the case of analyzing all orientations as one dataset, the original data is split into training (347 samples) and testing (87 samples) sets using train_test_split.

However, each orientation has different data points used for training and testing. Thus, for the XYZ orientation group, the training set consists of 104 samples and 26 samples in the testing set. For XZY orientation, 38 samples in the training set and ten samples in the testing set. For ZYX orientation 134 and 34 samples in training and testing sets respectively, while for Angle orientation, these numbers are lower, the training sample consists of 70 samples and 18 samples in the testing sample. Before training the models, the training data is scaled to zero mean and unit variance using StandardScaler.

The training and testing of models were performed with the help of Scikit-learn in the conda environment on macOS.

Description of investigated parameters

In order to evaluate the performance of proposed prediction models, there is a need for describing parameters that are used to predict thickness, width, and length. All parameters that are listed in Table 3 can be clustered into two groups. The first group corresponds to STL model properties (number of mesh triangles and mesh points, surface, and volume), while the second group describes part placement in the build chamber with respect to build chamber global coordinate system (world coordinate system—WCS).

Since Magics 20.0 software provides information about part placement in terms of central, minimal, and maximal coordinates (see Fig. 11), it is important to define what these coordinates mean. Minimal coordinate corresponds to a point on the part that is placed closest to the origin of WCS of the build chamber, while maximal coordinate describes the position on the part that is farthest from the origin of WCS.

Figure 11 illustrates how minimal and maximal coordinates are defined for a part that is placed in the build chamber in two different orientations. For the part in XZY orientation, the difference between maximal and minimal coordinates corresponds to its dimensional features. However, a distance between maximal and minimal coordinates for parts in Angle orientation will not correspond to the value of a dimensional feature. In other words, the authors show that part placement coordinates are used to describe the part placement in the build but not the dimensional features of parts.