Keywords

1 Introduction

Automatic objective quality assessment of 3D printed surfaces is currently one of the most dynamically developing areas of image analysis for emerging applications. Observing a rapid growth of popularity of 3D printing (additive manufacturing), as well as the availability of affordable high quality cameras, a natural direction of an extensive research is the application of image analysis methods for smart monitoring of the 3D printing process. The goal of such methods is to make it possible not only to control the progress of the 3D printing procedure but also to prevent the occurrence of some minor errors or even abort the manufacturing process in the case of poor quality of obtained objects.

Vision based assessment of 3D printed surfaces is a natural extension of research activities related to in situ monitoring of the manufacturing process and non-destructive evaluation (NDE), which have been reported recently, e.g. with the use of Optical Coherence Tomography (OCT) for selective laser sintering [12], spatially resolved acoustic spectroscopy [13] or using top view stereo cameras to obtain the cloud of 3D points further compared with the model [14].

An interesting approach to video based detection of defects during the 3D printing process has been presented by Straub [30], where five cameras with Raspberry Pi units connected using Ethernet cables have been used to capture the images of the manufactured object. Although two major types of issues - lack of filament causing the “dry printing” and premature job termination - have been detected properly, the system’s high sensitivity to changes of environmental conditions and camera motions has been a major problem in this approach.

Some other approaches to imaging in quality assessment of 3D prints utilize “process signatures” used for fused deposition of ceramic materials [7, 8], as well as the analysis of “road paths” for identification of under- and over-filling comparing them to the predefined models [5].

Another interesting approach is the non-destructive evaluation based on ultrasonic imaging and X-rays [34] as well as using electromagnetic methods [2], however applicable mainly for off-line quality assessment of previously manufactured objects. Some of the recently presented methods require the comparison with the model of the printed objects [18], whereas some other attempts are based on previous time-consuming training [3, 4, 31] or additional filtering [27]. Nevertheless, most of the proposed approaches utilizing machine vision are used for process monitoring fault detection rather than quality assessment of the manufactured objects [6]. One of recent examples [36] is the use of fringe projector with the analysis of small subregions with the use of local point features. An interesting application for multi-material 3D printing has also been presented in MultiFab project [28], together with automated positioning system utilizing the data obtained from the precisely calibrated OCT 3D scanner.

Nevertheless, the main goal of our research is not only the monitoring of the progress of production and the state of the printing device, but primarily an automatic smart quality assessment of the manufactured object during the printing process, which is usually relatively long. Such possibility would be useful for saving the filament, energy and time in case of detection of too low quality, making it possible to stop the manufacturing process and warn the user immediately.

Since in some cases some minor issues may be corrected after the manufacturing process, or even during printing in some devices, the classification of the printed surfaces into two classes representing high and low quality 3D prints may be insufficient. Despite of encouraging classification results obtained during recent years, an approach to quality assessment using a continuous scale would be more demanding. Considering the methodology typically used in general purpose image quality assessment (IQA), where the objective quality metrics should be highly correlated with subjective opinions, typically expressed as Mean Opinion Score (MOS) values or Differential MOS, a unique dedicated database of 3D printed samples together with subjective scores has been prepared making it possible to verify existing methods. Additionally, a novel approach to surface quality assessment of 3D prints based on the combination of different methods optimized towards high correlation with subjective scores has been proposed and verified in the paper, leading to satisfactory results.

2 Methods of Surface Quality Assessment Based on Classification

Automatic quality assessment of 3D printed surfaces based on the analysis of images captured by side view cameras can be conducted using different approaches, including texture analysis, adaptation of general purpose IQA methods, image entropy, detection of patterns based on Hough transform or the use of descriptors for gradient analysis based on Histogram of Oriented Gradients (HOG). Nevertheless, some of the above mentioned approaches can be applied for the assumed colour of the filament and should be additionally tuned for each colour, hence their practical usefulness may be limited. The main purpose of all these methods is related to the classification of the observed surfaces into two major groups representing high and low quality samples, although in some experiments additional “moderately low” and “moderately high” quality samples have been distinguished.

The application of texture analysis is based on the assumption that the statistical distribution of colours of the neighbouring pixels should be similar for the whole image. Hence, analysing the chosen Haralick features, calculated using the Grey-Level Co-occurrence Matrix (GLCM), smaller homogeneity may be observed for lower quality 3D prints [10, 19]. Nevertheless, the proposed methods require time-consuming computations of several GLCMs for various offsets, used during further analysis of changes of Haralick features. In view of the necessary computational efforts and an average accuracy, this approach has not been further investigated. This decision results also from the experiments conducted for the whole developed database, leading to worse results in comparison to the other methods.

On the other hand, it can be assumed that, due to the regularity of the patterns representing the consecutive printed layers, observed by a side located camera, the image entropy should be significantly lower for high quality 3D prints. Dividing the image into \(N \times N\) fragments, the local entropy values should also be small and similar to each other if there are no visible artifacts, caused e.g. by the lack of filament or being the result of overfilling. Hence, the variance of the local entropy should also be low for high quality surfaces. Since the image entropy is also strongly dependent on the filament’s colour, the combination of the local entropy and its variance calculated for HSV and RGB colour spaces has been proposed in the paper [23], leading to colour independent method of quality assessment. Further improvement of classification results, verified for a larger database of the flat 3D printed samples, has been obtained due to the use of entropy based method applied for the depth maps obtained by a 3D scanner [9].

Another possibility of quality evaluation of the 3D printed surfaces is related to the use of some of the general purpose IQA metrics. As the most universal widely used metrics, such as e.g. Structural Similarity (SSIM) [32] or Feature Similarity (FSIM) [35] belong to the group of full-reference methods, which require the knowledge of the original undistorted image, their direct application would require the comparison with the model of the printed surface. To overcome this issue, the division of the image into blocks has been proposed making it possible to calculate the mutual similarities between the image fragments [24]. In the presence of geometrical artifacts the mutual similarities for the image fragment containing the distortions decreases noticeably. A similar approach may be considered for the calculations of correlation, also with the use of the Monte Carlo method [25] to decrease the amount of computations.

Analysing the structure of the 3D printed flat surface with well visible layers, being the result of placing the melted filament over the already hardened polymer, one may expect a high number of straight lines, which should be easily extracted using the Hough transform with appropriate parameters [11]. Nevertheless, as verified experimentally, its direct application may be troublesome, especially for some brighter filaments. In spite of this, due to the additional application of histogram equalization using the well-known CLAHE method, as well as the random choice of the analysed image regions, a relatively high classification accuracy (about 0.8) may be achieved [11].

Another investigated approach is the application of the HOG features [17] calculated locally for various orientations. Since for high quality surface the luminance changes should be well predictable and the horizontal changes should be much smaller than the dominating vertical ones, assuming that the sample is not rotated, the analysis of directional gradients may be a useful tool for the assumed quality evaluation. A high accuracy of classification, independently of the colour of the filament, may be achieved using the signed orientations and 4 bins for the calculation of the HOG features, assuming the final classification using the standard deviation of the HOG features [17].

Although all the previously proposed approaches presented above have been developed for the classification purposes, it has been assumed that they may be additionally verified by means of the database containing the subjective quality scores collected in perceptual experiments. Such results, obtained after the analysis of the opinions provided by human observers, may be useful for the optimization purposes to ensure high correlation of the developed metrics with subjective evaluation, similarly as in general purpose IQA.

3 The 3D Prints Database

The database containing 107 images of the 3D printed flat surfaces together with their depth maps and subjective evaluation results, expressed as Mean Opinion Score (MOS) values, has been prepared with the use of three 3D printers: Prusa i3, RepRap Pro Ormerod 3 and da Vinci 1.0 Pro 3-in-1. All the samples have been prepared using the most popular Fused Deposition Modelling (FDM) technology from 9 different colour ABS (Acrylonitrile Butadiene Styrene) filaments. In comparison with another popular thermoplastic polymer, namely Polyactic Acid (PLA), this material is more abrasion resistant but requires higher working temperature, as its melting point is about 200 \(^\circ \)C. It is also lightweight and has good mechanical properties, however its fumes emitted during the printing process may be toxic [1, 29].

Since the quality of the manufactured objects are dependent on many conditions, including the quality of materials used for the construction of the 3D printer and the quality of the filament, regardless of some independent factors, the presence of some typical distortions has been forced by changes of temperature, filament’s delivery speed or configuration parameters of the stepper motors. All the obtained samples containing various amount of distortions caused mainly by over- and under-filling, including the presence of cracks, have been independently assessed by 92 human observers using the typical scale from 1 (very poor) to 5 (very good). Additionally the obtained MOS values have been compared with the previously utilized expert opinions to confirm the correctness of the obtained results. Some sample images together with MOS values, are presented in Fig. 1.

Fig. 1.
figure 1

Sample representative images of the 3D printed flat surfaces with their average subjective quality scores.

The images have been acquired using Sony DSC-HX100V camera with an automatic white balance, 5 mm focal length and the exposure time 1/125 s without flash, preventing a fixed distance. A distributed illumination has been used to prevent strong reflections using three lamps. The depth maps have been obtained as the \(1928 \times 1928\) pixels 16-bit greyscale images, being the result of the normalization of the STL files representing the 3D models obtained from the 3D point clouds. They have been achieved as the result of the 3D scanning process using the ATOS 3D scanner manufactured by GOM company with the use of fringe pattern perpendicular to the visible layers on the printed surface [9].

The assumption of an automatic quality evaluation of 3D printed surfaces discussed in this paper is its high accordance with subjective opinions, similarly as in general purpose IQA methods, and therefore the proposed approach should be considered as useful mainly for aesthetic purposes rather than e.g. evaluation of mechanical properties. Such extension would require the analysis of the 3D structure of the manufactured object, acquired e.g. using terahertz methods, and is planned as a part of further research. Another possible extension of the database, planned in future work, may be an addition of images of the non-planar objects, where the entropy based methods may be the most suitable.

4 Idea of the Combined Metric

One of the main goals of the general purpose IQA is to obtain the possibly highest correlation between the objective and subjective quality scores. Unfortunately, single metrics, such as SSIM [32] or much better FSIM [35], usually require the additional non-linear mapping recommended by the Visual Quality Experts Group (VQEG) due to the some specific properties of the Human Visual System (HVS). Since various IQA databases are used for the verification and optimization of newly proposed metrics, the parameters of the logistic function typically used for such mapping may vary for different datasets.

As different general purpose metrics utilize various kinds of image informations, the idea of combined/hybrid metrics has been proposed by the combination of three different metrics using their weighted product [20], leading to a significant increase of the Pearson’s Linear Correlation Coefficient (PLCC) for raw quality scores without the necessity of non-linear mapping. Such idea has been extended by the replacement of some metrics by newer ones [21, 26], as well as its application for multiply distorted images [22] and recently by the use of no-reference metrics [15].

The general form of the combined metric with exponent weights analysed in the paper can be expressed as

$$\begin{aligned} Q_{\hbox {combined}} = \prod _{i=1}^K \hbox {Metric}_i^{\hbox {weight}_i} \; , \end{aligned}$$
(1)

where K is the number of weighted metrics (originally \(K=3\) [20, 21]).

As the component metrics, further subjected to optimization of their weights, all the previously examined methods of quality evaluation of the 3D printed surfaces, have been used, particularly those described in Sect. 2. For the additional verification of the proposed approach, two rank-order correlation coefficients have been calculated, similarly as typically used in general purpose IQA. Nevertheless, in image quality assessment both these coefficients, namely Spearman Rank Order Correlation Coefficient (SROCC) and Kendall Rank Order Correlation Coefficient (KROCC), are considered as the measures of the prediction monotonicity, whereas PLCC measures the prediction accuracy. Sperman’s \(\rho \) is defined as:

$$\begin{aligned} \rho = 1 - \frac{6 \cdot \sum {d_i^2}}{n \cdot (n^2-1)} \; , \end{aligned}$$
(2)

where n is the number of images and \(d_i\) is the difference between the position of the i-th image in two sequences ordered according to subjective and objective scores, respectively.

Kendall’s \(\tau \) coefficient is defined as:

$$\begin{aligned} \tau = \frac{n_c - n_d}{0.5 \cdot n \cdot (n-1)} \; , \end{aligned}$$
(3)

where \(n_c\) and \(n_d\) are the numbers of concordant and discordant, being the positions of two images in the same two sequences sorted according the subjective and objective quality scores, respectively.

Both rank-order coefficients are independent of the differences of the perceived and measured quality, since only the order of the sorted images is considered regardless of the “quality distances” between them, and therefore they do not require any non-linear mapping functions which would not influence the monotonicity of the sequences of the quality scores.

5 Analysis of Experimental Verification

To verify the possible increase of the correlation of the objective metrics with subjective evaluations due to the application of the combined metrics, all correlation coefficients have been calculated firstly for the single methods proposed in previous papers. Analysing the obtained results, presented in Table 1, the best results may be achieved using the methods based on the entropy of the depth map as well as the mutual Feature Similarity calculations. An interesting observation is that for the HOG based metrics much better PLCC values may be obtained for the kurtosis of HOG values but rank-order correlations are higher for standard deviation of HOG originally proposed in [17]. Nevertheless, there is no single method with the PLCC and SROCC exceeding 0.7 and Kendall’s \(\tau \) is slightly higher than 0.5 only for FSIM based metrics.

Table 1. Correlation coefficients between the single objective metrics and subjective quality scores obtained for the developed database.
Table 2. Correlation coefficients between the optimized combined metrics and subjective quality scores obtained for the developed database.
Fig. 2.
figure 2

Scatter plots obtained using selected single metrics and MOS values for 107 samples from the developed database.

Fig. 3.
figure 3

Scatter plots obtained using proposed combined metrics and MOS values for 107 samples from the developed database.

Considering the results of verification presented in Table 1, additionally illustrated by the scatter plots presented in Fig. 2, all further experiments have started with the optimization of the combined metric based on FSIM and local entropy of depth map. Assuming the combined metric based on formula (1) expressed as

$$\begin{aligned} Q_{\hbox {comb2}} = \hbox {FSIM}_4^\alpha \cdot E_{\hbox {localdepth}}^\beta \; , \end{aligned}$$
(4)

where \(\hbox {FSIM}_4\) is the average mutual Feature Similarity assumed for the division of the image into 4 blocks, \(E_{localdepth}\) is the average local entropy of the depth map assuming its division into 16 blocks as proposed in [9] and the weighting coefficients \(\alpha \) and \(\beta \) have been subjected to optimization leading to the increase of the PLCC value to 0.7575 (for \(\alpha = 1.6\) and \(\beta = -1.2\)) as presented in Table 2.

During further experiments some other metrics presented above have been included in the general formula of the combined metric (1) with optimized exponential weights leading to the results presented in Table 2. As can be observed, the best results have been achieved for the combination of four metrics with the following respective weighting coefficients: \(\alpha = 2.9\), \(\beta = -1.6\), \(\gamma = -14\) and \(\delta = -1.8\), where two latter weighting coefficients should be applied for the metric proposed in [11] and kurtosis of HOG features, respectively. Replacing the metric based on the entropy of depth maps by the product of the average local image entropy calculated for the hue component in HSV colour space and its variance, assuming the division of the image into 256 regions [23], makes it possible ot increase the SROCC and KROCC values with slightly worse Pearson’s correlation. The optimized coefficients have been obtained by the unconstrained non-linear optimization using the MATLAB fminsearch function, based on simplex search method, additionally verified using some gradient-based methods.

To illustrate the advantages of the proposed approach, the scatter plots illustrating the relationships between the subjective and objective metrics for 107 samples included in the developed database are presented in Figs. 2 and 3. Observing these plots, higher linearity of the relation between the MOS and proposed combined metrics can be easily noticed.

Since the calculations of all metrics for a single \(1600 \times 1600\) pixels image takes less than 2 seconds in MATLAB environment, installed on a PC with Intel i7 processor clocked at 2.8 GHz and 16 GB of RAM, the proposed approach should be fast enough also for in situ quality monitoring of the 3D prints during a relatively slow typical manufacturing process, even with the use of hardware solutions with lower computational efficiency. Due to the independence of computations performed for each of the individual metrics, some parallelization possibilities of calculations may be considered as well.

Although the verification of the proposed methods has been conducted off-line, the only limitation of the presented approach for on-line applications is related to the necessity of acquisition of depth maps in addition to images captured by side located cameras. In the case of removing the element based on entropy of the depth map for the optimized weights of three other metrics, the PLCC decreases to 0.7877 with SROCC = 0.7854 and KROCC = 0.5898. Nevertheless, similar solutions based on projection on fringe patterns have also been considered by some other researchers [36]. An alternative solution is the use of entropy based method analysed in [23] for images captured by camera calculated in HSV colour space, leading to even better rank-order correlations, as shown in Table 2.

6 Conclusions and Future Work

Application of the proposed combined metrics makes it possible to increase the correlation with subjective evaluation of 3D printed surfaces significantly, from below 0.7 obtained for the best single metric to over 0.83 achieved for the best combination of four methods with optimized weighting coefficients. In comparison to the use of combined metrics for general purpose IQA [15, 20, 21, 26], the increase of the correlation coefficients is much larger, partially due to a high diversity of the combined metrics, which utilize different methods, such as Feature Similarity, entropy, Hough transform and HOG descriptors.

The development of the database of 3D prints containing the results of subjective evaluation opens some new possibilities for the development of even better metrics, optimized in view of correlation with aesthetic evaluations. Nevertheless, an interesting direction of our future research may also be the extension of the database by the results of some other non-destructive evaluation methods, e.g. using terahertz technology, to obtain full information related to the 3D structure of the manufactured objects, also for off-line quality inspection in view of mechanical properties. Another issue, which is worth investigating, is the extension of the dataset towards further development of methods useful for evaluation of non-planar surfaces, e.g. based on entropy and mutual similarity of image regions.

An interesting overview of various approaches to quality control in seven different technologies of 3D printing can be found in [16], whereas some other open challenges are specified in [33], where it has been stated that “the development of 3D printing technologies is still underway, meaning that there are multiple alternatives without an absolute rule for choosing among them”. In view of these needs, the proposed approach to combination of multiple methods of surface quality assessment can be considered as one of the potentially useful solutions for emerging applications related to video based quality control in 3D printing. Such methodology may also be further adapted for some other 3D printing methods and materials than the most popular Fused Deposition Modelling with the use of thermoplastic polymer filaments, such as PLA or ABS.