1 Introduction

The market segment of desktop 3D printers has gained much importance in recent years, with an average sales growth reaching 88.6% from 2012 to 2015. Desktop, low cost or personal are the terms generally used to define additive manufacturing (AM) systems sold by less than $5000, which are mostly based on the fused filament fabrication (FFF) technique [1, 2].

In short, those machines selectively deposit molten polymers by a nozzle as the print head, or the building platform, moves on the xy plane and can generate complex three-dimensional objects in a layer-wise approach [2, 3]. Most personal 3D printers adapt design configurations from the Stratasys’s Fused Deposition Modeling (FDM®) technology [4] and from the reference RepRap project [5].

Usual printing materials include filaments made of polylactide (PLA) and acrylonitrile butadiene styrene (ABS), but alternative polymers have arisen contributing to the versatility of the process [1, 6]. The quality and reliability of the desktop 3D printers have also increased over the years, which have encouraged multinational companies to adopt these printers on the first stages of conceptual design [1].

Although few papers report the benefits of desktop 3D printers as auxiliary teaching and research tools, these machines favor the implantation of hands-on learning approaches, facilitating new developments in design and allowing preliminary viability studies on the engineering field [7,8,9]. This may explain why educational institutions appear as the second biggest market for low-cost 3D printers [1].

Despite being hard to track, sales of desktop 3D printers increased exponentially in the last 5 years leading to the emergence of several small companies producing and selling machines [1]. In that context, choosing among the many options of low-cost 3D printers available in the market has turned into a complex task, since factors such as technical performance, price and software capabilities must be considered according to the intended application [8, 10].

The technical performance of low-priced 3D printers has been assessed based on their capacity to produce a standard part. Several benchmarking models have been proposed over the years, which focus on different aspects such as accuracy, surface finish, mechanical properties, sustainability and cost [2, 8, 11,12,13]. Although they provide guidelines to choose among the many 3D printers available in the market, the results obtained with standard parts not always reflect a real demand or consider a specific application.

Different methods have also been proposed to help on the selection AM equipment, either using consolidated decision-making approaches such as analytic hierarchy process (AHP) and simple pair analysis (SPA) or introducing new ranking procedures [8, 10, 14,15,16]. These falls into two main limitations: Most of them are based on a qualitative analysis, which implies poor objectivity, or consider a general scenario, which results in a plain selection method whose criteria have all the same importance.

The purpose of this paper is to demonstrate by means of a case study how a relatively simple decision-making method such as AHP can be applied on the selection of desktop 3D printers, benefitting from quantitative data obtained from the evaluation of a real component. The evaluation model, evidenced in Fig. 1, composes an adhesive distribution system that was designed to integrate a laminated object manufacturing (LOM) machine, which uses discarded paper as main building material. The system is part of an innovation project developed at the University of São Paulo, which yielded a patent application under the number BR 10.2016.008220-0 [17].

Fig. 1
figure 1

Design proposed for the LOM machine showing in detail the evaluation model used in the case study

2 Background

2.1 Benchmarking models for additive technologies

With the continuous development of additive technologies over the last 30 years, several researchers proposed test artifacts, often called benchmarking models, in order to evaluate AM systems and processes. Kruth [18] was one of the first to compare different AM processes, including material extrusion, using a test geometry. Besides geometric features, the first benchmarking models were also used to assess mechanical properties [19] and economic factors [20].

Different test parts with focus on geometric, mechanical and process aspects were proposed to assess the performance of additive technologies [11, 21, 22]. A hybrid category was also mentioned by Johnson et al. [13], considering the potential overlaps across the three basic test types.

Richter and Jacobs [23] suggested that the ideal benchmarking model should occupy most of the available build area, allowing to test the printing quality in the central region of the platform as well as near the edges. The test artifact should also present a substantial number of small, medium and large features, including holes, bosses and many features of a real application component. Moreover, the part should not take too long to build, should consume a small amount of material and should be easy to measure. Additional suggestions that were incorporated to the test artifacts can be found elsewhere [24,25,26,27].

With the expiration of the Stratasys’ FDM® patent in 2009, the number of low-cost 3D printers available in the market increased quickly [1]. Despite sharing the same principle of deposition (i.e., material extrusion) and general configuration, the competing platforms may differ greatly in size, positioning system, material and software capabilities. In that scenario, more benchmarking models with focus on personal 3D printers started to appear.

The first test part designed to evaluate both dimensional accuracy and build time with focus on FDM® user’s point of view was presented by Grimm [28]. Johnson et al. [13] proposed a different benchmarking part to evaluate the capabilities of a CupCake® CNC MakerBot in relation to geometric accuracy, followed by Decker and Yee [29], who also presented a new benchmarking design intended to facilitate the evaluation of dimensional accuracy.

More recently, sustainability aspects started to be considered when evaluating personal 3D printers [2]. Anyway, as can be seen from the papers cited, there is still no consensus about a standard geometry to assess neither industrial AM machines nor personal 3D printers, and few works are devoted to real application components.

2.2 Selection of AM systems and processes

Aside from comparing benchmarking parts, there is a need to establish methods to help the selection of additive processes and machines. Xu et al. [30] proposed a series of equations to calculate the building time and cost, which could be used by a selection software together with experimental data regarding dimensional accuracy and surface roughness.

Decision-making methods have also been explored as a valid approach to tackle such a huge variety of printing processes and platforms considering multiple criteria, including technical, economic and environmental aspects.

Borille et al. [14] compared the performance of AHP, multiplicative analytic hierarchy process (MAHP), SPA and two VDI (Verein Deutscher Engenieure) guidelines in the selection of the most adequate technology to produce prototypes for the aeronautical industry. The authors qualitatively assessed the alternatives in terms of precision, roughness, elongation and cost of the part and production time using linguistic terms.

Roberson et al. [8] proposed their own selection system for AM machines, based on the capacity of five 3D printers of different technologies to produce a standard part. The evaluation was performed in terms of build time, unit cost, material cost and dimensional precision, but the model proposed did not consider scenarios of importance according to the field of application.

Mançanares et al. [15] proposed a method to select AM processes based on AHP, which considers criteria such as material variety, surface quality, post-finishing, precision, impact and flexion resistance, cost of the part and post-cure. The method comprehends three scenarios of application: a real engineering part, a machine prototype and an architectonic model. The processes were scored qualitatively, according to a database about each technology, by means of linguistic terms for their performance.

Cruz and Borille [16] applied AHP, SPA e VDI guidelines to assess the performance of three different manufacturing processes relative to the fabrication of final components for the aeronautic industry. The processes were evaluated qualitatively in terms of savings, weight, development time and cost considering two distinct scenarios.

Finally, Çetinkaya et al. [10] proposed the application of hybrid fuzzy AHP and PROMETHEE to select the most adequate low-cost 3D printer for a Turkish company. The selection criteria were prioritized using fuzzy AHP, and the 3D printers were ranked using qualitative terms for their general performance.

2.3 Analytic hierarchy process

A decision-making method comprehends a set of rules used to obtain and analyze information, which can be applied to solve a decision problem [31]. When it comes to a selection problem, simple and logical methods must be used to assess the attributes of the alternatives in relation to certain criteria in order to achieve a specific goal [16].

The AHP is a method that can be used to solve complex decision problems considering multiple criteria. Developed by Saaty in the 1970s, AHP helps to make decisions individually or in collective, offering a structured approach to rank the most suitable alternative regarding the specified objective [31].

Despite the simplicity, AHP can handle ill-structured problems in a hierarchical manner; is capable to consider tangible and intangible attributes; and helps to monitor the consistency of the judgments performed by the decision makers, standing out from other decision-making tools [32].

The first step of AHP consists in elaborating a hierarchical structure, listing the major objective and the criteria which correspond to the metrics that will be used to assess the different alternatives. That way, an uncertain demand can be deployed into specific and controlled levels, helping to clarify the decision process [15, 32, 33].

Next, the preference of the subcriteria regarding a parent criterion located one level above is determined by pairwise comparisons, using numerical values attributed according to the fundamental scale shown in Table 1 [32].

Table 1 Relative scale of criterion importance [33]

The resulting matrix is reciprocal and shows how much the ith element is more important than the jth element, for a given level of the hierarchy [33]. The consistency of the judgments must be then evaluated, because people’s feelings and preferences are often subjective and tendentious [16]. The consistency index (CI) of the comparisons is calculated according to Eq. 1, where λmax is the maximum eigenvalue and n is the order of the matrix.

$${\text{CI}} = \frac{{\lambda_{\hbox{max} } - n}}{n - 1}.$$
(1)

The relation between the consistency index and the random index (RI), which is given in Table 2, represents the consistency ratio (CR) of the judgments made by the decision maker. If CR is smaller than 10%, the decisions are considered valid and the main eigenvector of the matrix can be calculated [32].

Table 2 Random index [33]

The normalized elements of the main eigenvector show the priority rank of the entities under comparison. Saaty [34] explains why the eigenvector should be used and how to calculate it. The procedure to determine the local importance of every criterion of the hierarchy structure follows the same logic [14].

In a similar way, the alternatives may also be ranked by means of pairwise comparisons considering each criterion of the defined hierarchy. However, and even checking the consistency ratio, this approach is more subjective, as the scoring will be based on the experience of the decision maker [16].

Alternatively, tangible data resulting from experiments such as mechanical tests, monetary information and other measurements related to each subcriterion may be used to score the possible solutions. This approach eliminates the need for comparisons, and no consistency checking is required. Rather, a column vector representing the rating of the alternatives for each subcriterion is produced.

For a given set of subcriteria, the alternative column vectors are then reunited in a single matrix, which must be multiplied by the normalized eigenvector that contains the local weight of each subcriterion from the same set. This will generate a column vector representing the rating of the alternatives for each set of subcriteria [32].

The individual alternative vectors from a given level are always combined in matrix and multiplied by a vector that contains the local weight of each criterion. The final ranking is achieved when the weighted scores of the alternatives have been calculated for every level of the hierarchy.

3 Research methodology

The research methodology was implemented as a case study consisting of three main stages, as shown in Fig. 2. On the first stage, the AHP structure was elaborated, and the criteria were compared to determine their priority. On the second stage, the evaluation models were produced using three different 3D printers. Next, the models were analyzed in relation to the criteria specified on the decision hierarchy. In contrast with previous works, data from experimental analysis were used on the weighted assessment stage and the alternatives were finally ranked.

Fig. 2
figure 2

Adopted research methodology

The three machines tested in this study were the 3D Cloner DH® (Indústria Schumacher Ltd., Marechal Cândido Rondon, Paraná, Brazil), UP! Plus 2® (Beijing Tiertime technology Co., Beijing, China) and MakerBot Replicator 2® (MakerBot Industries, Brooklyn, NY, USA), which were calibrated prior to use as recommended by the developer. The mentioned printers are Cartesian, with axis individually driven by stepper motors. Both 3D Cloner DH® and MakerBot Replicator 2® move the building platform on the z-axis and the print head on the x–y axis, while the UP! Plus 2® moves the building platform on the zy axis and the print head on the x-axis. Table 3 presents the information of interest about the mentioned 3D printers.

Table 3 Cost and technical data for the 3D printers tested in this study

All material used to print the models was Esun® PLA, of green color, acquired by 22.00 USD/kg and provided by Shenzhen Esun Industrial Co. Ltd. (Nanshan District, Shenzhen, China). The three systems assessed were adjusted to print only one model using the same printing parameters, e.g., controlled room temperature of 25 °C, layer thickness equal to 0.25 mm, printing temperature of 220 °C, rectilinear raster for the infill pattern and using the same nozzle diameter of 0.4 mm. Although the printing temperature may be considered high for PLA, it is within the temperature range recommended by the distributor and has been used in previous experiments performed by the authors with good results. The construction orientation XY [35] was also kept the same for the three equipments.

Due to the inherent disparities among the printers, some conditions could not be equated. For example, each system had its own slicer program: ClonerGen3D®, UP Studio® and MakeWare®; therefore, it was not possible to standardize the printing routines. Also, a thin layer of glue stick was applied on the building platform of the 3D Cloner DH® and Replicator 2® to fix the parts, while the perforated platform of the UP! Plus 2® was heated to 40 °C, which is the standard value used by the slicer program. In addition, both the 3D Cloner DH® and Replicator 2® were set to print with 90% infill density, while the Up! Plus 2® was set to print with the maximum infill percentage available, since it is not possible to numerically control the print density.

The equipment used to evaluate the surface roughness of the printed models was a contact profilometer, model Form Taly Surf 50 Intra (Ametek Inc., Berwyn, Pennsylvania, USA). Dimensional and geometric accuracy were inspected with a conventional Mitutoyo caliper (Mitutoyo America Corp., Aurora, Illinois, USA) and using a coordinate measuring machine, model CROMA 686 (Hexagon Metrology, São Paulo, Brazil) with resolution of 0,039 micrometers and equipped with a 4-mm probe tip, resulting in measurements with resolution of 1 µm. The mass of the models was measured using a precision scale, model FA2104 (Raylabel, China), and the built time was counted with a stopwatch. The actual cost of material used was calculated based on the measured weight of the parts and the price per kilogram of the plastic filament. Finally, the estimated build time and material use were calculated by each slicer program and informed prior to the start of the printing process.

4 Case study

Since the selection method proposed aims to identify the best low-cost 3D printer to be used on the fabrication of a physical prototype at the university, this paper is organized as a case study. Besides considering multiple criteria with different priorities, the evaluation of a real component allows to collect experimental data about the performance of each 3D printer in a realistic situation. From the perspective of the research coordinator, knowing the most appropriate system among those already existing in the university is important to direct future acquisitions for similar applications.

4.1 AHP structure

The decision hierarchy elaborated to select the most suitable 3D printer for the mentioned application is shown in Fig. 3. As can be noted, eight quantitative criteria were proposed, which can be grouped into three main categories: technical performance, software capabilities (i.e., the slicer program’s ability to inform the user about the process requirements in terms of time and material) and economic aspects. The structure comprehends different aspects related to the usage of low-cost 3D printers, simply organized into two levels. The three possible solutions are represented at the bottom of the hierarchy.

Fig. 3
figure 3

Decision hierarchy elaborated to select the most suitable 3D printer of the case study

Tables 4, 5, 6 and 7 show the decision matrices with the pairwise comparisons performed in agreement with the AHP methodology [33]. The calculated consistency indexes are, respectively, 0.03, 0.05, 0 and 0; therefore, the judgments can be considered valid.

Table 4 Decision matrix for the criteria of the first level
Table 5 Decision matrix for the first set of subcriteria, second level
Table 6 Decision matrix of the AHP for the second set of subcriteria of the second level
Table 7 Decision matrix of the AHP for the third set of subcriteria of the second level

Table 8 summarizes the normalized importance of each criterion obtained after determination of the main eigenvector associated with each decision matrix, as described in Sect. 2.3. It is worthy to note that, due to the application context of the evaluation model, the technical performance criterion was considered the most important, followed by the economic aspects and software capabilities.

Table 8 Relative importance of each criterion obtained with the AHP method

In addition, since the 3D printed geometry is part of a mechanical assembly, surface roughness, dimensional and geometric accuracy were considered more important than build time. Having in mind the academic environment where the research was developed, the price of the 3D printers was considered more important than the cost of the material consumed. In relation to the software capabilities, it was considered more interesting to know in advance the time required to build the models than the amount of material consumed.

4.2 Model evaluation

The evaluation model is represented in Fig. 4, with annotations indicating its main dimensions and identifying the features that were analyzed. The geometric tolerance symbols (cylindricity, perpendicularity and position) also indicate which type of evaluation was performed, e.g., perpendicularity was checked for the cylindrical bosses (CB1, CB2 and CB3) in relation to the flat plane; position error was checked for CB1 and CB2 in relation to one another. Surface roughness (Ra) was measured along 15-mm tracks, in five different locations as denoted by the dashed lines.

Fig. 4
figure 4

Simplified representation of the evaluation model, showing the main dimensions and features analyzed

The functionalities of the component involve the dispensing, flow and distribution of a water-based adhesive that is used to bond sheets of paper according to the additive LOM process. As part of an innovation project developed by students at the USP, 3D printing is being used to generate the physical prototypes necessary to test the invention. Therefore, in order to assemble the whole system and validate the mentioned functions, it is crucial to produce a prototype with smooth surface roughness and good dimensional and geometric accuracy. Despite the relatively simple geometry, the part presents different features such as cylindrical bosses, inclined and flat plans, and overall reduced dimensions, which facilitate the printing operations and the evaluation process.

4.2.1 Build time, material consumption and cost comparison

Table 9 summarizes the actual and estimated building time, actual and estimated material consumption and cost of material used to print the models using the mentioned 3D printers.

Table 9 Build time, material consumption and cost comparison for the 3D printers tested in this study

Due to the smaller building platform of the UP! Plus 2®, the model had to be sectioned in three pieces so that it could be printed. In order to guide and simplify the assembly, the section followed the base of the cone trunk shown as a circular line in the perspective view in Fig. 4. No fits were added, and after completion, the pieces were glued together with a cyanoacrylate adhesive. The 3D Cloner DH® presented the fastest build time, while the UP! Plus 2® took the longest time. The longest time was expected for the UP! Plus 2® due to the need to section the evaluation model in three pieces.

Since the part was designed to avoid supporting structures, there was no discarded material. The 3D Cloner DH® consumed more material, which reflected on the material cost, followed by the UP! Plus 2® and Replicator 2®. Considering the theoretical weight of a fully dense model to be 109.9 g, the resulting infill produced by each machine was 95.9% (3D Cloner DH®), 88.1% (UP! Plus 2®) and 85.5% (Replicator 2®). Although inaccurate control of the wire feed and temperature fluctuations may contribute to fill variations, the most significant contribution can be attributed to the different G codes generated by each slicer program and their associated tool paths. Although UP Studio® did not allow to set a numerical value for the percent infill, it generated a model with the closest density to the desired value (90% infill).

In relation to the material consumption estimation, the UP! Plus 2® presented the highest accuracy (0.5% error), followed by the 3D Cloner DH® (6.4% error) and Replicator 2® (8.9% error). In all cases, the slicer program underestimated the building time, with the most accurate estimate given by the 3D Cloner DH® (2% error) and the least given by the Replicator 2® (24.5% error).

4.2.2 Surface roughness

Surface roughness measures the microscopic irregularities or superficial texture of a part. The box plot in Fig. 5 shows the minimum, first quartile (Q1), average, third quartile (Q3) and maximum values for the roughness average (Ra), collected from five locations, including the inclined plane, for each printed model. As discussed by Roberson et al. [8], surface roughness measurements on an inclined plane reveal information about the layer height and the stair-step effect.

Fig. 5
figure 5

Average surface roughness (Ra) measurements for five locations on each printed model

In general, the UP! Plus 2® printed with the lowest average roughness (0.1762 µm) but with the largest variation in comparison with the other machines. Considering that the material and printing temperature were kept the same, the lower surface roughness produced by the UP! Plus 2® can be attributed to the heated platform. As the evaluation model is relatively thin, the presence of an additional heat source slows down the solidification process, which favors material flow and accommodation, thus reducing the average Ra value as reported by [36, 37].

The 3D Cloner DH® produced surfaces with a slightly higher average roughness (0.2027 µm), but with more constant values. The MakerBot Replicator 2® showed the worst average value for surface finish (0.3242 µm). Although repeatability is an important condition for a 3D printer, for simplicity, only the average value for the surface roughness was used in the rating stage.

4.2.3 Dimensional accuracy

Table 10 lists the dimensional deviations from selected features that were analyzed in the printed parts. Worthy to note, the 3D Cloner DH® and Replicator 2® have the tendency to build features with smaller dimensions than the digital model. On the other hand, the UP! Plus 2® produced larger features, with an accentuated positive deviation in CB1.

Table 10 Overall dimensional deviations for selected features of the evaluation model

The negative deviations in some of the linear dimensions (BW, BD and BH) are frequently attributed to the thermal contraction of the PLA. Also, high thermal gradients and inefficient fixture usually make the extremities of the parts to warp, which reduces the final height of the model as observed with the printers without heated platform (3D Cloner DH® and Replicator 2®). In contrast, positive dimensional deviations may be majorly attributed to systematic positioning errors caused by the driver axis, which was more critical with the UP! Plus 2®. As opposed from expected, the model printed in three pieces by the UP! Plus 2® did not show a pronounced positive deviation in the base depth (BD).

Since it was not possible to effectively separate the contribution of the material (e.g., crystallization and thermal contraction) from the influence of the equipment (e.g., poor positioning system, lack of heated platform) in this study, the absolute values of the deviations were summed in order to obtain a number representing the general dimensional accuracy of each 3D printer. Table 11 reveals that the 3D Cloner DH® produced parts with the best dimensional accuracy, followed by the Replicator 2® and UP! Plus 2®.

Table 11 Sum of the absolute values of dimensional deviations

4.2.4 Geometric accuracy

Figure 6 shows the measurements performed on the printed parts. The results for the cylindricity, perpendicularity and position error measurements in relation to the digital file are presented in Table 12.

Fig. 6
figure 6

Measuring three-dimensional tolerances using the CMM CROMA 686

Table 12 Geometric tolerance results

As in the case with the dimensional accuracy, the sum of the values reveals that the 3D Cloner DH® produced parts with the best geometric accuracy, followed by the MakerBot Replicator 2® and UP! Plus 2®. Grimm [28] mentions that the geometric accuracy of parts produced by FFF printers is impaired by a round-off error caused by the slicer software. This may also be caused by positioning errors of the driver axis, which reveal a sensible quality difference between the mechanical components used in each 3D printer, which is reflected by the price of each machine.

Other sources may also contribute to the measured geometric errors. A recurring problem with low-cost 3D printers based on material extrusion, generally called oozing, was more observed in the parts produced by the UP! Plus 2® and MakerBot Replicator 2®. Oozing occurs when the polymer keeps flowing, while the print head is moving, causing the filament to be deposited at the external surface of a feature. In addition, due to the precision of the probe tip, part of the tolerances can be affected by the surface roughness of the model.

4.3 Assessment

In order to rate the 3D printers according to the decision hierarchy, the numerical values of the average surface roughness, the sum of the absolute values for the dimensional deviations, total geometric error, build time, percent variations for the estimated build time and material consumed, unit price and cost of material consumed were used.

It is important to note that the values were inverted before they were normalized, since the lower values are desirable as seen in [14]. Table 13 summarizes the absolute rates attributed for the 3D printers in relation to each subcriterion.

Table 13 Absolute rates attributed for the 3D printers in relation to each subcriterion

As discussed in Sect. 2.3, the alternative vectors from the first set of criteria of the second level were reunited in a matrix and multiplied by the corresponding vector with the local weights of each criterion. The same was done with the alternative vectors from the second and third set of criteria. Then, the three resulting vectors were reunited once again and multiplied by the vector containing the local weight of each parent criterion.

Table 14 summarizes the final score and shows the resulting rank of the 3D printers. The 3D Cloner DH® presented the highest score, followed by the UP! Plus 2® and MakerBot Replicator 2®. This indicates that, according to the procedure discussed, the 3D Cloner DH® is the most suitable machine for the intended application.

Table 14 Final score and ranking of the assessed 3D printers

5 Conclusion

With this paper the authors intend to illustrate, by means of a case study, how 3D printing equipment can be objectively and quantitatively assessed with the aid of the AHP method. The process conditions were adjusted to be as similar as possible, except for the inherent differences of the equipment. The assessment was performed with three low-cost 3D printers with respect to their performance in the manufacture of a real application part. According to the resulting rank, the 3D Cloner DH® can be considered the best low-cost 3D printer to be used in the physical prototyping activities developed at the university.

By using the AHP method, it was possible to identify and weight the evaluation criteria which determined the assessment metrics. The evaluation model, even though it was not designed for comparison purposes, presents some of the basic features required for a benchmarking part and the quantitative evaluation performed contributed to mitigate the subjectivity of possible judgments based on a linguistic or qualitative mind frame.

For the specific application considered, the technical performance criteria (surface roughness, dimensional and geometric accuracy and build time) were considered more important than the economic aspects (unit price and cost of the material consumed) and the software capabilities (build time and material consumption estimation).

Although the inevitable process differences can lead to discrepancies in the performance of the equipment, the adopted conditions reflect the real limitations associated with low-cost, plug ‘n’ play 3D printers. Such limitations might reflect on the final outcome of the assessment; however, the authors would like to emphasize that the major objective of the research did not focus on the numerical results representing the performance of each printer, but on the method itself.

Finally, although the case study performed focuses on a very specific application segment, the method presented can be adapted for other situations, considering different additive techniques that involve a multifactorial choice. The method may also be adjusted to compose a friendly computational routine or mobile application that helps the user to choose and prioritize decision criteria and rate the alternatives with qualitative or quantitative grades, facilitating the selection process of low-cost 3D printers.