Evaluation of segmentation accuracy and its impact on patient-specific CFD analysis

Medical image segmentation, especially for biological soft tissues, is an issue of great interest. The aim of this study is to evaluate the segmentation performance of a commercial and an open-source software, to segment aortic root and coronary arteries. 3D printing stereolithography technology was used to generate ground truth models, which were then re-acquired by means of a micro-CT scanner. Measurements from the printed and reconstructed models with both the software were compared, in order to evaluate the level of agreement. In the second phase of this study, Computational Fluid Dynamics (CFD) simulations were conducted, to compare the outputs between the models segmented with the two software. The goal was to understand how differences in the segmentation process propagate in CFD results. Results showed that both software guarantee satisfactory segmentation performance, with average geometrical differences between reconstructed and physical models in the order of a few percentage points. However, when we consider thin details, as a sharp stenotic region, the commercial validated software seems to be more accurate in replicating the real anatomy. We also realized how apparently negligible geometrical differences, varying the employed software, can turn into enormous variations of hemodynamic parameters, such as velocity and wall shear stress, which place in the centre the delicate role the segmentation process holds. This evidence is crucial in the biomedical field and especially in a coronary arteries study, where CFD simulations can be exploited as a starting point for surgery considerations.


Introduction
Medical images play a key role in medicine, allowing to obtain detailed representations of the interior of the human body in a fast and simple way. They are an important non-invasive support tool for clinical analysis and surgical intervention, as well as a visual representation of the function of internal organs or tissues [1]. In this framework, technological evolution has generated a multiplicity of imaging technologies, each one characterized by its peculiarities, strengths and fields of application.
Diagnosis and therapy of cardiac diseases are fundamental issues in medicine nowadays. Preoperative imaging data in cardiovascular surgery and interventional cardiology mainly include CT (Computer Tomography), MRI (Magnetic Resonance Imaging) and echocardiography [2]. Although these imaging modalities are essential, they lack in providing 3D spatial resolution of the anatomical domain. Indeed, black and white 2D representations have the limit to hide an immediate and natural visualization of a 3D anatomical structure, while digital representations solve this problem and, for this reason, they have emerged in the field, also because of the increasing computational power of workstations and their reduced costs. In this light, virtual 3D reconstruction methods are fundamental, opening the way to a broad range of innovative applications in the field.
There is potentially a very broad range of applications for these digital models, from computational simulations to additive manufacturing and augmented/virtual reality approaches. Segmentation is the key process to get the socalled "digital twin" models. It is defined as "the process of partitioning an image into several parts, where each of these parts is a collection of pixels (or voxels) corresponding to a particular structure" [3]. Segmented models have to represent a reconstruction of the actual anatomy, with an adequate level of faithfulness, with respect to the specific application [4]. In this framework, the segmentation process becomes crucial, having to guarantee the generation of an accurate, manageable, "defects-free" result.
A wide variety of segmentation techniques has been proposed in the literature [5]. However, a standard segmentation technique that can produce satisfactory results for all imaging applications is not present. Moreover, different assumptions about the nature of the images lead to the use of different algorithms [6]. Traditional segmentation algorithms use thresholding, region growing, edge detection and clustering, while more recently new and more complex approaches based on deformable models, statistical, fuzzy and neural network techniques have gradually been introduced [6]. Manual segmentation is still the most diffused approach, even if semi-automatic and automatic tools have gradually been implemented (see e.g. [7,8]). Manual segmentation is a time-consuming and tedious activity, subject to intraand inter-observer variability and requires dedicated expert operators [9]. Therefore, the implementation of automated segmentation approaches, requiring as little user interaction as possible, is perceived as a fundamental development in the field [7].
Currently, the segmentation is routinely performed with commercial software. Materialise Mimics (Leuven, Belgium) can be considered the gold standard software [10] in this field and is the most used by professionals worldwide [11]. However, the usage of open-source software as an alternative to commercial software for 3D reconstruction is incrementally taking place, because they guarantee good performance, they are versatile and can be readily extended and redistributed [10], even if they are often less user-friendly and with limited functionalities. Among the various available open-source codes, 3D Slicer (http://www.slicer.org/) is one of the best known and appreciated, characterized by a wide variety of segmentation tools, as well as the possibility of active interaction in the developer community [12].
In the literature review, the papers which compare commercial and open-source segmentation software are few in number. They generally compare manual and semi-automatic or automatic segmentation algorithms, to evaluate differences in terms of performance. The focus is on hard tissues, namely bones, easier to be segmented than soft tissues.
Lo Giudice et al. [10] compared 3D reconstructions of the craniomaxillofacial region, mandible in particular, from Cone-Beam Computed Tomography (CBCT). They digitally compared manual segmentations obtained with Mimics with semi-automatic reconstructions from different open-source alternatives. They found that the semi-automatic segmenta-tion of the mandible showed high reliability, as well as a high correlation with the ground truth model (manual segmentation), even if some underestimations may intervene.
Wallner et al. [13] tested the validity and veracity of the segmentation procedure: their study aimed at creating a complete data set of mandible models, to be the ground truth for further comparisons. In the meantime, they evaluate the accuracy and congruence of the segmented volumes, by calculating parameters such as the Dice Score Coefficient (DSC) and the Hausdorff Distance (HD).
Again, Argüello et al. [12] focused on the segmentation of a human vertebra, to be employed for structural analysis using the finite element method. They compared three opensource segmentation packages, on the basis of parameters such as the ease of the workflow, time for completion and the robustness of the tool, together with the accuracy of the segmented result with respect to a predefined reference. They concluded that the best option for the segmentation process is 3D Slicer software.
Other studies were performed to check the validity of 3D reconstructions from medical images introducing in the workflow 3D Printing (3DP) technologies. Szymor et al. [14] for example compared the 3D reconstruction of the right orbit of the skull performed in 3D Slicer with 3D printed models, obtained with a low-cost selective deposition lamination technology. They compared the reference distances acquired from the 3D virtual model with the ones from the correspondent 3D printed model, acquired by means of an optical scanner. Thanks to the number of available measurements, they statistically validated the reliability of 3D printed models.
Furthermore, the development of Computational Fluid Dynamics (CFD) in recent years has enabled the use of 3D numerical simulations to investigate patient-specific hemodynamic, both in physiological and pathological conditions. In this view, Colombo et al. [15] developed a framework for the investigation of both near-wall and bulk flow hemodynamic of patient-specific stented femoral artery models. The developed reconstruction method was validated using 3D printed rigid phantoms, by comparing the reconstructed geometries with the reference CAD ones employed for 3DP. The means reconstruction error found by authors was about 6%.
The present paper merges the contributions from both segmentation and 3DP studies and CFD studies, to perform an organic assessment of segmentation accuracy, focusing on soft tissues, namely aortic root and coronary arteries, followed by an evaluation of its impact on patient-specific CFD analysis.
The consideration on which this work is based is that the segmentation process, used for patient-specific numerical analysis, may play a key role not only in the patient-specific geometry reconstruction, but also in the hemodynamic evaluation. For this reason, the objective of this study is twofold: 1. To quantitatively compare a commercial and an opensource segmentation software, focusing on soft tissue. The workflow is depicted in Fig. 1: 3DP technology is used to generate a ground truth model, then exploited to compare two different segmentation software, Mimics and 3D Slicer, respectively. For this purpose, measurements from the printed and the reconstructed models are compared. 2. To investigate how the segmentation process affects the subsequent CFD analysis: CFD simulations are carried out from the segmented models, both from Mimics and 3DSlicer, and primary and derived variables are compared.
It is important to underline that the aim of the paper is to offer the reader a methodological perspective, a strategy to tackle a complex and multiscale problem, rather than strictly focusing on numerical results and answer to a contingent biomedical question. This is also why the considered number of patients is not statistically significant and CFD analysis, assuming vessels walls as rigid, was carried out.

Materials and methods
Anonymized CT scans of 3 patients were considered. They were all acquired by means of the same machine, with a resolution of 512 × 512 pixels; the pixel spacing is 0.334 × 0.334 and the slice thickness 0.6 µm. Patients deliberately present differences in their clinical conditions: the first one, arbitrarily called H (Healthy), is a healthy subject, the second one, MS (Mild Stenosis), has a mild stenosis (narrowing of the vessel lumen) in the right coronary artery and the third one, SS (Severe Stenosis), has a severe stenosis, in the same segment.

Ground truth models generation
In order to have patient-specific samples 3D printed, a segmentation process to extract the vessels lumen from patients' CTs was needed. We could have printed already available idealized geometries from CAD models, but we chose to have patient-specific geometries, requiring preliminary segmentation.
The reconstruction procedure here described was repeated for each patient CTs. Preliminary segmentation was performed in Mimics, considered as our benchmark. Once each stack of 2D images was imported in the software, it was necessary to isolate the regions of interest, namely the complex of the aortic root and coronary arteries. To perform the seg-mentation, a thresholding algorithm was used, followed by the definition of the region of interest. Other algorithms had been preliminarily taken into account for segmentation, but then thresholding was chosen because it is the most straightforward and easily reproducible. The volume filled by the blood (the internal lumen) was obtained. This is a typical behaviour while dealing with cardiovascular system segmentation: only the so-called "blood pool" can be obtained, being external walls too thin and not detectable by the software in the segmentation process. The segmentation was achieved with a threshold range of 293-1620 Hounsfield Units (HU). These values allow the blood pool in each part of the volume of interest to be highlighted.
Once the segmentation was completed, some postprocessing steps were needed, to obtain a more regular and representative volume, usable for subsequent applications. The surface was firstly smoothed, with global and local algorithms. For global smoothing, the algorithm was applied to the whole model, for 10 iterations, with a 0.7 scale factor and shrinkage compensation. Upstream a systematic comparison varying the number of iterations and scale factor had been performed, and this was (qualitatively) considered to be a good compromise to recreate the real vessels' surface without losing important details. By contrast, in the case of local smoothing, it was applied just to specific portions, with the same scale factor. After other minor interventions, models were ready to be exported as STL files. In Fig. 2 the same model after the segmentation (left) and the smoothing procedure (right) is shown.
Patients H and MS were selected to perform the comparison between the segmentation software. Indeed, patient SS was not included in the study, because we experimented that the small diameter (< 1.0 mm) in the stenotic region made 3D printing impossible with the adopted technology, due to the systematic breaking of this model during supports removal.
With this regard, a stereolithography (SLA) Form 3 printer (Formlabs, USA) was used to generate the models, with a White rigid resin V4. For this kind of printer, the producer declares an average deviation from the ideal of (− 0.01 ± 0.03) mm, for a feature size of 4 mm [16]. After the proper orientation of the models and the creation of the support structures in PreForm 3.6 software (Formlabs, USA), a layer thickness of 0.1 mm was set. When prints ended (Fig. 3, left panel), according to the producer's specifications, models were washed in isopropyl alcohol for 20 min and then post-cured in the oven at 60°C for 20 min. Supports were then manually removed (Fig. 3, right panel).

Segmentation accuracy evaluation
The two printed models, namely H and MS, were then reacquired by means of a micro-CT scanner (X25 by North Star, On the right, the same model after the post-processing USA), because endowed with a higher resolution with respect to a medical one (see Table 1).
At the same time, however, it has a detectable volume of the order of square centimetres. For this reason, the left coronary artery was sawed off from the printed models, to have models that could be entirely scanned with the instrument; in fact, the main interesting region for the purpose of the work is the right coronary, because of the stenosis presence. In this step, the error introduced by the scanner was neglected, assuming micro-CT as "ideal", in order not to further intricate the workflow.
The obtained radial slices of models were then converted into axial ones, to be segmented and post-processed by the two software. As regards Mimics, the previously described procedure was followed again for each segmentation, while for 3D Slicer it was slightly different. In this case, after CTs were imported, the same threshold range of 293-1620 HU was selected. Differently from the property software, Fig. 3 On the left, 3D printed model as extracted from the printer. On the right, the same model after supports removal, wash and post-caring it was necessary to create a mask for the region of interest and another one for the external region. In this way, the program was able to segment the boundary between the two zones. The segmented volume could then be exported. Due to the absence of a dedicated post-processing environment in 3D Slicer, this phase was conducted exploiting the widespread free software Autodesk Meshmixer 3.5 (http:// www.meshmixer.com/). The idea was to carefully replicate here the same fixing and smoothing steps performed on Mimics models, to guarantee the equivalence. Now, to compare the printed models with the respective reconstructions from the two software, five corresponding significant sections were chosen in each model (see e.g. Figure 4), and the respective diameters were measured.
In the case of printed models, a digital calliper (resolution 0.01 mm) was used, while in the case of reconstructed models, the "best fit diameter" measurement command available in Mimics was exploited. With this tool, the diameter of the best fit circle to the contour of the 3D object in a selected point is calculated, by means of a least-squares method. Even if not a strictly rigorous procedure, to our point of view this approach is a necessary compromise with the practical feasibility of the measuring procedure. The idea was to rely on a unique measuring environment, both for models cre- Fig. 4 Indicative five reference planes on patient A model. Here measurements were taken physically through a calliper on the printed model and digitally on the reconstructed ones ated in Mimics and 3D Slicer. Mimics was chosen because it offers easier to use and more assorted measuring tools, with respect to 3D Slicer. So, 3D Slicer STL models had to be imported into Mimics software. In this way, a uniform and easy measurement procedure was adopted. Each measurement was sequentially repeated three times and performed by two different trained operators.

CFD analysis
The right coronary vessel of each model was isolated and CFD simulations were therefore conducted. Ansys Workbench/Fluent (Ansys, USA), version 19.3, was adopted. STL files from Mimics and 3D Slicer were imported into the software. In this part of the study patient SS could be included, too. A polyhedral mesh with 5 boundary layers was generated, with a cell size ranging from 0.05 to 0.3 mm, after a proper mesh sensitivity analysis [15]. Blood was modelled as an incompressible and Newtonian fluid, with a density of 1050 kg/m 3 and a dynamic viscosity based on the Carreau model [17]. A typical physiological coronary waveform was applied as an inlet boundary condition, since patient-specific data were not available.
The volumetric flow rate (q) in the inlet area was calculated on the basis of the fitting equation [18] q m 3 s 1.43 * d 2.55 (1) where d is the inlet vessel diameter. Zero-pressure and no-slip conditions were applied at the outlet and at the wall. Laminar flow was hypothesized, because the maximum Reynolds number at peak flow rate was about 1650 (Reynold number was calculated in the worst condition, in correspondence of the smallest area in patient SS).
Regarding the used finite volume method, a second-order scheme was set for the pressure calculation, while a secondorder upwind and a least-squares cell based was set for the momentum and for the gradient, respectively. To check the convergence, the residuals were fixed at 10 -5 both for the continuity and for the momentum in the three directions.
Once calculations were completed, for each model mean velocity, maximum velocity and Wall Shear Stress (WSS) field were evaluated.

Segmentation accuracy evaluation
In Tables 2, 3, 4, 5, 6 and 7, average dimensional results with standard deviation, divided for operator and model are reported, for each reference plane depicted in Fig. 4.
Then, starting from the measured values, the percentage differences between Mimics and calliper measurements and between 3D Slicer and calliper measurements were computed, respectively, according to the following formula:  In the following Tables 8 and 9, these calculated percentage differences are reported, for each model.
The obtained values show that, in the case of models from patient H, segmentations performed with the two software differ on average of only a few percentage points, compared to the printed ones. A general tendency to slightly overestimate the segmented volume can be observed. In the case of models from patient MS the behaviour is quite similar, except for plane C (in correspondence with the stenotic region), where the average error is a bit higher than for the other reference planes.

CFD analysis
Five reference planes were selected for each model (Fig. 5), to analyse the quantities of interest. Particular attention was paid to the stenotic regions. Figures 6, 7 and 8 report the average and maximum velocities for the three patients' models, at each reference plane. Figure 9 reports a comparison of coloured maps for velocities, for models H and SS. Significant differences between software can be easily observed for model SS, specifically in the stenotic region.
WSS was also analysed. The maximum values are located in the zones of stenosis. Indeed, for patient H values do not exceed 5 Pa, for both the reconstructions, showing good agreement between them (Fig. 10, left panel). On the contrary, for patient SS maximum WSS reaches 120 Pa in Mimics reconstruction, but only 23.4 Pa in the 3D Slicer one (Fig. 10, right panel). The percentage difference between maximum values of WSS exceeds 80%, while it is equal to 28% for patient MS and about 20% for patient H.

Discussion and conclusions
The segmentation of vascular structures from clinical images is nowadays important for diagnosis assistance, treatment and surgical planning [19]. It is indeed a fundamental step to perform hemodynamic studies, by means of CFD simulations.
Vascular segmentation is a very specific and challenging problem, that can be performed by different software, based on specific algorithms. Open-source alternatives are usually more popular inside the scientific community, making the research process more democratic. However, to be competitive, they have to guarantee a comparable performance with their more emblazoned alternatives, in terms of accuracy, versatility and easiness of use.
The aim of this study was to compare the performance of two different segmentation software, the proprietary Mimics by Materialise and the open-source 3D Slicer, focusing on the aortic root with the coronary arteries. We also estimated how differences in segmentation accuracy could affect results from CFD simulations, so how small reconstruction inaccuracies propagate to fluid dynamics simulations.
We concluded that, even if the number of patients we disposed of was limited, both software roughly guarantees good segmentation performance, with average differences between reconstructed and physical models in the order of few percentage points. Indeed, looking at the resulting measurements performed on the software reconstructions and on the 3D prints, their average differences are almost always below 5%. These low differences, also affected by the difficulties in the manual measurement process, allow both software to be considered as accurate segmentation tools, that guarantee a realistic reproduction of the anatomy of interest. The same conclusion was reached for example in [10]. In this case, the authors dealt with skull models and the employed open-source software was InVesalius (Renato Archer Information Technology Centre, Brazil). We extended these considerations to soft tissue anatomies. However, when we considered thin details, as the stenotic region, the segmentation performance seems to become globally worse. This might be true for patient MS, even if the stenosis is not so severe, and so significance in this sense is quite limited.
The analysis of patient SS stenosis would have been much more interesting. Unfortunately, because of the adopted printing technology, we could not perform the analysis. A qualitative comparison can be anyway conducted by looking at Fig. 11, where the stenotic region of patient SS is seg-mented from the original CTs with Mimics, on the left, and with 3D Slicer, on the right.
In this case, the stenosis is more pronounced. The portion segmented with the open-source software appears to be significantly rougher, while Mimics better catches anatomical variations. The difference is significantly high in value, in correspondence with the smallest cross-sectional area of the coronary, as measured on obtained reconstructions (0.84 mm 2 at minimum for Mimics one and 1.96 mm 2 for 3D Slicer one). It is important to highlight that this result is due to just differences in segmentation algorithms between software, and not induced by the smoothing procedure.  Toepker et al. [20] reached the same conclusions, too. They developed six phantoms to represent coronary arteries with stenosis. They demonstrated that smaller areas and diameters had greater degrees of error compared to larger ones. Even, a 0.20 mm 2 area with a 0.5 mm diameter reached a difference error of 664% from its true size.
These specific considerations reflect on the CFD simulations results. Indeed, for patient H results are quite similar, but when we consider patients MS and above all SS, results show an important discordance: differences reach up to 48% for the maximum velocity next to the stenosis and even 80.5% for the maximum wall shear stress. These are to be imputed, given the same simulations settings, to differences in computational domains obtained from segmentation.
This work presents some limitations, that could be overcome with future studies and developments. Firstly, the low number of patients analysed in this study did not allow us to obtain statistically significant measurements; Abdullah, for example, managed in conducting a statistically significant analysis in its comparative study [21]. The number of available models was further reduced by the breaking of the patient SS printed model, which could not be employed. The use of a different 3D printer technology, without the need for manual supports removal, maybe could have avoided this limitation.
Studies about repeatability and reproducibility of obtained models, as intended in [22], could represent an interesting and powerful complement of the work, too.
Moreover, the measurement of the 3D printed models performed by means of a calliper cannot be so accurate, mainly because of the operator's difficulties in correctly individuate the specific positioning and orientation to read the values. The availability for example of a laser scanner instead of the calliper could remove this operator-dependent step and measurements accuracy would become significantly better. Similar difficulties were found in the digital measurement procedures, for which more refined tools could be thought. Speaking about operator independence, in order to make a comparison through a standardized procedure, an attempt to limit the dependence on the operator in the model creation and modification steps was made. Some steps such as smoothing and geometry fixing, however, cannot be totally user-independent. Even if the Laplacian smoothing at 10 iterations was seen to be a good compromise, the need for further local smoothing in specific sites still requires the manual intervention of the user. The challenge for future developments is to universalize even these steps and reach complete standardized and hopefully automatised reconstruction procedures.
The crucial importance of performing an accurate reconstruction is in this study demonstrated, especially if simulations are used in supporting clinical decisions [23]. Even apparently negligible geometrical differences, coming from different segmentation software, can turn into enormous variations of hemodynamic parameters, which place in the centre the crucial and delicate role the segmentation process holds. This evidence has to be always taken into account in the biomedical field and especially in arteries study, because CFD simulations are usually the starting point for surgery considerations about stent implantations and further stent development studies.
Funding This research did not receive any specific grant from funding agencies in the public, commercial, or not-for profit sectors-GL and FM received funding from the MIUR FISR-FISR2019_03221 CECOMES.

Conflict of interest All authors declare no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.