Soft tissue coverage on the segmentation accuracy of the 3D surface-rendered model from cone-beam CT

The aim of this study is to investigate the effect of soft tissue presence on the segmentation accuracy of the 3D hard tissue models from cone-beam computed tomography (CBCT). Seven pairs of CBCT Digital Imaging and Communication in Medicine (DICOM) datasets, containing data of human cadaver heads and their respective dry skulls, were used. The effect of the soft tissue presence on the accuracy of the segmented models was evaluated by performing linear and angular measurements and by superimposition and color mapping of the surface discrepancies after splitting the mandible and maxillo-facial complex in the midsagittal plane. The linear and angular measurements showed significant differences for the more posterior transversal measurements on the mandible (p < 0.01). By splitting and superimposing the maxillo-facial complex, the mean root-mean-square error (RMSE) as a measurement of inaccuracy decreased insignificantly from 0.936 to 0.922 mm (p > 0.05). The RMSE value for the mandible, however, significantly decreased from 1.240 to 0.981 mm after splitting (p < 0.01). The soft tissue presence seems to affect the accuracy of the 3D hard tissue model obtained from a cone-beam CT, below a generally accepted level of clinical significance of 1 mm. However, this level of accuracy may not meet the requirement for applications where high precision is paramount. Accuracy of CBCT-based 3D surface-rendered models, especially of the hard tissues, are crucial in several dental and medical applications, such as implant planning and virtual surgical planning on patients undergoing orthognathic and navigational surgeries. When used in applications where high precision is paramount, the effect of soft tissue presence should be taken into consideration during the segmentation process.


Introduction
Low-dose cone-beam computed tomography (CBCT) for three-dimensional (3D) imaging of the maxillo-facial structures is increasingly used in the latest years in the medical and dental field, and it extends to a wide range of applications [1][2][3]. Accurate visualization of the face, jaws, and its components are important for clinical diagnostics and decision-making [4]. Nowadays, the application of 3D cephalometric analysis plays an important role in cases of complex maxillo-facial abnormalities [5,6] and for the evaluation of growth or treatment outcomes [7][8][9]. Accurate 3D surface-rendered models, especially of the hard tissues, are crucial in applications such as virtual surgical planning on patients undergoing orthognathic surgery, during dental implant and prosthetic procedures, and when simulating treatment outcomes [10][11][12][13][14][15].
Surface-rendered 3D models of the hard tissues derived from the CBCT Digital Imaging and Communication in Medicine (DICOM) data that are originally composed of voxels, each with its own gray value based on the radiation absorbed by the tissues during the scan. By means of specific software applications, reconstructions are possible in sagittal, coronal, and axial planes, allowing the operator to scroll through the tissue structures in any direction of interest. Most applications also include reconstruction techniques allowing image extraction out of the DICOM data similar to those from conventional radiographic techniques, such as orthopantomograms and lateral cephalometric radiographs [16]. In order to construct a 3D digital model of specific tissues out of the original voxel-based data, a specific process called segmentation has to be followed. Briefly, the operator provides the software an upper and a lower threshold matching the gray-level range of the voxels which specifies the tissues of interest. The software then discards all data outside these limits and depicts only the voxels within the set threshold values.
Factors influencing the quality and accuracy of the models provided by the segmentation process can be divided in three main categories [17]. The first comprises CBCT system factors, such as scanner type, field of view settings, and voxel size settings [18][19][20][21]. The second category is patient-related factors such as positioning of the patient in the scanner, metal artifacts, and the soft tissue covering [22][23][24][25]. The third are operator-related factors, such as the segmentation process itself, the employed software, and the operator performing the segmentation [26][27][28][29]. Most studies published on this particular subject were based on research using dry skulls.
The use of the dry skull without any soft tissue substitute may present a drawback, as it does not simulate the human anatomy in real life. The soft tissues are sources of scattering radiation during a radiographic examination, resulting in deterioration of the signal-to-noise ratio and, subsequently, of the gray-level value differentiation in voxels between different density tissues, presumably resulting to lower bone image quality [30]. As a result, measurements obtained from images on a human dry skull may deviate from the "true" values if they be obtained from the same subjects with soft tissue coverage. Interpretation of these results may be misleading for decision-making in clinical practice. To overcome this error, some studies introduced latex balloons filled with water as a soft tissue equivalent [31,32]. One of the drawbacks of this method is that it does not reflect the soft tissue properties or distributions in real life. Alternatively, human cadavers may represent better the human anatomy, which are, however, difficult to obtain. The effect of soft tissue presence on the segmentation process and the resulting 3D surface-rendered volumetric models is not yet well documented, specifically when all the above-mentioned affecting side parameters are standardized.
Therefore, the aim of this study is to investigate whether the soft tissue presence affects the segmentation accuracy of the 3D surface-rendered volumetric models from conebeam CT in a setting in which other affecting parameters were controlled.

CBCT Data acquisition
A sample of seven DICOM datasets was used for this study, containing the scans of seven anonymous cadaver heads comprising both edentulous and partially edentulous jaws that were initially scanned with the KaVo 3D exam scanner (KaVo Dental GmbH, Bismarckring, Germany) using a standardized scanning protocol [28]. Subsequently, the cadavers were meticulously macerated according to an established protocol by the Department of Anatomy, University Medical Centre Groningen, Groningen, The Netherlands [33]. This procedure resulted in the respective seven dry skulls that were scanned with the same scanner and following the original protocol as used during the cadaver scans. The dry skulls were repositioned in the scanner using the laser reference lines of the unit [34]. Yaw, pitch, roll, and height settings were obtained from the first series of cadaver scans by means of the manufacturer's software in order to achieve similar positioning. For both sets of scans, a 0.3-mm voxel size with a 17-cm field of view was used. Therewith the complete sample used in this study consists of DICOM datasets of two groups, scans of the cadaver heads with soft tissue, and scans of the respective dry skulls without soft tissue.

The segmentation protocol
The DICOM datasets were exported from the unit's dedicated software and imported into ITK-SNAP, a specialized segmentation software package for medical imaging [35]. The procedure starts with determining the region of interest and threshold levels, followed by setting the seeding points in the tissues of interest, which results in a very close approximation of the 3D structures with neighboring intensities. The final 3D surface-rendered model of the bone surface is formed by segmenting using the level-set method to drive the active contour evolution as coded in the ITK-SNAP software [35]. It is based on region competition causing the active contour to reach equilibrium at the boundary of the regions, i.e., the borders of the hard tissues, subsequently discarding pixels representing the surrounding soft tissues. The segmentation procedure for the DICOM datasets was performed in a random order following a segmentation protocol fitting the software package. This procedure is partly operator dependent, since each cadaver and dry skull requires an individual set of segmenting thresholds. All segmentations were performed by one operator and repeatedly three times with a 1-week interval to determine intraobserver reliability. This segmentation procedure resulted in digital 3D surfacerendered hard tissue models for both groups, with and without the presence of soft tissues.

Measuring procedure
The effect of soft tissue on the 3D surface-rendered hard tissue models was analyzed by comparing the two groups of models. The models representing the dry skulls were set as the reference 3D models, against which the models deriving from the cadavers with the soft tissues were tested. Two different methods were employed for comparisons. The first method was based on linear and angular measurements using the Simplant O&O software (Materialise Dental, Leuven, Belgium). A series of landmarks were selected and identified on the 3D digital models, and subsequently, the defined linear and angular measurements were performed by the software [32,33,36]. These measurements are illustrated in Figs. 1 and 2 for the maxillo-facial complex and the mandible, respectively, and were assessed three-dimensionally on the surface-rendered models. The landmark identifications were performed blindly and in random order by one operator, three times repeatedly with a 1-week interval to test for intraobserver reliability. The second method was based on volume superimposition and color mapping of the surface differences, which are described in the following session in detail [37,38].

Superimposition and color mapping
This procedure was performed using Geomagic Studio software package (Geomagic Solutions, USA). First, the two regions of interest were defined, namely the maxillo-facial complex and the mandible. In this study, the maxillo-facial complex was defined as the area including all external facial surfaces below a horizontal plane, passing through the highest point of the orbital rims, parallel to the Frankfurt horizontal, and ventral to a vertical coronal plane which crosses both temporo-zygomatic sutures, excluding the mandible. The second region of interest was the mandible which was compared completely, including all external lingual and facial surfaces.
Before comparing, the models had to be superimposed (Fig. 3). For both regions of interest, an automatic operatorindependent matching was followed; the software performs the best-fit matching based on the least squared differences between the models for optimal superimposition. The maxillo-facial complexes were matched on their complete external facial surfaces. The mandibles, however, given the known potential transversal dimensional deformation due to the maceration and drying techniques [39], were matched on their frontal area defined as all external surfaces of the mandible ventral to a vertical coronal plane passing through both mental foramina (Fig. 3c). In addition, both the maxillo-facial complex and the mandible models were split in their midsagittal planes in order to further minimize any interfering effect of such dimensional deformation and isolate the soft tissue effects. The resulting left and right parts were matched and compared separately (Fig. 3e, f).
The differences between the models were displayed by means of color mapping of surface deviations and by sampling the models to approximately one million polygons each. In addition, the software provides information for each comparison indicating the magnitude of deviation between the surfaces of the two registered volumes. The signed average of the surface differences presents the absolute mean of the surface differences by simply adding up positive and/or 1, 2 orbita width L (left) and R (right)-distance between points medioorbitale and zygomaticofrontal medial suture; 3 zygion width-distance between points zygion R and L; 4 frontozygomatic width-distance between points zygomaticofrontal medial suture R and L; 5 nasal canal width-distance between points lateral piriform aperture R and L; 6 nasal width-distance between points medio-orbitale R and L; 7, 8 orbita height-distance between point orbitale and the line crossing both supraorbital points; 9 nasion-anterior nasal spine height-distance between points nasion and ANS; 10 facial divergence angle nasionangle from zygion R point to nasion point to zygion L point; 11 facial divergence angle ANS-angle from zygion R point to ANS point to zygion L point negative distances for all surface polygons. By definition, differences are marked positive when the tested surface lies outside the reference surface and vice versa. The root-meansquare error (RMSE) was used as an absolute measure of model surface deviations, in order to account for positive and negative differences which otherwise could cancel out each other [38,40]. In addition, the mean values of the left and right sides were calculated.

Statistical analysis
The reliability of all linear and angular measurements was expressed by intraclass correlation coefficients (ICC) for absolute agreement based on a two-way random effects analysis of variance (ANOVA) between the three repeated measurements. All variables were positively tested for normality using the Shapiro-Wilk test with all p values being >0.05. The significance of differences between the measurements performed on both groups and the differences between RMSE values was calculated using the paired sample T test. The level of clinical significance was set at 0.05. The RMSEs and signed average of differences were calculated for the selected surfaces on each maxillo-facial complex and mandible model. Mean and standard deviations were calculated. All statistical analyses were performed using Statistical Package of Social Sciences (SPSS).

Measurement reliability
No differences between the repeated segmentations were found exceeding the level of 0.3 mm, which was the voxel size (Fig. 4). These values confirm the reliability of the used segmentation protocol as being excellent. The ICCs of the reliability tests on all measurements obtained from both reference and test groups varied within 0.94-1.00 and were accordingly classified as being excellent.

Linear and angular measurements on the maxillo-facial complex
The results from the linear measurements performed on the mandible were shown in Table 1. The mean differences for all linear measurements on the maxillo-facial complex ranged between −0.41 and −0.78 mm ( Table 1). The differences between the reference and test groups for the angular measurements were between −0.41°and −0.12°. Both were found statistically non-significant (p > 0.05). The results from the linear measurements performed on the mandible were shown in Table 2. Significant differences were found for the mandibular width at the coronoid process (1.95 mm, p < 0.01), at condylion laterale (1.42 mm, p < 0.01) and at condylion medial (3.60 mm, p < 0.001). The angular measurements showed significant differences for the coronoid divergence angle (1.67°, p < 0.001), the condyle lateral divergence angle (1.22°, p < 0.01), and the condyle medial divergence angle (1.85°, p < 0.001).

Volume comparisons of the maxillo-facial complex
The results of the volume superimposition and surface comparisons of the maxillo-facial complex between the reference and test models were shown in Table 3. The mean RMSE was 0.936 mm with a standard deviation of 0.188 mm comparing the complete maxillo-facial complexes. The mean signed average of difference between surfaces was −0.092 mm with all values being negative. Dividing the maxillo-facial complex to a left and right side for separate registration and comparing procedures decreased the mean RMSE to 0.922 mm with a standard deviation of 0.195 mm which was not significant (p > 0.05). Mean values. Standard deviations (SD). Average differences (Δ) with confidence intervals (CI) at 95 % and level of significance of the differences (P) Fig. 4 An illustration of the segmentation reliability based on color mapping. The segmentation reliability was assessed by superimposing and color mapping two surface models of the same maxillo-facial complex segmented at two different sessions. Green color represents differences within 0.3 mm which equals the voxel size

Volume comparisons of the mandible
The results of the mandible comparisons were shown in Table 4. Analyzing the complete mandibles registered on the frontal area resulted in a mean RMSE of 1.240 mm with a standard deviation of 0.375 mm. The mean signed average of difference between surfaces was −0.189 mm with all values being negative. Dividing the mandible in a left and right side, Mean values. Standard deviations (SD). Average differences (Δ) with confidence intervals (CI) at 95 % and level of significance of the differences (P) *p < 0.05, **p < 0.01, ***p < 0.001

Discussion
The aim of this study was to investigate whether the soft tissue covering affects the segmentation accuracy of the 3D surface-rendered volumetric models derived from conebeam CT, which to our knowledge has not been studied previously. Well-defined linear and angular measurements combined with volume comparisons demonstrated as an accurate tool to measure differences between 3D digital models were used as a method to illustrate the soft tissue effect [41][42][43]. Our results indicate that the soft tissue presence seems to affect the accuracy of the 3D hard tissue model of both the maxillo-facial complex and the mandible obtained from a cone-beam CT, however, below a generally accepted level of clinical significance of 1 mm [44]. It must nevertheless be noted that this level of accuracy may not meet the requirements of all applications, especially where higher precision is paramount [45,46]. All segmentations, linear and angular measurements, and volume comparison procedures were performed repeatedly by one single operator. The repeatability of the segmentation process was found excellent with the differences between repeated segmentations below the set voxel size of 0.3 mm. The ICCs of the linear and angular measurements were classified as "excellent" which is in agreement with previous studies reporting excellent reliability of this method within and among different observers [43,47]. The differences of the linear and angular measurements for the maxillo-facial complex were below 1 mm. In the mandible, larger differences were observed which could have possibly demonstrated a considerable effect of soft tissue on the inaccuracy of the models. However, carefully analyzing these results and taking into consideration the maceration and drying process involved in this study, such a conclusion might be found misleading. Previous studies showed an effect of drying on bone morphology particularly concerning a mandible. The posterior transversal dimension of the pig mandibles were affected up to an extent of 2.7 % [48]. This is in agreement with the present study, in which both the linear and angular measurements and the superimpositions showed differences for the mandibular transversal measurements. Although comparing the maxillo-facial complexes did not reveal such dimensional distortion as seen in the mandible, we could not presume that this region was not affected. In order to minimize any effect of the possible transversal deformation due to the maceration and drying process involved, the mandible and the maxillo-facial complex models were split in their midsagittal planes and the left and right parts were registered and compared separately. The characteristic measure of the inaccuracy of a tested model compared to the reference is the root-meansquare error or RMSE that serves as a measure of how far is the average error from 0, i.e., the distance difference, between the surfaces of the two models. Whereas the mean RMSE values of the maxillo-facial complex barely differ before and after the splitting, that of the mandible significantly decreased to less than 1 mm, which is generally considered as a clinical acceptable threshold [44]. Further, all mean differences between the surfaces were negative values. This shows a trend in which the dry skull models lie slightly within the cadaver models when the two models were optimally superimposed, indicating the dry skull models being smaller in general which could be caused by a change in humidity [39]. Although this is a consistent finding, the differences never exceeded the limits of the voxel size of 0.3 mm used in this study. These findings confirm the results from previous studies demonstrating that interpretation of research data, in which dry skulls were used, especially concerning the mandible, should be analyzed with the knowledge that significant changes may have taken place in the craniofacial dimensions [48]. Special consideration was taken to control the possible affecting parameters, in order to focus on the effect of soft tissue presence in the accuracy of the 3D model, although this would partly limit the external validity of this study. All dry skulls were scanned using the same cone-beam CT unit, with identical scan settings and similar positioning of the skull in the field of view. Given the fact that the voxel size is another affecting parameter, adjusting the setting to a higher scan resolution could compensate the effect of soft tissue presence. Concerning this parameter, it has been indicated that the accuracy of the models appears to be connected to the voxel size [49], although the differences were small. However, others did not find increased accuracy of linear measurements on segmented surface models decreasing the voxel size from 0.4 to 0.25 mm [34]. In our study, the voxel size was fixed to 0.3 mm.
Head positioning is also considered as a possible affecting parameter. The accuracy of linear measurements using wires glued on the skull reported no clinically relevant effect [50]. Head positioning has been shown to be an affecting variable in the accuracy of the 3D cephalometric measurements based on 3D CBCT surface images and the reliability of linear measurements in some studies [50][51][52] and to be an insignificant factor for measurement accuracy in others [19]. Since the range of positioning deviations for error-free measurements are not yet known, we specifically aimed at achieving similar positioning using the laser reference lines of the unit to control the positioning as a parameter affecting the accuracy.
Two other parameters assumed to have a possible effect on the segmentation accuracy are the operator performing the segmentation process and the software utilized. The software package employed requires the operator to define individual hard tissue threshold values for each skull and cadaver. A previous study showed that a commercial software company produced more accurate surface models compared to an experienced 3D clinician [33]. This higher accuracy could be the result of a more experienced operator performing the segmentation or ascribed to the use of different tools and methods provided by different software packages. Both these affecting factors were well controlled in this study by utilizing one operator with a high intraobserver reliability performing segmentations on both the cadaver and dry skull and the use of one professional software package. This study used a limited but unique sample of paired CBCT DICOM datasets from seven human cadaver heads and their respective dry skulls, contributing to a method in which the surface models with and without a natural soft tissue coverage could be compared. The effect of the scattering radiation due to the presence of soft tissues, resulting in deterioration of the signal-to-noise ratio and subsequently of the gray-level value differentiation in voxels, was well simulated. The method has a clear advantage over the use of artificial media, such as water balloons, to mimic the coverage of soft tissues [32,53]. However, it has to be acknowledged that this method still contains a number of limitations. The preservation and storage of the cadavers was done in formaldehyde embalming fluid which could have increased the soft tissue thickness, altered the bone properties of the cadaver heads, and subsequently influenced the scattering radiation during the radiographic examination [54]. Therefore, in this study, the cadaver heads were not meant for representing the exact human anatomy of a living being. For the purpose of the present study, they were merely used as ex vivo equivalent of human heads to investigate the effect of soft tissue coverage.
Further research should focus on other affecting parameters resulting in a 3D surface-rendered model including the hardware, software, and operator-dependent parameters. By adjusting and improving these parameters, the effect of soft tissue presence should be minimized resulting in more accurate 3D surface-rendered models suitable for high-accuracy demanding applications.

Conclusion
Considering the inherent limitations of any method involving preparation of dry skulls, it seems that the soft tissue presence does affect the segmentation accuracy of the 3D hard tissue model of both the maxillo-facial complex and the mandible obtained from a cone-beam CT, however, below a generally accepted level of clinical significance of 1 mm. As this level of accuracy may not meet the requirements of applications where high precision is paramount, further studies should investigate how to optimize the setting parameters to overcome the potential inaccuracy of CBCT-derived surface models related to the presence of soft tissue.

Compliance with ethical standards
Conflict of interest The authors declare that they have no competing interests.
Funding None.
Ethical approval The manuscript does not contain clinical studies or patient data.

Informed consent Not required.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.