Reliability of HR-pQCT Derived Cortical Bone Structural Parameters When Using Uncorrected Instead of Corrected Automatically Generated Endocortical Contours in a Cross-Sectional Study: The Maastricht Study

Most HR-pQCT studies examining cortical bone use an automatically generated endocortical contour (AUTO), which is manually corrected if it visually deviates from the apparent endocortical margin (semi-automatic method, S-AUTO). This technique may be prone to operator-related variability and is time consuming. We examined whether the AUTO instead of the S-AUTO method can be used for cortical bone analysis. Fifty scans of the distal radius and tibia from participants of The Maastricht Study were evaluated with AUTO, and subsequently with S-AUTO by three independent operators. AUTO cortical bone parameters were compared to the average parameters obtained by the three operators (S-AUTOmean). All differences in mean cortical bone parameters between AUTO and S-AUTOmean were < 5%, except for lower AUTO cortical porosity of the radius (− 16%) and tibia (− 6%), and cortical pore volume (Ct.Po.V) of the radius (− 7%). The ICC of S-AUTOmean and AUTO was > 0.90 for all parameters, except for cortical pore diameter of the radius (0.79) and tibia (0.74) and Ct.Po.V of the tibia (0.89), without systematic errors on the Bland–Altman plots. The precision errors (RMS-CV%) of the radius parameters between S-AUTOmean and AUTO were comparable to those between the individual operators, whereas the tibia RMS-CV% between S-AUTOmean and AUTO were higher than those of the individual operators. Comparison of the three operators revealed clear inter-operator variability. This study suggests that the AUTO method can be used for cortical bone analysis in a cross-sectional study, but that the absolute values—particularly of the porosity-related parameters—will be lower. Electronic supplementary material The online version of this article (10.1007/s00223-018-0416-2) contains supplementary material, which is available to authorized users.


Introduction
The geometry and density of the cortex of long bones are important determinants of bone strength. Since most of the bone mass lost with age is cortical, fractures at advanced age occur often at sites that mainly consist of cortical bone [1][2][3]. Additionally, it has been shown that cortical porosity is a good predictor of bone strength [4][5][6]. High resolution peripheral quantitative computed tomography (HR-pQCT) is a non-invasive three-dimensional imaging modality that has the ability to measure volumetric bone mineral density (vBMD) and microarchitecture of the cortical and trabecular region [7]. Furthermore, HR-pQCT images can be used in micro-finite element analyses (µFEA) to calculate bone strength indices [8].
Identification of the cortical region on HR-pQCT images is challenging; the transition from endocortical to trabecular bone is gradual, so no single voxel identifies the end of the cortex and the beginning of the medullary canal with its trabecular content [9]. Currently, studies examining HR-pQCT-derived cortical bone parameters mainly use a semiautomatic method (S-AUTO) provided by the manufacturer to distinguish the cortical region from the trabecular region. With this method, an endocortical contour is generated automatically first [10,11], and the operator then manually modifies the generated contour when it visually deviates from the apparent endocortical margin. However, this method may be prone to operator-related variability and is time consuming (approximately 1 h per scan) [12]. This is particularly problematic in large cohort studies, where image analysis will be done by several operators and data from many participants need to be analyzed.
In an in vivo scan-rescan study by Kawalilak et al. it was shown that use of the uncorrected endocortical contour instead of S-AUTO contour resulted in the same repeatability of cortical bone parameters [12]. However, it is currently unknown whether the uncorrected contour can reliably be used for the assessment of cortical bone parameters in crosssectional studies. Additionally, the magnitude of differences in cortical parameters due to inter-operator variability in modification of the contour is unknown.
In this study, we examined whether the uncorrected automatically generated contour (AUTO) instead of the S-AUTO contour can be used for cortical bone analysis in an in vivo cross-sectional study. Therefore, cortical bone parameters were first obtained with the AUTO method, and then with endocortical contours that were corrected by three independent operators. The cortical bone parameters obtained with the AUTO method were compared to the average of the cortical bone parameters obtained with the S-AUTO method of the three independent operators (S-AUTOmean). Additionally, the cortical bone parameters obtained by the three independent operators were compared to each other. We hypothesized that the AUTO method can reliably be used for cortical bone analysis.

Study Population and Design
Data from The Maastricht Study, an observational prospective population-based cohort study, were used. The rationale and methodology of this study have been described previously [13]. In brief, the study focuses on the etiology, pathophysiology, complications, and comorbidities of type 2 diabetes (T2DM) and is characterized by an extensive phenotyping approach. Eligible for participation were all homedwelling individuals aged between 40 and 75 years and living in the southern part of the Netherlands. Participants were recruited through mass media campaigns and from the municipal registries and the regional Diabetes Patient Registry via mailings. Recruitment was stratified according to known type 2 diabetes status, with an oversampling of individuals with T2DM for reasons of efficiency.
The present study includes cross-sectional data from a subset of 63 consecutive participants with normal glucose metabolism, who completed the baseline survey between September 2010 and June 2013 and returned to the research center between September 2015 and January 2016 for the HR-pQCT scan of the distal radius and tibia. Participants with a radius and/or tibia scan with severe or extreme motion artifacts (i.e., quality grade 4 or 5 [14], n = 11), participants with a scan with an inadequate position of the reference line (reference line not on plateau of the distal radius or distal tibia n = 1), and participants with extreme outliers on almost all cortical parameters (> 2 SD from mean, n = 1) were excluded, resulting in a final study population of 50 participants. The study has been approved by the institutional medical ethical committee (NL31329.068.10) and the Minister of Health, Welfare and Sports of the Netherlands (Permit 131088-105234-PG). All participants gave written informed consent.

HR-pQCT Imaging
The non-dominant radius and ipsilateral tibia were scanned on a HR-pQCT scanner (XtremeCT; Scanco Medical AG, Brüttisellen, Switzerland) using the standard in vivo protocol as described in the literature [7,15]. In case of a history of a fracture of the distal radius or tibia at that site, the contralateral site was scanned. The forearm and leg were immobilized in a carbon fiber cast. An anteroposterior scout projection of the scan site was acquired for positioning of the tomographic acquisition. A reference line was placed on the plateau of the distal radius or distal tibia. The scan started 9.5 and 22.5 mm, for the radius and tibia respectively, from the reference line in the proximal direction and spanned 9.02 mm in length. Images were reconstructed using an isotropic voxel size of 82 µm, thus resulting in 110 consecutive slices. Total scan time was 2.8 min, with each acquisition resulting in an effective dose of approximately 3 µSv. All scans were graded once (operator 1) with regard to subject motion, and scans with quality grade 4 (severe motion artifacts) or 5 (extreme motion artifacts) were repeated once [14]. Only scans with quality grade 1 to 3 (no, minor or moderate motion artifacts) were used for subsequent image analysis [16].

HR-pQCT Image Analysis
All scans were evaluated using the standard patient evaluation protocol that was provided by the manufacturer and has been described previously in detail [17][18][19]. First, the periosteal contour was automatically derived and manually 1 3 modified by operator 1 when contours visually deviated from the periosteal boundary [20]. The endocortical contour was automatically created using a series of automatic morphological operations to separate the trabecular and cortical volumes of interest [10], resulting in the uncorrected contour (AUTO method). Then, according to Burghardt et al. [10], when the contour visually deviated from the apparent endocortical margin, it was manually corrected (S-AUTO method). Correction of the AUTO contour was performed three times by three independent operators (operator 1 (OP1), operator 2 (OP2), and operator 3 (OP3)). All images were then analyzed using the advanced cortical evaluation protocol provided by the manufacturer [10,11]. All three operators underwent the same training for modification of the endocortical contour.

Statistics
All statistical analyses were performed using the Statistical Package for Social Sciences (version 22.0; IBM, Chicago, Illinois, USA). Mean and standard deviations for all cortical bone parameters were calculated using the S-AUTO contours of OP1, OP2, OP3, and (three times) the AUTO contour. All single evaluations with the AUTO method resulted in the same result; the average of the three AUTO evaluations (AUTOmean) is thus equal to AUTO. The average of the results obtained using the S-AUTO contours of OP1, OP2, and OP3, is referred to as S-AUTOmean. The mean difference in cortical bone parameters was calculated between S-AUTOmean and AUTO, and between all individual operators. A paired samples t test was used to test for significant differences in mean cortical bone parameters between these pairs. Non-normally distributed variables were log transformed using the natural logarithm. The Pearson correlation coefficient© and the intraclass correlation coefficient (ICC) were calculated to measure linear dependence and the level of agreement of the cortical bone parameters between S-AUTOmean and AUTO and between all individual operators. The precision error for all cortical bone parameters was calculated as the root mean square coefficient of variation (RMS-CV%) of the three operators (OP1, OP2, and OP3), the pairs of operators (OP1 and OP2, OP1 and OP3, and OP2 and OP3), and the average of the semi-automatic method and the automatic method (S-AUTOmean and AUTO) [21]. Bland-Altman plots were provided to visualize agreement between S-AUTOmean and AUTO, and between the independent operators. The limits of agreement were calculated as the mean value ± 1.96 * SD. A p value < 0.05 was considered statistically significant.

Results
Data from 50 participants with normal glucose metabolism and a HR-pQCT scan of the distal radius and tibia with quality grade 1-3 were used for analysis. The mean age of the participants was 57.3 ± 8.7 year and 60% were women. Five (10.0%) scans of the distal radius and 25 (50.0%) scans of the distal tibia were graded as quality 1, 29 (58.0%) scans of the distal radius and 18 (36.0%) scans of the distal tibia were graded as quality 2, and 16 (32.0%) scans of the distal radius and 7 (14.0%) scans of the distal tibia were graded as quality 3.

Mean Cortical Bone Parameters
The mean cortical bone parameters obtained by the individual operators and by AUTO and the differences in mean cortical parameters between S-AUTOmean and AUTO are shown in Table 1 (radius) and Table 3 (tibia). The differences in mean cortical parameters of the pairs of operators (OP1-OP2, OP1-OP3, and OP2-OP3) are shown in Table 2 (radius) and Table 4 (tibia).

Correlation Coefficients
Pearson's r and the ICC of all bone parameters of the three operators (OP1-OP2-OP3) and of the two methods (S-AUTOmean-AUTO) are shown in Table 1 (radius) and Table 3 (tibia). Pearson's r and the ICC of all bone parameters of the pairs of operators (OP1-OP2, OP1-OP3, and OP2-OP3) are shown in Table 2 (radius) and Table 4 (tibia).
The correlation coefficients for S-AUTOmean-AUTO and the pairs of operators were > 0.9 for almost all cortical bone parameters of both the distal radius and tibia (except for log Ct.Po.Dm of the radius: S-AUTOmean-AUTO 0.89, OP1-OP3 0.88, OP2-OP3 0.87, and log Ct.Po.Dm of the tibia: S-AUTOmean-AUTO 0.80). The ICC of S-AUTOmean and AUTO was high (> 0.81) for almost all

Precision Error
The RMS-CV% for all cortical bone parameters of both the distal radius and distal tibia are shown in Table 5 (OP1-OP2-OP3 and S-AUTOmean-AUTO) and

Bland-Altman Plots
The Bland-Altman plots showed no systematic error in any of the plots of S-AUTOmean-AUTO ( Fig. 1 (cortical porosity of the radius and tibia); Supplemental Figs. 1 (radius) and 2 (tibia)) and of the individual operators ( Fig. 1 (cortical porosity of the radius and tibia); Supplemental Figs. 3 (radius) and 4 (tibia)). The 95% confidence intervals were the widest in the plots of S-AUTOmean-AUTO, displaying more variability in error between S-AUTOmean and AUTO than between OP1 and OP2, OP1 and OP3, and OP2 and OP3. Additionally, use of the AUTO method instead of the S-AUTO method resulted in lower absolute values of the porosity-related parameters, while also clear variability in determined cortical porosity between the individual operators was observed (Fig. 1). One outlier (> 2 SD from mean) was observed in the Bland-Altman plots of the distal radius of S-AUTOmean-AUTO and OP1-OP2 for the parameters Ct.TV, Ct.BV, Ct.Th, Ct.vBMD, and Ct.Ar. Slice 75 of this specific scan is shown in Fig. 2. Clear differences in the location of the endocortical contours of AUTO, OP2, and OP3 are visible when compared to the contour of OP1.

Discussion
In this study, we evaluated whether the AUTO contour instead of the S-AUTO contour could be used for cortical bone analysis in a cross-sectional study. Therefore, the (average) results obtained with the AUTO method were compared to the average results of the S-AUTO method. Additionally, variability in results as obtained with the different S-AUTO contours was examined. The smallest differences in mean cortical bone parameters, the highest ICCs and lowest precision error were found between independent operators. Additionally, the precision error of the three operators was generally lower than the precision error of S-AUTOmean and AUTO. This indicates that correction of the contours, even when the scans are analyzed by several independent operators, introduces a smaller error than the use of the uncorrected, automatically generated contours. However, the mean differences in cortical bone parameters between S-AUTOmean and AUTO were highly comparable to the differences between OP1 and OP3. The absolute level of agreement between S-AUTOmean and AUTO was high and the Bland-Altman plots of the differences between S-AUTOmean and AUTO in cortical parameters showed no systematic error.
The precision errors of the cortical bone parameters of the distal radius of OP1-OP3 and of OP2-OP3 were generally comparable to those of S-AUTOmean-AUTO. In contrast, the precision errors of the cortical bone parameters of the distal tibia of S-AUTOmean and AUTO were higher than the precision error of the individual operators, particularly for the porosity-related parameters. Compared to precision errors for cortical bone parameters obtained by a short-term repositioning study that used S-AUTO [10], the precision errors of S-AUTOmean and AUTO were higher, particularly those of Ct.Po.V and Ct.Po of the distal tibia. In contrast, the error introduced by not correcting the endocortical contour to the precision was comparable to the error introduced by variability in positioning of the reference line for the parameters Ct.BMD and Ct.Th, but not for Ct.Po [22]. Additionally, the precision error of S-AUTOmean and AUTO for Ct.Po and Ct.Po.Dm of the distal radius is comparable to the precision error of a donor specimen that was scanned and evaluated at 9 different sites [23]. Although it is known that motion of the subject has a large influence on the precision error of densitometry and trabecular microarchitectural parameters [14], the influence of motion of the subject on cortical bone parameters has not been examined. Thus, the AUTO method seems to introduce about the same error to the precision error as variability in the position of the reference line and as multicenter scanning.
Although all operators underwent the same training, clear differences between the endocortical contours of Table 6 Root mean square coefficient of variation of the cortical bone parameters of the distal radius and tibia for the independent operators Ct.TV cortical total volume in mm 3 , Ct.BV cortical bone volume in mm 3 , Ct.Th cortical thickness in mm, Ct.vBMD cortical vBMD in mgHA/ cm 3 , Ct. Po.V cortical pore volume in mm 3 , Ct.Po cortical porosity in %, Ct.Po.Dm cortical pore diameter in mm, Ct.Ar cortical area in mm 2 , OP1 operator 1 semi-automatic contouring method, OP2 operator 2 semi-automatic contouring method, OP3 operator 3 semi-automatic contouring method the three operators were visible (Fig. 2). The explanation for this difference is probably the gradual transition from cortical to trabecular bone, which makes identification of the endocortical border challenging [1,9]. As a result of the presence of this transitional zone, each operator 'sees' his or her own truth and modifies the AUTO contour in a slightly different way, thereby including variable parts of the transitional zone. Currently, there are no studies published that compare different endocortical contours with histology (the golden standard), and it is thus currently unknown which contour marks the endocortical border correctly. Since the porosity of the trabecular compartment is much higher than the porosity of the cortical compartment, variability in inclusion of the transitional zone will influence the observed cortical bone parameters and will lead to an increased RMS-CV%, particularly of the porosity-related parameters. As shown in Fig. 2, the endocortical contour created by AUTO was closer to the periosteal border than the contour created by the independent operators, and thus included a smaller part of the transitional zone. As a result, use of AUTO results in lower absolute values of cortical porosity when compared to using the S-AUTO method (Fig. 1). This is in agreement with the study by Kawalilak et al., who also showed that the S-AUTO method resulted in a larger trabecularized cortex compared to the AUTO method [12]. Studies using AUTO contours instead of correct contours should therefore take into account that this will lead to lower absolute values of all parameters, except for the cortical vBMD which will be higher. Limitations of our study include the generalizability; the mean age of our study population was 58 years, and therefore most of the included women will be postmenopausal. It may be expected that the AUTO method will also be reliable in premenopausal women because the endocortical border is often better recognizable in younger women [1]. Additionally, a study by Kawalilak et al. showed that the reproducibility of cortical bone parameters when the AUTO contour is used is better in premenopausal than in postmenopausal women [12]. In contrast, recognition of the endocortical border is more difficult in older individuals, which may lead to problems with both the AUTO and S-AUTP method. Future studies are warranted to examine the validity of the use of the AUTO method in both younger and older study populations. A second limitation of our study is the lack of scan-rescan data. We were therefore not able to determine the short-and/or long-term reproducibility of the uncorrected and corrected contours. However, a previous study showed a high reproducibility of both the AUTO method and the corrected contours [12]. Third, the study was not designed for comparing the two methods for a clinical outcome such as fracture prediction or treatment of osteoporosis. Future studies are warranted to examine whether the AUTO contour can also be used in studies with clinical outcomes. Fourth, for the examination of the cortical compartment, we used the method provided by the manufacturer. Therefore, all the results in this study are therefore only valid for this method, and cannot be extrapolated to other algorithms such as the commercially available StrAx© software [24]. Finally, the reference line for the HR-pQCT scans was placed at a fixed reference point, which resulted in the scanning of the same region in every participant. However, bone morphology at that region differs between individual patients, where a higher amount of cortical bone will be present in participants with relatively short extremities. Therefore, recent studies suggest scanning at a percentage distance of the total length of the bone [25,26].
In conclusion, the S-AUTO method resulted in better reliability of the cortical bone parameters than the AUTO method in a cross-sectional study. However, it was shown that correction of the AUTO contour by different operators resulted in clear variability in cortical parameters. Additionally, the percent differences in cortical parameters, the ICCs, and the precision errors of the radius between the uncorrected automatically generated contour and the corrected contour were highly comparable to the errors observed between independent operators and no systematic error was observed in the Bland-Altman plots. Therefore, we believe that the AUTO contour can be used for cortical bone analysis in a cross-sectional study, although it should be taken into account that use of the AUTO instead of the S-AUTO method will result in lower absolute values of particularly the porosity-related parameters. The lower cortical parameters with AUTO and the variability in parameters between the individual operators can be explained by variable inclusion of the transitional zone.
Author contributions EACdW, PPMMG, AK, BvR, and JPWvdB designed the study and prepared the first draft of the paper. EACdW is guarantor. EACdW, CS, and AP contributed to the experimental work. EACdW, JJAdJ, PPPMG, AK, BvR, and JPWvdB were responsible for statistical analysis of the data. All authors revised the paper critically for intellectual content and approved the final version. All authors agree to be accountable for the work and to ensure that any questions relating to the accuracy and integrity of the paper are investigated and properly resolved.