Introduction

Dual-energy X-ray absorptiometry (DXA) is commonly used in clinical practice to measure areal BMD (grams per square centimeters) at the proximal femur for the diagnosis of osteoporosis and has been shown in prospective studies to predict hip fractures [1]. DXA is a 2D projectional measurement of a 3D object, which limits the geometric and structural information that can be derived from a DXA exam. However, more information can be obtained from a DXA image than simply BMD [2, 3]. Hip structure analysis (HSA) is a method to obtain certain structural parameters from DXA images and has been widely employed in research studies [411].

Quantitative computed tomography (QCT) is considered the gold standard for obtaining 3D structural measurements of the proximal femur, particularly when it employs relatively high-resolution protocols with voxel sizes below 1 mm3. To date, there has been uncertainty as to whether DXA-based HSA can truly represent the geometric and structural natures of the hip in vivo as determined by QCT [12]. Several issues complicate the comparison of HSA and QCT measurements in vivo. Because the femur is positioned differently for the QCT and DXA examinations, the accurate matching of the 2D region of interest (ROI) analyzed in HSA to a corresponding 3D ROI in the QCT dataset requires a 2D–3D registration of the projectional DXA image to the QCT dataset. Also, there are important differences between the DXA and QCT measurement techniques related to how they handle bone marrow fat and partial volume effects, which may influence correlations between these measurements.

Volumetric DXA (VXA) is a newly developed technique that utilizes the rotating C-arm of a DXA device to obtain four DXA images from various angles. Using these images and with the help of a QCT-based statistical atlas, a volumetric DXA dataset can be derived [13]. The VXA process required a 2D–3D registration. Thus, in this study, we used the algorithms developed [13] for 2D–3D registration of the four DXA images to QCT to undertake a careful comparison of HSA and QCT measurements on the same individuals. Taking advantage of the methods and data available from this earlier work, in this paper, we report on an in vivo comparison of HSA on a DXA image to high-resolution QCT in a population cohort of older women. This cohort represents the most difficult clinical population to evaluate because of the presence of low bone mass and hip osteoarthritis.

Methods

Patients

Forty-eight women (mean age, 82.8 ± 2.5 years; height, 157.4 ± 6.1 cm; weight, 64.2 ± 10.7 kg; and BMI, 25.9 ± 3.9 kg/m2) were randomly recruited from the CARE Study. The CARE Study is a population-based study of ambulant elderly women, excluding only those with focal bone disease or osteomalacia [14, 15]. Informed consent was obtained from each patient, and the study was approved by the Human Research Ethics Committee of the University of Western Australia. In four subjects, the proximal femur was not scanned appropriately because, in some, the proximal femur was missing on the DXA images or the QCT scan; one image file was corrupted during data transfer, and in two cases, the femurs were not successfully segmented from the QCT dataset, yielding 41 subjects with complete data for this analysis. All patients whose results from both the DXA and CT could be obtained are included in the results presented.

Measurements

QCT of the right hip was measured using a Brilliance 64 CT (Phillips Inc.) with a calibration phantom (Mindways, Inc.) placed below the patient. The QCT technique factors were 120 kV, 170 mAs, pitch of 1, 1 mm slice thickness, reconstruction kernel B, and 15 cm reconstruction FoV, resulting in a 0.29 mm in plane voxel size.

DXA images of the right hip were taken on the same day as the QCT with a Discovery A DXA scanner (Hologic, Inc.) which has a rotating C-arm. After the standard PA DXA hip image was acquired, additional DXA images were acquired at angles of −21°, 20°, and 30° relative to the PA view by rotating the C-arm without patient repositioning. HSA measurements at the narrow neck (NN) and trochanteric (IT, in HSA terminology) regions [2] were made on the standard PA DXA hip image using APEX 3.0 software (Hologic, Inc.). The additional DXA images acquired at the various angles were not used in the HSA calculation but were only used for co-registering (i.e., align both translationally and rotationally) the subject’s QCT dataset with the subject’s PA DXA image to produce anatomically equivalent ROI placement (Fig. 1).

Fig. 1
figure 1

Four DXA views are used to constrain the location of the QCT dataset. The mid-plane slice of the HSA ROIs (NN shown) is mapped onto the QCT dataset, and parameters are calculated for this slice. Shown are the center of mass (COM), the width parameter along the PA view, and the PA perpendicular vector direction

The Hologic implementations of the HSA algorithms were licensed from the Johns Hopkins University and were implemented under the guidance of Prof. Beck. The Hologic version of HSA and the HSA software provided by Prof. Beck for various research studies have been shown to be highly correlated by Khoo et.al. [16] in an independent study utilizing a fan beam Hologic densitometer equivalent to the one used in this study.

Co-registration

Periosteal and endosteal bone surfaces of the QCT datasets were segmented using the Medical Image Analysis Framework software package developed at the University of Erlangen [17]. A tetrahedral mesh model with third-order Bernstein polynomial density functions was then calculated from the segmented QCT volume [18, 19]. The meshed QCT volume was co-registered to the four DXA images using a general purpose 2D–3D deformable body registration algorithm [2023]. A rigid registration allowing rotations and translations but not deformations was used. The 2D–3D registration algorithm used a fast GPU-based algorithm [24] to produce digitally reconstructed fan beam radiographic projections (DRRs) of the meshed volume at each angle that a DXA image was obtained. Each of the four DRRs was compared to the corresponding DXA image using mutual information. The sum of the mutual information of these image pairs served as a cost function. An optimization routine using simulated annealing (a robust method that avoids being trapped in local minima [25]) was used to determine the correct transform for the three translational and rotational parameters of the QCT meshed volume to co-register it with the DXA images. The inverse of this transform was used to place a 1 mm plane at the center of the HSA NN and IT ROIs (which were defined on the standard hip PA DXA image), onto the QCT dataset. This plane is the 2D slice on which the QCT parameters are calculated. The procedure of co-registration ensured that anatomically equivalent regions were measured by HSA and QCT. Because many of the QCT scans did not extend far enough below the lesser trochanter into the femoral shaft to allow a comparison to the HSA shaft ROI, the comparison at the shaft ROI was not attempted.

Calculation of parameters on the QCT dataset

Cross-sectional area (CSA) in square centimeters was defined in accordance with the traditional HSA definition as the area of the slice filled with bone. In this definition, the area of each pixel is weighted by the amount of bone in the pixel.

Cross-sectional moment of inertia (CSMI) in quartic centimeters is defined around a given axis. In DXA HSA, CSMI is calculated and averaged over line profiles along the u direction in Fig. 1. The center line profile of HSA is a projection of the 2D slice in the PA image. CSMIHSA can therefore only be calculated around an axis perpendicular to the PA image (v in Fig. 1). However, QCT is not restricted by the directionality of the PA image, and one is free to choose the axis around which CSMI is calculated.

Let (u, v, w) define an ortho-normal coordinate system centered at the center of mass (COM) of the 2D slice, ρ(u, v) be the volumetric bone density in milligrams per cubic centimeter per voxel in the slice, and ρ NIST = 1,850 mg/cm3. If (u CM , v CM ) is the location of the COM, and we define the center of mass coordinate system as:

$$ \begin{gathered} \tilde{u} = \left( {u - {u_{{CM}}}} \right) \hfill \\\tilde{v} = \left( {v - {v_{{CM}}}} \right) \hfill \\\end{gathered} $$

Then:

$$ \begin{gathered} CSM{I_{\rm\mathbf{u}}} = \iint\limits {{{\tilde{v}}^2}\left( {\frac{{\rho\left( {u,v} \right)}}{{\rho_{{NIST}}}}} \right)}\;dudv \hfill \\\hfill \\\end{gathered} $$
$$ \begin{gathered} CSMI_{\rm\mathbf{v}} = \iint\limits {{{\tilde{u}}^2}\left( {\frac{{\rho\left( {u,v} \right)}}{{\rho_{{NIST}}}}} \right)}\;dudv \hfill \\\hfill \\\end{gathered} $$
$$ CSM{I_{{\mathbf{w}}}} = \iint {\left( {{{\tilde{u}}^2} + {{\tilde{v}}^2}} \right)}\left( {\frac{{\rho \left( {u,v} \right)}}{{{\rho_{{NIST}}}}}} \right)dudv = CSM{I_{{\mathbf{u}}}} + CSM{I_{{\mathbf{v}}}} $$

The \( \left( {\rho \left( {u,v} \right)/{\rho_{{NIST}}}} \right) \) term defines the bone fraction within a pixel and accounts for partial volume effects of the finite voxel size. This definition of the moment of inertia is consistent with that defined by Martin et al. [26] and other published literature.

In the above equations, CSMI u and CSMI v depend on the particular choice of the Cartesian coordinate system (u, v axes) of the 2D slice, which is in turn patient position dependent. CSMI w , although calculated as a sum of the latter two moment terms, is independent of patient position. This can be seen by noting that the distance term (\( {\tilde{u}^2} + {\tilde{v}^2} \)) is the square of the distance to the normal axis (w) and is not affected by the choice of the 2D coordinate system within the slice. Thus, CSMI w , also called the polar CSMI, is the natural choice for a 2D slice. Therefore, for the primary comparison to CSMIHSA, we have chosen CSMIQCT to be equal to CSMI w .

Section modulus (Z) in cubic centimeters is CSMI divided by the distance of the furthest contributing bone pixel from the axis around which CSMI is calculated.

Width represents the outer diameter of the bone at the ROI (Fig. 1). For HSA, this is termed the “sub-periosteal width” and is the distance calculated between the blur-corrected edges of the BMC profile [27]. Blur correction adjusts the DXA image for the apparent increase in size due to the partial volume effect. For the QCT slice, it is the distance between the edges of the bone in the QCT slice at the angle of the DXA PA view. This slice has been extracted from the QCT volume after segmentation, which added minor partial volume artifacts due to an additional interpolation step. As shown in Fig. 1, width is calculated along u to ensure co-registration with the DXA PA view.

Femoral neck axis length (FNAL) assessment did not use co-registration between the DXA image and QCT dataset because minor rotational positioning errors of the femur during PA DXA image acquisition caused errors in the placement of the FNAL when propagated to the QCT dataset. Instead, a plane perpendicular to the narrowest part of the femoral neck was automatically found on the QCT dataset. This was achieved by first defining a plane using spherical coordinates (l, θ, φ) where l is the distance of the plane from the origin, and θ and φ, represent the normal vector to that plane in terms of its inclination and azimuth angles respectively. Optimization on these three coordinates was performed using a downhill simplex algorithm in order to minimize the area of femoral neck that intersected this plane. This automated algorithm used the NN region defined above as the initial starting location of the plane. Since the algorithm started with the NN region as the initial guess, and this region is between the femur head and greater trochanter, convergence to the plane with the narrowest area was rapid. FNAL was measured perpendicular to this plane through its center of mass from the edge of the femoral head to where the axis exited the femur distally. To reduce the effects of osteophytes which were prevalent and visible in the QCT dataset, the measurement was repeated eight times along line segments parallel to the neck axis. The eight measurements were concentrically spaced around the neck axis. The final FNAL value was defined as the median of these eight parallel segments and the central measurement.

Statistics

Parameters calculated from the QCT dataset were considered the gold standard, and the parameters calculated by HSA were compared to QCT by linear regression analysis using GraphPad Prism V 5.03. If the offset (i.e., intercept) was not statistically different from zero (p < 0.05), the analysis was repeated with the intercept restricted to zero.

In order to test the sensitivity of our results to the placement of the NN ROI, in addition, the plane through the narrowest part of the femoral neck of the QCT dataset was also used as the basis for an alternate definition of the QCT NN ROI and compared to the HSA NN ROI.

Results

High linear correlations (r = 0.89–0.95) were found between HSA and QCT for CSA, CSMI, and Z at the NN and IT regions (Figs. 2 and 3). The intercepts of the linear correlation of the parameters were not statistically significant (p < 0.05) at the IT region but were statistically significant at the NN region (Table 1). The slopes of these parameters were all different from unity.

Fig. 2
figure 2

The correlation of HSA with QCT for the narrow neck region

Fig. 3
figure 3

The correlation of HSA with QCT for the trochanter region

Table 1 Results of the linear correlation of HSA vs. QCT at the NN and IT regions

The correlation of the width of the bone was r = 0.95, the slope was 0.98 for both the NN and IT regions, and the standard error of the regression line was 1 and 0.8 mm, respectively. There was no statistically significant offset. To examine whether the difference of the slopes from unity was possibly caused by the small partial volume artifact added during the extraction of the slice used for the width calculation, we set a bone threshold of 50 mg/cm3 for this slice. With this threshold, the slopes were 0.994 and 0.984 for the NN and IT ROIs, respectively. This suggests that the difference from unity can at least in part be explained by image processing of datasets with finite voxel sizes, i.e., is a consequence of the limited spatial resolution.

For FNAL, the correlation was found to be r = 0.90, and the standard error of the regression line was 2.2 mm. The offset of the linear regression was not statistically different from zero; thus, the line was fitted with the intercept restricted to zero; under these circumstances, the slope was 1.003 ± 0.004. The Bland–Altman plot showed excellent agreement of the two techniques across the range of FNALs encountered in the study with 95% confidence intervals of −0.39 to 0.45 cm (Fig. 4).

Fig. 4
figure 4

Comparison of FNAL between HSA vs. QCT for FNAL. The Bland–Altman is shown with 95% confidence intervals

To examine whether the high correlations seen in this study were strongly dependent on the co-registered ROI placement, we measured the correlation to the HSA NN ROI when the QCT ROI was placed in the narrowest area of the femoral neck using the automated narrow neck algorithm described in the methods section of the FNAL calculation. Correlations between HSA at the NN and the parameters calculated with this automated ROI placement on QCT were 0.92, 0.90, and 0.87 for CSA, CSMI, and Z, respectively. The difference in correlation between the parameters calculated using the two different methods of ROI placement at the NN on the QCT dataset did not reach statistical significance.

Additionally, to examine whether these high correlations could be improved by more exact correspondence between QCT and HSA, we also compared DXA CSMIHSA and ZHSA with the corresponding QCT calculations around the same axis v, i.e., CSMI v and Z v . In all cases, these parameters had marginally better correlation (r increased by approximately 0.01) than CSMI w and Z w . The exception being CSMI at the NN ROI, where the increase was slightly greater and reached statistical significance. The correlation coefficient for CSMIHSA of the NN improved from 0.936 when it was compared to CSMI w , to 0.975 (p = 0.04 for the difference between r-values when it was compared to CSMI v ).

Discussion

The high correlations of the 2D HSA measurements of CSA, CSMI, and Z with the 3D QCT gold standard measurements provide support for the validity of interpreting these parameters as being highly correlated to these physical parameters. This is an important point as the HSA algorithm and DXA manufacturer equipment used in this study have already been utilized in many published clinical studies.

Because the calibration standards for bone mass differ between the two modalities measurements and because they handle bone marrow fat and partial volume effects differently, it is not surprising that the slopes for CSA, essentially a measurement of the BMC in an ROI, differed from unity. This mass measurement difference also affected CSMI and Z. However, as noted in the Methods section, there is a further difference for CSMI and Z because the DXA HSA measurements are limited to calculating these values in the DXA planar projection (CSMIHSA and ZHSA, which are around the v axis in Fig. 1), whereas the QCT measurements utilize the 3D data and were calculated around the w (polar) axis. These differences limit the comparison to correlations; thus, individual measurements cannot be substituted one for the other without adjustments which may be population or technician dependent.

It is important to note that both the width and FNAL results indicated a high degree of agreement in absolute terms between DXA and QCT despite the use of a fan beam DXA device. Geometrical measurements on fan beam DXA devices are impaired by magnification effects if the bone being measured is not at the height above the table estimated by the scanner software. Based on in vitro studies, some have speculated that fan beam DXA may cause significant errors in geometrical measurements [2830]. These concerns are not supported by the data in this study of elderly women with BMI 25.9 ± 3.9 kg/m2, where there was no evidence for magnification in the population as a whole, as demonstrated by slopes that were nearly unity. Nor did fan beam magnification have an appreciable effect on individual subject results, as the SEEs ranged from only 0.7 to 2.2 mm. While this study does not rule out the possibility that there is a measurable magnification effect in vivo in men or severely obese women, it sets limits on the size of the magnification effect in a typical clinical population.

Another possible source of error contributing to the standard error of the estimate (SEE) of FNAL was patient positioning. The FNAL results were calculated independently on the DXA image and QCT dataset without co-registration; thus, if the femur neck during the DXA exam was not positioned parallel to the table in some subjects, it would appear shorter by varying amounts and would cause an increase in the SEE of the correlation. However, since the length of the FNAL is shortened only by the cosine of the angle the femur is mispositioned by, for small angles, this effect is negligible. Additionally, the DXA technicians in this study were highly trained and accustomed to the careful attention to detail required in research studies.

This expertise in patient positioning may also partially explain the important result that exactly matching the ROIs in 3D space with co-registration was not required for high correlations between DXA HSA and QCT for the NN region. We did not foresee this surprising result, as one might intuitively expect that oblique planes caused by improper positioning could result in considerable variation, as well as variations caused by limiting the determination of the narrowest point to a single 2D projection of a complex 3D object. The fact that the high correlations were seen, albeit with careful positioning, encourages the use of the HSA NN region in clinical studies where co-registration is not possible as a reasonable surrogate for measuring the “true narrow neck” with QCT. This result may also be due to the femoral neck region not having a well-defined weakest location. Physiological remodeling may cause the femoral neck to have a relatively large region which has approximately the same resistance to bending and compression, which would make the exact placement of the NN region less critical.

Previously, Prevrahl et al. [12] have undertaken a DXA QCT study comparing narrow neck region CSMI and reported an r 2 of 0.5, much less that the r 2 of 0.81 with non-registered ROIs and r 2 of 0.88 for co-registered ROIs reported here. The lower correlation found in the Prevahl study may have been due to a combination of different hardware and algorithms used. Prevahl et al. used a Prodigy (GE/Lunar), and the QCT was performed on a GE9800-Q (GE Healthcare, Inc.) with an Image Analysis QCT phantom and with lower spatial resolution (1 mm × 1 mm × 3 mm voxel). A global threshold was used for the segmentation of the CT data. The algorithm utilized by Prevahl et al. were those contained in the GE/Lunar AHA® software for the DXA and for the QCT, those developed by Lang et al. [31, 32]. Also, careful co-registration was not used. Importantly, the high correlations reported in this study cannot be generalized to other structural measurement software and hardware implementations without further validation.

In this study, we chose to only calculate on the QCT dataset that subset of HSA parameters for which highly accurate QCT results can be obtained. Even the relatively high-resolution QCT used in this in vivo study cannot measure cortical thickness below 1–1.5 mm accurately [33]. Thus, we did not calculate on the QCT dataset parameters such as cortical thickness and buckling ratio where partial volume artifacts, in particular in elderly patients with decreased cortical thicknesses, would have had large effects. As the true cortical thickness and the true cortical BMC are not known, it is also extremely difficult to correct these artifacts in a theoretically rigorous manner. The comparison in vivo of cortical parameters between QCT and HSA lacks a true “gold standard” because one is comparing two methodologies, both of which have limited accuracy. Trying to disentangle truth from assumptions for these parameters was beyond the scope of this paper.

Neither did we calculate the neck shaft angle on the QCT dataset. Neck shaft angle is not defined in three dimensions as the femoral neck axis and the line through the middle of the femoral shaft usually do not intersect in three dimensions. Additionally, as noted in the Methods section, a number of the QCTs in the study started at the distal edge of the lesser trochanter which prevented the accurate determination of the femoral shaft axis for those subjects.

In conclusion, there is high correlation between HSA and high-resolution QCT for CSA, CSMI, and Z in a cohort of elderly Caucasian women. Additionally, good absolute agreement between HSA and QCT was seen for FNAL and also width at the NN and IT ROIs. Assuming that the structural analyses in the plane of the DXA image relate to the overall structural strength of the hip, the ability of HSA to calculate these structural parameters from DXA images potentially allows the study of many interesting research questions, as well as patient assessments, without the inconvenience and much higher X-ray doses associated with QCT.