1 Introduction

The calcaneus bone has important function within the human musculoskeletal system of bearing body weight during standing and motion. The large portion of this anatomical part consists of trabecular bone; age- and osteoporosis-related changes in calcaneal bone reflects bone strength at other skeletal sites, and the heel has been widely used to predict future fracture risk predominantly using measurements by quantitative ultrasound (McCloskey et al. 2015). The in vivo assessment of bone microstructure contributes to a better understanding of the pathophysiology of diseases such as osteoporosis as well as improving our understanding of the efficacy of therapeutic interventions (Farr and Khosla 2015; Fuller et al. 2015; Liebl et al. 2014; Paggiosi et al. 2014; Nishiyama and Shane 2013; Pialat et al. 2012). High-resolution peripheral quantitative computed tomography (HR-pQCT) is a non-invasive, low-radiation technique designed to measure bone microstructural properties at peripheral sites in vivo (Putman et al. 2014; Stein et al. 2014; Boutroy et al. 2008). Although the second-generation HR-pQCT system (XtremeCT II) has provided improved image resolution (e.g. 61 μm Manske et al. 2015 and 41 μm Enns-Bray et al. 2016), the first-generation XtremeCT I system (82 μm voxel size) is still widely used for bone mechanical analysis purposes. Many recent HR-pQCT-based studies have followed a similar microFE modelling framework, which originated from the pioneering work of van Rietbergen and colleagues (van Rietbergen and Ito 2015; Van Rietbergen et al. 1995) who developed a methodology based on microCT scans of trabecular bone. The methodology was based on three main steps: (1) low-pass image filtration, (2) global single level threshold and (3) conversion of binarised image voxels into linear hexahedron elements. Since then, this modelling protocol has been revised and improved. For example, Laib and Ruegsegger (1999) applied a 3D Laplace–Hamming high-pass filter to effectively remove trabecular-edge noises from microCT images. In another study, Ulrich et al. (1998) studied the effect of mesh smoothing on microCT-based trabecular structure by using tetrahedral elements to improve the preservation of trabecular connectivity in comparison with hexahedral mesh. In another study, a volumetric topological analysis (Saha et al. 2010) was developed to classify the trabecular microstructure into plate/rod-like components, and the hexahedral elements were replaced by beam and shell elements according to their topological classification (Wang et al. 2015; Vanderoost et al. 2011).

However, it is still debated whether accurate finite element (FE) predictions can be obtained from HR-pQCT-imaged trabecular bone when the same procedures defined for microCT-based FE analysis are used. Bevill and Keaveny (2009) used a simple mesh resampling technique on human cadaveric trabecular microstructures (femoral neck, greater trochanter and vertebral body), obtained from microCT scan with 20 μm voxel size. They concluded that scans with voxel size below 80 μm can accurately predict bone strength. However, they did not account for the intrinsic reduction of signal-to-noise ratio of in vivo HR-pQCT images compared to a resampled microCT image. Depalle et al. (2013) also performed a similar investigation by considering L2 human cadaveric vertebrae, and they found that voxel size and degree of element formulation are important factors which directly impact on the accuracy of stress distribution within the trabecular microstructure in the FE model. In a more recent study, Christen et al. (2016) investigated the voxel size dependency of cadaveric distal radius compressive load by using microCT and generating microstructures using multiple downscaled images (from 50 to 150 μm) where they found small voxel size dependency at 61 μm (average error in three orthogonal loading cases of 8.2% compared to 25-μm gold-standard microCT-based model) and increased dependency at larger voxel sizes (average error of 16.7% for 82 μm). MacNeil and Boyd (2008) and Varga et al. (2011) investigated the accuracy of HR-pQCT-based microFE strength predictions (Pistoia et al. 2002) by conducting cadaveric distal radius uniaxial compressive test. In these studies, good agreement was observed between FE results and experimental data derived to calculate apparent bone strength. It should be noted that in distal radius sections the presence of dense cortical shell around the trabecular region acts as mechanical stabiliser in particular when the load is applied longitudinally. As a result, the cortical part of such a specimen bears more load than the trabecular bone. On the other hand, the cortical shell is more accurately converted to microFE compared to the trabecular region which has a wider range of greyscale distribution, if a single level threshold-based method is applied. Therefore, it is hard to understand whether the local predictions of HR-pQCT-based microFE models of distal radius or tibia are accurate in their trabecular core.

To investigate the HR-pQCT-based (82 μm voxel size) FE prediction for human distal tibia by considering whole (cortex and trabecular part) and trabecular segment as well as cubic subvolumes of trabecular region, Liu et al. (2010) pioneered a direct morphological and FE mechanical comparative study where microCT-based models for each cadaveric specimen were considered as gold standard. In this study, despite the differences between HR-pQCT and microCT measurements, high correlations between the two modelling approaches were observed. To evaluate the trabecular bone behaviour more explicitly and using microCT-based models as a gold standard, Tjong et al. (2012) applied HR-pQCT imaging on cadaveric distal radius samples by considering multiple voxel sizes (41, 82 and 123 μm) and found a strong correlation between HR-pQCT model voxel sizes and the accuracy of trabecular region morphometric parameters. In the latest study, Zhou et al. (2016) had examined the accuracy of HR-pQCT (82 μm voxel size)-based microFE predictions for human cadaveric distal radius and tibia by calculating and comparing mechanical properties (bone stiffness and yield strength) and morphological parameters against gold-standard microCT-based microFE and mechanical testing results. They reported a strong correlation between HR-pQCT- and microCT-based mechanical and morphological properties, where corrective adjustment was required to achieve the accurate values for HR-pQCT-based models in the light of their microCT counterparts. To the best of the authors’ knowledge, there has not been any established systematic study on the morphological and mechanical properties of microFE models derived from calcaneum HR-pQCT scans. Therefore, by adopting the comparative analysis approach from Liu et al. (2010) study, the aim of this study was to determine the influence of global thresholding technique of HR-pQCT on the morphological and biomechanical analyses of calcaneus trabecular bone when compared with microCT-based reconstructions and models.

2 Materials and methods

2.1 Specimen preparation

Five embalmed human cadaveric left feet were obtained from the University of Sheffield medical teaching unit where the study was approved by the ‘Medical School Ethics and Research Committee’. The age of donors ranged from 85 to 101, all female. HR-pQCT scanning (XtremeCT I, SCANCO-V1.1c, Scanco Medical) of the specimens was performed in situ with two different integration times of 100 and 200 ms. It should be noted that the scanning protocol with 100 ms integration time is the typical method used for distal tibia and distal radius. The other integration times could be chosen to decrease the scanning time and, therefore, decrease the probability of artefacts. All the other parameters were the same: 82 μm isotropic voxel size, 60 kVp voltage, 900 μA current. Afterwards, a parallelepiped portion of the calcaneus (≈ 18 × 18 × 33 mm3) was extracted by means of a diamond bandsaw under constant water irrigation (300CP, Exakt, Germany) as shown in Fig. 1.

Fig. 1
figure 1

An example of trabecular region (in orange) extracted from the dissected calcaneus (in white) for microCT imaging

The extracted calcaneus specimens (fully dissected with removed soft tissues) were then scanned with a microCT system (SkyScan 1172, Bruker, Belgium) with the following parameters: 17.41 μm isotropic voxel size, 2950 ms integration time, 100 kV voltage, 100 μA current, 180° total rotation, 0.7° rotation step, 2 × averaging. The microCT images of parallelepiped portion of calcaneus and HR-pQCT images of full feet (acquired with 100 ms integration time or 200 ms integration time, independent registrations) were rigidly registered using AMIRA 6.0.1 in order to select the same region of interest between the low- and high-resolution images.

2.2 Thresholding

To construct microCT-based trabecular models, 301 × 301 × 301 voxel cubic subvolumes were considered from the central region of the registered microCT data, where also the equivalent volumes were extracted from their corresponding HR-pQCT sets (65 × 65 × 65 voxels, one extra voxel size was considered on each side to fully capture the boundary geometry). Initially, the uint16 microCT image slices were converted into uint8 format and then filtered by applying low-pass Gaussian filter (0.5,5). The gold standards for both the morphological parameters and the mechanical properties in this study were derived from the segmentations of the microCT images which were used to generate the microCT-based models. For a high-resolution microCT images, adopting a single level threshold method is an adequate technique to separate the pixels which represent bone from surrounding marrow/air (Ding et al. 1999). To select the right threshold, the BV/TV values were initially computed at different threshold ranges. Then, by differentiating the BV/TV and threshold curve, the threshold value corresponding to the steepest curve gradient was chosen as the right input that was equal to 90 for all the specimens (Scanco Medical AG 1997). To study the effect of resampling method (similar to Bevill and Keaveny 2009), additional microstructures were also generated by resampling the original binarised microCT-based model using MATLAB-written resize function (k-wave 2015; That is, the model is downsampled by a factor of 17.41/82 ≈ 0.2) and a single value thresholding was applied to achieve BV/TV similar to the gold-standard microCT model (referred to as microCTRes).

Due to lower resolution of the HR-pQCT images compared to the microCT ones, larger partial volume effect is expected and the calculation of the BV/TV is more affected by small changes in the single level threshold values. In particular, increasing the threshold value can lead to a significant trabecular network disconnection. To study this problem, two threshold values were considered to evaluate the HR-pQCT-based structural and mechanical properties. For each specimen, the first threshold (TH1) was chosen to obtain BV/TV values similar to those obtained from the gold-standard microCT images. However, with TH1 the microFE models generated from microCT or HR-pQCT were geometrically different (Fig. 2). Therefore, a second threshold value (TH2) was used, resulting in a more similar geometry to the microCT model. To achieve this, the fractal theory was employed (Alberich-Bayarri et al. 2010), which indicates how the binarised trabecular structure is arranged at different dimensional scales and provides a systematic way to compare different microstructures. The fractal dimension curves were generated for each HR-pQCT image for different threshold ranges (Moisy 2008) and compared against the values obtained from the analyses of the corresponding microCT images. A threshold value TH2 equal to (TH1 0.04) minimised the differences in the microstructure between HR-pQCT and microCT images. Nevertheless, TH2 leads to higher BV/TV for HR-pQCT images compared to those computed in microCT images (143 ± 79% increase; see Fig. 2). The HR-pQCT-based models generated using TH1 and TH2 are represented as HR-pQCTTH1 and HR-pQCTTH2.

Fig. 2
figure 2

Illustrative representation of HR-pQCT-based modelling, using different threshold values. TH1 and TH2 represent the first and second threshold values, respectively. It should be noted that in general, there is not a linear trend between the threshold variation and the geometrical similarity between HR-pQCT- and microCT-based models (quantified using fractal dimension method)

2.3 FE computation

The binarised voxels were converted to ABAQUS/Standard 6.13-1 cubic hexahedral elements (C3D8) where implicit solver was used for all simulations with linear elastic material property, E = 10 GPa, ν = 0.33 assigned to each element (Pistoia et al. 2002). It should be noted that the Pistoia rule was defined and validated for other human anatomical sites such as wrist where in this study it is assumed that this hypothesis behaves similarly for the calcaneus part. By considering three main loading directions, 0.007 mm uniaxial compression (0.13% compressive strain) was applied on one end of all microstructures, while the opposite end was fully constrained similar to Boutroy et al. (2008) (see Fig. 3).

Fig. 3
figure 3

Illustrative representation of three compressive loading scenarios: a loading case 1, b loading case 2 and c loading case 3. The loading-side elemental nodes (indicated by the black arrows) were assigned to compressive uniaxial displacement and suppressed in the other directions, where the elemental nodes on the opposite side were fully suppressed in all directions (indicated by the red surfaces). d The position of the selected microstructures (light blue) with the calcaneus bone

2.4 Output processing

von Mises stress (Stressvm), total applied force (Ftot) and estimated failure load (Ffrct) were computed from each FE simulation for each loading case (Vilayphiou et al. 2010, 2011; Wegrzyn et al. 2010). To predict Ffrct, the Pistoia et al. (2002) criteria were used by assuming that bone failure would happen when at least 2% of the total volume of elements are loaded beyond 7000 μ strain. At the end of each FE simulation, the elemental Straineqv values and the total applied reaction force, Ftot, were extracted and imported to the MATLAB-based program to evaluate the Ffrct.

2.5 Microstructural classification

In order to study the effect of geometrical anisotropy on the mechanical response, the degree of anisotropy DA and the main trabecular direction using MIL (mean interception length) technique were calculated (BoneJ plugin in ImageJ, NIH Image). Plateness/rodness, local thickness and heterogeneous arrangement of individual trabeculae within the microstructure are important morphological features which affect the mechanical response (Liu et al. 2008). Vasilic et al. (2009) established object classification criteria which provide a good indication of local trabeculae shape by computing eigenvalues of local structure tensor and then applying Eq. 1 to calculate the plate/rod ratio:

$$ C_{\text{r}} = 2\left[ {\frac{{\lambda_{2} - \lambda_{1} }}{{\lambda_{3} - \lambda_{1} }}} \right] - 1,\quad \lambda_{1} < \lambda_{2} < \lambda_{3} $$
(1)

where λ1, λ2 and λ3 represent the eigenvalues and Cr ∈ (− 1,1) where shifting towards − 1 and 1 corresponds to more rod-like and plate-like geometries, respectively. For effective computation of local structure tensor, it is required to obtain the medial surface of trabecular microstructure by using so-called digital object skeletisation (or thinning) methodology. To achieve this, a 3D thinning algorithm was generated using MATLAB script based on original work of Manzanera et al. (1999) and its improved version by Stauber and Muller (2006). This technique provides fully connected single voxel thickness medial surface of binarised 3D image data which make it possible to construct a representative voxel cloud about each medial point to evaluate structure tensor. In addition, by using distance transform technique, Euclidean distance of skeleton voxels was calculated from nearest zero voxel (mineralised phase) of full 3D image data to evaluate the local structural thickness, Sthick.

3 Results

3.1 Morphological analysis

The physical size of binarised voxels has a significant impact on geometrical pattern in microstructural local scale, in particular when single level thresholding scheme is used to construct trabecular HR-pQCT-based models. An example of such a considerable geometrical variation is shown in Fig. 4:

Fig. 4
figure 4

Effect of voxel size on trabecular geometrical pattern: a microCT-based model, b microCTres-based model and c, d HR-pQCT-based models (integration time = 100 ms) with minimum thresholds TH2 and TH1, respectively

The BV/TV values for the five cadaveric specimens are relatively scattered (8.70 ± 4.90%, 3.80–14.31%) where similar observation is also reported by Ulrich et al. (1995) for the calcaneus site. Specimens 1 and 5 have relatively lower BV/TV values compared to the other specimens (Fig. 5). The effect of integration time was almost negligible between 100 and 200 ms, except for HR-pQCTTH2. BV/TV is similar for microCT, microCTRes and HR-PQCTTH1, whereas the BV/TV of HR-pQCTTH2 is 2–3 times higher than the other values.

Fig. 5
figure 5

BV/TV (%) values for all specimens under microCT, microCTRes, HP-pQCTTH2 and HR-pQCTTH1 for the integration time of 100 and 200 ms

On average, the specimens 1–4 had DA = 0.66 ± 0.03, whereas for specimen 5 (highly osteoporotic) DA was equal to 0.99. Also, the averaged angular difference between the main trabecular direction and the x-axis (see Fig. 3) was 13.57 ± 3.80° for specimens 1–4 and 72.24° for specimen 5.

By investigating the plateness/rodness (Cr) distribution curves of MicroCT models, it can be noted that the local geometry of rod/plate-like members is well estimated since the area under the curves for Cr < 0 is significantly larger for the less-dense specimens 1 and 5 (see specimen 1 in Fig. 6 and specimens 2–5 in “Appendix 1”) due to higher proportion of rod-like trabeculae compared to specimens 2, 3 and 4. However, unlike the microCT models, microCTRes and HR-pQCTTH1 microstructures show significantly overestimated distributions for rod-like members for all the cases. Considering HR-pQCTTH2, the reduction of threshold value has an adverse effect on the local morphologies for specimens with lower BV/TV (i.e. 1 and 5) where more members are identified as plate-like components through Cr distribution curves. Also, for all five specimens including the data from scans performed with the two integration times, the Cr distributions of HR-pQCTTH1 models, in general, are more similar to their equivalent microCTRes.

Fig. 6
figure 6

Sample 1, plateness/rodness distribution for HR-pQCT models with integration times a 100 ms and b 200 ms and their counterparts registered microCT and microCTRes models

For all five specimens, the microCTRes and HR-pQCTTH1/TH2 models have significantly larger Sthick compared to their corresponding gold-standard microCT models considering both integration times (Fig. 7).

Fig. 7
figure 7

Averaged structural thickness values for microCT, microCTRes, HR-pQCTTH1 and HR-pQCTTH2 models for scanning integration times of 100 and 200 ms

3.2 Mechanical analysis

The average von Mises stress (mean of microFE elements von Mises stress values) was similar between microCT and HR-pQCTTH2 models (loading condition 1, Fig. 8 and loading conditions 2 and 3, “Appendix 2”). MicroCTRes and HR-pQCTTH1 models have 2–3 times lower stress, and both are similar.

Fig. 8
figure 8

Averaged and standard deviation von Mises stress for microCT, microCTRes, HR-pQCTTH1 and HR-pQCTTH2 models for scanning integration times of 100 and 200 ms and for loading condition 1

The total reaction force (equivalent to apparent stiffness) was overestimated by the HR-pQCTTH2 model (except for sample 4), whereas the microCTRes and HR-pQCTTH1 models significantly underestimated it (loading condition 1, Fig. 9 and loading conditions 2 and 3, “Appendix 2”). A large variability was also found from one specimen to another.

Fig. 9
figure 9

Total reaction force for microCT, microCTRes, HR-pQCTTH1 and HR-pQCTTH2 models for scanning integration times of 100 and 200 ms and for the loading condition 1

By considering these five sample cases, the HR-pQCTTH2 model results in significantly overestimated Ffrct values, especially for low-BV/TV samples (#1 and #5); see Fig. 10 for the loading condition 1 and “Appendix 2” for the loading condition 2 and 3. The HR-pQCTTH1 model underestimated the failure load and was similar to microCTRes model.

Fig. 10
figure 10

Averaged estimated failure load for microCT, microCTRes, HR-pQCTTH1 and HR-pQCTTH2 models for scanning integration times of 100 and 200 ms and for the loading condition 1

4 Discussion

Ex vivo mechanical finite element and microstructural analyses were conducted on HR-pQCT scans of calcaneum trabecular subvolumes with microCT-based models used as gold standard. For these five sample cases, the local structural thickness, Sthick, of generated microCTRes and HR-pQCTTH1/TH2 models is significantly different from the gold-standard microCT model, whereas this difference is lower for microCTRes models. The inaccurate estimation of trabecular thickness by using XtremeCT I system is consistent with the results from Manske et al. (2015) for human cadaveric radii samples. It was found that increasing the integration time for HR-pQCT scans does not noticeably improve the geometrical quality of generated microstructures; as a result, their mechanical properties were also less improved. Also, Cr curves similarities between HR-pQCTTH1 and microCTRes models highlight the dominant role of voxel sizes on the accuracy of morphological properties, since the microCTRes is generated purely through downsampling method. A similar trend was observed between HR-pQCTTH1 and microCTRes FE results, which again shows a principal influence of voxel size on the quality of image-based microFE models.

The calculated stiffness of the selected regions of interest in calcaneus parts has an anisotropic property with highest and lowest stiffness in loading directions 3 and 2, respectively (see Fig. 3). It was also found that the averaged angular difference between the main trabecular direction and the x-axis is 13.57 ± 3.80° which is fairly alongside the third loading direction, resulting in maximum stiffness compared to the other loading cases. It is noteworthy that this anisotropic mechanical response is also observed in HR-pQCT and resampled-based microstructures as well. Therefore, it can be stated that the HR-pQCTTH1 models are potentially able to catch anisotropic mechanical response, according to the trabecular orientations.

One of the main objectives of this work was to perform comparative study on failure load prediction, Ffrct, in the light of the imaging/modelling technique. From both geometrical analysis and FE results, it was found that the single value threshold technique does not provide accurate image-based trabecular microFE model with voxel size of 82 μm. Using HR-pQCT-based microstructure with overestimated BV/TV value (i.e. HR-pQCTTH2), the Stressvm was similar to the microCT model. However, it overestimated the apparent stiffness (measured as Ftot). It should be pointed out however, it was assumed that the bone elastic modulus is identical for all the samples and independent from the structural and physiological properties.

The microCT models have different Sthick to HR-pQCT. This means, for microFE models which are generated from 82 μm voxel size images, there is a wide image greyscale distribution. In contrast, for higher-resolution microCT images, the smaller voxel size will result in narrower greyscale distribution. Therefore, using single value threshold technique is more efficient for images with narrow intensity distribution such as cases with small voxel size or homogenous geometry (e.g. cortical bone). It is important to note that the (Pistoia et al. 2002) criteria have been established based on axial loading on the distal radius where cortical bone encloses the trabecular region. From several FE analyses on distal radius fracture (e.g. HR-pQCT with 82 μm resolution, Vilayphiou et al. 2010), it is reported that the cortical part of the distal radius bears more load (i.e. stress) than the trabecular region and therefore acts as mechanical stabiliser by transmitting force into applied loading direction. Also, due to almost homogenous geometry of cortical bone, this region has narrow image greyscale distribution where using single value threshold technique results in accurate cortical microFE. Therefore, potential inaccurate mechanical predictions resulting from the impact of large voxel size on trabecular microFE were masked by the cortical dominance. In our study, the use of purely trabecular calcaneus microFE models confirms that HR-pQCTTH1 models do not sufficiently represent the accurate microstructure geometry, especially for specimens with low BV/TV (e.g. sample 5). In particular for such low-BV/TV samples, there were weak trabecular connectivities due to their insufficient voxel resolution; therefore, performing the single level threshold method resulted in the elimination of large amounts of microstructure that were visually realised (for example, see Fig. 4).

Although only five calcaneum cadaveric samples were used in this study, high variability in terms of architecture was found from one sample to another and yet the mechanical analysis was fairly consistent for all samples providing confidence that the main outcomes of this study are valid for any calcaneum trabecular bones.

5 Conclusions

Overall it was found that the morphology of XtremeCT I HR-pQCTTH1 models for calcaneus trabecular bone is significantly different to their corresponding microCT models. Due to wide greyscale distribution of HR-pQCT images, the final HR-pQCT-based microFE is very sensitive to single level threshold technique where the whole morphology and BV/TV can be affected by the chosen threshold. By analysing microCTRes and HR-pQCTTH1 models and both morphological and mechanical similarities between the two, it was found that voxel size has a dominant effect on the microFE quality when the single value threshold method is used for generating voxel-based microstructure. The mechanical simulation results showed significant difference between the predictions of gold-standard microCT and XtremeCT I HR-pQCTTH1 models that in average are 70% lower for von Mises stress, 64% lower for apparent stiffness and 33% lower for failure load prediction.

In conclusion, a single threshold value cannot predict both morphological structures (BV/TV) and biomechanical analyses (stress, apparent stiffness and failure load prediction) correctly. Since the resolution of XtremeCT I is not sufficient to resolve trabecular thickness, caution should be made to use it to monitor the effect of the progression of the disease or the effect of a bone drug if the target is, for example, bone remodelling since accurate trabecular thickness will not be achieved. Alternatively, employing more advanced segmentation strategies (e.g. local adaptive threshold method Burghardt et al. 2007) may improve the accuracy of HR-pQCT-based microstructures; nevertheless, a comprehensive geometrical and mechanical evaluation study is still required, in particular for the trabecular regions with small BV/TV values.