Introduction

Radiation exposure is directly associated with cancer risk [13]. The earlier the radiation exposure, the higher the risk of radiation-induced cancer [4, 5]. Children have a higher mitotic rate and therefore increased susceptibility to radiation and a longer lifespan to accumulate dose and manifest radiation-induced cancer [4, 6]. Repeated spine radiographs in adolescent scoliosis [7] and fluoroscopy in tuberculosis [8] are associated with increased risk of breast cancer. There is no minimum dose threshold at which radiation does not have a cancer risk but the dose response is linear for solid cancers and linear-quadratic for leukaemia [4, 5]. The Committee on Biological Effects on Ionizing Radiation VII lifetime risk model suggests that an increase of 100 mSv above background radiation could cause 1 cancer per 100 people [9]. The typical effective dose (ED) of one chest radiograph in a 10-year-old child is 0.006 mSv [5]. A study on cumulative radiation doses in children with spinal dysraphism calculated mean childhood cumulative ED of 23 mSv with an additional cancer risk of 0.37 % (1 in 270) based on a risk of 16 % per Sv [10]. Therefore, the lowest dose investigation that meets clinical need should be used, particularly in patients where repeated exposures are required.

Densitometric vertebral fracture assessment (VFA) was first described by Genant in 2000 [11, 12]. There is a range of favourable VFA literature in adults [1316], demonstrating sensitivity and specificity ranging from 62 to 97 % and 94 to 99 % respectively [14, 15, 1722]. VFA is recommended as a complement to densitometry for improved clinical evaluation of asymptomatic VF in adults [2325]. Although the importance of VF in the definition of osteoporosis in children is well established [26] and despite VFA being associated with lower radiation doses of 3–20 μSv [23, 27, 28] compared to 600–3000 μSv for radiographs [23, 27, 28], there are no recommendations for VFA in children. Generally, children with suspected reduced bone mineral density (BMD) have dual energy x-ray absorptiometry (DXA) to assess BMD and radiographs to identify vertebral fractures (VF), leading to significant lifetime cumulative radiation dose.

The aim of this study was to determine whether DXA, specifically iDXA (GE Healthcare Lunar iDXA, Buckinghamshire, UK), can replace radiographs for diagnosis of VF in children with suspected reduced BMD either with primary osteoporosis such as osteogenesis imperfecta or with secondary osteoporosis such as those treated with steroids or who have leukaemia.

Methods

The study was funded by the National Institute for Health Research “Research for Patient Benefit Programme” (Reference PB-PG-0110-21240). Local ethics committee and Research and Development approval (Reference 11/YH/0292) and patient/guardian assent/consent were obtained.

Two hundred and fifty patients aged 5 years to 15 years (inclusive) were recruited between November 2011 and February 2014 from two tertiary paediatric centres; 200 with suspected reduced BMD attending the metabolic bone clinic for iDXA and lateral spine radiographs and 50 attending spine clinic requiring lateral spine radiographs as part of routine care who were consented for an additional lateral iDXA. Participants were only recruited into the study once (Fig. 1).

Fig. 1
figure 1

Flow chart demonstrating patient recruitment process from metabolic bone and spine clinics

One hundred and fifty one patients were recruited prospectively and 99 retrospectively (33 from our centre, 66 from Birmingham Children’s Hospital - BCH).

Assuming (1) the true VF rate is 30 % and (2) 80 % sensitivity/specificity for the tests, then recruiting 250 patients (75 with VF), we can estimate sensitivity/specificity of DXA (±9 %) and radiography (±6 %) with 95 % confidence.

iDXA was performed according to published recommendations [29]. Radiographs were obtained on one of two local machines (TH3 Digital or TH Bucky Diagnost, Phillips, Guildford UK) or one of two machines at BCH (Luminus DRF, Siemens, Camberley UK, or CPI Wolverson Acroma unit, Willenhall UK) adhering to the European guidelines for spine radiographs in children [30]. Depending on patient size, single thoracolumbar or separate thoracic and lumbar exposures were taken. Radiographs were obtained in the lateral decubitus position for patients with suspected reduced BMD and in a standing lateral position for spine clinic patients. Average exposures were 73 kV, 82 kV and 103 kV for thoracic, lumbar and thoracolumbar radiographs respectively. Detector focus distance was 100 cm for decubitus and 210 cm for standing spine radiographs.

iDXA and radiographs for each patient were acquired on the same day.

Blinded to clinical information and corresponding results of the other modality, three consultant paediatric musculoskeletal radiologists (PB, IL, ACO), each with minimum 10 years’ experience, independently scored anonymised images in random order, for (1) presence of fractures and (2) image quality according to modified European criteria [30]. A hundred randomly selected pairs of images were read a second time. A final consensus read of all 250 radiographs acted as reference standard. Quantitative measurements using workstation measurement tools only took place at the reader’s discretion. The vertebrae were graded for fracture from 0 to 4 according to the simplified algorithm-based qualitative score (which is a modification of the Ferrar et al. algorithm-based qualitative vertebral fracture assessment technique [18]):

  1. 0)

    Normal

  2. 1)

    Fracture with 24 % or less height loss

  3. 2)

    Fracture with 25 % or more height loss

  4. 3)

    Non-osteoporotic deformity

  5. 4)

    Uncertain or unable to determine due to quality [31].

Because only lateral images were assessed and for consistency of vertebral level assignment between observers, the first vertebral body not associated with ribs was always designated L1 and the lowermost vertebral body associated with ribs was designated T12. If T12 and L1 could not be identified (e.g. excessive coning), all vertebrae were scored unreadable.

A questionnaire (non-validated) was randomly administered to assess patient and carer experience.

Radiation dose was calculated using dose area product (DAP) for radiographs and recorded exposure factors, scan areas and entrance surface dose (ESD) for iDXA. Average DAP was calculated and used to estimate average ED using PCXMC 2.0 software for different age groups to estimate the relative risk of each modality. Average lifetime additional cancer risk was calculated using the Health Protection Agency’s proposed total lifetime cancer risk per unit of ED (percentage per Sievert) as a function of age at exposure and sex.

Statistical analysis was performed using R Software Version 3.0.2 for PC. Using the consensus radiographic read as reference standard, we calculated and compared the prevalence of VF (percentage patients identified with one or more VF and percentage VF from the total of 3250 vertebrae) and iDXA/radiograph sensitivity/specificity. Previously surveyed clinicians initiate treatment once there is vertebral body height loss of 25 % or more plus pain [31]; therefore, patients were classified into two groups: no treatment (no VF or VF with a height loss of less than 25 %, VF0/VF-25) and treatment (one or more VF with a height loss of equal to/more than 25 %, VF+25) groups. Unreadable vertebrae within these groups were included in statistical analyses. Kappa statistics were used to assess inter/intraobserver and intermodality agreement. Fleiss’ kappa was used to assess agreement between all three observers simultaneously. Paired samples Student’s t test was used to compare radiation doses of the two modalities.

Results

Demographics

Mean patient age was 11.5 years; 104 (42 %) were male; 142 (57 %) self-classified as Caucasian, 109 (44 %) had osteogenesis imperfecta (OI). The other 90 children with suspected reduction in BMD had various diagnoses including inflammatory bowel disease, rheumatological conditions, coeliac disease, cystic fibrosis and unexplained fractures. 37(74 %) of the 50 spine clinic patients attended for scoliosis.

Fracture characteristics/image analysis

Vertebral level

Of the 3250 vertebrae assessed, 364 (11 %) were fractured, with T7 being the most frequently fractured level (47/250, 19 %). Table 1 summarises fracture characteristics for the consensus and individual iDXA/radiograph reads.

Table 1 Summary of fracture characteristics for the 250 individual and consensus reads

Figure 2 compares (a) iDXA to (b) radiography in a patient with OI; vertebrae T5 to T11 were independently identified by all observers on both iDXA and radiography as fractures with a height loss equal to or more than 25 %.

Fig. 2
figure 2

Lateral iDXA (a) and thoracic spine radiograph* (b) of patient 185, an 11-year-old female with osteogenesis imperfecta. Vertebrae T5 to T11 were independently identified by all observers on both iDXA and radiographic images as 2c fractures which translates to a height loss of more than or equal to 25 % (2), affecting both endplates (c). *The lumbar spine was included in the original radiographic examination, but for the illustrative purposes of this article, it has been omitted

Figure 3 compares (a) iDXA to (b, c) radiography in a patient with severe OI.

Fig. 3
figure 3

Lateral iDXA (a), thoracic spine radiograph (b) and lumbar spine radiograph (c) of patient 131, a 9-year-old female with osteogenesis imperfecta. The patient had severe multilevel fractures secondary to severe disease with resultant kyphoscoliosis degrading image quality on both iDXA and radiographs. On the consensus radiographic read T4 to T10 were graded as unreadable because of poor image quality

Image quality

A total of 460 (14 %) vertebrae were unreadable. Reasons included excessive coning either obscuring T12/L1 so that reliable vertebral levels could not be assigned or obscuring other vertebrae, poor image quality and patient positioning.

Of the 3250 vertebrae, the number unreadable on iDXA was 262 (8 %), 337 (10 %) and 232 (7 %) for radiologists 1, 2 and 3 respectively. The number for radiographs was 300 (9 %), 411 (13 %) and 504 (16 %). The percentage of unreadable images varied by vertebral level, image modality and observer. Overall, the level with the highest number of unreadable vertebrae was T4 (27.6 %); this was true for all three observers and both modalities. Similarly, overall, the levels with the lowest number of unreadable vertebrae were L1 to L3 (4.8 %) and this was generally true for all three observers and both modalities. Results for each level, observer and modality are summarised in Table 2.

Table 2 Percentage of unreadable vertebral bodies for each vertebral level, image modality and observer

Twenty-four patients had spinal rods in situ for scoliosis correction. There were on average less unreadable vertebrae for patients with spinal rods from iDXA (4, 7 and 4 for radiologists 1, 2 and 3 respectively) compared to radiographs (6, 8 and 7). The difference was statistically significant for radiologists 1 and 3 (p values 0.041 and 0.005 respectively).

Figure 4 compares (a) iDXA to (b) radiography in a postoperative scoliosis patient with spinal fixation; image quality with spinal rods in situ was degraded on radiographs from T4 to T6 but maintained on iDXA.

Fig. 4
figure 4

Lateral iDXA (a) and thoracic spine radiograph (b) of patient 80, a 14-year-old female with adolescent idiopathic scoliosis and previous spinal fixation. All observers independently scored vertebrae T4 to T6 as not fractured on iDXA. All observers were independently unable to score T4 to T6 on radiography because of poor image quality

Patient level

Overall, 90 (36 %) patients had one or more VF (vertebral height loss 10 % or more). A total of 181 (72 %) patients had valid consensus radiograph data allowing definitive categorisation into no treatment (VF0/VF-25) or treatment (VF+25) groups. The remaining 69 (28 %) had a combination of unreadable vertebrae and VF0 and were excluded from diagnostic accuracy calculations as a result of the inability to give a definitive diagnosis (some or all of the unreadable vertebrae may have had significant loss of height).

Table 3 summarises diagnostic accuracy. On a patient level, for the diagnosis of any grade VF, iDXA had average sensitivity and specificity across the three radiologists of 78 % (95 % confidence interval (CI) 57–99 %) and 72 % (95 % CI 46–99 %) respectively and radiographs 84 % (95 % CI 70–99 %) and 72 % (95 % CI 47–97 %). For the diagnosis of VF+25, iDXA had average sensitivity and specificity across the three radiologists of 70 % (95 % CI 58–82 %) and 97 % (95 % CI 94–100 %) respectively and radiographs 74 % (95 % CI 55–93 %) and 96 % (95 % CI 95–98 %).

Table 3 Contingency table showing diagnostic accuracy of iDXA compared to reference standard

Table 4 summarises the inter- and intraobserver agreement for the three observers for DXA and radiographs.

Table 4 Summary of observer agreements

Table 5 summarises intermodality agreement between the three observers for iDXA versus radiographs, iDXA versus consensus/reference standard radiograph and radiograph versus consensus/reference standard radiograph.

Table 5 Summary of intermodality agreements

Radiation dose

A total of 144 patients had valid radiation dose data; 95 (66 %) were male, mean age was 11.8 years (5–15 years) and mean weight was 41.1 kg (14.3–87.5 kg). The mean DAP for iDXA was 18.0 μGy/m2 (SD 3.4) compared to 64.4 μGy/m2 (SD 76.7) for radiographs, a difference of 46.4 μGy/m2 (95 % CI 33.7–59.1), p < 0.001. Average age-adjusted ED for iDXA was 41.9 μSv compared to 232.7 μSv for radiographs.

The average lifetime additional cancer risk per lateral iDXA was calculated to be 0.001 % and 0.000 % for patients aged 5–10 and 10–15 years respectively for both sexes. Per lateral spine radiograph the additional lifetime cancer risk was 0.003 % for boys and 0.002 % for girls aged 5–15 years.

Patient experience

Eighty-five sets (85 %) of patient/carer questionnaires were returned. Of these, 77 (91 %) were completed by patient and carer, five (6 %) by the carer only and three (3 %) by the child only. Of the 82 carers that completed a questionnaire, 11 (13 %) thought their child had difficulty staying still whilst the radiographs were obtained compared to 8 (10 %) for iDXA (p = 0.549). Two (2 %) carers thought their children (aged 10.3 and 15.8 years) found the noise of the iDXA upsetting or frightening.

Eighty children (32 aged 5–11 years and 48 aged 12–15 years) completed questionnaires. Thirty-nine (49 %) preferred iDXA while 27 % (34 %) had no preference. Sixty-nine (86 %) did not find moving about the hospital for the different tests unacceptable.

There were no adverse effects of either iDXA or radiographs.

Discussion

This is the largest study to date assessing whether VFA can replace spine radiographs in children. Overall we found iDXA had similar sensitivity and specificity to radiography and good intraobserver agreement, on average higher than the intraobserver agreement of radiography. A similar study of VFA in children concluded that its utility was limited by compromised visibility and poor diagnostic accuracy [27]. However, those results were based on older DXA technology (Hologic Densitometer), a relatively small sample size (n = 65) and acquisition of DXA and radiographic images not on the same day but within 6 months of each other. A more recent comparative study using newer DXA technology (Hologic Discovery A Densitometer) reported sensitivity (96 %) and specificity (100 %) on a patient level (some vertebrae were excluded from analysis because of poor visibility) [32]. Another recent study of VFA in 165 children and adolescents compared 20 of the subjects’ VFA with lateral spine radiographs (obtained within 2 months of each other), reporting sensitivity of 83 % and specificity of 100 % for VFA [33]. This study did not assess T4 or T5 and again excluded unreadable vertebrae from statistical analyses [33]. Diagnostic accuracy of both studies [32, 33] was higher than ours for both DXA and radiographs; inclusion of poorly visualised vertebrae in our statistical analyses may be seen either as a weakness or strength. Whilst diagnostic accuracy will have been improved had we excluded all poor quality images, the data as presented demonstrates the worst-case scenario.

Our results indicate that iDXA had a (statistically insignificant) lower unreadable rate than radiographs (up to 16 % for both). These rates are similar to previous studies performed on adult (DXA and radiographs) [13, 14] and paediatric (radiographs) [34] populations. However iDXA had a (statistically significant) better image quality than radiographs when spinal rods were in situ.

DAP was chosen to estimate radiation dose because accurate ESD measurements using thermoluminescent dosimeters are challenging at low doses and more labour intensive. The radiographic systems had DAP meters installed and the iDXA system recorded scan area, offering simple methods for estimating doses in a large number of patients by only requiring the periodic measurement of ESD to ensure stability. Commonly published DXA doses relate to post-menopausal women over the age of 60 and reference dose data from 2006 [2]; the lifetime risk of fatal cancer in children is approximately four to five times [5] higher than this adult group. Published differences in radiation dose for radiographs and VFA (200:1) are higher than the differences shown by our study (5.5:1) [23, 27, 28]; however, published data commonly relates to standard DXA spine scans (ca. 10 cm × 20 cm) with a scan area of ca. 200 cm2, whereas the scans performed in this study had an average area of ca. 700 cm2, replicating conventional film coverage. This accounts for an estimated 3.5-fold increase in estimated ED. Our average ESD measurement of 235 μGy2 is similar to the published values of up to 352 μGy2 for a different manufacturer’s scanner (Hologic QDR 4500-A) [2]. The remainder of the difference is likely due to newer digital radiographic technology with significantly lower doses compared to previous non-digital technologies. Even though dose reduction was lower than expected (demonstrating the benefit of optimised exposures delivered by dedicated paediatric radiology departments), an average annual ED reduction of 232.7 μSv per patient amounts to a considerable childhood/lifetime cumulative dose reduction, particularly given the comparable diagnostic accuracy and patient/carer acceptability of VFA. Based on average dose calculations from our cohort of patients, for a female, estimated cumulative ED of at least 2097 μSv from an annual spine radiograph between the ages of 5 and 15 years would give an additional lifetime cancer risk of 0.022 % (1 in 4545). For a male, estimated cumulative ED would be 2930 μSv with an additional lifetime cancer risk of at least 0.033 % (1 in 3030). Although the overall risk per patient is low, total numbers of patients are relatively high and it is an avoidable risk without compromising diagnostic information.

If conventional radiography is required as a baseline to assess spinal deformity, such as scoliosis or kyphosis in this select group of patients with suspected reduction in BMD, then the use of EOS® for full standing radiographs of the spine is an alternative method of reducing cumulative radiation dose [35]. The limiting factor for the use of this alternative low dose technique is its availability. EOS systems are more expensive than conventional radiographic equipment and estimates of patient throughput at national level suggest that EOS is not cost-effective [36]. Therefore, the National Institute for Health and Care Excellence (NICE) does not currently recommend the routine use of EOS in the National Health Service (NHS) [37]. Although EOS produces images of equal or better quality than radiographs at doses comparable to DXA, it does not mitigate the need for BMD assessment and therefore a test that can simultaneously assess both in those children who do not have scoliosis/kyphosis is preferable.

The major limitations of this study (and others of diagnostic accuracy) relate to the lack of an objective gold standard. Firstly, because there is no agreed standardised objective method for the diagnosis of VF, we cannot be certain which prevalent fractures were truly fractures. We used the consensus radiographic read of three experienced observers as reference standard. Radiographic cone beam technology has the disadvantage of producing divergent x-ray beams causing parallax and distorting the shape of the vertebrae at the extremities of the radiograph. Conversely, the fan beam technology in DXA is perpendicular to each vertebral body as the source travels down the spinal column [27]. The parallax effect seen in radiographs may affect diagnostic accuracy, particularly for subtle fractures or normal physiological change in vertebral body shape and height. It is possible that mild fractures were over-called on radiographs rather than missed on iDXA. We accept that our selected reference standard may be imperfect, but it is at least as reliable as standards used in daily practice and is expected to be reliable for those vertebral fractures that would merit treatment (height loss greater than 25 %).

Secondly, the higher intermodality agreement of individual radiograph compared to individual iDXA reads is in part to be expected, because for individual and consensus radiographs we were scoring not only the same modality but also the same images. Despite this advantage, radiographs did not significantly outperform iDXA.

Thirdly, disadvantages of consensus scoring in general are well documented [38] and applicable to this study; however, inter- and intraobserver agreement for individual reads was similar for both iDXA and radiographs. Therefore, for any individual radiologist, clinical opinion and hence patient management would be the same irrespective of whether diagnosis of VF was made from DXA or from radiographs.

Finally, the use of conventional statistical methods for studies of diagnostic accuracy for which there is no gold standard has been questioned and more appropriate methodology suggested [39]. An interesting future study would be to apply some of these methodologies (e.g. latent class analysis) to our raw data.

In conclusion, diagnostic accuracy of iDXA and radiographs for the detection of VF in children are comparable; parents had no strong preference for either modality, whilst the majority of children either preferred iDXA or had no preference. Incidentally we demonstrated improved image quality of iDXA for scoliosis patients with in situ spinal rods. A single iDXA scan provides an average annual effective dose reduction of at least 232.7 μSv per patient. Given the large numbers of children at risk of VF (skeletal dysplasias, steroid therapy, anticancer treatment etc.) this amounts to considerable childhood and population lifetime cumulative dose reductions. In accordance with the principles of “as low as reasonably achievable” [40] and “image gently” [41], we believe that in children with suspected reduced BMD, either with primary osteoporosis such as osteogenesis imperfecta or with secondary osteoporosis such as those treated with steroids or who have leukaemia, DXA (using modern scanners) should replace conventional radiographs for the diagnosis of VF.