Applicability of two commonly used bone age assessment methods to twenty-first century UK children
- 390 Downloads
To assess the effect of secular change on skeletal maturation and thus on the applicability of the Greulich and Pyle (G&P) and Tanner and Whitehouse (TW3) methods.
BoneXpert was used to assess bone age from 392 hand trauma radiographs (206 males, 257 left). The paired sample t test was performed to assess the difference between mean bone age (BA) and mean chronological age (CA). ANOVA was used to assess the differences between groups based on socioeconomic status (taken from the Index of Multiple Deprivation).
CA ranged from 2 to 15 years for females and 2.5 to 15 years for males. Numbers of children living in low, average and high socioeconomic areas were 216 (55%), 74 (19%) and 102 (26%) respectively. We found no statistically significant difference between BA and CA when using G&P. However, using TW3, CA was underestimated in females beyond the age of 3 years, with significant differences between BA and CA (− 0.43 years, SD 1.05, p = < 0.001) but not in males (0.01 years, SD 0.97, p = 0.76). Of the difference in females, 17.8% was accounted for by socioeconomic status.
No significant difference exists between BoneXpert-derived BA and CA when using the G&P atlas in our study population. There was a statistically significant underestimation of BoneXpert-derived BA compared with CA in females when using TW3, particularly in those from low and average socioeconomic backgrounds. Secular change has not led to significant advancement in skeletal maturation within our study population.
• The Greulich and Pyle method can be applied to the present-day United Kingdom (UK) population.
• The Tanner and Whitehouse (TW3) method consistently underestimates the age of twenty-first century UK females by an average of 5 months.
• Secular change has not advanced skeletal maturity of present-day UK children compared with those of the mid-twentieth century.
KeywordsAge determination by skeleton Forensic medicine X-rays Hand Wrist
Greulich and Pyle
Index of Multiple Deprivation
Tanner and Whitehouse
Bone age assessment plays an important role in clinical practice, permitting investigation of whether bone maturity is occurring at a rate consistent with chronological age (CA). In this context, bone age (BA) assessment is useful for managing children with skeletal dysplasias and endocrine disorders, as well as planning for orthopaedic procedures . Approximately 160,000 unaccompanied children entered European countries during 2015 and 2016 . Although there is no precise figure, numbers are significant and authorities have faced challenges in estimating some of their ages . In these situations, CA has occasionally been deduced by comparing BA of the individual in question with the existing BA standards . This practice is particularly common at geographical borders where conflicts or crises are occurring. Whether to aid clinical management of paediatric patients or to determine chronological age when this is unknown, it is crucial to have a reliable and appropriate method of determining bone age . However, the European Society of Paediatric Radiology musculoskeletal task force has recently advised against the practice of estimating chronological age based on an assessment of bone age .
Numerous approaches have been developed to determine BA. Among these, two methods are widely utilised based on left hand and wrist radiographs, namely the Greulich and Pyle (G&P) and Tanner and Whitehouse (TW) methods [7, 8]. The G&P method is based on matching the child’s hand radiograph to standard plates provided by the G&P atlas; thus, this method compares the hand’s general maturational status. The population providing the G&P standard atlas were originally North American Caucasians of “good” socioeconomic status in 1938. The “good” socioeconomic status was designated because recruited children were above average both economically and educationally (they were also free of physical, mental, nutritional and environmental factors detrimental to growth) . In contrast to the G&P atlas, the TW method undertakes an assessment and scoring of skeletal maturity for each individual hand and wrist bone. Data provided by the Harpenden Longitudinal Growth Study enabled the TW method’s development. In 2001, the TW3 method replaced the TW1 and TW2 methods as a result of documented secular change (as stated by the authors). The data that formed the TW3 method was collected from European and American Caucasian children of average socioeconomic status during the 1980s and 1990s . Following the introduction of G&P and TW3 standards, numerous investigations have been undertaken internationally, in order to identify the extent to which these standards are relevant to various populations. This issue is significant, especially in light of the growing volume of studies concluding that certain methods are inappropriate for particular ethnic groups and as a result of improvements in socioeconomic status [11, 12, 13, 14].
BoneXpert software was developed in 2009, enabling automatic calculation of bone age, according to the G&P and TW3 standards . The software provides standard deviation scores for each hand radiograph, thus assisting the comparison of a child’s bone age with healthy children of the same sex and age. There are several advantages in utilising this software tool, including eliminating observer variability and saving rating times.
This study aims to use BoneXpert to test the applicability of the G&P and TW3 methods to United Kingdom (UK) children born in the twenty-first century, whose standard of living (across all socioeconomic categories) is likely to be higher than that of the children used to develop the G&P and TW3 methods. Our hypothesis was that improved living standards and therefore improved nutrition would render their bone age advanced when compared with their chronological age .
Hand radiographs performed between 2010 and 2016 on children aged between 2 and 15 years presenting to the Emergency Department of Sheffield Children’s Hospital, United Kingdom, following upper limb trauma, were retrospectively identified from the Picture Archiving and Communication System.
Radiographs that contained recent untreated fractures were used. However, radiographs in children with a history of previous fracture were excluded, as were those with a specific request for BA estimation. When both the left and right hands were imaged in the same child, only the left hand radiograph was included in the analysis. Demographic data including sex, ethnicity (self-reported) and CA at the time of the radiograph were recorded.
Socioeconomic status of recruited children was documented using the Index of Multiple Deprivation (IMD) . The postcode of each child was retrieved from the patient address data and then the corresponding values provided by the IMD for each postcode were recorded. The IMD measures deprivation based on income, employment, education, health and disability, crime, barriers to housing and service and living environment for each small area. These small areas consist on average of 650 households and approximately 1500 residents . The English IMD 2015 data are ranked for each small area within England from 1 to 32,844. IMD scores below 10,894 are deemed to be areas of low socioeconomic status, between 10,895 and 21,788 are average, and above 21,789 are of high socioeconomic status. BoneXpert software (Visiana) was utilised to analyse the hand radiographs. All radiographs were acquired via a computed radiography system and were in DICOM format. The default ethnicity for analysing the radiographs was Caucasian, because the software does not include ethnicity-specific standard deviation scores (SDS).
Statistical analysis was undertaken via SPSS version 24 for PC (IBM). The mean variation for BA and CA was determined for each child by subtracting BA from CA (BA − CA). Therefore, a positive value indicates advanced BA, whereas a negative value indicates delayed BA, compared with CA. The significance of the differences was calculated using a paired sample t test.
Statistical analysis was undertaken separately for both sexes, in relation to each method (G&P and TW3) and the standard error of the estimate (SEE) was calculated for each sex and method (all ethnicities) . Analysis was repeated for both sexes for Caucasians only, to investigate the effect of ethnicity on the results. Analysis was also performed to determine the effect of readings from left and right hands. The effect of socioeconomic status was evaluated using the one-way ANOVA test. Results were considered statistically significant when the p value was < 0.05 (two-sided).
Approval was obtained from the Health Research Authority at Yorkshire and Humber. The need for full Research Ethics Committee approval was waived for this retrospective study of hand radiographs.
Mean difference (SD) in years, between BA and CA in females and males
Mean CA (SD)
Mean BA (SD)
Mean difference BA − CA
G&P BA vs CA
− 0.07 (1.05)
TW3 BA vs CA
− 0.43 (1.12)
G&P BA vs CA
− 0.12 (1.06)
TW3 BA vs CA
− 0.54 (0.96)
− 0.13 (0.64)
Mean difference (SD) in years, between G&P BA and CA (all ethnicities)
Mean difference (SD) in years, between TW3 BA and CA (all ethnicities)
Mean difference (SD) in years, between G&P, TW3 and CA in three socioeconomic groups
Mean difference between BA and CA (SD) G&P–CA
− 0.23 (1.11)
− 0.35 (1.03)
− 0.19 (1.02)
− 0.04 (1.10)
− 0.33 (1.01)
− 0.02 (1.12)
− 0.52 (0.86)*
− 0.01 (1.03)
− 0.63 (0.98)*
− 0.02 (0.79)
− 0.37 (0.87)*
− 0.58 (0.94)*
− 0.24 (0.97)
− 0.66 (0.96)*
− 0.7 (0.86)
− 0.47 (0.93)*
− 0.2 (0.91)
Mean difference between BA and CA in years, according to body side (all ethnicities)
n = 118
n = 68
n = 139
n = 67
Mean difference (SD)
− 0.2 (1.02)
Mean difference (SD)
− 0.32 (0.94)
− 0.6 (0.99)
− 0.04 (1.00)
The G&P and TW3 methods showed comparable accuracy in females with the standard error of the estimate (SEE) of ± 1.05 and ± 1.06 years, respectively. Similar accuracy for the two methods was also observed in males with SEE of ± 1.10 and ± 1.00 years for G&P and TW3 respectively.
Several variables may affect the applicability of BA methods. One is socioeconomic status, which refers to a combination of environmental factors such as nutritional status, state of health and economical and social class of an individual. Being of “high” socioeconomic status infers improved access to healthcare, sufficient food, exercise and housing, allowing full growth potential to be achieved . Studies have shown that high socioeconomic status is more likely to accelerate skeletal maturation rate . This might be related to nutritional factors with over-nutrition leading to overweight/obesity, which in children has been linked to BA advancement [21, 22]. In contrast, individuals from low socioeconomic groups are more likely to have poor diets and lower weight and are more likely to experience growth retardation . Bearing in mind that the TW2 method was updated because of perceived effects of secular change , whereas G&P has never been updated, we questioned the reliability of bone age assessment methods. We sought to analyse the reliability of the G&P and TW3 methods within the modern-day UK context.
Breaking the cohort into yearly intervals showed statistical significance for varying age groups in females and males, when using the G&P atlas. These differences (overestimation at age of 6 and underestimation at age of 12, in females) were still significant when only data from Caucasian children was analysed. In spite of these sub-group differences, there was no statistical difference between overall mean BA and overall mean CA in either males or females. To convey a comprehensive picture, we contrasted our findings—especially mean difference between BA and CA—with previous studies that focused on the Caucasian population (Supplementary Table 1). Some of these studies have concluded that Caucasian children mature skeletally at approximately the same rate as the G&P standard in males across all age groups [14, 24, 25, 26, 27, 28]. However, other authors recommend that the G&P atlas be used with reservation due to mean BA being retarded in some age groups compared to the reference population [29, 30, 31, 32]. Common findings among these studies of the G&P atlas include underestimation of males aged below 13 years and overestimation during adolescence [30, 31, 32, 33, 34, 35, 36]. G&P was applicable to females during adolescence while overestimation was reported before the age of 12 years [31, 32]. Others have recommended that a new standard altogether is required for precise bone age assessment, given the significant advancement of BA due to secular changes in skeletal maturation, which is thought to be due to improved standard of living [28, 30, 35, 36]. For example, Calfee et al reported that G&P overestimated males and females between 12 and 15 years old, for whom BA exceeded CA by at least 2 years . All of these studies used the subjective assessment of experienced raters; our results using an objective software program indicate that overall, G&P currently remains applicable.
A large number of children included in this study (55%) were of low socioeconomic status according to IMD and socioeconomic status explained 17.8% of the difference between bone age (TW3 method) and chronological age. Although there have been improvements in standard of living over the past decade  (expected to advance bone age), our results show delayed BA in girls when using the TW3 method. In line with our results, other studies have shown delayed BA compared with CA in females after the age of 10 years [14, 29, 37]. These results potentially support recent views of some researchers, who argue that the improved secular trend has eased or stopped [38, 39]. As a result of an improving secular trend in standard of living, the TW3 method was established in 2001 such that the TW3 BA is about a year ahead of the previous (TW2) method, especially after the age of 10 or 11 years . Our results suggest that a return to TW2 may be necessary.
Several authors argue that socioeconomic status is the predominant reason behind the difference in skeletal maturational rates among populations [12, 14, 31]. Schmeling et al found that bone age was retarded among 27 studies that reported the socioeconomic status of their participants . This retardation was due to the high socioeconomic status of the children recruited to develop the G&P atlas compared with the children within these studies, such that even the secular trend of increasing standard of living was not sufficient to eliminate any differences in socioeconomic status of the various cohorts .
In spite of the likely effects of socioeconomic status, the impact of ethnicity cannot be neglected. Studies on two different ethnic groups residing in the same region have shown that bone age assessment methods may reveal different results [24, 34]. Ontell et al showed that the G&P atlas is applicable to Caucasian girls at all ages but not to boys before the age of 13, while in Asians in the same region, the G&P atlas is applicable to girls at all ages but only to boys between 7 and 13.3 years. Zhang et al concluded that Asian children mature sooner than do Caucasian children, especially between 10 and 13 years of age in girls and between 11 and 15 years of age in boys. In a recent meta-analysis, bone age was significantly delayed in African females, while advanced in Asian males when compared with the G&P standard . Furthermore, it has been shown that young Asian adults reach the end of maturity prior to the age observed through the TW3 method (25–27). Research focusing on South African individuals found that TW3 underestimated CA for boys but not for girls . These variations within populations must be considered when assessing bone age . In this current study, we demonstrated no significant difference between all ethnic groups compared with Caucasians alone; it should be noted that Asians and Africans made up only 20% and 5% of the study population respectively.
Measuring BA according to a subjective technique has a greater likelihood of introducing rating variations across analysts, due to varying degrees of expertise. However, this disadvantage was overcome in the current study through the use of BoneXpert, which is an automated bone age analysis software tool that eliminates observer variability and has the advantage of saving significant time. Our observed 5-month persistent discrepancy between chronological age and TW3 bone age as determined by BoneXpert in females appears to be a disadvantage not of the software per se, but of the reference standard (TW3) on which the software depends. Despite this, the software showed acceptable accuracy when using the G&P and TW3 methods for both sexes with the SEE being approximately ± 1 year.
The fact that we did not review hospital notes to ascertain full health in the children (although radiology and ED notes were scrutinised);
The exclusion of certain age groups, namely those under 2 years old in females, those under 2.5 years in males and individuals of both sexes aged 15 years or older. In order to save time and eliminate subjectivity, this pragmatic study was performed using BoneXpert; however, this software tool is unable to read images from younger age groups due to limited ossification or non-ossification of epiphyses, while its dependability is questionable when used on older age groups ;
Height and weight and pubertal stage of recruited children were not recorded; it is said that that body mass index affects the rate of skeletal maturation [19, 20]; the prevalence of overweight and obese children is well documented to be rising  and should be considered in prospective studies of bone age assessment;
We do not know the precise socioeconomic status of the reference children, although those recruited for G&P were said to have “good” socioeconomic status;
We used self-reported ethnicity; non-Caucasians were a minority in the current study, yet some researchers have shown that ethnicity is more accurately self-reported in groups other than Caucasian [45, 46, 47]; and finally,
This study did not set out to be and should not be regarded as a validation study of BoneXpert, since the mean absolute and root mean squared errors were not calculated. Rather, we aimed to correlate G&P and TW3 against known CA of a healthy modern population and found that G&P remains reliable (consistent with the results of a recent systematic review) . The question of accuracy of BoneXpert has already been answered in primary research studies [49, 50, 51], whereas as far as we are aware, the assessment of the applicability of the standards themselves has not been previously performed using objective software and only a few have considered socioeconomic status [12, 14, 52, 53, 54]. Contrary to our results, these studies have shown delayed bone age in children of low socioeconomic status—it is possible that the degree of deprivation in the children from these studies was greater than in ours.
Progress in medicine, education, industry and economic growth have all contributed to higher socioeconomic status which in turn is expected to have had a positive impact on children’s skeletal maturation [8, 24]. Our results show retardation of BA appears counterintuitive, but may not be if the socioeconomic status of the TW3 reference children was on average higher than that of the children we recruited and suggest that perhaps we should revert to the TW2 method.
Our results indicate that (1) secular change does not appear to have advanced skeletal maturity of UK children; (2) no significant difference exists between BoneXpert-derived BA and CA when using the G&P atlas; therefore, this method can be utilised for the modern UK population; and (3) BoneXpert-derived TW3 BA in current UK children is consistently below the CA of females by an average of 5 months; the clinical significance of this will have to be determined by the requesting clinician and will be greater in younger children who have a lower standard deviation. Developers of BoneXpert may wish to consider this in future upgrades of the software.
This study has received funding by Najran University.
Compliance with ethical standards
The scientific guarantor of this publication is Dr. Amaka C. Offiah.
Conflict of interest
The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.
Statistics and biometry
No complex statistical methods were necessary for this paper.
Written informed consent was not required for this study because the study was conducted retrospectively.
Institutional Review Board approval was obtained.
• cross-sectional study/diagnostic or prognostic study
• performed at one institution.
- 2.Eurostat (2017) Asylum applicants considered to be unaccompanied minors. Luxembourg. Available via https://ec.europa.eu/eurostat/en/web/products-datasets/-/TPS00194Last. Accessed 12 May 2019
- 3.Aynsley-Green A, Cole TJ, Crawley H, Lessof N, Boag LR, Wallace RM (2012) Medical, statistical, ethical and human rights considerations in the assessment of age in children and young people subject to immigration control. Br Med Bull 102:17–42. https://doi.org/10.1093/bmb/lds014 CrossRefGoogle Scholar
- 8.Tanner J, Healy M, Goldstein H, Cameron N (2001) Assessment of skeletal maturity and prediction of adult height (TW3 method). WB Saunders, LondonGoogle Scholar
- 17.Deprtment of Communities and Local Government (2015) English indices of deprivation 2015. 1:1–11Google Scholar
- 18.Department of Communities and Local Government (2015) The English Indices of Deprivation 2015 – Frequently Asked Questions (FAQs). 1–19Google Scholar
- 30.Suri S, Prasad C, Tompson B, Lou W (2013) Longitudinal comparison of skeletal age determined by the Greulich and Pyle method and chronologic age in normally growing children, and clinical interpretations for orthodontics. Am J Orthod Dentofacial Orthop 143:50–60. https://doi.org/10.1016/j.ajodo.2012.08.027 CrossRefGoogle Scholar
- 33.Loder RT, Estle DT, Morrison K et al (1993) Applicability of the Greulich and Pyle skeletal age standards to black and white children of today. Am J Dis Child 147:1329–1333. https://doi.org/10.1001/archpedi.1993.02160360071022 CrossRefGoogle Scholar
- 44.Ng M, Fleming T, Robinson M et al (2014) Global, regional, and national prevalence of overweight and obesity in children and adults during 1980-2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet 384:766–781. https://doi.org/10.1016/S0140-6736(14)60460-8 CrossRefPubMedCentralPubMedGoogle Scholar
- 53.Griffith JF, Cheng JCY, Wong E (2007) Are Western skeletal age standards applicable to Hong Kong Chinese? – a comparison of Greulich and Pyle and TW3 methods. Hong Kong Med J 13:S28–S32Google Scholar
- 54.Chiang K, Chou A, Yen P, Ling C (2005) The reliability of using Greulich-Pyle method to determine children’s bone age in Taiwan. Ci Ji Yi Xue Za Zhi 17:417–420+453Google Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.