Well-founded practice or personal preference: a comparison of established techniques for measuring ulnar variance in healthy children and adolescents

Objectives Ulnar variance is a clinical measure used to determine the relative difference in length between the radius and ulna. We aimed to examine consistency in ulnar variance measurements and normative data in children and adolescents using the perpendicular and the Hafner methods. Methods Two raters measured ulnar variance on hand radiographs of 350 healthy children. Participants’ mean calendar and skeletal ages were 12.3 ± 3.6 and 12.0 ± 3.7 years, 52% were female. Raters used the perpendicular method, an adapted version of the perpendicular method (in which the distal radial articular surface is defined as a sclerotic rim) and the Hafner method, being the distance between the most proximal points of the ulnar and radial metaphyses (PRPR) and the distance between the most distal points of both (DIDI). Intraclass correlation coefficients (ICCs) for intermethod consistency and inter- and intrarater agreement were calculated using a two-way ANOVA model. Variability and limits of agreement were determined using the Bland-Altman method. Results The interrater ICC was 0.75 (95% CI, 0.61–0.84) for the adapted perpendicular method, 0.88 (95% CI, 0.80–0.93) for PRPR, and 0.94 (95% CI, 0.90–0.97) for DIDI. The intermethod consistency ICC was 0.60 (95% CI, 0.48–0.70) for perpendicular versus PRPR and 0.60 (95% CI, 0.49–0.70) for perpendicular versus DIDI. The intrarater ICC was 0.88 (95% CI, 0.70–0.95) for perpendicular, 0.90 (95% CI, 0.83–0.94) for PRPR, and 0.81 (95% CI, 0.69–0.89) for DIDI. The perpendicular method was not useable in 38 cases (skeletal age ≤ 9 years) and the Hafner method in 79 cases (skeletal age ≥ 12 years). Conclusions The perpendicular and Hafner methods show moderate intermethod consistency. The Hafner method is preferred for children with skeletal ages < 14 years, with good to excellent inter- and intrarater agreement. The adapted perpendicular method is recommended for patients with skeletal ages ≥ 14 years. Key Points • The perpendicular method for measuring ulnar variance requires extended instructions to ensure good interrater agreement in pediatric and adolescent patients. • The Hafner method is recommended for ulnar variance measurement in children with unfused growth plates and up to a skeletal age of 13 years, and the perpendicular method is recommended for children with fused growth plates and from skeletal age 14 and older. • The mean ulnar variance measured in this study for each skeletal age group (range, 5–18 years) is provided, to serve as a reference for future ulnar variance measurements using both methods in clinical practice. Electronic supplementary material The online version of this article (10.1007/s00330-019-06354-x) contains supplementary material, which is available to authorized users.


Introduction
Ulnar variance is a clinical measure that can be applied on hand radiographs to determine the relative difference in length between the radius and ulna. When the ulna's relative length differs from that of the radius by less than 1 mm, this is termed neutral ulnar variance or "ulna zero" [1]. A deviation from this neutral position with the ulna exceeding the radius is termed positive ulnar variance, or "ulna plus" [2]. Consequently, a deviation in the opposite direction is termed negative ulnar variance, or "ulna minus." However, exact values of ulnar variance and their interpretation depend highly on the method used to measure the ulnar variance. Population averages of ulnar variance vary around neutral and increase with grip [2][3][4]. Ulnar variance can be used to determine injury prognosis of distal forearm fractures [5] and in diagnosis of conditions like ulnar impaction syndrome and triangular fibrocartilage complex (TFCC) degeneration [6]. In young gymnasts with possible stress injury of the distal radius, ulnar variance is suggested to be on average more positive [7].
Ulnar variance measurement methods include the "line technique" [8], the "concentric circle technique" [9], and the "method of perpendiculars" [10]. In the line technique, a line is drawn from the ulnar side of the articular surface of the radius to the ulna, and ulnar variance is defined as the distance between this line and the carpal surface of the ulna. The concentric circle technique uses a template of concentric circles placed with the center on the distal sclerotic line of the radius, and ulnar variance is measured by the distance between the line approximating the distal radius and the ulnar cortical rim. The perpendicular method measures the difference between two lines touching the distal ulnar aspect of the radius and the distal cortical rim of the ulna, both drawn perpendicular to the longitudinal axis of the radius. The latter was found to have the highest interrater and intrarater reliability of the three and is most often used in adults (Fig. 1a) [11]. However, in children, this method can be difficult to apply, because the distal radial and ulnar surfaces may not be (clearly) visible when the epiphysis is not fully ossified. To overcome this problem, a measurement method specifically for skeletally immature patients was developed by Hafner et al (Fig. 1b) [12]. This technique and the population data provided by this initial study have since been used in studies reporting ulnar variance in adolescent populations such as young gymnasts [7,13].
The Hafner method has in turn been criticized for being unfamiliar to many clinicians, difficult to apply, and incomparable to values acquired in adult populations using other measurement techniques, while the perpendicular method showed good interrater reliability and was considered easy to apply in adolescents [14]. However, the perpendicular and Hafner methods have not been compared directly in pediatric or adolescent populations, nor have normative data been acquired from larger populations. For clinical use as well as for research on the possible relationship of positive ulnar variance and distal radial physeal stress injury, a reliable measurement method is essential.
As the only reference standard for ulnar variance is in vivo measurement of true ulnar and radial length, which is not favored or even possible in most cases, relative measurement suffices in daily clinical practice [9]. Such a measurement needs to be easily applicable and reliable, and to allow comparison with other populations measured with the same technique. This study aims to determine consistency of the perpendicular method for measuring ulnar variance and the Hafner method in a Western European pediatric and adolescent population and to provide normative population data for the distribution of measurements in children and adolescents for both methods.

Design
This retrospective study included a random sample from a population of healthy children and adolescents of a previous study in which normal values for phalangeal radiographic absorptiometry were determined [15]. This study population consisted of children from the Erasmus Gymnasium in Rotterdam and children of employees (and their relations) at the Erasmus Medical Center Rotterdam. Inclusion criteria were inclusion in original study population by Van Rijn et al [15], and age 18 years or younger. Exclusion criteria were any disease or use of medication known to affect bone growth and/ or metabolism (in accordance with the study by Van Rijn et al [15]), radiographically visible growth deformity of the wrist and/or hand, and radiographically visible upper extremity fracture. Ethical approval was obtained for the initial study and for subsequent use of the data. In keeping with national guidelines on clinical studies in children, informed consent was given by parents or guardians alone for children younger than 12 years, and by parents or guardians as well as the child for children aged 12 years or older. The included sample consisted of 185 girls (53%) and 165 boys ( Table 1).
The primary outcome measure was intermethod consistency between the perpendicular and Hafner methods. Secondary outcome measures were interrater and intrarater agreement of both methods, and normative population data for the distribution of ulnar variance values in children and adolescents for both methods.

Ulnar variance measurement
Digitalized posteroanterior radiographs of the left hand were previously obtained in all participants [18] and retrospectively used in this study. Radiographs were generated with the shoulder in 90°abduction, the elbow in 90°flexion, and the forearm in neutral rotation, in accordance with recommendations in the literature [19]. Images were standardized into 300 dpi with 12 bits per pixel to facilitate accurate measurements, using a Vidar Diagnostic Pro Advantage scanner using TWAIN v5.2. Images were blinded and skeletal age was determined using automated software (BoneXpert, Visiana) [18].
A musculoskeletal radiologist with 2 years of experience (rater 2) and a fourth-year radiology resident specializing in musculoskeletal radiology (rater 1) measured ulnar variance using step-by-step instructions for both measurements including example images, based on the methods' descriptions in the literature [11,12]. To ensure raters' familiarity with both methods, both raters practiced the use of the Hafner method on 10 images and the perpendicular method on 10 different images that were excluded from further analysis.  Fig. 1 a The method of perpendiculars [11]. A line is drawn perpendicular to the longitudinal axis of the radius and through the most distal ulnar part of the radius. The position of the adjacent distal cortical rim of the ulna relative to this line is measured as positive, neutral, or negative ulnar variance. b The method of ulnar variance measurement as described by Hafner et al [12]. First, a line is drawn perpendicular to the longitudinal axis of the ulna and touching the most proximal point of the ulnar metaphysis. Similarly, a second line is drawn in the radius, perpendicular to its longitudinal axis and touching the most proximal point of the radial metaphysis. Ulnar variance is then defined as the distance between these two lines, in the literature often referred to as "PRPR" ("PRoximal-PRoximal," distance "A"). Alternatively, the distance between the most distal point of the ulnar metaphysis and the most distal point of the radial metaphysis, often referred to as "DIDI" can be measured in a similar way ("DIstal-DIstal," distance "B") In the perpendicular method, the distance from the most distal part of the radius to the adjacent distal cortical rim of the ulna represents ulnar variance (Fig. 1a). The Hafner method [12] consists of two measurements: the distance from the most proximal point of the ulnar metaphysis to the most proximal point of the radial metaphysis ("PRoximal-PRoximal," or "PRPR") and the distance from the most distal point of the ulnar metaphysis to the most distal point of the radial metaphysis ("DIstal-DIstal," or "DIDI") ( Fig. 1b).

Inter/intrarater agreement and intermethod consistency
Raters independently measured ulnar variance in 60 participants, first using the Hafner method, and then using the perpendicular method on the same images after a 1-week interval, to determine interrater agreement for both methods. In case of a systematic difference between raters for one or both methods, possible causes for discrepancies were discussed during a consensus meeting and measurement instructions were adapted accordingly. Subsequently, raters used the method in question in 60 other participants, and interrater agreement was again determined. This process was set up to be repeated until good interrater agreement, defined by an intraclass correlation coefficient of at least 0.75, was achieved for both methods. For the Hafner method, a single round of 60 measurements was performed to reach this level of agreement, and for the perpendicular method, one consensus meeting and 1-s round of measurements with adapted instructions were carried out.
To achieve optimal external validity relating to daily clinical practice, intermethod consistency and intrarater agreement were assessed by the more junior expert (rater 1). For intermethod consistency, rater 1 used both methods to measure ulnar variance in 220 participants in two separate sessions. To determine intrarater agreement for both methods, rater 1 remeasured 60 images in random order and in two sessions (Fig. 2).

Reference data
In order to also provide reliable population reference data for ulnar variance per skeletal age group, the sample of participants for reliability analysis was augmented until a total of 350 participants was randomly selected from the study population by Van Rijn et al [15]. Rater 1 used the perpendicular method and rater 2 performed the Hafner measurements on all of these images. Reference values for each skeletal age group were calculated for both methods separately.

Statistical analysis
For assessment of intermethod consistency between the perpendicular and Hafner methods, the intraclass correlation coefficient (ICC) for consistency in rater 1 was calculated using a two-way mixed analysis of variance (ANOVA) model (ICC(3,1)). The average of measurements by the two methods was calculated for each image, as well as the difference in ulnar variance between the two methods. Variability was determined using the method described by Bland and Altman, by calculating the 1.96 standard deviation (SD) of the mean difference between the two methods as the upper and lower limits of agreement [20].
Interrater agreement was assessed in a similar manner: for each method, the ICC for absolute agreement  [16,17] between the raters was calculated using a two-way random ANOVA model (case 2, ICC(2,1)). The means and SD of the measurements were calculated for both raters within each method. The mean difference with its SD between measurements by both raters was calculated, as well as the limits of agreement. From the set of 60 double measurements by rater 1, intrarater agreement for both methods was determined by calculating the ICC for absolute agreement using a two-way random ANOVA model (case 2, ICC(2,2)). The levels of agreement measured by the ICC were defined as ICC < 0.5 = poor, ICC 0.5-0.75 = moderate, ICC 0.75-0.9 = good, and ICC > 0.9 = excellent. A sample size calculation was done based on an ICC ≥ 0.8 and a preferred 95% confidence interval (CI) of 0.75-0.85, leading to a preferred sample size of at least 201 images to be rated by each rater [21,22]. Table 2 shows ulnar variance measurements for the complete cohort and Table 3 for both sexes per skeletal age group.
During a consensus meeting, raters concluded that they interpreted the radial surface differently using the literature-based instructions [11]. The perpendicular method's instructions were adapted into a more detailed description (Fig. 4, Appendix 2) and for the second series of 60 images that were subsequently measured, the mean systematic difference was 0.2 mm (SD, 0.8 mm) and the ICC for absolute agreement of the adapted perpendicular method after one iteration was 0.75 (95% CI, 0.61-0.84), defined as good (Fig. 3b). The complete cohort's mean ulnar variance was − 1.4 mm (SD, 1.3 mm; range, − 7.0 to 3.5 mm).

Intermethod consistency
The ICC for intermethod consistency was 0.60 (95% CI, 0.48-0.70) for the perpendicular method compared with PRPR, defined as moderate. For the perpendicular method compared with DIDI, the ICC for intermethod consistency was moderate as well, with a value of 0.60 (95% CI, 0.49-0.70). Table 2 shows the ICCs for intermethod consistency per skeletal age group. The mean difference between PRPR and the perpendicular measurement was 0 mm, whereas it was − 1 mm between DIDI and the perpendicular measurement (Fig. 5).
In 38 cases (11%; 7 girls, 31 boys), all with skeletal ages of 9 years or younger, the perpendicular method could not be used because of absence of the ulnar epiphysis or of both epiphyses (Fig. 6). The Hafner method could not be used in 79 cases (23%; 59 girls and 20 boys), all with a skeletal age of 12 years or older, because one or both growth plates were not visible (Fig. 6).

Discussion
This study is the first to show moderate intermethod consistency of the perpendicular and Hafner methods for ulnar variance measurement in a population of healthy children and adolescents (ICC 0.60), with reference values for both methods. The interrater agreement was good to excellent for the Hafner method (ICC 0.88-0.94), and good for an adapted version of the perpendicular method with detailed measurement instructions (ICC 0.75) after a consensus meeting.

Intermethod consistency
In line with previous statements [14], the perpendicular method was moderately consistent with the Hafner method, albeit with dispersed absolute differences between measurements. Figure 5 shows that in negative Hafner measurements, the perpendicular measurement is often more negative, whereas in positive Hafner measurements, the perpendicular measurement is often less positive, and that differences with the perpendicular method are scattered within limits of agreement of − 3 and + 3 mm (PRPR) and − 4 and 2 mm (DIDI). This proportional bias of the perpendicular method compared with the Hafner method likely originates from the different anatomical distances used in these two methods. While PRPR was originally labeled the preferred measurement and is therefore more widely used than DIDI [12], our findings suggest that concomitant use of PRPR and DIDI is valuable for reliable intermethod comparison, but that raters need to take the systematic difference of − 1 mm between the perpendicular method and DIDI taking into account.

Reliability
The Hafner method's interrater and intrarater agreement were not reported in the original publication, but one study in young gymnasts illustrated its intrarater reliability with Pearson correlation coefficients of 0.97 to 0.99 [7]. For the perpendicular method, we report an interrater agreement ICC slightly lower than the ICCs of 0.92 (for boys) and 0.89 (for girls) reported earlier [14] that can be (partly) explained by methodological differences. In our study, raters drew all relevant lines while measuring, as opposed to using a template with horizontal lines representing each millimeter of ulnar variance as was done previously [14]. The large discrepancy between interand intrarater agreement after the first 60 measurements suggests that even when using the literature-derived instructions, variation between raters can be large. A template might overcome this problem, but may not be available in all PACS systems, warranting clear and unambiguous instructions for those who do not have access to a template. We therefore provide the adapted perpendicular method for ulnar variance measurement use in adults and children with (partly) fused physes and have included a standardized instruction sheet (Appendix 2).

Reference data
We report comparable pediatric ulnar variance values compared with the commonly used reference values of − 2.1 to − 2.3 mm (PRPR) and − 2.3 to − 2.8 mm (DIDI) reported by Hafner et al, who found 95% confidence interval widths varying from 4 to 9 mm, increasing with age [12]. For the adapted perpendicular method, our results show a more negative ulnar variance than earlier measurements with a slightly lower ICC [14], warranting cautious interpretation. This difference may in part be caused by the adaptation of measurement instructions.
In healthy pediatric populations, mean ulnar variance is reportedly negative: − 2.3 to 0.9 mm (Fig. 7). These studies' sample sizes and heterogeneity likely have contributed to the large reported confidence intervals compared with the clinically relevant difference of only a few millimeters between   negative and positive ulnar variance [7,[12][13][14][23][24][25][26]. In addition, forearm rotation reportedly affects ulnar variance [19], and although these differences can be small and will therefore not always be clinically relevant [4,27], slight variations in hand positioning on radiographs may have contributed to the heterogeneity of the population data in the literature. Finally, ulnar variance can increase with age [1,12,28], becoming less negative or even positive in adulthood [4]. Our population may have been on average older (chronologically or skeletally) than other pediatric study populations, rendering ulnar variance less negative and closer to adult measurements.

Strengths and weaknesses
We used radiographs with standardized hand positioning from a large population of healthy children and adolescents without wrist pathology to ensure reliable results and to provide reference data. Although more children aged 12 years and older were included, at least 13 cases per skeletal age group over 6 years were available. Two musculoskeletal radiology specialists first measured several practice cases to prevent bias caused by a learning effect. The methodology included one iteration of adaptation and extension of written instructions for the perpendicular method because of large systematic interrater differences. This resulted in the adapted perpendicular method with improved reliability, which can now be further externally validated in other observers such as hand surgeons or orthopedic surgeons. The reference data represent Western European children and adolescents, and population data need to be established for populations with different ethnicities.

Clinical impact
Childhood gymnastics performance and distal radial growth plate stress injury are thought to cause increased incidence of positive ulnar variance and long-term consequences like TFCC injury [26,29]. However, negative, neutral, and positive ulnar variance have all been described in young gymnasts [30], and accurate measurement is therefore essential for future investigations of this relationship. For the diagnosis or therapeutic decision-making process of other conditions related to abnormal ulnar variance, like radial Salter-Harris fractures, Kienböck's disease, and juvenile idiopathic arthritis [1,31,32], reliable measurement of ulnar variance is equally valuable. The results from this study can aid radiologists, hand surgeons, and other clinicians in choosing the appropriate measurement method and in comparing measurements with reference data, provided from healthy children from Hafner's age group and from healthy adolescents older than 15 years. For children with skeletal ages of 8 years or younger, the PRPR and DIDI are recommended, and for 14 years or older, the adapted perpendicular method is the measurement of choice. For children with skeletal ages of 9 to 13 years, both methods can be used and measurements can be compared while keeping in mind the − 1 mm systematic difference between the perpendicular and DIDI methods and the higher interrater reliability of the Hafner method. The reference data are organized by skeletal age determined on the same hand radiograph, facilitating maturity-related comparisons.

Future recommendations
As pediatric mean ulnar variance values vary largely between previous studies, but small changes are suggested to be of influence in various conditions such as wrist pain in gymnasts, future research on the clinically relevant differences in ulnar variance in this population is warranted. Depending on the study population's age range, the Hafner or perpendicular Part 1: drawing the measurement lines Draw a line parallel to the longitudinal axis of the radius. This is now referred to as the reference line (RL).

Part 2: measuring ulnar variance
Measure the distance between line 1 and line 2. This is the patient's ulnar variance.
Ulnar variance: Perpendicular to RL, draw a line that touches the most distal articular surface of the ulnar side of the radius, located at the ulnar corner of the sigmoid notch of the radius (line 1).
Perpendicular to RL, draw a line that touches the most distal articular surface of the ulna (line 2).

+ -
If the ulna is relatively longer than the radius, the ulnar variance is positive. If the ulna is relatively shorter than the radius, the ulnar variance is negative.

Example
The cortical rim of the ulna is considered to represent the most distal articular surface of the ulna. The cortical rim of the most distal part of the articular surface of the radius can be difficult to locate due to overprojection, especially in case of incorrect positioning of the hand. When locating the ulnar corner of the sigmoid notch of the radius on incorrectly aligned images, draw the line directly inbetween the sclerotic rim of the distal radial articular surface and the overprojecting (more distal) part of the radial surface.
This surface is usually seen as a sclerotic rim of the radius. method should be used to provide accurate measurements. Regardless of the measurement method, standardized wrist positioning should be applied with the forearm in neutral rotation, and caution should be taken that a template may not be useable in all PACS systems and that manual application of measurement lines may be subject to large interrater differences. Use of a standardized instruction sheet (Appendices 2 and 3) can help reduce this variation.  Fig. 5 a Adapted Bland-Altman plot for Hafner's PRPR measurement compared with the difference of the PRPR measurement and the perpendicular measurement. In persons in whom both methods can be used, the difference between these measurements can be assessed using these study results. For example: if in a 12-year-old child the PRPR is − 1 mm and the perpendicular method results in an ulnar variance of − 3 mm, the difference between perpendicular compared with PRPR is − 2 mm, which lies within the limits of agreement of the differences found in this study.