The reliability and validity of three non-radiological measures of thoracic kyphosis and their relations to the standing radiological Cobb angle
- First Online:
- Received:
- Accepted:
DOI: 10.1007/s00198-010-1422-z
- Cite this article as:
- Greendale, G.A., Nili, N.S., Huang, MH. et al. Osteoporos Int (2011) 22: 1897. doi:10.1007/s00198-010-1422-z
- 21 Citations
- 1.2k Downloads
Abstract
Summary
Hyperkyphosis is implicated in a mounting list of negative outcomes, including higher mortality. Hyperkyphosis research is hindered due to difficulties inherent in its measurement. By showing that three clinical measures of kyphosis are suitable for use in large scale, longitudinal, hyperkyphosis studies, we will facilitate much needed research in this field.
Introduction
The objective of this study is to describe the reliability of three non-radiological kyphosis measures (Debrunner kyphosis angle, flexicurve kyphosis index, and flexicurve kyphosis angle) and their validity compared to the Cobb angle and to approximate a Cobb angle from non-radiological kyphosis measures.
Methods
We analyzed data from 113 participants aged ≥60 years with kyphosis angle ≥40°. Cobb angle was measured on a standing lateral thoracolumbar radiograph using bounds at T4 and T12. Non-radiological measures of kyphosis were made three times by a single rater and a 4th time by a blinded second rater.
Results
Intra- and inter-rater reliabilities for non-radiological assessments were high (intra-class correlations of 0.96 to 0.98) and did not differ from each other. Pearson correlations, estimating validity, ranged from 0.62 to 0.69 and did not differ. The Debrunner angle was close to the Cobb angle, with scaling factor of 1.067 and an offset of 5°. The Flexicurve kyphosis angle had to be scaled by 1.53 to obtain the equivalent Cobb angle. The scaling factor for the Flexicurve kyphosis index to Cobb angle was 315, with an offset of 5°. Compared to the measured Cobb angle, Cobb angles predicted using the non-radiological measures had similar magnitude errors (standard deviations of the differences ranging between 10.24 and 11.26).
Conclusions
Each non-radiological measurement had similar reliability and validity. Low cost, ease of use, and robustness to variations in spine contour argue for the Flexicurve in longitudinal kyphosis assessments. The approximate conversion factors provided will permit translation of non-radiological measures to Cobb angles.
Keywords
Cobb angleKyphosisReliabilityValidityIntroduction
Adverse consequences of hyperkyphosis (excessive thoracic kyphosis) include physical functional limitations [1–4], injurious falls [5], back pain [6], respiratory compromise [7], restricted spinal motion [8], fractures [9, 10], and mortality [11–13]. However, a recent randomized, controlled trial found that hyperkyphosis was remediable, encouraging further study of its prevention and treatment [14].
Impediments to large-scale hyperkyphosis research are the difficulties inherent in obtaining the criterion standard measurement, the modified Cobb angle [15–19], including expense, limited portability of X-ray equipment, X-ray exposure, and the time necessary to procure and read the radiographic image.
Although the non-radiological kyphosis measures minimize cost and obviate radiation, they have enjoyed limited adoption. One explanation may be that they are not calibrated to the Cobb angle, which limits their clinical interpretation. A metric that translates a non-radiological kyphosis result into an approximate Cobb angle would allow estimation of clinical severity from non-Cobb measures. Demonstrations of the reliability and validity of the non-radiological measures, especially in older persons, have been minimal, a possible second reason for limited use [13, 20, 22–24].
Therefore, we designed this study to describe: (1) the intra-rater and inter-rater reliability of three non-radiological kyphosis measures, the Debrunner kyphosis angle, the flexicurve kyphosis index, and the flexicurve kyphosis angle; (2) the validity of each non-radiological measure using the modified Cobb angle as the criterion standard; and (3) a translational formula that provides an approximate Cobb angle based on results of the non-radiological measures. We used baseline data from the Yoga for Kyphosis trial, during which we performed standing lateral radiographs to assess modified Cobb angle as well as multiple, same-day, intra-rater and inter-rater measures of the non-radiological assessments.
Methods
Participants
The analysis sample came from the Yoga for Kyphosis Trial, a single masked, randomized, controlled trial (RCT) of Yoga intended to improve thoracic hyperkyphosis [14]. The trial enrolled 118 participants aged ≥60 years with Debrunner kyphometer-assessed kyphosis angle ≥40°. Major RCT exclusions were: serious comorbidity; use of an assistive device; or unable to pass a movement-safety screen. Of 118 persons enrolled in the RCT, 113 had a standing radiological Cobb angle and at least one non-radiological assessment of kyphosis at RCT baseline, making them eligible for this analysis.
Kyphosis measurement
All kyphosis measures were made on the same day, within a 4-h window. The modified Cobb angle, based on the technique originally described by Cobb to quantify scoliosis, was measured on standing lateral thoracolumbar radiographs [17–19], specifying the limit vertebrae at T4 and T12 [18]. Because some radiographs did not permit use of specified limit vertebrae (e.g., due to overlying structures) Cobb angles from 20 films were based on eight vertebrae (T4–T11 or T5–T12) and Cobb angles from six films were based on seven vertebrae (T5–T11). Non-radiological measures of kyphosis included the Debrunner kyphometer angle, the Flexicurve kyphosis index, and the Flexicurve kyphosis angle. The upper arm of the Debrunner kyphometer was placed on C-7 and the lower arm on T-12. The circumscribed kyphosis angle was read from the protractor [6, 20]. Debrunner measurements were flagged as problematic in eight cases, because it was difficult to get the base of the arms flush on the landmarks. The Flexicurve kyphosis index was measured using a Flexicurve [21, 25]. The cephalic end of the Flexicurve was placed on C-7, and it was molded to the spine in the caudal direction. The shape was traced onto paper, and the apex kyphosis height was estimated relative to the length of the entire thoracic spine; this is the Flexicurve kyphosis index (Fig. 1). Using geometric formulae, the Flexicurve kyphosis angle was also calculated from the Flexicurve tracing. By definition, this inscribed angle is systematically less than the circumscribed angle (Fig. 1).
Training and time required for non-radiological kyphosis measures
Research staff had baccalaureate degrees, but none had formal training in anatomy. Staff training consisted of an initial didactic and demonstration (with the aid of volunteer subjects) by Principal Investigator (GAG). It included: review of basic spine anatomy using illustrations; instruction in how to find landmarks by palpation; demonstration of the placement of the kyphometer and how to read the angle from the instrument’s protractor; demonstration of how to apply the flexible ruler and how to make measurements from it. Each staff member then practiced identifying landmarks and conducting the measures. In aggregate, the didactics and staff practice took approximately 40 min. During the conduct of the study, each Debrunner measurement took between 1 and 2 min to make and record, depending on the degree of difficulty ascertaining landmarks. Each flexible ruler measure took 30 s to make; subsequent tracing of the shape on paper and taking the measurements to calculate the angle and index took 2.5 min.
Intra-rater and inter-rater reliability
Each clinical kyphosis assessment was made three times for each participant (with repositioning) by the same staff person; the average was the primary value. These three measures also permitted evaluation of intra-rater reliability. For inter-rater reliability, immediately following the first set of measures, one other masked research associate made a 4th assessment, with repositioning, in 54 participants. (Inter-rater sample size ranged from 51 to 54 due to missing values.)
Statistical analyses
We examined the within-rater, intra-class correlation coefficients (ICC = between-person variance divided by total variance) for each of the non-radiological kyphosis measures using the three measurements made on each participant by the primary rater. In the 54 participants in the inter-rater subset, who had paired ratings made by a single first and a single second rater, we compared the average of the three measures from the primary rater with the single measure from the secondary rater, calculating inter-rater ICCs. Both intra-rater and inter-rater ICCs were also examined after stratification by kyphosis severity, defined by Cobb angle median split: moderate if <53°, severe if ≥53°. To compare the non-radiological kyphosis measures with the Cobb angle criterion standard, we examined Pearson correlations between each non-radiological measure and Cobb angle. These analyses were repeated after first excluding 26 participants whose Cobb angles did not span T4–T12 and then excluding seven individuals whose Debrunner measurements were flagged as problematic. In each of these samples, correlations were also examined after stratification by kyphosis severity. We created mathematical formulae to convert the non-radiological results to equivalent Cobb angles. Formulae were created by simple linear regression of the Cobb angle on each of the non-radiological measures in the sample that excluded participants whose Cobb angles did not span T4–T12 and whose Debrunner measurements were flagged as problematic. To test if Cobb angles measured using alternate landmarks had systematic error, in the 20 participants whose Cobb angle measurements spanned either T5–T12 or T4–T11, we compared the measured Cobb angle with the Cobb angle predicted by the clinical measures, using the paired t test. Finally, in the sample in which we derived the Cobb angle prediction equations (Table 5), we conducted Bland–Altman analyses. Bland–Altman analysis consists of the examinations of two graphs. The first graph is an identity plot, a scatter plot of the two measurements along with the line y = x. If the measurements agree closely, then the scatter plot points will line up near to the line y = x. The identity plot was produced only for measured Cobb angle and the measured Debrunner kyphosis angle, because they measure the same thing (circumscribed kyphosis angle) and use the same metric (degrees). The second graph is a Bland–Altman plot, a scatter plot of the variable’s means plotted on the horizontal axis and the variable’s differences plotted on the vertical axis; it includes approximate 95% confidence bands (the confidence bands assume normality of differences). The Bland–Altman plot illustrates the amount of disagreement between the measures being compared. Bland–Altman plots were created for the measured Cobb angle and each of the following: measured Debrunner kyphosis angle; Debrunner-predicted Cobb angle; Flexicurve kyphosis index-predicted Cobb angle; and Flexicurve kyphosis angle-predicted Cobb angle. The scientific importance of these differences is judged qualitatively; however, we also computed the standard deviation of the mean difference between the Cobb angle and each comparator to gauge the magnitude of the error [26].
Results
Baseline demographic, behavioral and medical characteristics of study participants
Characteristic | Full sample (N = 113) | Inter-rater reliability sample ^{a} (N = 54) |
---|---|---|
Age (years) | 75.3 ± 7.5 | 75.5 ± 7.7 |
Height (cm) | 160.7 ± 8.9 | 161.1 ± 9.0 |
Weight (kg) | 68.8 ± 15.1 | 68.3 ± 14.3 |
Body mass index (kg/m^{2}) | 26.5 ± 4.5 | 26.1 ± 4.3 |
Female gender: % (N) | 80.5 (91) | 81.8 (45) |
Usual physical activity | 2.3 ± 0.5 | 2.3 ± 0.6 |
Chronic conditions (#) | 5.6 ± 3.8 | 5.4 ± 2.9 |
Vertebral fractures ^{b,c} | ||
None % (N) | 75.2 (85) | 74.6 (41) |
Thoracic % (N) | 19.5 (22) | 20.0 (11) |
Lumbar % (N) | 7.1 (8) | 9.1 (5) |
Average values and distributions of standing Cobb angle and non-radiological kyphosis measurements
Kyphosis measurement | Sample size | Mean | Standard deviation | Median |
---|---|---|---|---|
Cobb angle, entire sample^{a} (degrees) | 113 | 53.76 | 14.76 | 53.10 |
Cobb angle, subset in which T4–T12 landmarks were used (degrees) | 87 | 55.43 | 13.62 | 53.1 |
Debrunner kyphosis angle (degrees) | 113 | 57.68 | 9.60 | 58.00 |
Flexicurve kyphosis index | 113 | 0.162 | 0.033 | 0.161 |
Flexicurve kyphosis angle ^{b} (degrees) | 113 | 36.50 | 6.82 | 36.48 |
Intra- and inter-rater reliabilities of three non-radiological kyphosis assessments
| Intra-rater reliability (N = 113) | Inter-rater reliability^{a} (N = 51–54) |
---|---|---|
Full sample | ||
Debrunner kyphosis angle | 0.98 | 0.98 |
Flexicurve kyphosis index | 0.96 | 0.96 |
Flexicurve kyphosis angle | 0.96 | 0.96 |
| ||
Moderate Kyphosis ^{b} | ||
Debrunner kyphosis angle | 0.97 | 0.98 |
Flexicurve kyphosis index | 0.94 | 0.93 |
Flexicurve kyphosis angle | 0.94 | 0.94 |
| ||
Severe Kyphosis | ||
Debrunner kyphosis angle | 0.97 | 0.98 |
Flexicurve kyphosis index | 0.94 | 0.97 |
Flexicurve kyphosis angle | 0.94 | 0.95 |
Validity of three non-radiological measurements of kyphosis compared to the Cobb angle criterion standard
Non-radiological kyphosis measurement and kyphosis severity | Full sample | Cobb-restricted sample^{a} | Cobb and Debrunner-restricted samples^{b} |
---|---|---|---|
Full range of Kyphosis | (N = 113; Std error = 0.094) | (N = 87; Std error = 0.107) | (N = 80;Std error = 0.112) |
Debrunner kyphosis angle | 0.622 | 0.715 | 0.762 |
Flexicurve kyphosis index | 0.686 | 0.725 | 0.756 |
Flexicurve kyphosis angle | 0.686 | 0.721 | 0.758 |
Moderate Kyphosis^{c} | (N = 55; Std error = 0.135) | (N = 41; Std error = 0.156) | (N = 37 ;Std error = 0.164) |
Debrunner kyphosis angle | 0.275 | 0.354 | 0.405 |
Flexicurve kyphosis index | 0.335 | 0.426 | 0.428 |
Flexicurve kyphosis angle | 0.328 | 0.397 | 0.406 |
Severe Kyphosis | (N = 58 ;Std error = 0.131) | (N = 46;Std error = 0.149) | (N = 43; Std error = 0.152) |
Debrunner kyphosis angle | 0.447 | 0.602 | 0.641 |
Flexicurve kyphosis index | 0.517 | 0.600 | 0.597 |
Flexicurve kyphosis angle | 0.532 | 0.626 | 0.627 |
Calibration of non-radiological kyphosis measurements to theT4–T12 Cobb angle (n = 80)
Non-radiological kyphosis measurements | β coefficient | Intercept | R^{2} |
---|---|---|---|
Debrunner kyphosis angle | 1.067 | −5.40 | 0.58 |
Flexicurve kyphosis index | 314.61 | 5.11 | 0.57 |
Flexicurve kyphosis angle | 1.53 | 0.30 | 0.57 |
In the 20 individuals with Cobb angle measurements that spanned one less vertebral body (i.e., T4–T11 or T5–T12), mean Cobb angle was smaller than the Cobb angle predicted by the clinical kyphosis measures by about 8° in each case (data not shown), indicating that when the Cobb angle measure spans fewer vertebral bodies, the Cobb angle is systematically underestimated.
Discussion
The overarching goals of this study were to calculate the reliability and validity of the Debrunner kyphometer angle, flexicurve kyphosis index, and flexicurve kyphosis angle and to calibrate each to the Cobb angle. Intra- and inter-rater reliabilities for the three non-radiological kyphosis assessments were uniformly high (0.96 to 0.98) and did not differ statistically from each other. Comparing the non-radiological kyphosis measurements to the Cobb angle also yielded validity estimates that were not distinguishable; all correlations were moderate (0.62 to 0.69). Our derived regression equations that scaled the non-radiological kyphosis estimates to the Cobb angle had robust R^{2} values, between 0.57 and 0.58.
This study’s high inter-rater and intra-rater reliabilities of Debrunner kyphometer and the Flexicurve kyphosis index, based on ICC values, mirrored reliabilities developed in a sample of 26 postmenopausal women with osteoporosis (but whose age range and degree of kyphosis was not specified); in that sample, inter-rater and intra-rater ICCs between 0.89 and 0.99 were found for each test [20]. The present analysis expands upon prior work by including a greater sample size, older subjects (in whom measurements may be more challenging), and a broad range of kyphosis over which reliabilities were assessed. The two studies agree, however: inter- and intra-rater reliabilities approach perfect and do not differ between the Debrunner kyphometer and the Flexicurve kyphosis index [27]. Although Ohlen examined reliability of the Debrunner kyphometer in 31 young volunteers and Ettinger tested reliability of the Flexicurve kyphosis index in 75 women aged 65–91 years, these two studies used different statistical methods to quantify reliability than those used in the present study, precluding direct comparison of their reliability estimates to ours [22, 24].
To our knowledge, published work has not reported the validity of the Debrunner kyphometer or the Flexicurve kyphosis index compared to the standing Cobb angle. Based on a sub-sample of 120 women from the Fracture Intervention Trial, Kado et al. calculated an ICC of 0.68 for the kyphosis index compared to a supine Cobb angle; however, the supine position would be expected to lessen the angle of kyphosis and lower the validity estimate [28].
Creating a mathematical formula that approximates Cobb angle based on a non-radiological kyphosis measure is not a novel idea and its value in avoiding radiation and facilitating longitudinal measurement has been recognized [23]. However, cross-calibration has been done only for the Debrunner instrument in an adolescent sample [23]. The present study offers metrics that allow researchers and clinicians to scale the Debrunner angle, Flexicurve kyphosis index, and the newly developed Flexicurve kyphosis angle to a standing radiological Cobb angle in adults with hyperkyphosis. For example, the Flexicurve kyphosis index–Cobb translations could enhance the interpretation of an important finding from the Study of Osteoporotic Fractures (SOF): that greater Flexicurve kyphosis indices predicted higher mortality independently of vertebral fracture [13]. It is now possible to approximate the Cobb angles that these indices represented: using the current study’s metric, the SOF sample’s mean predicted Cobb angle would be 43.8° (standard deviation, 10.7). Thus, the relative mortality hazard per kyphosis index standard deviation developed in SOF can be roughly translated to a 15% increase in mortality per each 10.7° increment in Cobb angle.
This study intended to inform deliberations about which of the three non-radiological tests used in the Yoga for Kyphosis project might be best suited to large observational or interventional kyphosis studies, in which sizable numbers of participants would be evaluated at multiple times. Because these types of studies necessitate multiple raters, the first consideration is the inter- and intra-rater reliabilities. On this basis, all three assessments performed nearly perfectly and equally. A second basis for ranking the three tests is validity, but this also did not discriminate among them. Finally, when compared to the criterion standard measured Cobb angle, Cobb angles predicted using each of the non-radiological measures had similar magnitude errors according to the Bland–Altman plots. Therefore, factors such as simplicity of use and sensitivity to anatomical variability may suggest the most favorable approach. The flexicurve may be easier for research staff without medical training, as it does not require identification of caudal landmarks. The flexicurve traces the contour of the entire spine; the inflection points between the cervical lordosis, thoracic kyphosis, and lumbar lordosis define the spinal curves. In contrast, the Debrunner kyphometer must be placed on palpated landmarks [6]. Despite careful protocols, the inferior landmark can be particularly difficult to discern, especially when lumbar lordosis has reversed [21]. The Cobb and Debrunner angles base their measurements entirely on the two ends of the spinal curve. If there are no problems at these locations (such as endplate tilt of limit vertebrae or difficult Debrunner placement), dependence on the terminal portions of the curve will not be strongly influential [29]. However, when anatomical abnormalities are present, then an instrument such as the Flexicurve, which uses the entire spinal contour, will be more robust because deformities in part of the spine will not introduce large errors. In this regard, the Flexicurve is akin to the centroid angle, which computes kyphosis using the midpoints of all vertebral bodies from T1–T12 [29]. Indicative of the error introduced by difficult landmark determination was the trend toward higher a correlation between the Debrunner and Cobb angles when eight individuals with difficult Debrunner measures were omitted from the validity computation (Table 4).
Use of the T4–T12 constrained Cobb angle had merits and limitations. In favor of the constrained Cobb is that the uppermost thoracic vertebrae are often poorly visualized due to overlying tissue density. Another attribute of the constrained technique is that the identification of the most inclined vertebral body, which marks the transition from the thoracic to the lumbar curves, can be difficult, leading to low intra-rater reliability for determination of limit vertebrae, a problem circumvented by using the constrained Cobb technique [30, 31]. It must be acknowledged that the constrained method will misestimate the true kyphosis angle when the transition vertebra is not at the same level as the specified level. In aggregate, the potential measurement errors in the Cobb angle degrade the accuracy of the criterion standard, conservatively biasing this study’s validity estimates.
The reliability and validity estimates of the non-radiological measures of kyphosis calculated from this sample cannot be assumed to apply to all instances in which these measuring devices are used; they are not immutable characteristics of the tests themselves [32]. Deterioration of reliability and validity may occur due to subject characteristics (e.g., obesity hampers landmark location) or to operator characteristics (e.g., staff capability). Because the research associates who performed the measures in the current study had no formal training in anatomy and likely comparable to other entry-level research or clinical staff, we believe that operator characteristics are unlikely to be influential in other settings.
The metrics developed in this study to scale the non-radiological tests to the standing Cobb angle must be viewed as approximations, intended to give investigators and clinicians a “feel” for what the values of the non-radiological tests mean in Cobb angle terms. They are not intended to translate individual patient’s non-radiological measures to Cobb angle values in clinical practice. Rather, these approximate conversion formulae are meant to help researchers get a handle on what the non-radiological tests mean in Cobb angle terms, which will inform the general clinical translation of research results.
In summary, in our study sample, we found that the Debrunner kyphometer, the flexicurve kyphosis angle and the flexicurve kyphosis index had strong and similar validity and reliability. Its low cost, ease of use by entry-level research staff, short measurement time, and relative robustness to variations in spine contour and deformity argue for use of the Flexicurve in longitudinal assessments of kyphosis. This study also provides approximate conversion factors that permit translation of results from three non-radiological kyphosis measures to an approximate Cobb angle value, which will assist researchers in interpreting the clinical meaning of the non-radiological tests.
Conflicts of interest
None.
Source of funding
Funding for conduct of the Yoga for Kyphosis Trial and this analysis was provided by NIH/NICHHD (5 R01 HD045834). Dr. Karlamangla was also supported by funding from the UCLA-Claude D. Pepper Older Americans Independence Center (1P30 AG028748).
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.