To report on the development of AOSpine CROST (Clinician Reported Outcome Spine Trauma) and results of an initial reliability study.
The AOSpine CROST was developed using an iterative approach of multiple cycles of development, review, and revision including an expert clinician panel. Subsequently, a reliability study was performed among an expert panel who were provided with 20 spine trauma cases, administered twice with 4-week interval. The results of the developmental process were analyzed using descriptive statistics, the reliability per parameter using Kappa statistics, inter-rater rater agreement using intraclass correlation coefficient (ICC), and internal consistency using Cronbach’s α.
The AOSpine CROST was developed and consisted of 10 parameters, 2 of which are only applicable for surgically treated patents (‘Wound healing’ and ‘Implants’). A dichotomous scoring system (‘yes’ or ‘no’ response) was incorporated to express expected problems for the short term and long term. In the reliability study, 16 (84.2%) participated in the first round and 14 (73.7%) in the second. Intra-rater reliability was fair to good for both time points (κ = 0.40–0.80 and κ = 0.31–0.67). Results of inter-rater reliability were lower (κ = 0.18–0.60 and κ = 0.16–0.46). Inter-rater agreement for total scores showed moderate results (ICC = 0.52–0.60), and the internal consistency was acceptable (α = 0.76–0.82).
The AOSpine CROST, an outcome tool for the surgeons, was developed using an iterative process. An initial reliability analysis showed fair to moderate results and acceptable internal consistency. Further clinical validation studies will be performed to further validate the tool.
Based on a ground-up and evidence-based approach, the AOSpine Knowledge Forum (KF) Trauma has undertaken initiatives to develop a novel disease-specific outcome instrument for spine trauma patients. In addition to outcome measurement from the patients’ perspective, there is also a need for a tool that incorporates the most relevant clinical and radiological parameters from spine surgeons’ perspective as a corollary predictive outcomes tool. In daily clinical practice, treating surgeons routinely use a number of clinical and radiological parameters to evaluate treatment results after traumatic spine injuries, either conservative or surgical. In order to predict the outcome and determine the potential need for additional treatment, it is common that spine surgeons make estimates of expected problems with respect to a number of short-term and long-term outcomes. It is likely that surgeons’ perspectives may differ substantially from the patients’ perspective [1,2,3,4,5,6,7].
It would be valuable to standardize the surgeons’ ‘gut feeling’ and make it measurable. Therefore, we sought to assess the potential utility of a new concept of a Clinician Reported Outcome Spine Trauma (AOSpine CROST) as supplemental to a patient reported outcomes tool. Such a tool would be administered by the treating surgeons at various time points during the follow-up period, after patients’ initial treatment. We hypothesized that treating surgeons with their content expertise would be enabled to estimate and predict clinical and functional outcomes of spine trauma patients using this tool. The quality of spine care would be improved with standardizing the evaluation of patients’ postoperative course. The objective of this paper is to report on the development of the AOSpine CROST as well as the results of an initial reliability study.
Materials and methods
Developmental process AOSpine CROST
In the developmental process of the tool, two separate surveys were conducted among international spine trauma experts in order to identify relevant clinical and radiological parameters for the thoracic and lumbar spine , and for the cervical spine . Subsequently, integrating evidence from the preparatory studies with expert opinion, a working draft version of the AOSpine CROST was developed. This process consisted of an iterative approach of multiple cycles of development, review, and revision including an expert clinician panel consisting of AOSpine KF Trauma and its associate members. Attention was paid to the definition of the parameters and additional descriptions in order to specify those parameters. Also various response scales were investigated. After the development of a draft version of the tool, a pilot test was performed during an expert committee meeting. The tool was evaluated by rating it for various cases from the daily clinical practice. After completing this phase, a definite version was developed to be further validated.
For the validation phase, a study to evaluate intra- and inter-rater reliability was performed among an expert panel. An invitation was sent by the data manager of AOSpine International to the Steering Committee members of the KF Trauma and Spinal Cord Injury as well as to their associate members. The participants were provided with 20 selected spine trauma cases through an online system, representing a typical wide range of clinical cases as would be seen in daily clinical practice. The cases were selected by the first author (SS) and senior author (FCO). The web-based system provided background data about case scenarios, their AOSpine CROST evaluation, and any comments in an additional blank field. For retest reliability, the cases were reassessed at two occasions with a 4-week interval.
In line with the aim of the AOSpine CROST to evaluate the provided treatment at the first follow-up time-point after trauma, the cases scenarios mimicked a first outpatient visit after the initial trauma. The cases were selected from a large database of the University Medical Center Utrecht (Utrecht, the Netherlands) and Schön Klinik (Fürth, Germany), and included 14 surgically and 6 conservatively treated patients. Each case consisted of: (1) patient characteristics, (2) background life-style, (3) trauma-related characteristics together with the CT and/or MRI scan slices from the trauma-setting, (4) the further course at the hospital, and (5) the outpatient clinic follow-up together with the AP and lateral CR or any other modality that was performed. Two examples of cases which showed various AOSpine CROST results are shown in Figs. 1 and 2.
The results of the developmental process of the AOSpine CROST were analyzed using descriptive statistics. The inter- and intra-rater reliability per parameter was analyzed using Kappa statistics, with < 0 values indicating poor agreement, 0–0.20 slight, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, and 0.81–1.00 almost perfect agreement . The inter-rater agreement for the total score was calculated using intraclass correlation coefficient (ICC) . The internal consistency was analyzed using Cronbach’s α, with α ≥ 0.7 being indicated as acceptable while α ≥ 0.9 as excellent.
AOSpine CROST tool
After pilot testing and multiple evaluations during expert committee meetings, a final version of the AOSpine CROST was developed consisting of 10 parameters (Table 1). Eight parameters were rated for both surgically and non-surgically treated patients, while 2 parameters were only applicable to surgically treated patients (‘Wound healing’ and ‘Implants’). The tool was pilot tested among several examples of spine trauma cases from the daily clinical practice. In line with the approach in the preparatory studies, each parameter was rated both for the short-term and long-term perspectives, indicated as ‘within 12 months’ and ‘from 12 months onwards’, respectively.
It was decided not to further classify response levels or develop specific cutoff points. After review of a number of scoring methodologies during the initial testing process a dichotomous scoring system (‘yes’ or ‘no’ response) was selected to express expected problems or adverse events for the parameters. Each ‘yes’-answer provided 1 point. The total recorded score was the sum of the ‘yes’-answers with a maximum achievable score being 8 points for non-surgically and 10 points for surgically treated patients. A higher total score would indicate worse expected clinical outcomes. The score is seen as an indication of the surgeon’s anticipation of a change in the treatment plan. The definitive version of the AOSpine CROST used in this study is shown in “Appendix.”
In total, out of 19 invited international spine trauma experts, 16 (84.2%) participated in the first round and 14 (73.7%) in the second round. Ten were related to AOSpine KF Trauma and 6 to AOSpine KF Spinal Cord Injury. Different world regions were represented, with 9 (56.3%) experts from North America, 5 (31.3%) from Europe, 1 (6.3%) from Asia, and 1 (6.3%) from South America.
The intra-rater reliability analysis per parameter showed fair to good results, both for the short term and long term (Table 2). For the short term, Kappa values ranged from 0.40 (‘General bone quality’) to 0.80 (‘Radiographic sagittal alignment’). For the long-term predictive outcomes, ‘Radiographic sagittal alignment’ (κ = 0.67) again showed the highest agreement. Compared to the short-term reliability, ‘Wound healing’ (κ = 0.31 vs 0.68)), ‘Stability of the injured spine level’ (κ = 0.57 vs 0.79), and ‘Implants’ (κ = 0.44 vs 0.67) showed rather lower agreements.
With slight to moderate agreement, the results of the inter-rater reliability analysis per parameter were lower than for intra-rater results, both for the short term and long term (Table 3). For the short term, ‘Spinal column mobility’ showed the lowest agreement (κ = 0.18), while the highest agreement was reached for ‘Radiographic sagittal alignment’ (κ = 0.60). The lowest inter-rater reliability for the long term was for ‘Spinal column mobility’ (κ = 0.16), and the highest for ‘Radiographic sagittal alignment’ (κ = 0.46).
Analyses of the inter-rater agreement results for the total scores of the AOSpine CROST showed moderate results for both surgically and non-surgically treated cases, as well as for the short and long term. As shown in Table 4, the intraclass correlation coefficient (ICC) ranged from 0.52 to 0.60.
Acceptable results were observed for the internal consistency of the total AOSpine CROST scores. The Cronbach’s alpha ranged from 0.76 to 0.82 (Table 5).
Although several comments were provided by the participants concerning the cases provided, no specific comments were directed at the AOSpine CROST tool.
Based on the results of two preparatory studies combined with findings from expert committee meetings, the AOSpine CROST has been developed. An initial reliability study conducted among senior spine trauma experts showed fair results for inter-rater reliability; however, moderate results for intra-rater reliability as well as acceptable results for internal consistency were found. We believe this is the first scoring tool for spine trauma care that reflects the spine surgeon’s expectations on predicted patient outcomes applicable to a routine clinical setting.
The tool was developed in an iterative fashion in several development cycles with sequential reviews and revisions in multiphase processes conducted in expert meetings. First, on the basis of two preparatory studies [8, 9], a number of parameters were selected in this process and then refined further. In the developmental process of the AOSpine CROST, multiple versions of the tool including those parameters were pilot tested among several examples of spine trauma cases from the daily clinical practice. In this perspective, various parameters as ‘Neurological status’, ‘Radiographic sagittal alignment’, ‘General bone quality’, and ‘Stability of the injured spine level’ were more precisely defined. However, after extensive efforts through sequential reviews, revisions, and pilot tests the expert committee decided not to further define or formulate specific cutoff points, rather one question was added for each parameter to make the tool more easily usable and improve interpretability. Further, a duration-based differentiation was made for short-term (‘in the next 12 months’) and long-term outcomes (‘from 12 months onwards’).
There were multiple reasons for these decisions in the current phase. For example, in the ‘Neurological status’ parameter addition of a dedicated neurological classification system was contemplated. As there are a variety of neurological classification systems in use, such as ASIA, Frankel Scales, and AOSpine Injury Classification systems [12,13,14,15], it was decided not to further specify this domain. Moreover, the correlation of various types of potential neurological deterioration relative to outcomes remained controversial and therefore was felt to be too unpredictable for classification at this time . Also, ‘Radiographic sagittal alignment’ was not specified in terms of specific kyphosis angles as to their impact on outcomes, as various threshold definitions have been proposed in previous literature [17, 18]. Moreover, worldwide variation of measurement techniques by surgeons around the world has made creation of specific numeric levels undesirable . The parameter of ‘Stability of the injured spine level’ was further refined by addition of the term ‘mechanical instability’. The same was the case for ‘Spinal column mobility’ in which maintenance of overall spinal column mobility was described. ‘General bone quality’ was felt to be another key descriptor of patient bone quality rendered by the surgeon. It was felt important as it may play a role in surgical decision making and also affect supplemental interventional treatments . Also, a higher risk of implant failure and the possibility for gradual neurological deterioration can be correlated with impaired bone quality . Furthermore, domains for patient ‘General physical condition’ and ‘General psychological condition’ were felt to be important factors for treatment selection as well as expected outcomes [22, 23]. ‘Implant’-related concerns were selected as a separate domain as osteoporosis and type of implant selection may impact anticipated outcomes, for instance, in case of short-segment fixation in patients with poor bone stock . ‘Wound healing’ in surgically treated patient might impact patient care, e.g., in form of revision surgeries or ongoing antibiotic treatment. ‘Functional recovery’ was added to the clinician’s perspective-based AOSpine CROST tool while not having been evaluated in preparatory studies [8, 9]. We felt that this parameter would add a valuable contribution to the overall tool, and provide a direct connection to the patient’s reported outcome as expressed by AOSpine PROST (Patient Reported Outcome Spine Trauma). This AOSpine PROST was developed and validated on the basis of different foundational studies and following an international consensus conference .
In general, fair to moderate results were observed for the inter-rater reliability of the tool, while the intra-rater reliability showed moderate to good results. Also, acceptable internal consistency was seen for the parameters of the AOSpine CROST. Thus, the tool is able to adequately measure the underlying construct, and evaluate the treatment progress using clinical and radiological parameters. These results indicate that individual surgeons are highly consistent in their judgments, but there is disagreement among different surgeons. This difference in the evaluation of crucial clinical parameters among surgeons with substantial experience and interest in spine trauma might also explain some of the ongoing controversies on the care of spine trauma patients. It may reflect the regional differences in treatment of trauma patients (or lack of worldwide accepted guidelines) and is considered as a possible expected finding of this study. This view is supported by the better results for intra-rater reliability and internal consistency. In the next phase, while testing the AOSpine CROST in a clinical setting including surgeons from the same departments and regions, better inter-rater reliability within one region or department is expected. From this perspective, this tool may also be useful in understanding the reasons for the observed variations in the practice. The reliability results may also be related with the current study design whereby cases scenarios were provided in an online environment. We would hope that direct assessments in front of actual patients and in a realistic clinical setting would allow for a more consistent assessment of patient by different practitioners, especially when parameters such as ‘General physical condition’ and ‘General psychological condition’ are concerned which scored as ‘fair’ only in this current validation study.
This study has several limitations. First, we relied on a relatively limited number of participants in the reliability study. Nevertheless, as each participant rated the AOSpine CROST for 20 cases, a total of 280–320 data points were retrieved which is comparable or even considerably more compared to many other inter-rater reliability studies in which 30–50 cases are rated by 3–5 participants. Secondly, our study design did not include longer term patient follow-up results to investigate the prospective value of the tool. We do plan to perform such actual outcomes based studies in the future with patients in actual clinical settings. Finally, the patients were presented as online cases only with descriptive scenarios. We felt that for our initial validation phase of our clinician outcomes tool this would provide the most expedient way to test the initial reliability of the AOSpine CROST.
In conclusion, the AOSpine CROST (Clinician Reported Outcome Spine Trauma) was developed on the basis of two preparatory studies combined with the results of expert committee meetings. An initial reliability analysis showed fair to moderate results and acceptable internal consistency. In the next phase, further prospective validation studies will be performed to investigate the construct validity, reliability, and predictive value of the tool. We believe that this tool has the potential to be used in the clinical setting, which can provide a holistic view of patients’ health when used together with the AOSpine PROST (Patient Reported Outcome Spine Trauma) and may help resolve some of the ongoing controversies.
Jensen MC, Brant-Zawadzki MN, Obuchowski N, Modic MT, Malkasian D, Ross JS (1994) Magnetic resonance imaging of the lumbar spine in people without back pain. N Engl J Med 331:69–73. https://doi.org/10.1056/NEJM199407143310201
Remes VM, Lamberg TS, Tervahartiala PO, Helenius IJ, Osterman K, Schlenzka D, Yrjonen T, Seitsalo S, Poussa MS (2005) No correlation between patient outcome and abnormal lumbar MRI findings 21 years after posterior or posterolateral fusion for isthmic spondylolisthesis in children and adolescents. Eur Spine J 14:833–842. https://doi.org/10.1007/s00586-005-0950-2
Witt I, Vestergaard A, Rosenklint A (1984) A comparative analysis of X-ray findings of the lumbar spine in patients with and without lumbar pain. Spine (Phila Pa 1976) 9:298–300
Nygaard OP, Kloster R, Dullerud R, Jacobsen EA, Mellgren SI (1997) No association between peridural scar and outcome after lumbar microdiscectomy. Acta Neurochir (Wien) 139:1095–1100
Kwoh CK, O’Connor GT, Regan-Smith MG, Olmstead EM, Brown LA, Burnett JB, Hochman RF, King K, Morgan GJ (1992) Concordance between clinician and patient assessment of physical and mental health status. J Rheumatol 19:1031–1037
Rothwell PM, McDowell Z, Wong CK, Dorman PJ (1997) Doctors and patients don’t agree: cross sectional study of patients’ and doctors’ perceptions and assessments of disability in multiple sclerosis. BMJ 314:1580–1583
Wilson KA, Dowling AJ, Abdolell M, Tannock IF (2000) Perception of quality of life by patients, partners and treating physicians. Quality Life Res Int J Quality Life Asp Treat Care Rehabil 9:1041–1052
Sadiqi S, Verlaan JJ, Mechteld Lehr A, Dvorak MF, Kandziora F, Rajasekaran S, Schnake KJ, Vaccaro AR, Oner FC (2017) Universal disease-specific outcome instruments for spine trauma: a global perspective on relevant parameters to evaluate clinical and functional outcomes of thoracic and lumbar spine trauma patients. Eur Spine J 26:1541–1549. https://doi.org/10.1007/s00586-016-4596-z
Sadiqi S, Verlaan JJ, Lehr AM, Dvorak MF, Kandziora F, Rajasekaran S, Schnake KJ, Vaccaro AR, Oner FC (2016) Surgeon reported outcome measure for spine trauma: an international expert survey identifying parameters relevant for the outcome of subaxial cervical spine injuries. Spine (Phila Pa 1976) 41:E1453–E1459. https://doi.org/10.1097/brs.0000000000001683
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Shrout PE, Fleiss JL (1979) Intraclass correlations: uses in assessing rater reliability. Psychol Bull 86:420–428
Frankel HL, Hancock DO, Hyslop G, Melzak J, Michaelis LS, Ungar GH, Vernon JD, Walsh JJ (1969) The value of postural reduction in the initial management of closed injuries of the spine with paraplegia and tetraplegia. I. Paraplegia 7:179–192. https://doi.org/10.1038/sc.1969.30
Maynard FM Jr, Bracken MB, Creasey G, Ditunno JF Jr, Donovan WH, Ducker TB, Garber SL, Marino RJ, Stover SL, Tator CH, Waters RL, Wilberger JE, Young W (1997) International standards for neurological and functional classification of spinal cord injury. American Spinal Injury Association. Spinal Cord 35:266–274
Vaccaro AR, Koerner JD, Radcliff KE, Oner FC, Reinhold M, Schnake KJ, Kandziora F, Fehlings MG, Dvorak MF, Aarabi B, Rajasekaran S, Schroeder GD, Kepler CK, Vialle LR (2016) AOSpine subaxial cervical spine injury classification system. Eur Spine J 25:2173–2184. https://doi.org/10.1007/s00586-015-3831-3
Vaccaro AR, Oner C, Kepler CK, Dvorak M, Schnake K, Bellabarba C, Reinhold M, Aarabi B, Kandziora F, Chapman J, Shanmuganathan R, Fehlings M, Vialle L (2013) AOSpine thoracolumbar spine injury classification system: fracture description, neurological status, and key modifiers. Spine (Phila Pa 1976) 38:2028–2037. https://doi.org/10.1097/brs.0b013e3182a8a381
Ter Wengel PV, Martin E, De Witt Hamer PC, Feller RE, van Oortmerssen JAE, van der Gaag NA, Oner FC, Vandertop WP (2019) Impact of early (< 24 h) surgical decompression on neurological recovery in thoracic spinal cord injury: a meta-analysis. J Neurotrauma. https://doi.org/10.1089/neu.2018.6277
Kuklo TR, Polly DW, Owens BD, Zeidman SM, Chang AS, Klemme WR (2001) Measurement of thoracic and lumbar fracture kyphosis: evaluation of intraobserver, interobserver, and technique variability. Spine (Phila Pa 1976) 26:61–65 (discussion 66)
Schoenfeld AJ, Wood KB, Fisher CF, Fehlings M, Oner FC, Bouchard K, Arnold P, Vaccaro AR, Sekhorn L, Harris MB, Bono CM (2010) Posttraumatic kyphosis: current state of diagnosis and treatment: results of a multinational survey of spine trauma surgeons. J Spinal Disord Tech 23:e1–e8. https://doi.org/10.1097/BSD.0b013e3181c03517
Sadiqi S, Verlaan JJ, Lehr AM, Chapman JR, Dvorak MF, Kandziora F, Rajasekaran S, Schnake KJ, Vaccaro AR, Oner FC (2017) Measurement of kyphosis and vertebral body height loss in traumatic spine fractures: an international study. Eur Spine J 26:1483–1491. https://doi.org/10.1007/s00586-016-4716-9
Westerveld LA, Verlaan JJ, Oner FC (2009) Spinal fractures in patients with ankylosing spinal disorders: a systematic review of the literature on treatment, neurological status and complications. Eur Spine J 18:145–156. https://doi.org/10.1007/s00586-008-0764-0
Halvorson TL, Kelley LA, Thomas KA, Whitecloud TS 3rd, Cook SD (1994) Effects of bone mineral density on pedicle screw fixation. Spine (Phila Pa 1976) 19:2415–2420
Wiseman T, Foster K, Curtis K (2013) Mental health following traumatic physical injury: an integrative literature review. Injury 44:1383–1390. https://doi.org/10.1016/j.injury.2012.02.015
van Delft-Schreurs CC, van Bergen JJ, de Jongh MA, van de Sande P, Verhofstad MH, de Vries J (2014) Quality of life in severely injured patients depends on psychosocial factors rather than on severity or type of injury. Injury 45:320–326. https://doi.org/10.1016/j.injury.2013.02.025
Stromsoe K (2004) Fracture fixation problems in osteoporosis. Injury 35:107–113
Sadiqi S, Lehr AM, Post MW, Dvorak MF, Kandziora F, Rajasekaran S, Schnake KJ, Vaccaro AR, Oner FC (2017) Development of the AOSpine Patient Reported Outcome Spine Trauma (AOSpine PROST): a universal disease-specific outcome instrument for individuals with traumatic spinal column injury. Eur Spine J 26:1550–1557. https://doi.org/10.1007/s00586-017-5032-8
The authors thank AOSpine International for their support. Also thanks to Vicky Kalampoki and Kathrin Espinoza-Rebmann, from AOCID, for their statistical analysis support.
This study was organized and funded by AOSpine through the AOSpine Knowledge Forum Trauma, a focused group of international Trauma experts. AOSpine is a clinical division of the AO Foundation which is an independent medically-guided not-for-profit organization. Study support was provided directly through the AOSpine Research Department.
Conflict of interest
The authors declare that they have no conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: The AOSpine CROST (Clinician Reported Outcome Spine Trauma)
Appendix: The AOSpine CROST (Clinician Reported Outcome Spine Trauma)
About this article
Cite this article
Sadiqi, S., Muijs, S.P.J., Renkens, J.J.M. et al. Development and reliability of the AOSpine CROST (Clinician Reported Outcome Spine Trauma): a tool to evaluate and predict outcomes from clinician’s perspective. Eur Spine J 29, 2550–2559 (2020). https://doi.org/10.1007/s00586-020-06518-0
- Outcome instrument
- AOSpine CROST
- Spine trauma
- Clinical parameters
- Radiological parameters
- Clinician perspective