Introduction

Manual therapists’ commonly use manual spinal stiffness assessment (MSSA) to guide diagnosis and treatment decisions for patients with non-specific spinal pain [1, 2]. In MSSA, therapists apply a force to the spine by hand in a posterior to anterior direction then subjectively interpret the resultant resistance to displacement [3]. Although MSSA is an accessible option for clinical practice, the reliability and criterion validity of this method is poor [3,4,5,6,7,8,9,10]. Mechanical devices were created as an objective alternative to MSSA [11, 12]. Fig. 1 illustrates the evolution of spinal stiffness assessment devices that typically assess spinal stiffness at one segmental level per indentation (single-level mechanical assessment) [12,13,14,15,16,17,18,19,20,21] which can be time-consuming and requires large amounts of data transcription.

Fig. 1
figure 1

The evolution of spinal stiffness assessment

VerteTrack is a new device developed to address the limitations of existing mechanical devices that quantifies bulk measurement of spinal stiffness [22]. A novel feature of the VerteTrack is a pair of rolling indentation wheels that enables stiffness assessment of an entire spinal region thereby minimising assessment time [22]. The VerteTrack has recently demonstrated excellent within-session test-retest reliability (intraclass correlation coefficient ICC3, k 0.95–100) and good to excellent between-day reliability (ICC3, k 0.82–0.93) in the clinical setting [23], however, its accuracy is unknown. Benchtop performance of a test instrument can be evaluated through measurement of precision (random error) and bias (systematic error) of the system under test (Fig. 2) [24, 25]. Therefore, this study aimed to measure the bench-top performance of the VerteTrack under both single-level and multiple-level continuous test conditions.

Fig. 2
figure 2

Components of measurement accuracy

Methods

Study design, setting and equipment

This was a laboratory-based accuracy study conducted in February 2018 [24]. Experiments were designed and conducted following the International Organization for Standardization (ISO 5725-1) for the accuracy of measurement methods and results [24].

VerteTrack overview

The VerteTrack frame (width 1080 mm × height 1090 mm × length 1, 510 mm) suspends an aluminium gantry that supports movement of a rolling indenter head (RIH) in three axes: X-axis (longitudinal, superior-inferior), Y-axis (transverse, left-right) and Z-axis (vertical, posterior-anterior) via stepper motors (resolution = 0.007 mm, www.stepperonline.com, China) (Fig. 3). A string potentiometer (resolution = 0.020 mm, accuracy ±0.010 mm, TE Connectivity, USA) is used to record Z-axis displacement. A vertically-oriented laser assists the operator to align the RIH upon pre-determined anatomical landmarks (GLX Laser Site, Barska). During spinal stiffness assessment, the VerteTrack applies discrete loads via addition of weighted plates (“plates”) with a nominal mass of 1 kg each (RIH + k plates; k = 0, 6). These loads were selected as they represent loads that have previously been used in VerteTrack studies [22, 23, 26] and are comparable to loads applied in other mechanical indentation studies [14, 20]. Plates were numbered and always added in the same order for each indentation cycle. For more detail about the VerteTrack see Brown et al. 2017 [22].

Fig. 3
figure 3

A labelled image of the VerteTrack during indentation]. Legend: RIH – Rolling indenter head

Methods of indentation

The VerteTrack can perform two modes of indentation testing: single-level and multiple-level continuous indentation. Single-level indentation assesses a single spinal level and requires the operator to position the RIH directly above the target tissue. Loads are then applied incrementally to the spine in a posterior to anterior direction with the resulting deformation of the spinal tissues recorded (Z-axis displacement). Multiple-level continuous indentation requires the operator to first identify the spinal trajectory that the RIH will travel within the horizontal (X-Y) plane. This is achieved by manually aligning each spinous process (determined by palpation or ultrasonography) with the RIH using the embedded laser pointer. The laser points are memorised by the device and then replayed to move the RIH continuously along the same pre-defined trajectory for each successive load. The resolution of the RIH is identical to the resolution of stepper motors (0.007 mm).

Load and displacement precision

Load precision (random error) of the VerteTrack was estimated by the coefficient of variation (CV = SD / load mean) over 10 repetitions for each load. The RIH was measured using recently calibrated digital scales (OHAUS, model TS4KD: Resolution 0.1 g, accuracy ±0.07 g) (Fig. 4, panel a). Each plate was added to the RIH, then repeated up to a total of 5 plates. Loads were converted to Newtons (N) using mass (kg) x gravity (9.81 m/s2). Displacement precision (z-axis, depth) of the VerteTrack was also estimated using coefficient of variation over 10 repetitions at each of 6 discrete levels of the RIH on a custom-engineered wooden wedge to simulate tracking of a spinal sagittal curve (Fig. 4, panel b).

Fig. 4
figure 4

a Experiment one methodology: load applied by the VerteTrack (LoadVerteTrack) vs. reference load (Loadref). b Experiment two methodology: displacement measured by the VerteTrack (DisplacementVerteTrack) vs. reference displacement (Displacementref). c Experiment three methodology: stiffness measurements using multiple-level continuous indentation (Stiffnessmultiple) vs. single-level indentation (Stiffnesssingle). Legend: RIH – Rolling indenter head, d0 – displacement 0, d1 – displacement 1, d2 – displacement 2, d3 – displacement 3, d4 – displacement 4

Load and displacement bias

Load bias (systematic error) was estimated by comparing each load delivered through the VerteTrack against the same load externally. Mean load bias was estimated by calculating the differences between reference loads and loads measured by the VerteTrack, and the 95% confidence interval of the difference [25]. Reference loads were calculated by the addition of successive plates placed directly upon the digital scale (i.e. not through the VerteTrack RIH) plus the load measured through the RIH alone. Each reference load (k plates; k = 1, 5) was measured ten times. Displacement bias was also estimated using the same method employed to determine load bias. Mean displacement bias was determined over 10 repetitions at each of 6 discrete levels as reported by the VerteTrack, compared to an external digital calliper (Wixey, WR200: Resolution = 0.05 mm, accuracy ±0.025 mm) (Fig. 4, panel b).

Comparison of single-level and multiple-level continuous operation

A method-comparison experiment was conducted to evaluate the performance of VerteTrack for measurement of stiffness during multiple-level continuous and single-level (reference) modes of operation. Terminal stiffness values (i.e. the ratio of the maximum load to the maximum displacement) [26] were used in our analysis. The stiffness of a deformable foam test medium (AIREX® balance beam, Switzerland) was measured during both single-level and multiple-level continuous modes of operation to simulate measurement at a single vertebral level and across multiple vertebral levels respectively. The test medium was chosen to emulate the physiological stiffness encountered for the in vivo adult lumbar spine (range: 2–10 N/mm) [12, 20, 26]. Five equidistant locations (5 cm apart) were marked on the foam medium along a straight line (RIH landing, L1, L2, L3 and RIH lift-off) for stiffness assessment (Fig. 4, panel c).

Precision during single-level and multiple-level continuous indentation

Precision of the VerteTrack during measurement of stiffness on the test medium was estimated by the coefficient of variation (CV = SD / stiffness mean) over 300 trials for both single-level and multiple-level continuous indentation. Stiffness was measured during multiple-level continuous indentation (Stiffnessmultiple) and single-level indentation (Stiffnesssingle) at three discrete locations (L1, L2, L3) on the medium. Incremental loads (plates) were added to the RIH in a predefined sequence (RIH + k; k = 1, 6). Between each trial, 90 s elapsed to allow for any residual deformation to resolve. Between each cycle (six trials of increasing load), an additional 5 min elapsed to allow any residual deformation to resolve after the maximum load was applied to the medium. A total of ten cycles were performed.

Single-level versus multiple-level continuous indentation

Each trial for Stiffnessmultiple was compared to Stiffnesssingle, to quantify bias between indentation methodologies we calculated the stiffness differences and 95% confidence intervals of the difference. Bias calculation, and a plot of raw stiffness data were conducted to assist interpretation. In addition, Lin’s Concordance Correlation Coefficient (LinCCC, Rc) was reported for load and displacement. LinCCC tests both agreement and linearity [27]. The strength of agreement was graded as “almost perfect” (Rc > 0.99), “substantial” (Rc > 0.95–0.99), “moderate” (Rc > 0.90–0.95), or “poor” (Rc < 0.90) [28]. Alpha was set at 0.05 for all statistical significance tests of agreement.

Results

Load and displacement precision

Six measures relating to RIH loading (LoadVerteTrack) were obtained beginning with no load 16.557 N (95%CI: 16.470 to 16.645, RIH only) then increasing by the addition of single masses. The coefficient of variation (CV) ranged from 0.03 to 0.27% depending upon the applied load (Table 1).

Table 1 Precision of VerteTrack applied load

Six discrete RIH displacements were then measured (DisplacementVerteTrack) beginning at a baseline value of 60.03 mm (95%CI: 60.01 to 60.05 mm, highest level) then increasing to 12.08 mm (95%CI: 12.00 to 12.16 mm, lowest level). The CV ranged from 0 to 0.32% depending upon the level of the wedge (Table 2).

Table 2 Precision of the VerteTrack RIH displacement

Load and displacement bias

The calculated reference loads (Loadref) ranged from 27.757 N (95%CI: 27.701 to 27.814 N, RIH + 1 plate) to 71.461 N (95%CI: 71.420 to 71.502 N, RIH + 5 plates). There was a statistically significant (p < .001) systematic mean bias for the VerteTrack load (LoadVerteTrack), compared to the reference load (Loadref) of − 0.123 N (95%CI: − 0.182 to 0.428 N, p < .001) (supplementary figure 1, panel a). Lin’s Concordance Correlation Coefficient showed almost perfect agreement (Rc = 1.0, 95% CI: 1.0 to 1.0) between LoadVerteTrack and Loadref (supplementary figure 2, panel a).

The reference displacement (Displacementref) as measured by the digital calliper ranged from 12.03 mm (95%CI: 11.98 to 12.08 mm) to 60.08 mm (95%CI: 60.02 to 60.13 mm). There was no statistically significant (p = .001) systematic bias for the VerteTrack displacement (DisplacementVerteTrack) as compared to the reference displacement (Displacementref) (mean difference = 0.02 mm, 95%CI: − 0.09 to 0.14 mm, p < .001) (supplementary figure 1, panel b). Lin’s Concordance Correlation Coefficient demonstrated an almost perfect agreement (Rc = 1.0, 95% CI: 1.0 to 1.0) between DisplacementVerteTrack and Displacementref (supplementary figure 2, panel b).

Stiffness measurements

Performance at different loads

For the purposes of this study, we classified low load conditions as 27.757 N and 38.662 N (1 and 2 plates), and moderate to high loads as 49.583 N, 60.687 N and 71.461 N (3–5 plates). The terminal stiffness values for low load conditions ranged from 6.09 to 8.81 N/mm, and moderate-high load conditions yielded stiffness values ranging from 5.70 to 6.38 N/mm. Under low load conditions, the terminal stiffness grand mean was 7.43 N/mm. As for moderate to high load conditions, the terminal stiffness grand mean was 6.03 N/mm (Fig. 5). Figure 5 shows a graphical representation of the effects of low versus moderate-high loading on stiffness values.

Fig. 5
figure 5

Multiple-level continuous and single-level stiffness data across three locations (L1, L2 and L3) on the AIREX balance beam. L1: trial 1–50, L2: trial 51–100, L3: trial 101–150. Histogram for each location

Performance of single-level versus multiple-level continuous indentation modes

To determine the precision of stiffness measured by the VerteTrack (Stiffnessmultiple and Stiffnesssingle) at three locations on the AIREX balance beam, CV was calculated for each load (Table 3). The CV at each load for Stiffnesssingle ranged from 2.0 to 2.3% and Stiffnessmultiple ranged from 1.4 to 3.2%.

Table 3 Coefficient of variation for Stiffnessmultiple and Stiffnesssingle

Multiple-level continuous indentation (Stiffnessmultiple) and single-level indentation (Stiffnesssingle) were compared at three discrete locations (L1, L2 and L3) on the AIREX balance beam. There was a negative systematic bias for Stiffnessmultiple, compared to Stiffnesssingle of − 0.25 N/mm (95%CI − 0.67 to 0.17, p < 0.001) (supplementary figure 1, panel c).

Discussion

This is the first mechanical spinal stiffness device to be evaluated for the bench-top performance (accuracy), which is essential to establish internal validity of the VerteTrack. Both single-level and multiple-level continuous indentation modes demonstrated high levels of precision and agreement, despite a small negative systematic bias for multiple-level continuous compared to single-level indentation (− 0.25 N/mm, equivalent to 4% lower stiffness). It is unclear if this difference is clinically relevant as there is currently no published data to support a minimal clinically important difference (MCID) for the assessment of spinal stiffness, nor standards for different indentation modes [3, 11]. More broadly, mechanical devices must first be used to collect baseline spinal stiffness data in a human population in order to determine a MCID, while on the other hand, an MCID cannot be calculated without understanding the performance of a measurement as is described here. Such baseline data will allow for more robust conclusions regarding differences between single-level and multiple-level continuous indentation.

The plot of raw single-level and multiple-level continuous stiffness values demonstrated that higher stiffness values were obtained under low loads compared with moderate-high loads where stiffness remained at approximately 6 N/mm (Fig. 5). This is likely attributed to properties of the viscoelastic foam medium. In vivo testing observed the inverse relationship between load and stiffness, that is a positive relationship between load and stiffness [23]. It would be advantageous to identify on a human population a specific load that yeilds the most useful spinal stiffness information. As reducing the number of loads would further reduce assessment time. In addition, results from in vivo testing suggest that the device provided reliable stiffness values, irrespective of load [23].

Limitations

This study was performed on a viscoelastic foam medium, without the presence of physiological properties known to influence spinal stiffness (such as breathing, spinal extensor muscle contraction and abdominal muscle contraction) [3, 11]. Also, it is unclear to what extent the observed phenomena can be attributed to the medium and whether a human population would emulate similar findings. To quantify bias, the level of agreement between multiple-level continuous stiffness measurements was compared to a reference standard. Single-level indentation was used as a proxy reference standard, given that it is the more established method of indentation reported in the literature. Unfortunately, there is no ‘gold standard’ to ascertain spinal stiffness in human participants.

Clinical utility of mechanical measurement of spinal stiffness

It is unclear whether the use of mechanical spinal stiffness measurement devices in a clinical setting would aid in the patient’s diagnosis, prognosis, treatment, or clinical outcomes. There are inconsistencies in the literature regarding the relationship between pain, disability and spinal stiffness, however emerging research into the sub-grouping of patients into responders and non-responders to spinal manipulative therapy have yielded promising results [16]. Mechanical spinal stiffness devices currently have limited utility in clinical practice until further research can identify specific populations that may benefit from the assessment procedure.

Conclusion

The Vertetrack demonstrated good bench-top performance through high precision, linearity, and low systematic bias compared to reference values. When combined with recently published clinical reliability data, the VerteTrack demonstrates high levels of internal validity. The novel multiple-level continuous indentation mode offers potential for increased time efficiency in future clinical trials without compromise in stiffness measurement or patient comfort.