Introduction

The assessment of spinal curvatures is helpful for the understanding of low back pain syndromes [1]. A reliable imaging procedure of spinal alignment may offer classification models of spinal form variations leading to different therapy options [2, 3], or might be used for therapy monitoring, as well [46].

X-ray imaging is still serving as the ‘gold standard’ for the assessment of spinal form, spinal deformities or structural vertebral disorders [7], but non-radiating devices have been established in past decades for the non-invasive assessment of posture and spinal alignment, e.g. lateral photometric imaging [3, 8], or electro-mechanical inclinometers for back surface reconstruction, e.g. Spinal Mouse [9], or three-dimensional raster stereography back shape reconstruction devices with a minimised examiner’s influence due to the optical, non-contact character of measurement needing no markers or detectors on the skin surface [10].

For the clinical environment or research applications, validity and reliability of those biomechanical assessment systems have not been sufficiently proved yet. For the examination of spinal mobility and back surface reconstruction by means of inclinometers, there exist satisfying reliability studies [1114]. But for raster stereography, there is still a lack of reliability studies that are covering all parameters offered for a three-dimensional spinal form analysis.

So far, intra- and inter-examiner reliability studies of raster stereographic sagittal plane spine shape parameters have been evaluated and published internationally [15, 16]. But those studies were limited: frontal and coronal plane parameters were not included and data acquisition for test and retest took place on the same day. One study—published in German—included frontal plane parameters, or axial vertebral deviations, but was limited due to the sample characteristics [17].

As the role of examiner influences appeared not to be crucial—no markers or detectors on skin surface set or conducted by an examiner—and with respect to the knowledge of the relation between intra- and inter-examiner reliability [16], the present investigation was focussing on the intra-examiner reliability and the intra-individual variability as well as the group mean stability of raster stereography parameters in all three dimensions (sagittal, frontal, coronal plane)—as recommended earlier [16]—in four repeated measures within 1 week.

Methods

Subjects

A total of 20 persons (age 25.4 ± 5.5 years; BMI 22.8 ± 2.7 kg/m2), females (n = 9) and males (n = 11), were recruited as volunteers having been explicitly informed about the investigation and the non-radiating character of physical examinations (Table 1). Data were anonymised after the examination and analysed for the purpose of a reliability analysis based on four repeated measures: between-instants within 5 min on the same day, between-day at the same time the following day, and between-week at the same time the following week.

Table 1 Anthropometrics for the whole sample and separated for males and females (mean ± standard deviation)

The participants—all of them associated with our institution—were included, if there was no diagnosis dealing with back pain complaints, no serious back pain history for 2 years, and no back pain at all in the last 6 months. Indeed, there was no actual back pain (CR10 pain scale: 0.8 ± 1.1 pts.) [18] nor were there any functional deficits due to a back pain history (Oswestry Disability Index: 4.0 ± 3.3 %) [19] in the whole sample. Therefore, accompanying confounding effects on spinal shape could be excluded.

Equipment

Spine shape parameters were calculated by means of video raster stereography (Formetric®-System, Diers International, Schlangenbad, Germany), a non-invasive device for an indirect and high resolution back shape reconstruction (reconstruction error 0.2–0.5 mm; resolution 10 pts./cm2) [10] (Table 2).

Table 2 Spine shape parameters, shortcuts, and a description of anatomy and corresponding geometry

Specific back surface landmarks—like the vertebra prominens (VP), the beginning of the rima ani representing the sacrum point (SP), and the right and left lumbar dimple (DR, resp. DL) representing the position of spinae iliaca posterior superior (SIPS) of the pelvis bones—were recognised automatically to build up a Cartesian coordinate system (Fig. 1). This coordinate system served as calibration reference frame for a three-dimensional surface reconstruction using triangulation equations that ensured a valid correlation between back shape reconstruction and radiographic assessments of the anatomy of spine and pelvis [20, 21]. For a better understanding of geometry and corresponding anatomical landmarks, spine shape parameters serving as dependent variables were illustrated (Fig. 2).Footnote 1

Fig. 1
figure 1

Data assessment in free bipedal standing, raster projection lines with animated landmarks (yellow dots) and vertebral bodies (C7 red, T1–T12 blue, L1–L4 green) on back surface, back surface reconstruction with red areas (convex curvature), blue areas (concave curvature), and yellow dots (axis for coordinate system: VP–SP and DL–DR), and frontal plane spine shape parameters: Trunk imbalance, Pelvis imbalance, and Scoliosis angle (Table 2). VP vertebra prominens, SP sacrum point, DM midpoint between dimples, DL left and DR right dimple

Fig. 2
figure 2

Illustration of spinal alignment curves and back shape reconstruction parameters (Table 2)

Test protocol

For the static assessment of spinal alignment, the participants were given only few instructions: They had to stand on a platform with their backs to the camera, their heels placed at the end of the platform, staying immobile while looking straight ahead (Fig. 1). Back shape was recorded over a time period of 5 s (10 frames/s), and spine shape parameters were calculated as an average of these 50 frames, nearly in real time.

For the dynamic examination of lumbar mobility, the participants started like they did for the static assessment. Then they clasped their head above the neck to present their vertebrae prominens to the camera and began a backward bending movement up to the maximum extension. Evasion manoeuvres in the hip or knee joint could be self controlled with the help of a contact bar at the back of their thighs. This bar was serving as a tactile feed back instrument to take care of the movement quality control, because knee bending and hip evasion movements would have led to an upper body backward inclination, but not to the intended maximum segmental hyperextension of the lumbar spine.

Backward bending was recorded over a time period of 10 s (10 frames/per s). The mobility was calculated as the difference between the unforced starting and the maximum hyperextension position at the end of the 10 s (Fig. 3). The examiner was well experienced and used standard instructions.Footnote 2 If the examiner decided that the execution of the movements had not been totally correct, the test was repeated.

Fig. 3
figure 3

Spinal mobility—backward bending in the sagittal plane with surface reconstructions at the upright standing starting point and at the end of the task (left) and lumbar flexibility angle (°) illustrated as the difference of the lumbar lordosis angles between upright standing and maximally hyper-extended position (right)

Statistics

Data were described as mean ± standard deviation (SD). Normal distribution was verified (Kolmogorof–Smirnof test). Group mean differences were proved (one-way ANOVA for repeated measures). Significance was accepted at a level of P ≤ 0.05. Intra-individual variability was expressed as standard error of the measurement (SEM) and as coefficient of variation (CV  %) based on four repeated measures, being displayed as group means for the whole sample. The Intra-Class-Correlation coefficient with corresponding confidence intervals (ICC ± 95 % CI) was calculated pairwise and for the total of all measures. Coefficients of more than 0.90 indicated a high, 0.80–0.89 a good, 0.70–0.70 a fair, and less than 0.69 a poor reliability.

Results

Stability

There were no significant changes within group means from the first to the last time of spinal form assessment in any parameter (P > 0.050, η 2 ranging between 0.001 and 0.180) (Table 3).

Table 3 Descriptives (mean ± SD), intra-individual variability expressed as relative (CV %) and absolute (SEM) values, and group mean differences (one-way ANOVA) in four repeated measures

Variability

Intra-individual variability of four repeated measures within 1 week revealed little absolute variations (SEM) and more discriminating relative values (CV), ranging from 4.4 % (thoracic kyphosis) to 98.2 % (trunk inclination) (Table 3).

  • Parameters describing the sagittal curvature (thoracic, lumbar, sacral sway) were the least varying (CV 4.4–7.9 %, SEM 0.6°–0.9°).

  • Parameters describing scoliosis determinants—vertebral rotation (coronal plane) or side deviation (frontal plane) varied between 14.1 and 20.5 % (SEM 0.3°–0.9°, and 0.4–0.7 mm, respectively).

  • Parameters describing the frontal and sagittal plane upper body global position as well as frontal plane pelvis position were varying most widely (CV 35.8–98.2 %, SEM 0.7–3.0 mm).

  • Lumbar mobility test results were varying wider (CV = 12.9 %, SEM = 1.5°) than lumbar angles assessed under static conditions (CV = 4.9 %, SEM = 0.8°).

Reliability

Short-term reliability assessed on the same day was higher than the between-day reliability, except for the pelvis torsion. Overall correlation coefficients were affected by the short-term coefficients, and therefore showed higher reliability values than the between-day analyses (Table 4).

Table 4 Reliability coefficients (ICC ± CI 95 %) for pairwise correlations and the total of four tests

Between-day and between-week reliability was comparable, except for the pelvis torsion (between-day was higher), for frontal plane scoliosis parameters (scoliosis angle, vertebral side deviation amplitude), and for the lumbar mobility (between-week was higher).

With respect to geometrical dimensions, we found specifically differing reliability coefficients (Table 4):

  • Reliability of sagittal plane parameters (trunk inclination, thoracic kyphosis, pelvis tilt, lumbar lordosis) was high (ICC 0.938–0.994), irrespective of the analysed interval.

  • Reliability of scoliosis associated parameters (vertebral rotation and side deviation, and scoliosis angle) was good or high (ICC 0.857–0.946) for the short-term interval, but lower for the between-day and between-week intervals (ICC 0.658–0.877).

  • Trunk imbalance assessment revealed poor or fair reliability coefficients (ICC 0.678–0.786).

  • Reliability of pelvis imbalance was fair; it was good only for the short-term interval (ICC 0.743–0.825).

  • Pelvis torsion assessment was highly reliable between-days (ICC = 0.909), but almost fair for other intervals (ICC 0.721–0.775), while the overall coefficient appeared to be good (ICC = 0.890).

  • Reliability of lumbar mobility testing was good or even high (ICC 0.862–0.969), but lower than for the assessment of lordosis angles under static conditions (ICC 0.972–0.990).

Discussion

Stability

According to earlier studies, we analysed group mean stability and did not find significant changes in any spine shape parameter, indicating assumed parameter stability within 1 week including four repeated measures as could be established previously for global sagittal spinal form parameters assessed using a surface inclinometer [14]. Therefore, longitudinal monitoring should be considered not to be affected by systematic processes like learning or familiarisation.

Variability

By comparing variations within repeated measures of the thoracic kyphosis angle in a ‘back phantom’ and a human being Goh et al. [15] could distinguish the major role of behavioural stance positioning effects rather than technical reasons as a confounder for reproducibility (‘phantom’ 0.4–1.3 % vs. ‘volunteer’ 2.4–3.0 %). Actually, intra-individual variability expressed as absolute values (SEM) revealed only little parameter variations ranging from less than one degree or millimetre, respectively, to maximally 1.5° (lumbar mobility) or 3 mm (trunk inclination) (Table 3). Earlier reliability studies dealing with raster stereography did not focus on intra-individually spreading spine shape parameters in repeated measures [16], or decided to calculate the relative variation (CV  %), only [15]. Mannion and collaborators [14] analysed between-day intra-individual variations as standard error (SEM) for a skin surface detecting inclinometer (Spinal Mouse®). They found considerably higher standard errors for their global thoracic and lumbar angles (SEM: 4.2°, and 2.5°, respectively) than we did (SEM < 1°). These variations were, inter alia, due to positioning. But contrary to the non-touch raster stereography, the practical application of the Spinal Mouse® inclinometer undoubtedly suffered of immanent inter-examiner influences, spine shape assessment results being determined not only by the individuals’ back shape and by varying posture but also, not least, by the grade (i.e. lack) of the examiner’s experience.

Expressed as relative variation (CV  %), pelvis and global upper body position varied more (36–98 %) than scoliosis determinants (14–21 %), while sagittal spine shape parameters varied least (4–8 %) in the present study. As there were no comparable studies covering all three dimensions of spinal alignment calculating coefficients of variability, our results could be discussed only for the kyphosis angle in this point, where Goh et al. [15] found a relative variation of about 3 %. Being aware that those findings were based on repeated measures on the same day, our results of 4.4 % kyphosis angle variation assessed within 1 week were assumed to be comparably good.

Reliability

In the present study, sagittal spine shape parameters showed the highest reliability coefficients (ICC 0.938–0.990). Functional testing of lumbar mobility was also almost highly reliable (ICC 0.862–0.969), while reliability for scoliosis determinants, frontal plane imbalance of pelvis and upper body was ranging from poor to high coefficients (ICC 0.658–0.946). In general, short-term reliability was higher than for the between-day or between-week interval.

Our results were in line with earlier studies examining sagittal spinal alignment within repeated measures on the same day (ICC or Cronbachs α: 0.92–0.99) [15, 16]. Looking at the between-day reliability, comparisons were possible only with the Spinal Mouse®, where global sagittal sway parameters showed coefficients slightly lower (ICC: 0.73–0.92) than observed in the present study [14]. Obviously, the above-mentioned examiner’s influence—manually conducted skin surface detection—was a confounding variable affecting reproducibility remarkably more than the non-touch raster stereography in static posture as well as in dynamic mobility testing.

Reliability of scoliosis determinants has not yet been investigated as between-day or between-week reliability anywhere else. So far, the short-term reliability has merely been examined within a sample of scoliosis patients: vertebral rotation and side deviation could be established as being highly reliable (r > 0.94 and r > 0.96, respectively) [17]. Those reliability coefficients were higher compared to the present study (ICC: 0.86–0.95) investigating volunteers without any back deformities, probably due to statistical reasons. Wider spreading parameter distributions among the scoliosis patients were easing higher correlation coefficients compared to more homogeneous non-scoliotic individuals in the present study, and in general ICC correlation coefficients tend to be lower than Pearson correlation coefficients, because the ICC took into account the absolute differences of the individual’s values, which were ignored by the Pearson correlation coefficients in earlier studies [17].

Reliability of frontal plane parameters, trunk and pelvis imbalance, for the short-term analysis was almost fair (ICC = 0.79) or even good (ICC = 0.83), respectively, but it should be kept in mind that between-day and between-week coefficients were remarkably lower for scoliosis determinants as well as for frontal plane parameters, marking a poor to good reliability (Table 4), which should be taken into account especially for spine shape monitoring investigations.

To the authors’ knowledge, there have been no earlier internationally published investigations discussing frontal plane spine shape parameter reliability. Technical background reasons might possibly be helpful for the discussion of pelvis imbalance, pelvis torsion, and trunk imbalance, which all depend more than other spine shape parameters on the correct automatic recognition of the lumbar dimple anatomy.

Lumbar dimple position represented the SIPS as bony pelvis structures necessary to build up the Cartesian coordinate system serving as calibration frame for back shape reconstruction [10, 22]. Confounding soft tissue influences should be considered especially for the lumbar dimple area. Inter-individually varying tissue properties affected by subcutaneous matter—affected more than in other back surface regions—might lead to differing errors within repeated measures, affecting reliability results and the error of measurements [21], although Body Mass Index could not be identified as a relevant confounder, so far [15, 16].

Limitations

With respect to the knowledge of earlier investigations, showing inter-examiner reliability coefficients very similar to or even higher than the intra-examiner reliability coefficients for sagittal plane parameters [16], and taking into account the automatic and non-touch character of raster stereography data assessment with an assumed only minor examiner influence, we did not test inter-examiner reliability. This was considered to be reasonable for the static data assessment of spinal alignment, but might be limiting for the dynamic assessment of lumbar mobility, where examiner instructions and decisions play a more influential, potentially confounding part.

Conclusions

Reproducibility of the non-invasive spine shape reconstruction in normal non-scoliotic individuals by means of video raster stereography is supposed to be helpful for clinical applications in screening and monitoring, although confidence intervals of reliability coefficients were indicating a lack of certainty for a high reliability in spine shape parameters apart from the sagittal plane. Effect analyses should take into account the degree of reliability differing in several spine shape parameters, whenever intervention effects are discussed. Furthermore, there is a need for additional research dealing with scoliosis patients of different degrees of spinal deformities.