Background

Low back pain (LBP) is the most burdensome of musculoskeletal conditions globally affecting ~ 7.5% of the world’s population (~ 577 million people) [1]. For up to 90% of people presenting with LBP, the specific cause of their pain cannot be clearly identified resulting in a label of non-specific LBP [2]. The current treatment of LBP mainly focuses on pain management while the causes of pain are rarely addressed. Quantitative assessments of the spine and patient complaints related to LBP may help with the identification of causes, improve the management of this condition, and reduce health care system costs.

Advances in science and technology over the past few decades have made several devices available to objectively assess clinical characteristics of patients including spinal stiffness. Stiffness is considered an important spinal biomechanical measure and has long been recognized by both patients and clinicians as one of the characteristic features of the back [3]. Therefore, stiffness has been widely used in the management of patients with back pain for diagnosis, prognosis, clinical decision-making, and the evaluation of manipulative techniques [4].

An increase or decrease in spinal stiffness has been found to be related to LBP. Specifically, previous studies have demonstrated that some patients with LBP have abnormal levels of spinal stiffness [5] and that these patients experience an immediate and sustained decrease in spinal stiffness for 1 week following spinal manipulative therapy [6, 7]. Moreover, researchers reported an increase in posteroanterior (PA) stiffness in participants with LBP compared to when participants had little or no pain, while asymptomatic controls showed insignificant changes in PA stiffness over time [8]. A reduction in stiffness has also been shown to be associated with self-reported measures of disability [6, 9]. These findings suggest that restoration of normal spinal stiffness and mobility plays an important role in some patients with LBP by improving spinal function and reducing pain although a casual relation between stiffness and these outcomes has not been confirmed. Therefore, further exploration of spinal stiffness assessment is warranted. While there are various spinal stiffness-testing devices available to objectively evaluate the spinal complaints [4, 5], there is no standard operating protocol for spinal stiffness measurement.

Having a standard data collection protocol for spinal stiffness assessment would facilitate comparison of devices and data between studies. Our research team developed a novel device, the VerteTrack, to improve on single-site spinal indentation by employing a loaded rolling wheel system. Several identical devices have been manufactured and are in use in multiple research centers over the past 6 years. In this Delphi study, our goal was to develop a best-practice protocol for evaluating spinal stiffness in human participants using VerteTrack, a spinal stiffness measurement device shown to be safe [10], reliable [11], and accurate [12].

Methods

This study used a standard Delphi methodology to achieve consensus. The Delphi method is a reliable and structured method of obtaining a consensus of opinion from a group of experts or knowledgeable participants [13] in areas where existing research is limited. The Delphi method is particularly recommended for areas where controversy, debate, or a lack of clarity exist [14].

Selection of participants

As our lab manufactured the device in question, we know of all the research centers that possess the device and all the staff who were trained on the device. We contacted these centers and asked them to provide us with an updated contact list of those who were trained/used the device since their initial training session. Thus, all individuals trained in VerteTrack methods and/or having previous experience using the VerteTrack device were invited to participate in the Delphi process (n = 25 individuals from 9 different institutions in 7 different countries). Potential participants were asked to participate in the study if they were willing to participate, have access to the internet over the course of the study, and were able to commit time to complete the surveys. Written consent was obtained from all participants after being informed about the project by adding a consent question to the start of Round 1.

Delphi-survey procedure

The Delphi survey involved three sequential rounds of deidentified online questionnaires provided over 4 months (Sep-Dec 2020). Study data were collected and managed using REDCap [15] electronic data capture tools provided by the Women & Children’s Health Research Institute at the University of Alberta. We contacted the research centers that are equipped with the device and asked them to send us the email addresses of those who were trained or collected data using the device. E-mail addresses were then entered into the REDcap website. All potential participants were sent an invitation email to participate in the Delphi process containing a link to the online survey. Participants were requested to complete each questionnaire within 2 weeks. Two automated e-mail reminders per round were sent out to non-responders at 1 week and the day before the due date. If participants were not able to complete the questionnaires within the 2 weeks, they were provided with additional reminders and extra time to respond. Each survey took 20–30 min to complete. Participants were allowed to save their answers and return to complete the questionnaire over several sessions.

Prior to the commencement of this study, consensus was defined when at least 70% of the participants in Rounds 2 and 3 either strongly agreed, agreed, (or strongly disagreed, disagreed) to include a statement in the final protocol. These levels of agreement have been considered appropriate in previous Delphi studies [13, 16,17,18,19]. Figure 1 summarizes the stages of the Delphi method in this study.

Fig. 1
figure 1

Stages of the Delphi technique to standardize spinal stiffness measurement using VerteTrack

In order to improve the structure and readability of questions, the Round 1 questionnaire was first piloted with three colleagues. Based on their feedback, Round 1 questions were revised and finalized. MH and GNK designed the Round 1 of the survey. This round included questions regarding basic demographic information and 21 open-ended questions inquiring about participant recruitment for VerteTrack testing, device safety, instructions given to research participants, and technical issues. This round aimed to review the comprehensiveness and relevance of the items and provide suggestions for the eventual protocol. Items for Round 2 of the survey were generated by comments from the first round that suggested removing, aggregating, or retaining items from the first round.

Only those who completed round 1 were invited to participate in Round 2. In this round, each participant received a survey comprising 171 statements. The goal of this round was to reach consensus on a standard protocol. In Round 2, participants were asked to indicate their anonymous opinion by ranking statements along a five-point Likert scale for agreement (“strongly agree”, “agree”, “neither agree nor disagree”, “disagree”, “strongly disagree”). Additionally, a free-text comment section for each question was available for participants to express any further thoughts or opinions. Round 2 also included four new open-ended questions derived from Round 1. Participants were required to rate every single item to be able to move on with the questionnaire.

Round 3 of the study comprised the same list and grading scale as Round 2 with an additional graphical description of findings from the previous round. The graphic information identified the percentage of total respondents that selected each possible score for the given item in Round 2. The respondents, therefore, were given an opportunity to modify or confirm their answers after viewing the scoring results using the same Likert scale from the previous round. The revised and new statements proposed by participants were added in Round 3 yielding a total of 183 statements. Using the consensus results obtained from Round 3, the authors created a written protocol for use of the VerteTrack device in collecting spinal stiffness measures.

Analysis

Deidentified data were analyzed by encoding participants with their survey ID numbers. Data from the REDCap tool was downloaded into a Microsoft Excel version 16.45 after each round. Descriptive statistics were used to describe the participants’ demographic characteristics. Responses to open-ended questions in the Round 1 and participants’ comments in Round 2 were thematically analyzed with MH and GNK discussing the qualitative responses. MH, GNK and SF met to discuss the items for the consensus statements in Rounds 2 and 3. The quantitative responses from the participants’ ratings in Rounds 2 and 3 were analyzed descriptively using medians, ranges, and percentages.

Results

Of the 25 individuals invited to participate in this Delphi study, 20 participants completed Round 1 (80% response rate), 20/20 completed Round 2 (100.0% response rate), and 20/20 completed Round 3 (100.0% response rate). The reasons for 5/25 participants not responding to the initial invitation email were not identified. Table 1 presents the demographic characteristics of participants at baseline. Participants had different experiences working with the device that ranged from receiving training to performing measurements of spinal stiffness in a population of 180 patients with back pain.

Table 1 Baseline characteristics of Delphi participants (n = 20)

In total, the pre-defined consensus threshold was reached for 67.2% (123/183) of statements after three rounds of surveys. Results from Round 3 were presented in Table 2. The number of consensus statements under each category was listed in Table 3. Items with 70% or more consensus from Round 3 were used to create the best practice protocol for the VerteTrack device (Additional file 1).

Table 2 Median value of Likert scale data and agreement level for all statements from Round 3
Table 3 The number of consensus statements under each category

Discussion

In this Delphi study, 20 panelists reached consensus on the majority of items relating to VerteTrack spinal stiffness measurements covering a wide range of domains including recruitment criteria, familiarization procedure, instructions for participants/ operators, technical issues, and safety. This is the first time, to our knowledge, that consensus has been used to obtain a common protocol on instrumented spinal stiffness measurements.

It is important to stress that the key feature of the approach used in this study is the consensus of individuals in the field of spinal manipulative therapy and low back pain research who had experienced working with VerteTrack. Therefore, the intent was not to find “the best” protocol for measuring spinal stiffness or to present an instrument as “the only” mechanical method for measuring spinal stiffness. Our goal was to develop a standard protocol for measuring spinal stiffness using a loaded rolling wheel device that could be used as a common resource in future studies.

The surveys identified some previously known considerations when measuring stiffness including the participant’s testing position, trunk muscles contraction, intra-abdominal pressure, respiratory cycle, and relocation of target spinal landmarks [4, 5]. This supports the quality and validity of our participants’ answers as these items have been developed over years in this field and the literature. For instance, one of our participant’s recommendations was to ask the patient to relax their back muscles during the assessment which is in line with an early study that showed spinal extensor muscle activities could induce changes in the mechanical responses to posteroanterior stiffness testing [20]. Furthermore, the surveys identified other factors not described previously in the literature including optimizing participant’s safety, a definition for a good/ bad trial, procedures to ensure a good trial, placing the device over the test area, instructions for reaching the same position in case of multiple assessments, and fixing software program crashes. This emphasizes the importance of group opinion over that of individuals for bringing new topics into focus that can be validated and studied in future works.

Interestingly, there was one specific area where no agreement was reached: the exclusion of pregnant participants from spinal stiffness measurements. One explanation for this lack of agreement is that different respondents may have different experiences in this area through diverse research designs that would, or would not, allow participants to be enrolled at different stages of pregnancy. This speculation is supported by studies to date that have employed VerteTrack. Of six studies using VerteTrack in human participants to date, three excluded pregnant participants [11, 21, 22], one excluded pregnant participants in the second or third trimester of pregnancy [10] and the remaining studies did not mention pregnancy at all [23, 24].

All items for which consensus was reached were consolidated into a final best practice protocol (Additional file 1) for using the VerteTrack. The resulting standard protocol is expected to improve the accuracy and efficiency of spinal stiffness measurements using the VerteTrack, facilitate the training of new operators, increase consistency of these measurements in multicenter studies, and finally provide the synergy and potential for data comparison between spine studies internationally. Our final protocol provides directions for researchers and clinicians who use the VerteTrack to measure spinal stiffness. However, caution should be used if between-patient comparisons are made (for many reasons including differences in plinth rigidity as well as between-person variations). The final protocol could be useful for other technologies that assess stiffness and even manual assessment of spinal stiffness. We encourage researchers in this area to review this protocol and consider adopting it for their own purpose. While the technical part of the protocol explaining how to operate the device may not be useful for manual assessments or devices that test participants in sitting position, however, some general information for spinal stiffness measurements has been provided and may be of benefit.

Strengths and limitations

The strengths of this study include the development of a consensus-based protocol based on 80% of the global population of persons with VerteTrack training and experience for Round 1 and 100% follow-up responses for Rounds 2 and 3. The relative heterogeneity in our participants may enhance the generalizability of the protocol and may have ensured that a greater spectrum of opinions was considered. The initial pilot survey improved the structure and readability of the questions before executing the full-scale project. In addition, Round 1 of our Delphi study provided the possibility of open responses and gave the participants the freedom to elaborate on the research topic which may increase the richness of the data collected. Although author bias cannot be completely eliminated from this type of research, it was minimized through implementing a Delphi consensus process using anonymous participant ratings and comments. The deidentification anonymity of participants’ answers to the questions also provided more open and honest feedback and prevented response bias.

It is acknowledged that the Delphi method itself has inherent limitations including Level V in the hierarchy of evidence-based medicine and the small sample size required. Although the final protocol was developed based on Delphi participants’ responses to 3 rounds of questions, it was not distributed to them for approval at the end of the study. Further, lack of interaction between participants in the Delphi (e.g., face-to-face meetings) may deprive panelists of exchanging important information, such as clarification of reasons for disagreements.

Conclusions

Using a Delphi approach, a consensus-based protocol for measuring spinal stiffness using the VerteTrack was developed. This standard protocol was designed to i) improve the accuracy, efficiency, and safety of spinal stiffness measurements using the VerteTrack, ii) facilitate the training of new operators iii) increase consistency of these measurements in multicenter studies, and iv) provide the synergy and potential for data comparison between spine studies internationally.