A clinical decision support system in back pain helps to find the diagnosis: a prospective correlation study

The aim of this study is to show the concordance of an app-based decision support system and the diagnosis given by spinal surgeons in cases of back pain. 86 patients took part within 2 months. They were seen by spine surgeons in the daily routine and then completed an app-based questionnaire that also led to a diagnosis independently. The results showed a Cramer’s V = .711 (p < .001), which can be taken as a strong relation between the tool and the diagnosis of the medical doctor. Besides, in 67.4% of the cases, the diagnosis was concordant. An overestimation of the severity of the diagnosis occurred more often than underestimation (15.1% vs. 7%). The app-based tool is a safe tool to support healthcare professionals in back pain diagnosis.


Introduction
Back pain is one of the leading reasons for people seeking for health care services [1]. Thus, back pain is a serious disease of increasing socio-economic importance [2][3][4][5]. Globally, especially low back pain is the main cause of years lived with disability [3]. Every age group is affected by back pain. The incidence of low back pain is 60-90%, and low back pain is the main cause of working disability in most countries [6][7][8][9][10]. Also neck pain is in increasing problem with incidences of between 10.4% and 71.5%, and an annual prevalence varying between 30 and 50% [7,[11][12][13][14][15][16]. Because of these enormous costs for the health care system, more and more attempts are made to support health care providers with a computerized decision support systems (DSS). The DSS can be considered as one specific form of a symptom checker [10]. The study group of the authors has already developed and published a first use of a computerized algorithm which is based on the next best questions to find out the most possible diagnosis [10]. Also, other institutions use these systems. Already in 2015, Semigran et al. examined 23 symptom checkers for self-diagnoses and triage using 45 standardized patients. The results showed that overall the correct diagnosis was given in 34% of the patients and the correct diagnosis under the top 20 diagnoses was given in 58% [17]. A study in 2018 showed significant medium correlation between the DSS and the medical recommendation. In 49.6% the diagnoses were concordant [10].
The aim of the present study was to redesign the questions and algorithm to improve the concordance between medical diagnosis given by a spinal surgeon and the algorithm tool.

Methods
This non-randomized unblinded correlational study included male and female patients with back pain who visited the Department of Orthopaedics of the University Medical Centre Regensburg between September and November 2020. Inclusion criteria were: consultation because of back pain, German language to understand the questions. Exclusion criteria were missing consent or patients who were not able to take part because of other medical reasons. The study was approved by the Ethics Commission of the University of Regensburg (21.08.18, 18-1007-121, DRKS DRKS00012467) and carried out in accordance with the approved guidelines of the Helsinki Declaration of 1975. A written informed consent was obtained from all study participants. Participation in this study was voluntary.

Patients
After fulfilling the inclusion criteria, 89 patients were included, of whom 86 completed the study. Three patients had to be excluded due to technical problems, so they were not able to complete the questionnaire. From the remaining 86 patients, 40 were female and 46 males. (Table 1).

App-based questionnaire for back pain
The algorithm of the app-based tool was developed in cooperation with members of Digital Health, medical doctors and psychologists. The idea for the tool algorithm is an algorithm that reflects the spinal surgeons´ way of thinking while he is taking the anamneses. First the algorithm was written down as a big decision tree. Then an app-based question tool was taken, and the questions were used within this frame setting to facilitate the evaluation. Thus, the algorithm and not the tool itself was subject of investigation.
For safety reasons the typical red flag questions have to be answered all with "no", otherwise the patient is sent directly to an emergency room. Then the patient chooses the area of pain (cervical, thoracic, lumbar, S-I-Joint).
After that two to five dichotomic questions have to be answered with yes or no. These "decision" questions lead to the most likely diagnosis. This diagnosis is then reassured by a block of questions, which are sensitive for the supposed diagnosis. If more than 65% of the answers are "yes", the diagnosis is confirmed. If not, the second most likely diagnosis is evaluated, by again a block of sensitive questions.
If these are answered with "yes" in more than 65%, this diagnosis is taken, if not the patient it advised to go to see a doctor because a diagnosis cannot be found for certain ("no diagnosis"). See in Table 2 an overview of all possible diagnoses.

Procedure
The study was conducted in the Department of Orthopaedics of a University Medical Centre. All patients were seen by one of two spinal surgeons. Each examination took between 10 and 20 min. The app-based form was filled either before the appointment or directly before the consultation while waiting by the patient itself within a maximum of 5 min. The order was chosen by chance for a smooth integration in the daily clinical routine. There were two experienced consultant spine surgeons of the same experience working together for 10 years, which means that their thinking and decisions are comparable. They were blinded in that way,  that they did not have any knowledge of the result of the tool. Thus, both procedures (tool and medical examination) were treated independently. After a first pilot study of 20 patients, the technical setting was improved using a better tablet. Some mistakes in linking the questions within the algorithm were fixed and the present study started.

Statistical analysis
To calculate the number of necessary cases, the effect size was taken from our former study [10]: With an effect size of phi = 0.717, an error probability of 5% and a power of 95%, the calculated number of cases is 86 [18].
The scored diagnoses were given by the spinal surgeons manually and by the tool as saved data. At first, the correlation between the diagnoses of the tool and the medical diagnosis were calculated. Because data were nominal scaled, and both variables have more than two characteristics, Cramers V was used. All diagnoses were described in the result section. Furthermore, the combination between the medical and the computerized diagnoses were presented for every single diagnosis. A possible difference in the frequency was statistically investigated with the Chi-square test.
"Overestimation" means, that the tool´s diagnosis is more severe than the spinal surgeon´s diagnosis. E.g., the tool says herniated disc and the spinal surgeon says unspecific low back pain. The diagnoses facet joint arthritis, low back pain and Iliosacral joint block were regarded as equal as there is nearly the same first line conservative treatment.

Results
Relation between the tool and the diagnosis of the spinal surgeon: Table 3 shows descriptively the relation of the single diagnoses given by the tool and the medical doctor.
Statistically, a significant relation between the tool and the diagnosis of the medical doctor could be carved out, Cramer's V = 0.711, p < 0.001, which can be taken as a strong relationship. This relation holds true, if the data were calculated separately for women, Cramer's V = 0.720, p < 0.001 and men, Cramer's V = 0.780, p < 0.001.
In 67.4% of the cases the diagnosis between the medical doctors and the DSS were concordant, in 32.6% they were discordant. The difference of the frequencies was statistically significant, χ 2 (1, N = 86) = 10.47, p < 0.001).
Analysing the over-and underestimation (fusing the diagnoses: facet joint arthritis, low back pain and Iliosacral joint block) the results showed that in 77.9% of the cases the diagnoses were concordant, in 15.1% they were overestimated, and in 7% they were underestimated by the DSS. The frequencies of the categories were statistically significant, χ 2 (2, N = 86) = 77.74, p < 0.001); however, the difference of the frequencies between the categories "overestimation" and "underestimation" was not significant, χ 2 (1, N = 19) = 2.58, p = 0.108). Table 3 Combination of single diagnoses from the tool and the spinal surgeon; each number represents one diagnose and you can see how often tool and how often the spine surgeons diagnosed it; e.g. the diagnosis spinal stenosis (2) was set ten times by the spine surgeons, whereas the tool had this result eight times and one facet joint arthritis (1) and one result was "no diagnosis" (12) 1 = facet joint arthritis, 2 = spinal stenosis, 3 = herniated disc, 4 = low back pain, 5 = osteoporotic vertebra fracture, 6 = Iliosacral joint block, 7 = thoracic block, 8 = cervical myelogelosis, 9 = spondylodiscitis, 10 = spondylolisthesis, 11 = fibromyalgia, 12 = no diagnosis, 13 = cervicocephalic syndrome

Discussion
The hypothesis that there is a correlation between the diagnoses of the spinal surgeons and the clinical decision support tool holds true. In comparison to former studies, the correlational effects are strong.
Since ages symptom checkers are part of scientific work with the aim to unify and to simplify clinical diagnosis finding or to support patients identifying their problems themselves [19,20]. In addition, the consortium selfBACK introduced a protocol to be used by patients themselves to promote self-management of low back pain (LBP) [21,22].
Despite all these efforts only few studies were published in the field of back pain in the last years. In the present study 67.4% of the diagnoses between the medical doctors and the DSS were concordant, analysing the over-and underestimation the results even showed that 77.9% of the diagnoses were concordant.
That means that e.g., pain because facet joint arthritis or low back pain was counted equally. Only 7% of the diagnoses were underestimated by the patients. But none of them was an urgent case. E.g., the tool recognized low back pain and the spinal surgeons diagnosed a spondylolisthesis without surgical indication.
In comparison to other studies the concordance is really strong. In 2015 Semigran et al. evaluated 23 symptom checkers for self-diagnoses and triage [17]. For this evaluation 45 standardized patient vignettes were formed. 15 were each assigned to a group. Group 1 needed immediate help, group 2 need no immediate help and group 3 only required exercises on their own. Overall in the 23 symptom checkers, only in 34% the correct diagnosis was given [17]. In another study by Bison et al., the patients filled the symptom checker and then identify the cause of their knee pain themselves in a list of 2-15 diagnoses. This only worked in 58% of the cases [23].
Further aim of this should not be to replace the medical doctor, but to help patients to have an initial assessment of their symptoms, decision making in outpatient centres or patient telephone hotline. In addition, it could help to triage in emergency rooms. The assessment only takes about 5 min. So, the tool can be used in waiting rooms as standalone tablet or on telephone hotlines by medical staff.
Due to the different treatment methods and individual decisions, treatment recommendations were deliberately avoided when developing this tool. However, diagnosisspecific physiotherapy exercises are recommended to patients.
In our former study, it is noticeable that there was a minor number of psychiatric diagnoses, a result which was not in line with the literature because depression and anxiety is related to back pain as well as avoidance behaviour [10,17,24]. In the algorithm of this study, only chronic pain syndrome and fibromyalgia are possible diagnoses.
Despite the growing digitalization in all fields, this study is the first for 2 years to evaluate a specific kind of symptom checker for back pain.

Limitations
There are some limitations to the study. Only spinal surgeons who work at the same hospital were included for the clinical examination, which means that most patients have had some treatment before. Next, outpatient centres should be included, as there is usually the first contact to the patient and the doctors are general practitioners and not spine surgeons. Another limitation is that psychiatric diagnoses are not evaluated.

Conclusion
This study showed strong correlation between the diagnoses by spinal surgeons and the clinical decision support system in back pain patients. In a next step it should be evaluated, if the tool can help to reduce the time that passes before a successful treatment is achieved for those patients. approved guidelines (21.08.18, 18-1007-121). Registration in Deutsche Register Klinischer Studien (DRKS), German Clinical Trials Register DRKS00012467 (WHO). All procedures conducted in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the Helsinki declaration of 1964 and its later amendments or comparable ethical standards.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.