Introduction

Adolescent idiopathic scoliosis (AIS) is a complex, three-dimensional deformity of the spine and chest wall and is defined as a curvature of the spine ≥ 10 degrees in the coronal plane [1, 2]. AIS can cause a wide range of symptoms, e.g., impaired mobility, reduced strength, function loss and back pain [3, 4]. Depending on the location and severity of the curvature, pulmonary symptoms such as shortness of breath and reduced exercise tolerance may occur [5]. As pulmonary functioning and symptoms change over time with aging, this potentially has negative consequences for quality of life and exercise tolerance for adult patients with an adolescent onset idiopathic scoliosis.

Despite the above, pulmonary functioning and symptoms are not routinely monitored and in scientific publications this domain is rarely reported and extremely underexposed [6]. The need for routine assessment of pulmonary function in patients with spinal deformities has been identified in a recent patient survey [6]. Furthermore, international consensus studies revealed that “pulmonary fatigue” should be included as an outcome domain in a standard outcome set for adolescents and young adults (AYA) [7] as well as for adults [8] with a spinal deformity undergoing surgical treatment. Although determined as a standard outcome, the authors concluded in both studies that as yet an adequate patient-relevant measure is not available [7, 8].

In AIS, clinical measurement instruments (e.g., spirometry) to assess pulmonary functioning and symptoms have been frequently used. Out of many available clinical tests, spirometry is the most frequently used modality to assess key parameters of pulmonary function, in particular forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV1). Classic patterns deriving from spirometry are so-called obstructive and restrictive patterns [9]. Obstruction, i.e., decreased FEV1/FVC ratio, is seen in lung diseases such as chronic obstructive pulmonary disease (COPD), asthma or cystic fibrosis [9]. Restrictive patterns, i.e., normal FEV1/FVC ratio but decreased FVC, are seen in various parenchymatous fibrotic lung diseases as well as in patient with thoracic deformities [9]. Of importance, spirometry has proven a reproducible diagnostic tool allowing for long-term repeated measurements as well as for evaluating the effect of interventions that may have altered pulmonary function [9]. AIS can cause a restrictive spirometric pattern due to decreased chest wall compliance that prevents normal inflation of the lungs [10, 11]. A negative association between the severity of the thoracic curvature and FVC and FEV1 has been reported [12].

Although clinical measurement instruments such as spirometry are frequently used, these instruments seem to lack clinical relevance, as they do not represent the patients experience (i.e., objective values that may not correlate with the patients symptoms), they are time-consuming, and they are expensive to obtain in routine clinical daily practice [6]. Additionally, the focus of routine outcome measurement has broadened toward including outcome assessment from the patient’s perspective which can easily be monitored over time (e.g., patient-reported outcome measures (PROMs)) [13, 14].

The objective of this systematic review is to identify currently available patient-reported and clinical measurement instruments that assess pulmonary functioning and symptoms in AIS.

Methods

A systematic review of the literature was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [15]. The protocol of this review was registered in PROSPERO (ID 129,174), an international prospective registry of systematic reviews [16].

Search strategy and eligibility criteria

A literature search was conducted on 18.06.2020 in the databases of PubMed, EMBASE and The Cochrane Library (Appendix 1). Keywords used to identify relevant papers were “spinal curvatures”(MeSH), “patient reported outcome measures”(MeSH), “respiratory function test”(MeSH), “scoliosis,” “pulmonary function,” “respiratory function,” “lung function,” “cardiopulmonary,” “lung volume,” “respiratory test” and “ventilation test” in the title or abstract. Duplicates were removed. An online web application (Rayyan [17]) was used for the title, abstract and full text screening. The titles and abstracts were independently screened by two authors (NtH and SSAF). Full texts were retrieved from all publications which passed the title and abstract screening and were independently reviewed for inclusion by the same two authors according to the following inclusion criteria: (1) diagnosis AIS, (2) minimal use of one measurement instrument for pulmonary function (patient-reported or clinician based), (3) all forms of treatment (surgical and non-operative) and (4) retrospective and prospective cohort studies, randomized controlled trials, case–control studies and case reports. Exclusion criteria were (1) reviews, (2) non-English language, (3) publication prior to 01.01.2000 and (4) other diagnosis than AIS (i.e., infantile, juvenile idiopathic scoliosis, early onset scoliosis, congenital scoliosis or neuromuscular scoliosis). In case of disagreement regarding the inclusion or exclusion of studies, the authors arranged a consensus meeting. When disagreement persisted, a third independent reviewer (MP) made the final decision.

Data extraction

The following data were extracted from the included studies: first author, region of origin, year of publication, number of AIS patients, mean age of patients, given treatment, mean follow-up, used measurement instruments and measured parameters to evaluate pulmonary symptoms.

Measurement properties

To determine the adequacy of the identified measurement instruments, in terms of measurement properties, an additional literature search was performed in PubMed, EMBASE and The Cochrane Library. The additional search was performed to find relevant studies that assessed the measurement properties of PROMs and the four most-used clinical measurement instruments identified in the systematic review (Appendix 2). Additionally, studies were searched through backwards citation (Appendix 3).

Assessment of the measurement properties of identified PROMs was based on the quality criteria as developed by Terwee et al. [18]. The following measurement properties were evaluated: (1) content validity, (2) internal consistency, (3) criterion validity, (4) construct validity, (5) reproducibility (5a agreement and 5b reliability), (6) responsiveness, (7) floor and ceiling effects and (8) interpretability.

Quality assessment

The COSMIN Risk of Bias (RoB) checklist [19,20,21] was used to assess the quality of the included publications in which the measurement properties were determined.

Results

Search results

The systematic search generated a total of 3146 papers (Fig. 1). After removing duplicates, 2388 studies remained, which were screened on title and abstract. Of these 2388 studies, 335 were potentially eligible and the full texts were retrieved and screened. A total of 213 papers were excluded after full text screening, and 122 were eligible for inclusion.

Fig. 1
figure 1

PRISMA flow diagram of identification, screening and inclusion of papers

Characteristics of included studies

Table 1 provides the main characteristics of all the included studies. Of the included studies, most were conducted in Asia (46/122; 38%) or North America (41/122; 34%). The vast majority of the studies (82/122; 67%) described a population between 10 and 15 years of age and the greater proportion had surgical treatment (83/122; 68%). In 39/122 (32%) studies no follow-up was reported, and in 24/122 (20%) treatment modality was not described.

Table 1 Characteristics of included studies

Extracted data

Clinical measurement instruments

Fifty pulmonary parameters were identified, measured by seven different clinical measurement instruments (spirometry, plethysmography, 3D-reconstruction (by MRI, CT or radiographs), manometer, gas analyzer, arterial blood analysis and physical examination). Parameters that were used in 10% or more of the studies (≥ 12/122) are presented in Table 2, categorized by used method or instrument. In Appendix 4, the parameters used in less than 10% of the studies are presented.

Table 2 Identified clinician based measurement instruments

Forced vital capacity (FVC), forced expiratory volume in one second (FEV1), the Tiffeneau index (FEV1/FVC) and total lung capacity (TLC) measured by spirometry or plethysmography are most frequently used (respectively 99/122; 81%, 97/122; 80%, 31/122; 25% and 29/122; 24%). Three-dimensional reconstruction (of the lungs or chest wall) by MRI, CT or radiographs was also commonly used (total 15/122; 12%). Exercise capacity (VO2max) and maximal ventilation (VEmax) was used in 11% of the included studies.

Patient-reported outcome measures (PROMs)

Nine percent of the studies (11/122) used a PROM (Table 3). A total of five different PROMs were identified, being the Borg dyspnea scale, the Borg ratings of perceived exertion (RPE) scale, the MRC breathlessness scale, the University of California at San Diego Shortness of Breath questionnaire (UCSD SOBQ) and a breathing effort scale. The most commonly used PROM is the Borg dyspnea scale (6/11; 55%). The breathing effort scale was a scale improvised by the authors of the paper, not officially developed in original research.

Table 3 Identified patient reported outcome measures

The Borg RPE scale [22] was originally defined by Borg in 1970, assessing the amount of exertion during exercise, ranging from 6 to 20, with 7 being “very, very light” and 19 being “very, very hard” and assesses the exertion during physical exercise, which is not only affected by the pulmonary function, but also by the cardiac function, vascular function and overall fitness. The Borg dyspnea scale [23] was created in 1982, as a modification on the originally defined RPE scale; therefore, it is also known as the modified Borg scale (mBorg scale). It rates the breathlessness from 0 (nothing) to 10 (maximal) during exercise. This only represents the dyspnea experienced by a patient at a certain point in time, not taking fatigue or weariness during the day into account which is deemed important in AIS. Moreover, the mBorg scale is assessed during exercise.

The MRC breathlessness scale quantifies the disability associated with breathlessness with one question by identifying when breathlessness and restrictions occur during daily activities [24]. It was defined by Fletcher et al. [25] in 1959. This single question originally comprised a 1 to 5 points scale with 1 being “I only get breathless with strenuous exercise” and 5 being “I am too breathless to leave the house” or “I am breathless when dressing.” This was later modified to a 0–4 scale, with the same items.

The UCSD SOBQ was originally developed 1987 [26]. It is a 24-item questionnaire that assesses the shortness of breath while performing a variety of activities of daily living, thereby evaluating the functional limitations. The questionnaire consists of 21 activity-based items and three questions about limitations due to shortness of breath, fear of harm from overexertion and fear of shortness of breath. The activity-based items are scored on a 6 point scale, with 0 indicating no shortness of breath and 5 indicating “worst” or “unable to do due to shortness of breath” [27, 28].

Measurement properties

No studies were identified assessing the measurement properties of the Borg dyspnea scale, Borg RPE scale, MRC breathlessness scale, UCSD SOBQ, FVC, FEV1, FEV1/FVC or TLC in an AIS population (Appendix 2).

Quality assessment—risk of bias

The Risk of Bias could not be determined, as no studies were identified that evaluate the measurement properties of these measurement instruments in AIS.

Discussion

AIS can significantly affect pulmonary function [5, 10,11,12], which is recognized by both clinicians [7] and patients [6]. Currently, no consensus exists on how to measure pulmonary functioning and symptoms in this group of patients, and a plethora of measurement instruments are being used [6]. This systematic review identified a total of seven clinical measurement instruments and five patient-reported outcome measures (PROMs) that have been used in studies of AIS patients from 2000 to 2020. No studies were identified on concomitant measurement properties to determine the adequacy of the identified measurement instruments. As such, floor-ceiling effects, validity, reliability, responsivity and interpretability of identified PROMs could not be evaluated in an AIS population. This study has not been able to identify any currently available adequate patient centric instrument to measure pulmonary outcomes following treatment for AIS in routine daily practice.

Clinical measurement instruments

A total of seven clinical measurement instruments were identified, measuring 50 pulmonary parameters such as FVC, FEV1, FEV1/FVC and TLC (Table 2 and Appendix 4). Spirometry and plethysmography are the most frequently used clinical-based measurement instruments and generally provide an adequate assessment of the volume and flow functions of the lungs [29]. Although both are reliable measurement instruments for the diagnosis of restrictive lung defects [9, 30, 31], as yet, no evidence exists for the adequacy of these instruments as outcomes measurement instruments in patients with AIS.

Even though many clinical measurement instruments are frequently used in the literature, they are not suited for routine outcome measurement for patient-centered care reporting in an AIS population. They lack clinical relevance as they do not cover the patients’ perspective, are time-consuming and are expensive to obtain in routine clinical daily practice [6].

Patient-reported outcome measurements (PROMs)

Five PROMs were identified that have been used to assess pulmonary symptoms in AIS (Table 3). As yet, the quality of these PROMs, in terms of measurement properties as described by Terwee et al. [18], has not been evaluated in the AIS population. The Borg dyspnea scale, Borg RPE scale, MRC breathlessness scale and the breathing effort scale evaluate the amount of breathlessness/dyspnea in a single point in time, scoring it from 6–20 (Borg RPE), 0–10 (Borg dyspnea), 1–5 (MRC) or 1–9 (breathing effort). The 24 item UCSD SOBQ is the only scale that includes the experienced limitations during daily activities. It consists of 21 items covering the amount of breathlessness (0–5) during daily activities and 3 items concerning shortness of breath, fear of hurting themselves by overexerting and fear of shortness of breath limiting daily lives.

No evidence was found regarding the measurement properties of any of the identified PROMs in an AIS population. This does not mean that the evidence is absent. To evaluate which PROM might be eligible for future use, a post hoc literature search was performed to find studies that assessed the measurement properties of the identified PROMs in populations other than AIS (Appendix 5). Twenty studies were found and the quality these studies was good [“very good” (13/20; 65%), “adequate” (6/20; 30%), “doubtful” (1/20; 5%) (Appendix 6 and 7)]. For substantiation of the COSMIN checklist, see appendix 8. None of these studies evaluated all measurement properties as described by Terwee et al. [18] Overall, the UCSD SOBQ seems adequate: 6/9 measurement properties, being content validity, internal consistency, criterion validity, construct validity, agreement and responsiveness were evaluated and were scored “positive” (Appendix 7). The UCSD SOBQ has been studied in populations with lung disease; obstructive lung disease (OLD) [27]; chronic obstructive pulmonary disease (COPD) [32, 33] or asthma and idiopathic pulmonary fibrosis (IPF) [28, 34]. Although the UCSD SOBQ has good measurement properties and seems promising, patients with lung diseases cannot be directly compared to patients with AIS as patients with AIS have restricted, but overall healthy lungs. Research regarding the measurement properties in an AIS population is needed to demonstrate the adequacy and the clinical usefulness of this PROM.

Besides the UCSD SOBQ, the only other PROM with promising measurement properties was the Borg RPE scale (3/9 properties evaluated), and it appeared to have a good criterion validity, meaning that it relates to the gold standard [18]. The populations studied have been more variable, including patients without primary pulmonary disease, ranging from children to healthy adults and Parkinson patients to patients recovering from a stroke.

Overall, the UCSD SOBQ seems promising as it has good measurement properties and includes the limitations of daily activities, which are important in an AIS population [6]. However, these measurement properties have not been assessed in an AIS population and the questionnaire seems too comprehensive for routine outcome assessment when, for example, it is compared to the frequently used SRS 22 questionnaire. This questionnaire comprises 22 questions for five different outcome domains, versus 24 questions for one outcome domain in the UCSD SOBQ. Where the UCSD SOBQ seems too comprehensive for routine use, the Borg RPE scale seems too concise for assessing the pulmonary problems in AIS, even though it has good criterion validity (also not evaluated in an AIS population). The Borg RPE scale only assesses the amount of breathlessness in a single point in time, mostly used during or directly after exercise. It does not include any other information on pulmonary symptoms such as an increased fatigue, which is a regularly reported symptom in AIS patients [6].

Future perspective

Patients experience a large variety of pulmonary signs and symptoms such as shortness of breath, reduced exercise tolerance, respiratory fatigue and perceive limited daily functioning due to these pulmonary symptoms [6]. However, as yet, the underlying cause and theoretical construct is not understood. Are the experienced limitations based on the dysfunction of the lungs themselves (e.g., limited inflation of the lungs) or is the fatigue due to increased energy consumption by the musculoskeletal system the main problem? Could the symptoms have a cardiovascular origin? A recent study showed that a proportion of AIS patients seem to have impaired right cardiac function, with pulmonary hypertension. This dysfunction normalized after scoliosis surgery, indicating the benefits of spine surgery on cardiac function, possible due to re-alignment of the spine, cardiovascular structures or rib cage [35]. Insight in the pulmonary symptoms that are experienced by patients with AIS will support the process of the development of an adequate PROM that might be implemented for outcome measurement in routine daily use. This will also aid the evaluation of aging, progression of the curve or different treatment strategies on the pulmonary function and symptoms in patients with scoliosis. We therefore recommend to explore further with patients what limitations they experience, and with which questions/items this might be measured. This will help create a theoretical construct for the pulmonary problems experienced by patients with AIS, ultimately leading to the identification and validation of an existing PROM, such as the UCSD SOBQ or the development of a new disease specific PROM.

Limitations

Several limitations of this study should be mentioned. First, selection bias might have occurred, as articles prior to 01.01.2000 and non-English articles were excluded to acquire the most relevant literature. Second, measurement properties of only the four most used clinical measurement properties (FVC, FEV1, FEV1/FVC and TLC) were assessed. However unlikely, it is possible that measurement properties of other clinical measurement instruments have been missed. Third, common language is lacking. A clear definition of the pulmonary problems in AIS does not (yet) exist, which subsequently makes it challenging performing research in this subject. Pulmonologists and physicians treating scoliosis patients have different perspectives on the matter. Pulmonary fatigue, for instance, is a definition unheard of in the pulmonary department but has been included in a core outcome set for adults and young adolescents with spinal deformity [7].

Conclusion

Both clinicians and patients recognize pulmonary symptoms and patient experienced limitations in routine daily practice as an important outcome domain in the treatment of patients with idiopathic scoliosis. Despite not being routinely reported or studied, we speculate that this domain is clinically very relevant for patients during adolescence, and also in later adult life, with implications on quality of life. In this study, no currently available adequate patient centric instruments were identified to measure this domain in patients with idiopathic scoliosis. Although clinical measurement instruments such as spirometry have been reported regularly in research papers, their use in routine practice is not patient centric and does not seem feasible. Several available PROMs may potentially be used, most notably the UCSD SOBQ and the Borg RPE scale. Their measurement properties in this specific patient population are still unknown, and both have limitations in feasibility for use in routine clinical practice. Furthermore, a major hurdle in identifying the right instrument is that the underlying theoretical construct and common language of pulmonary functioning and symptoms in these patients is still elusive. The development of such a construct and potentially a subsequent PROM to routinely measure pulmonary functioning and patient experience is recommended.