Background

Physical function is a strong measure of biological age and a biomarker for health and quality of life in older people [1,2,3,4]. The assessment of physical function among older adults is of importance, as the early detection of functional decline renders it possible to intervene and reverse or prevent further physical function decline and the possible loss of independence [5]. Furthermore, physical performance assessment as an outcome measure is a vital component in studies comparing groups or evaluating the effect of different interventions on physical function [6, 7].

The Short Physical Performance Battery (SPPB) is a well-established instrument for the measurement of physical performance, commonly used among community-dwelling adults, nursing home residents, and hospitalized patients [1, 5,6,7,8,9,10,11,12]. The SPPB involves a timed 4-m walk at the participant’s normal pace, a timed repeated chair sit-to-stand test, and 10-s balance tests, with feet side-by-side, semi-tandem, and full-tandem. Low SPPB scores have been shown to predict poor outcomes, such as falls, mobility loss, disability, hospitalization, a longer hospital stay, nursing home admission, and death [1, 6,7,8, 11, 13,14,15,16]. Furthermore, previous research suggests that the SPPB can detect the early stages of frailty [17], and that a total score ≤ 9 points can distinguish frail from non-frail individuals [18].

As a performance-based measure of physical function, the SPPB has many advantages. The SPPB only takes a few minutes to complete, requires little training to administer, and uses simple equipment. Additionally, the results can be quantified by scores, and it is reproducible and sensitive to changes in functionality through time [18]. Previous systematic reviews that evaluated the psychometric properties of various physical performance instruments have concluded that the SPPB is a reliable and valid tool for measuring lower limb strength in the elderly community [5, 19, 20]. Therefore, the SPPB is considered a good measure for cross-cultural comparisons of physical performance in elderly individuals [18]. The Norwegian translation of the SPPB [21] has shown high reliability in elderly people with and without dementia, living at home or in nursing homes [22].

To be meaningful, test scores must have an empirical frame of reference. Reference or normative (used as synonyms in this paper) data provide this empirical context and represent the range of performance for a particular test in a particular group of individuals. Normative data provide a numerical description of test performance in a well-defined sample group [23]. This group is considered the ‘gold standard’ against which an individual‘s test performance is compared and contrasted [24]. In particular, percentiles, which indicate a person’s relative position in the group for the ability/characteristics tested, provide a useful way of identifying individuals with performance significantly below the level expected for their age and background [23]. One consideration in choosing an appropriate normative dataset is the dataset’s sample size [24]. Furthermore, the use of reference data for a specific population is recommended for a more meaningful interpretation of physical function test results [25]. Thus, the optimal reference values for a physical function test must consider differences in sex and age [26].

Despite the critical importance of having access to normative data to facilitate the clinical interpretation of test findings, there are relative few large-scale normative reports in the literature [24]. As yet, there are no published reference values for the SPPB (in terms of the total score) based on a large sample of individuals aged 40+ years. Thus, we aimed to establish reference values, stratified by sex and age (as recommended by Steffen et al. [27]), for community-dwelling Norwegian adults aged 40 years or older in terms of (1) the SPPB total score; (2) the scores of the three subtests (balance, walking speed, and repeated chair sit-to stand); and (3) the walking speed test (in m/s) and time (in seconds) to complete 5 chair stands in the chair sit-to-stand test. Additionally, we aimed to explore floor and ceiling effects in these measures.

Methods

Study population

Participants comprised men and women aged 40 years or more who participated in the 7th wave of the Tromsø study [28]. The Tromsø study is a multipurpose population-based health examination study, initiated in 1974, with study waves repeated in 1979, 1986, 1994, 2001, 2008, and 2015. In the current analyses, the sample was restricted to those participating in the wave initiated in 2015. All Tromsø study participants aged 40 years or more were invited to complete phase-one of the Tromsø study (n = 32,591), and a random subset of 40% were invited to complete the phase-two examination, which comprised a more thorough clinical examination and included physical function testing. Some Tromsø study participants were invited to complete all phase-two subtests, while most were invited to only some of the tests. The Regional Committee of Research Ethics approved the study (2016/389), and written informed consent was obtained from all participants in the Tromsø Study.

We included individuals who completed the SPPB with non-missing values for all subtests. Among the 9324 individuals invited to SPPB testing, 7866 participated; of these 7763 had non-missing data (Table 1). However, we excluded those who performed the tests without shoes (n = 279) and those who required assistance or had short-term leg injuries (n = 10). The use of walking aids was allowed, and were used by 31 participants (crutches/cane: n = 27; walker, n = 4). Thus, our final study population comprised 7474 participants (53.2% women).

Table 1 Background characteristics and SPPB score distribution according to sex and age, n = 7474

SPPB procedures

From April 20th 2015 to October 26th 2016, experienced clinical evaluators (physiotherapists and trained nurses) assessed the SPPB using standardized methodologies for the instructions, positioning, and scoring. Seven different evaluators rotated during this time period, each spending 1 week at a time in the SPPB station.

The standing balance tests included tandem, semi-tandem and side-by-side standing, and the participants were timed until they moved or 10 s had elapsed. To assess walking speed, the participants were twice asked to walk 4 m at their regular pace. For the repeated chair sit-to-stand test, a pre-test was performed; the participants were asked to fold their arms across their chest (i.e. the armrests were not used) and stand up from the chair. If the pre-test was successful, the participants were asked to perform five chair stands as quickly as possible. They were timed (in seconds) from the initial sitting position to the final standing position at the fifth stand. Each of the three subtests (balance, walking speed and repeated chair sit-to-stand test) of the SPPB was scored from 0 to 4, and summed for a total score ranging from 0 to 12, with higher scores reflecting better function. In addition, the walking speed (meters/second) was calculated as 4/ (the fastest time [in seconds] of the two walking speed trials). Four total SPPB score categories (0–3, 4–6, 7–9, 10–12) according to the cut-points provided by Guralnik and colleagues in their original work [6] is used.

Covariates

Age and sex data were obtained from the Tromsø study registry. All participants were asked about their highest completed level of education. The education level was classified into three categories: second level, first stage (elementary and/or primary school); second level, second stage (high school); and third level (college or university). Height and weight were measured in light clothing without shoes. Body mass index (BMI) was calculated as the weight in kilograms divided by the height in meters squared (kg/m2).

Statistical analysis

Crude mean values and standard deviations (SD) stratified by sex and age groups were first determined. Mean values at specific ages were then estimated, along with corresponding 95% confidence intervals, in linear regression analyses. Next, quantile regression was used to estimate age-specific percentiles (5th, 10th, 25th, 50th, 75th, 90th, and 95th percentiles). In both regression settings, age was included as a restricted cubic spline with 4 knots at default knot locations (ages 44, 61, 68, and 79 years). Models were run separately for men and women. The SD was estimated from the regression model and reflects the standard error of the forecasted value, which corresponds to the SD and is a measure of variation in the actual values. Additionally, we fitted 95% prediction intervals for the walking speed and repeated chair sit-to-stand test to indicate the distribution of the actual individual values. Sex-specific normative values for the SPPB, chair sit-to-stand test and walking speed at five-year age intervals (40, 45, ..., 85 years) were then predicted post hoc from the fitted regression models. Finally, floor and ceiling effects were considered as present when more than 20% of the respondents achieved the lowest or highest possible score [29, 30].

Results

Demographic (sex, age, and level of education) and anthropometric data (height, weight, and BMI) are summarized in Table 1. The mean age of the total sample (7474 participants; 53.2% women) was 63.2 years (SD, 10.4 years; range, 40–85 years). In general, decreased function with increased age was observed (Figs. 1, 2a and b).

Fig. 1
figure 1

SPPB total score by age and sex. Percentiles (5th, 10th, 25th, 50th, 75th) and the mean value are shown

Fig. 2
figure 2

a Walking speed (m/sec) by age and sex. b Chair sit-to- stand test (sec) by age and sex. Mean values with corresponding 95% confidence intervals (CIs) and 95% prediction interval (prediction interval is indicative of the distribution of the actual individual values) are shown

The mean total SPPB score of the total sample was 11.4 points (SD, 1.3; range, 0–12). The mean of the total SPPB score, as well as the distribution of three SPPB classes (total SPPB ≤6, 7–9, and > 9 points), are shown according to sex and age group in Table 1. On average, the total SPPB score was 0.28 points greater in men than in women (p < 0.001), with significant sex differences in all five age groups (Table 1). Age-specific percentile reference data for the total SPPB score in men and women are shown in Fig. 1. The mean and median of the total SPPB score were approximately 12 points (at the maximum) until the age of 70 years in men and 65 years in women; thereafter, there was a steep decline with increased age. Observed ceiling effect for men, defined as more than 20% with the maximum score for the age groups of 40–49; 50–59; 60–69, 70–79 was 80+ was 91, 78, 64, 47, and 36%, respectively. Furthermore observed ceiling effect for women for the age groups of 40–49; 50–59; 60–69, 70–79 was 80+ years was 88, 65, 44, 31, and 23%, respectively.

The distribution of scores for each of the subtests is shown in Table 2. The mean balance, walking speed, repeated chair sit-to-stand scores of the total sample were 3.85 (SD, 0.50), 3.90 (SD, 0.36), and 3.63 (SD, 0.78), respectively. The mean walking speed (meter/second) with 95% confidence bands, as well as the 5th and 95th percentiles to further illustrate the range, is shown in Fig. 2a. The decline in walking speed with age was similar across sex until the age of 60–65 years; starting at this age, the decline in women was greater than that in men. However, men had a steep decline at approximately 75 years of age, resulting in similar walking speeds for men and women at 80–85 years of age. Furthermore, performance in the repeated chair sit-to-stand test was similar between men and women until approximately 60 years of age and after 80 years of age, with women performing significantly worse than men from 60 to 80 years of age (Fig. 2b). Additionally, walking speed was significantly greater in men than in women for the age groups of 65–69 and 70–74 years, but not for the other age groups (Table 3). The mean time (in seconds) to complete the repeated chair sit-to-stand test is shown according to sex and age group in Table 4.

Table 2 Distribution of SPPB subtest scores according to sex and age. n = 7474
Table 3 SPPB subtest: Walking speed, m/s by sex and age
Table 4 SPPB subtest: Repeated chair sit-to stand test, seconds by sex and age

Among men, 91, 78, 64, 47, and 36% had a total SPPB score of 12 points in the age groups of 40–49, 50–59, 60–69, 70–74, 75–79, and 80+ years, respectively. The corresponding rates for women were 88, 65, 44, 31, and 23%, respectively. No floor effects were observed for the total SPPB score. For the balancing, walking speed, and repeated chair sit-to-stand tests, low scores (0–2 points) were observed in 14, 4, and 31% of those in the oldest male age group (80–85 years). The corresponding values for women aged 80–85 years were 27, 16 and 60%, respectively. Ceiling effects were observed in the youngest age groups for all three subtests; however, no floor effects were observed.

Sex- and age-specific percentile reference values for the SPPB sub tests walking speed and chair sit-to stand test are presented in Tables 5, 6 and 7.

Table 5 Normative values for total SPPB score
Table 6 Normative values for the SPPB walking speed test (m/s)
Table 7 Normative values for the SPPB repeated chair sit-to-stand test (seconds)

Tables 5, 6, 7: Values for the percentiles were estimated from quantile regression analyses, while the mean (SD) was estimated from a linear regression model. In both regression settings, age was included as a restricted cubic spline with 4 knots at default knot locations (age: 44, 61, 68, and 79 years). Models were run separately for men and women. SD was estimated from the regression model and is the standard error of the forecast. P5, P10, P25, P50, P75, P90, P95; the 5th, 10th, 25th, 50th, 75th, 90th, and 95th percentile, respectively.

Discussion

To the best of our knowledge, the present study is the first to provide sex-specific reference values for the SPPB total score, as well as for the three subtests included in the SPPB, in community-dwelling adults aged at least 40 years. There was considerable variability in the SPPB total score among individuals age 40+ years living at home. Furthermore, the present study results demonstrate that the main decline in physical function occurs in the mid-sixties, with a slightly earlier decline in women than in men.

An appropriate measuring instrument should have minimal floor and ceiling effects for the intended purpose and population [31]. The present study showed a considerable ceiling effect using for the SPPB total and subtest scores, since more than 20% of the respondents achieved the lowest or highest possible score [29, 30]. Consistent with the present results, ceiling effects for physical performance measurement instruments in higher-functioning community-dwelling older adults aged ≥60 years have been observed by other researchers [31, 32]. Furthermore, the detection of ceiling effects in the youngest age groups for the SPPB, scored in terms of points, is not surprising [7, 33]. However, ceiling effects for physical performance measurement instruments do not only hamper the detection of early balance deficits, but also prevent the detection of intervention-related changes over time in higher-functioning older adults [7, 32,33,34]. When a measure is used to capture change, high baseline scores and ceiling effects limit the ability to detect improvement between two assessments, posing a serious concern for type II errors in clinical trials. Even when the more serious risk of type II errors does not occur, outcome measures with limited sensitivity to change may falsely diminish the overall magnitude of the intervention effect. This suggests that reporting the performance on the subtests of the SPPB as the time to complete a 3-m or 4-m walk and the time to rise from a chair five times in the repeated chair sit-to-stand test might be better for high-functioning adults aged 40–80 years.

The present study demonstrated a significant trend toward age-related functional decline, with some differences between men and women, consistent with previous studies [35]. Furthermore, a previous meta-analysis, which clearly highlighted an effect of age on walking speed [36], reported mean walking speeds stratified by sex and age-group (in 10-year intervals) that correspond quite well to the present results. The present data on walking speed in men and women at different ages also correspond well to those in the review of reference values for standardized tests of walking speed by Salbach et al. [37] and the study by Callisaya et al. [38], which randomly selected participants from the Southern Tasmanian electoral roll (n = 223). Additionally, Thaweewannakij et al. [35] described reference values for the comfortable walking speed in elderly people, aged 60–90 years, who were well functioning and dwelling in the community. The speed varied from 0.88 to 1.48 m/s, which corresponds well to the present results, even though the walking distances differed between the studies. However, our participants performed better on the repeated chair sit-to- stand test than did the participants in the study by Thaweewannakij et al. [35], with times ranging from 12.9 s in the age group of 60–69 years to 17.1 s in women aged 80 or more (see Table 4). A walking speed < 0.6 m/s on the 4-m test has been used as to identify persons at high risk for being hospitalized with deteriorating health and physical function [12]. All of our participants had a walking speed >0.6 m/s, which differs from the rate of 8.1% reported in other studies [39]. As Da Câmara et al. [18] reported that 9 points on the SPPB discriminates between frail and non-frail older adults, approximately 20% of men and women aged 75 years or more in our study population could be classified as frail.

Strengths and limitations of the study

One strength of the present study is its use of a performance-based physical function assessment that was previously tested for validity and reliability [5]. Furthermore, before the study was initiated, the testers completed a training programme to ensure high inter-rater test reliability. Additionally, the current study has a high degree of generalizability as it recruited from the general population. However, the study focused only on community-dwelling older people, omitting those living in institutions, and it remains unclear whether the present findings are generalizable beyond Norway. Additionally, legal restrictions hamper detailed comparisons between participants and non-participants [28]. In general, studies of the two first waves (Tromsø 4 and 6) revealed differences in age and marital status between participants and non-participants; non-participants were younger and more likely to be single [28].

Conclusions

The present study is the first to provide comprehensive, up-to-date normative values for SPPB measures in community-dwelling individuals aged at least 40 years and living in Norway. Up-to-date population-specific normative values are essential in enabling clinicians to better evaluate patient performance relative to that for the general population community-living older adults and determine the appropriate intervention/management. Because of ceiling effects, the SPPB has limitations in the assessment of physical functioning across the full spectrum of community-dwelling adults aged 40+ years that should be considered. Finally, we conclude that performance on the SPPB should be reported in terms of the total score, as well as the time to complete the repeated chair sit-to-stand test and the walking speed test. The present data may be used to interpret the results of studies evaluating and establishing appropriate treatment goals.