SARC-F as a case-finding tool for sarcopenia according to the EWGSOP2. National validation and comparison with other diagnostic standards

Background Sarcopenia is a potentially reversible condition, which requires proper screening and diagnosis. Aims To validate a Polish version of sarcopenia screening questionnaire (SARC-F), and assess its clinical performance. Methods Cross-sectional validation study in community-dwelling subjects ≥ 65 years of age. Diagnosis of sarcopenia was based on the 2018 2nd European Working Group on Sarcopenia in Older People (EWGSOP2) consensus. Hand grip and 4-m gait speed were measured, and the Polish version of SARC-F was administered. Results The mean (SD) age of 73 participants (21.9% men) was 77.8 (7.3) years. Seventeen participants (23.3%) fulfilled the EWGSOP2 criteria of sarcopenia, and 9 (12.3%) criteria for severe sarcopenia. Fourteen (19.2%) participants fulfilled the SARC-F criteria for clinical suspicion of sarcopenia. The Cronbach’s alpha coefficient for internal was 0.84. With EWGSOP2 sarcopenia as a gold standard, the sensitivity of SARC-F was 35.3% (95% CI 14.2–61.7, p = 0.33), specificity was 85.7% (95% CI 73.8–93.6, p < 0.0001). The corresponding positive and negative predictive values were 42.9% (p = 0.79) and 81.4% (p < 0.0001), respectively. The probability of false-positive result was 14.3% (95% CI 6.4–26.2, p < 0.0001) and the probability of false-negative result was 64.7% (95% CI 38.3–85.8, p = 0.33). Overall the predictive power of SARC-F was low (c-statistic 0.64). Discussion SARC-F is currently recommended for sarcopenia case finding in general population of older adults. However, its sensitivity is low, despite high specificity. Conclusions At present SARC-F is better suited to rule out sarcopenia then to case-finding. Further refinement of screening for sarcopenia with the use of SARC-F seems needed. Supplementary Information The online version contains supplementary material available at 10.1007/s40520-020-01782-y.


Introduction
Sarcopenia is a frequent, age-related muscle wasting, that results in impaired skeletal muscle performance. Its prevalence has been estimated at 9-40% among older persons [1]. Sarcopenia has been linked to higher morbidity and mortality [2], increasing the need for support in the activities of daily living or institutionalization, and diminished quality of life [3,4]. Sarcopenia is also responsible for high care-related burden, including burden to the family and the society [5,6]. Sarcopenia is amenable to therapy, mainly rehabilitation and proper nutrition [7,8]. However, protocols to assess sarcopenia are required, with sufficient performance in the case-finding and the confirmation stages of the diagnosis [9].
However, the diagnostic performance of the test varies across studies [20]. As an example, the sensitivity of the test varies considerably across studies, with values ranging from 3.9 to 95.4% [21,22].
Our aim, was to standardize the translation and validate the clinical performance of the Polish version, included in the on-line appendix, of the SARC-F as a case-finding tool against an array of objective diagnostic criteria for sarcopenia, including dual-energy X-ray absorptiometry (DXA)based EWGSOP2.

Study
The study was approved by the Jagiellonian University Bioethics Committee (KBET no.: 1072.6120.71.2018). All participants gave written informed consent. The study was performed cross-sectionally between August 2018 and March 2019, in line with the call by the Special Interest Group (SIG) for Sarcopenia of the European Geriatric Medicine Society (EuGMS) for action to improve the screening and the diagnosis of sarcopenia [23].

SARC-F questionnaire and other sarcopenia screening modalities
We performed a two-step cultural and clinical validation of SARC-F questionnaire in subjects ≥ 65 years of age. The validation protocol was based on the original validation procedure of the English SARC-F [23]. In brief, the questionnaire was translated from English to Polish by the two Polish geriatric researchers fluent in general and medical English (KP, AS). The final version of the Polish SARC-F was back-translated to English by a Polish-English bilingual native-speaker certified in interpreting between both languages. The back-translation was then sent to Prof. John Morley for the formal approval. The threshold for sarcopenia case-finding has been established at 4 points [10].
The original SARC-F questionnaire together with the Polish translation is included in Appendix 1 [10]. In addition to the SARC-F, we used the SARC-CalF which adds calf circumference to the SARC-F components [24]. We also used mid-arm circumference and calf circumference as separate screening instruments. For the calf and mid-arm circumference, we used population-specific cut-off values based on Youden's Index. For the calculation SARC-CalF, on a scale from 0 to 20, we used the Youden Index-based population-specific cut off for calf circumference (< 32 cm) with a weight of 10. The cut-off for SARC-CalF was ≥ 11. We also used index by Ishii et al. [25].

Study population
Information about the study was sent to the local seniors' organizations (e.g. universities of the third age, senior clubs) and posted in the outpatient and inpatient geriatric clinics of the University Hospital in Kraków. Consecutive communitydwelling people aged 65 years and more were encouraged to participate in the project.

Measures
Body composition was assessed by the Lunar iDXA dualenergy X-ray absorptiometry (DXA) equipment (GE Healthcare, Chicago, Il, USA). Appendicular lean body mass was calculated as a sum of lean mass of both upper and lower limbs, and expressed and further analyzed as Appendicular Skeletal Muscle Mass (ASM, kg), ASM adjusted for subject's height (ASM/h 2 , kg/m 2 ) and ASM adjusted for subject's body mass index (BMI) (ASM/BMI) [6].
Muscle strength (kg) was assessed with a Saehan handheld dynamometer SH 5001 (Seahan Corporation, Masan, South Korea) according to the American Society of Hand Therapists (ASHT) recommendations [26]. Handgrip strength of both hands was assessed three times, and the highest value for either hand was recorded.
Gait speed (m/s) was measured three times, over a distance of 4 m with subjects walking at their usual speed, with walking aids if needed [27]. The first attempt was considered as an instructive example, and the highest value of the second or third trials was used.
Additionally, we performed standard Timed-Up and Go (TUG) Test and the Short Physical Performance Battery (SPPB) [28,29]. The functional measurements were performed by a study physiotherapist (JC).
The subjects were interviewed by one of the two trained raters (KP, AG). To assess their cognitive performance we used Polish version of the Montreal Cognitive Assessment and 15-point Geriatric Depression Scale [30,31], functional status the Activities of Daily Living and Instrumental Activities of Daily Living Scales and activity the Seven-Day Physical Activity Recall questionnaire [32,33]. Physical frailty was assessed according to criteria by Fried et al. and Rockwood et al. [34,35], the nutritional status with the Mini-Nutritional Assessment [36], and quality of life with the EuroQol-5D-5L questionnaire [37]. Sociodemographic characteristics and information about medications and comorbidities were collected. Based on the available data, the Charlson Index and total weekly energy expenditure were calculated for each subject [33,38].
Anthropometric measures were performed in accordance with the Centres for Disease Control and Prevention (CDC) guidelines [39], and included: height (cm), weight (kg), waist and hip circumference (cm), calf circumference (CC, cmf) and midarm circumference (MAC, cm); Body Mass Index (kg/m 2 ) and Waist-Hip Ratio (WHR) was calculated, respectively.

Assessment of sarcopenia
Sarcopenia was diagnosed according to four definitions including: the European Working Group on Sarcopenia in Older People 2 (EWGSOP2, 2018) consensus (all sarcopenia, not limited to severe sarcopenia) [6], the Foundation for the National Institutes of Health (FNIH, 2014), based on weakness and low muscle mass [40], the International Working Group on Sarcopenia (IWGS, 2011) [41], and the Society of Sarcopenia, Cachexia and Wasting Disorders (SSCWD, 2011) criteria [42]. For the purpose of the presented analysis, we used the values of AMS adjusted to height (ASM/h 2 ) of ≤ 6.0 kg/m2 for women and ≤ 7.0 kg/m 2 for men, respectively, as proposed by the EWGSOP2 consensus paper [6]. The operational criteria used for sarcopenia diagnosis are summarized and presented in Appendix 2.

Statistical analyses
The data management and the statistical analyses were performed with SAS 9.4 (SAS Institute Inc., Cary, NC, USA). The continuous variables were compared with standard normal Z-test or Wilcoxon's test, normally and non-normally distributed variables, respectively. The proportions were compared with chi-square test. To assess the coherence of the Polish version of the SARC-F inventory, we calculated Cronbach's alpha coefficient. Further, for each objective gold-standard diagnosis of sarcopenia as a binary outcome, with SARC-F as an ordinal explanatory variable on a scale from zero to ten, we fitted a logistic regression model based on which we obtained the Receiver Operating Characteristics (ROC) curve with the Area Under the Curve (AUC) as a measure of diagnostic performance. Using the approach described by Youden et al. [43] we obtained the population-specific cut off values for SARC-F. Further, based on the cut-off of 4 proposed by the EWGSOP2, and the calculated population-specific cut-off, we calculated the sensitivity, specificity, positive predictive value, negative predictive value and accuracy, with exact 95% Confidence Intervals. To put the performance of SARC-F in a wider context, we repeated the procedure for alternative case-finding tests described in the literature.

Results
The mean (SD) age of 73 patients (78.1% women) was 77.8 (7.3) years. Of the entire group, 17 persons had sarcopenia based on the EWGSOP2 criteria, while 14 participants fulfilled the criteria for the SARC-F defined sarcopenia. Table 1, contains the characteristics of the study group. Overall, the included patients were multimorbid and taking multiple medications (median quantity of OTC preparations 2, median quantity of prescription preparations 7.5) irrespective of their sarcopenia status. Cognitive impairment was present in 61.1%, malnutrition or the risk of malnutrition were present in 32% and frailty based on Fried criteria in 29.2% of the participants. Patients affected by sarcopenia were older, and had lower educational status, and were affected with more diseases. They had lower BMI, including lower muscle and lower fat mass, lower mid-arm circumference and lower calf circumference. Patients with sarcopenia, in general, were characterized by lower self-reported physical activity, and worse values for physical performance and muscle strength indices (Table 1).

Translation and cultural validation. Intra and interrater reproducibility
First, in an initial group of 10 persons of wide educational range (age 79.7 (9.3) years, 50% women) we assessed the ability by the participants to comprehend the questions correctly. We did that across different educational strata and across genders. Further, in another group of 20 participants we performed the assessment of the inter-rater and intra-rater agreement of each of the five items of the SARC-F test. Due to the fact that the answers were qualified into three levels, it was not possible to use the McNemar's test. Instead, to check for the degree of agreement, we used the simple, unadjusted, kappa statistic. Overall, we found that the.
inter-rater agreement was high (the kappa statistics ranging between 0.85 and 1.0) and that the intra-rater reproducibility of the questions was good (the kappa statistics ranging between 0.65 and 1.0). (Appendix 3).
The Cronbach's alpha test for internal consistency of questions (final study-group) was 0.82 for ability to lift and carry 10 lb, 0.83 for past year's history of falls, 0.76 for chair to bed transfer, 0.76 for climbing 10 stairs and 0.74 for walking across a room. Overall, the Cronbach's alpha was 0.82.

Measures of clinical usefulness
To assess the clinical usefulness of SARC-F to detect the cases of sarcopenia, we used, as standards the following definitions of sarcopenia: EWGSOP2, IWGS, FNIH, and SSCWD. The sensitivity and specificity of SARC-F was 35 Table 2.

Sample-specific cut-off values for SARC-F
Based on the ROC results for each of the sarcopenia definition used, we calculated sample-specific cut-off values of SARC-F. The Youden method-based cut-off with each sarcopenia standard as comparator was ≥ 5, except for FNIH where it was ≥ 2. The sensitivity of thus obtained cut-off against the FNIH as comparator was 60.0%, specificity was 61.9%, PPV was 20.0%, NPV was 90.7%. The sensitivity of this cut-off against the EWGSOP2 for was 35.3%, specificity was 89.3%, PPV was 50.0%, NPV was 82.0%.
The corresponding values for the remaining comparator definitions of sarcopenia were not materially altered in comparison to standard SARC-F cut off. The details are given in Table 2.

Other screening tools for sarcopenia
To put the clinical validity of SARC-F in a broader context, we analyzed the clinical validity of other screening modalities for sarcopenia.
With an exception of mid-arm circumference alone against FNIH as standard, all additionally tested screening criteria demonstrated numerically better c-statistic and sensitivity compared to SARC-F. The results for specificity, PPV, NPV, and accuracy varied and are presented in Table 2.

Discussion
We performed a two-step, cultural and clinical validation of the Polish translation of the SARC-F questionnaire. We did that in the community-dwelling older persons, against the four commonly used definitions of sarcopenia. We used whole-body DXA scans to assess muscle mass.
SARC-F questionnaire was reproduceable both intra-rater (all kappa ≥ 0.82) and inter-rater (all kappa ≥ 0.62). The sensitivity and specificity of SARC-F against the EWGSOP2 was 35.3%, and 85.7%, respectively. We used the Youden method to obtain population-specific cut-off for SARC-F. This did not importantly improve the estimates of clinical validity.
In Poland, a recent study of 67 community-dwelling persons ≥ 65 years of age, the sensitivity of SARC-F was 92.9%, and the specificity 98.1% [47]. This contrasts results of similar another Polish study, where the sensitivity of SARC-F was 41.2%, specificity was 88.0% [46]. Both studies, used the bio-impedance weighting scale-measurement, however, the former one was performed in a slightly younger population (69.5 ± 4.0 years vs. 74.5 ± 6.9 years), which to some extend may have influenced the results. Based on DXA assessment, we show the sensitivity to be even smaller, with comparable specificity. A German validation study performed in 117 older subjects showed that the internal consistency of the German version of SARC-F was acceptable (the Cronbach alpha = 0.67), with intra-rater repeatability of 0.90 and inter-rater repeatability of 0.93. They estimated the sensitivity of SARC-F as 63%, specificity as 47% and the c-statistic of 0.58 [48]. Our estimates of the inter-rater and intra-rater repeatability for test components were > 0.85, and > 0.65, respectively. The overall Cronbach alpha was 0.82. The sensitivity was 35% and specificity 86%. The c-statistic was 0.64.
A number of studies, in persons of varied background including ethnicity, age, pathology were performed, yielding wide range of estimates of sensitivity, specificity, accuracy, and the ROC for SARC-F [20].
The largest study thus far was performed in 4000 older individuals from three populations of varied cultural background, demonstrated low sensitivity, but high specificity In addition to the SARC-F, we used the SARC-CalF including calf circumference on top of the SARC-F components as described by Barbosa-Silva et al. [24] and a point scoring system designed by Ishii et al. [25] using sex, age, grip-strength, and calf-circumference We also used midarm circumference and calf circumference as separate screening instruments. For the Ishii et al.'s index, the calf and midarm circumference, we used population-specific cut-off values based on Youden's Index. For the calculation SARC-CalF, on a scale from 0 to 20, we used the Youden Index-based population-specific cut off for calf circumference (< 32 cm) with a weight of 10. The cut-off for SARC-CalF was ≥ 11 of SARC-F (< 10%, > 94%, gender-specific sensitivity and specificity, respectively). The authors of that report concluded that despite the low sensitivity and thanks to high specificity SARC-F may be used as a screening tool for sarcopenia at the community level [49]. SARC-F was employed in a range of pathologic settings. In the heart failure patients, the sensitivity and specificity were 52.5% and 96.2%, respectively [44]. In hip fracture patients, sensitivity was 95% and specificity 57%, [20] and in older orthopedic patients sensitivity was 47.4% and specificity 68.4% [50].
Disparate results of the psychometric characteristics of the SARC-F screening questionnaire demonstrated across sarcopenia studies, might be due to the varying diagnostic modalities used. As shown by Kim et al. in their study of 2099 community-dwelling older adults from the nationwide Korean Frailty and Aging Cohort Study (KFACS), sarcopenia prevalence varied from 7.9% if employing the chair stand test as a measure of muscle strength and ASM/height 2 as a quantity unit for muscle mass, to 18.4% when handgrip strength and/or chair stand test and ASM/height 2 results were examined [51].
In our study, we checked the diagnostic validity of the SARC-F against four definitions of sarcopenia (EWG-SOP2, FNIH, IWGS, SSCWD expert guidelines). With grip strength as a measure of muscle strength, gait speed as a proxy of physical performance, and appendicular lean body mass checked with whole-body DXA scans, we showed an acceptable accuracy of the SARC-F for finding sarcopenia according to all the definitions tested. Additionally, we calculated the Youden's J statistic and constructed the population-specific cut-off values for SARC-F against the EWGSOP2, FNIH, IWGS and SSCWD sarcopenia working group diagnostic criteria. We showed a sensitivity of SARC-F ranged from 30 to 50% and specificity from 83 to 87%, the results by and large in line with previously published data [18]. When setting the Polish population-specific cut off values for SARC-F questionnaire, we obtained better diagnostic properties when adjusting the threshold for suspected sarcopenia to five points for the EWGSOP2, IWGS and SSCWD, and two points for FNIH sarcopenia diagnostic consensus accordingly.

The clinical performance of other screening tests
We found the SARC-F tool to be fairly specific but with low sensitivity. To put that in a broader context, we checked the clinical validity of other sarcopenia screening tools with our study population-specific cut-offs. The tools included: an index designed by Ishii et al. [25], calf circumference alone, midarm circumference alone, SARC-CalF -an index based on SARC-F that incorporates calf circumference [24], and the SARC-F with the study population-specific cut-off values. We found that for most of those tools the sensitivity was numerically better than for SARC-F. We also noted that the very simple measures such as calf circumference, or the midarm circumference were characterized by best clinical performance.

Limitations and strengths
Our study needs to be considered in the context of its limitations. Our sample was moderate in size. However, its size was in line with what has been published thus far. We tested the SARC-F against an array of standards, where for the quantification of the muscle mass we used DXA. This may be an advantage over the studies that had used a bio-impedance based assessment of muscles.

Conclusions and implications
We present a validated Polish translation of the SARC-F questionnaire. Although some other simple measures such as the mid-arm circumference or the calf circumference are at least of comparable value, SARC-F is more versatile, as it can be self-administered, assessed during a telephone interview, or used in subjects of varying bodybuild, or body-build affected by pathologies such as heart failure, liver failure, hypoalbuminemia etc. Our results indicate that its performance is better in ruling sarcopenia out than finding the cases.

Conflict of interest None.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.