Background

Shoulder and elbow pain and injuries originating from athletic overuse are well described in sports such as baseball [1], swimming [2], and volleyball [3]. Physical demands directed to these joints in overhead sports may lead to chronic loading, affecting athletic performance [4, 5]. The consequential symptoms might be detectable only during sport-specific training or competing, without deficiencies in daily life activities [6]. Athletes appear to continue to train and compete despite upper arm symptoms [7, 8]. A diversity in shoulder and elbow symptom perceptions creates a challenge both for athletes and healthcare professionals to evaluate and monitor the upper extremity health and performance.

Subjective evaluation of patient’s experiences is used to complement clinical outcomes or serve as a primary measure when objective results cannot be obtained [9, 10]. Several patient-rated outcome measures for upper arm health have originally been developed for the general population [11]. These tools have commonly been used for the athlete assessment [12, 13]. However, questionnaires developed for general use might not be specific enough to detect minor and slowly evolving changes in the athletic performance. This might lead to underestimating the athlete’s functional deficiencies and therefore make them more vulnerable to subsequent injury [6, 12, 14].

The Kerlan Jobe Orthopaedic Clinic (KJOC) Shoulder and Elbow score is a sport-specific questionnaire developed to assess the upper arm health of overhead athletes. KJOC score enables sensitive observation of subtle changes in athletes’ shoulder and elbow function and performance [6]. Detection of functional changes may aid in planning the sports training, rehabilitation, and return to sports after injury. The KJOC score is a valid and reliable tool in English speaking overhead athletes [6, 15, 16], and it has recently been validated to several other languages [17,18,19,20,21,22]. However, the validity, reliability, and responsiveness of the Finnish version of KJOC have not been previously reported. This study aimed to produce a Finnish version of the KJOC score and evaluate its psychometric properties. We hypothesized that the KJOC score would be a valid, reliable, and responsive tool to assess the Finnish-speaking overhead athlete’s shoulder and elbow functionality.

Methods

Translation and cross-cultural adaptation

Before conducting the research, the developer of the original KJOC score was contacted to obtain permission to use the English language questionnaire. Cross-cultural adaptation was performed following published guidelines [23]. Two independent translators, an informed and an uninformed professional, performed the translation from English to Finnish. The translations were united, and the resulting synthesis was discussed by a research committee formed of specialists from health sciences, orthopaedics, physical therapy, and overhead sports coaching. The back-translation was executed by a native English speaker fluent in the Finnish language, who was not familiar with the original questionnaire. Subsequently, the modified translation was piloted with ten overhead athletes and revised due to gathered observations. Progression of the cross-cultural adaptation process was documented, and its’ description can be provided upon a request.

Study population and recruitment

Study population was recruited through sports clubs from the Capital region of Finland and the Jyväskylä region in Central Finland. The inclusion criteria were age over 15 years, currently active status in overhead sports, and Finnish as first language. The exclusion criterion was a recent upper limb or another injury that prevented active participation in sports. A total of 118 athletes were recruited, and 114 found eligible for the study. Four athletes were excluded due to missing values, missing written consent, a neurological condition that could have influenced the data of interest, and under aged person [14 years old]. Included participants were athletes from five different overhead sports (volleyball, swimming, tennis, gymnastics, Finnish baseball) competing in national or international level. The research was undertaken with ethical approval from the Human Sciences Ethics Committee of the University of Jyväskylä (147/13.00.04.00/2020). Participants provided written consents prior to participation and rights of the participants were protected throughout the study.

Data collection

Questionnaires were administered on three occasions during a time-period when all athletes were in active training and competing (Table 1). Between sport specialisations, pre-season and competition seasons were scheduled somewhat differently during the study. To evaluate the validity of the Finnish KJOC score, baseline measurements were performed during September 2020 after the first wave of Covid-19 pandemic had stabilised in Finland. Athletes were asked to fill in printed versions of KJOC, Disabilities of the Arm, Shoulder and Hand (DASH), American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form (ASES), and RAND-36 questionnaires [24,25,26,27] to assess construct validity. Additionally, FIT (frequency, intensity, time) index of Kasari, and questions related to Covid-19 pandemic were inquired. KJOC score was distributed again after two weeks to test the test–retest reliability of the survey, alongside a question inquiring possible alterations in physical health within the two time points. Responsiveness of the KJOC survey was implemented eight months after baseline. Questionnaires were mainly returned by mail, with the remaining being collected during athletes’ training.

Table 1 Data collection with used questionnaires at each timepoint

Questionnaires

The ten-item KJOC score is formed of two sections, both including five questions that inquire athlete’s perceptions about their shoulder and elbow function and performance. Respondents place the answer to each item with a mark on a ten-centimeter Visual Analog Scale (VAS). Left margin of the scale stands for zero points and the right side for ten points, a higher score indicating better function of the upper arm. Item scores are measured with a ruler starting from the left corner of the scale to the mark placed by the participant and recorded in centimetres with one decimal. Overall score is the sum of all items, resulting in a score between 0 and 100 points with 100 points standing for perfect upper extremity function [6].

ASES, DASH and RAND-36 were used as reference questionnaires to assess the construct validity of the Finnish KJOC score. ASES is used to assess daily shoulder disability and pain with overall score ranging between 0 and 100 with higher result resembling better functionality [28]. DASH questionnaire evaluates function of the entire upper arm, and is scored between 0 and 100, with lower score indicating better upper arm health. Optional sport module (DASH-SM) comprising four questions related to free-time activities addresses upper arm function more comprehensively in physically active individuals [29]. RAND-36 score evaluates different sectors of general quality of life with eight subscales: (1) physical functioning, (2) role limitations due to physical problems, (3) role limitations due to emotional problems, (4) energy/fatigue, (5) emotional well-being, (6) social functioning, (7) pain, and (8) general health. Overall score is calculated within 0–100 points, with higher result describing better quality of life [30]. All reference scores have been previously validated into Finnish.

Descriptive data

Alongside KJOC-score and reference questionnaires, participants were asked to fill in questions regarding general physical activity level with the FIT Index of Kasari [31] (scale between 0 and 100, higher score describing higher physical activity). In addition, due to the outbreak of Covid-19 pandemic during spring 2020, the influence of pandemic on sports training was inquired with following questions: How the pandemic affected the amount of (1) leisure-time physical activities; (2) Exercise within your sporting event; and (3) What was the trend of adjustments in the training? The Covid-19 questions were repeatedly inquired also at the responsiveness timepoint.

Statistical methods

Construct validity of the KJOC score was tested to evaluate the ability of the instrument to measure the phenomenon it was created to measure [32]. Correlations between the KJOC score and reference questionnaires were determined with Pearsons correlation coefficients and their corresponding p-values [33]. Moderate to strong correlations were hypothesized to be detected between the KJOC and other upper arm questionnaires. In contrast, correlation with RAND-36 was expected to be weak to moderate, since the instrument measures a divergent construct. Further, construct validity was also assessed by measuring how the KJOC score discriminates respondents according to their self-reported subgroup of upper arm function: 1. playing without any arm trouble, 2. playing, but with arm trouble, or 3. not playing due to arm trouble. Independent samples t-test was used in investigating the statistical significance of mean score differences between subgroups.

Reliability was determined for single items and overall KJOC score by the test–retest procedure [32]. Intraclass correlation coefficient (ICC) with corresponding 95% confidence intervals (CI) [34] were computed based on absolute-agreement, using a 2-way mixed effects average measures model. Measurement error was determined with the standard error of measurement (SEM) with formula SEM = standard deviation (SD) × (√1 − ICC)[35]. Subsequently, minimal detectable change (MDC) was calculated with MDC = SEM × 1.96 × √2 Eq. [36]. Bland–Altman plot with 95% limits of agreement was used to plot the mean difference between the test–retest scores against the mean of the two measurements [37]. Internal consistency of the KJOC score was determined to assess mutual uniformity of different sections of the score [32] by calculating values of Cronbach’s alpha coefficient [36].

In responsiveness analysis, it was determined if KJOC score detects physiological changes in upper extremity function after time [38]. Score difference was evaluated for those respondents who reported a change in their upper arm health between baseline and responsiveness measurements. The change in upper arm health was inquired with Global Rating of Change scale (GRC). Wilcoxon signed-rank test was used to detect statistical significance between the KJOC total mean scores. Furthermore, standardized response mean (SRM) and effect size (ES) were computed. SRM was evaluated by calculating the difference between the baseline and follow-up mean scores and dividing the difference by SD of the difference. ES was determined by calculating the difference between the follow-up and baseline mean scores, divided by the baseline measurement SD [38, 39].

Floor- and ceiling effect

Floor- and ceiling effects were detected as evident if more than 15% of the study subjects scored either highest or lowest possible score within one item. If more than 25% of the score items showed a floor or ceiling effect, the whole score was concluded to present the phenomenon [40].

All statistical analyses were performed with IBM SPSS Statistics 24.0 for Windows software. Descriptive statistics were calculated and reported for all relevant measures and presented as means and standard deviations for continuous variables and as counts and percentages for other variables. The normality of variables was evaluated graphically and with the Shapiro–Wilk W test.

Results

Cross-cultural adaptation

Cross-cultural adaptation revealed minor cultural differences. The options to describe athletes’ level of competition (professional major league, professional minor league, intercollegiate, high school) were adopted to match Finnish sporting culture classifications with terms that may be translated as professional-, semi-professional- and recreational athlete. Further, words game and playing were translated into Finnish to match the words competition and competing. The latter translations were chosen since the terms acknowledge both team- and individual performance sports, and they were not considered altering the primary concept of the items in question. From the KJOC item five, the expression traded or waived was eliminated since trading or waiving the athletes does not occur in Finnish sporting culture. The questionnaire was considered straightforward and easy to use among the pilot test population. The back-translation of the Finnish KJOC questionnaire is available in an additional file (see Additional file:  1).

Study participants

Demographic and clinical characteristics of the participants are presented in Table 2.

Table 2 Athlete characteristics

Validity

The Finnish KJOC score showed a high correlation with DASH and moderate correlations with DASH-SM, ASES, and RAND-36 scores (Table 3). Furthermore, RAND-36 subscale correlations were computed as negligible to low (r = 0.105–0.457), with one subscale correlation resulting in statistically insignificant (role limitations due to emotional problems, p = 0.266). The subgroup analysis indicate that all correlations were higher for symptomatic athletes than asymptomatic (Table 3).

Table 3 ASES, DASH, and RAND-36 score results, and pearsons correlation coefficients with the finnish KJOC score

In cross-category comparisons, statistically significant differences in KJOC mean scores were detected between all subgroups. The mean scores of symptomatic athletes (72.4 ± 19.4; 95%CI 63.1–81.7) were significantly lower than those of asymptomatic athletes (92.6 ± 6.6; 95%CI 91.3–94.0; p < 0.001). Similarly, participants who had either lost time in training or competition within the past year because of arm trouble (83.1 ± 18.1 vs 91.2 ± 9.3; p = 0.003) or who had received care due to an upper arm injury (84.4 ± 15.7 vs 90.9 ± 9.5; p = 0.01), scored lower compared to their counterparts.

Reliability

Internal consistency

The internal consistency regarding the ten items of the Finnish KJOC score was evaluated as excellent (α = 0.92), indicating a good homogeneity within the questionnaire.

Test–retest reliability

The test–retest data was collected after median time interval of 16 days. The ICC of the total KJOC score was 0.77 with values ranging between 0.38 and 0.77 for single items (p < 0.001) (Table 4). SEM and MDC were calculated as 5.5 and 15.1 for the whole study population. The Bland–Altman’s plot showed a small mean difference of −0.22, and 95% limits of agreement ranging from − 13.55 to 13.10 (Fig. 1).

Table 4 Test–retest reliability of the Finnish KJOC score for the total score, and single items
Fig. 1
figure 1

Bland–Altman’s plot describing the test–retest reliability

Floor and ceiling effects

Within symptomatic athletes, the Finnish KJOC score showed a ceiling effect with item five (47.4%), which inquires the athletes’ relations with team coaches and management. Instead, a ceiling effect of the whole score was observed for asymptomatic athletes, with 23.2% to 61.1% giving the highest score for each item. Floor effect was not detected in symptomatic nor asymptomatic athletes.

Responsiveness

The responsiveness data was collected after median time interval of eight months. Twenty-four out of 38 respondents reported a change in upper arm functional status with GRC scale either for better or for worse after the follow-up period. Changes in total KJOC scores resulted conflicting since a significant decline in mean scores was detected disregard the reported trend of change. Also, SRM and ES values supported the significance and quantity of the mean score changes. No significant differences were detected in participant characteristics (Table 5).

Table 5 Responsiveness of the finnish KJOC score

Discussion

This study reports the cross-cultural adaptation and validation of the KJOC score into Finnish language. The questionnaire was shown to be a valid and reliable measure in evaluating the functionality and performance of the shoulder and elbow in Finnish-speaking overhead athletes.

The face- and content validity of the original KJOC-score were ensured throughout the standardized process of cross-cultural adaptation [23]. Few cultural and linguistic adjustments were necessary to produce a conceptually equivalent version of the original questionnaire. Athletes perceived the translation easy to understand and complete.

Detected correlations with DASH, DASH-SM and ASES corresponded the early hypothesis of moderate to strong correlations and the results were in line with previous studies [6, 1722]. In subgroup analysis, correlations resulted higher within symptomatic subjects. Asymptomatic athletes might not show symptoms with general scores but instead report minor changes in upper arm performance with KJOC. Whereas, symptomatic subjects result with lower results in all upper arm questionnaires. Although ASES does not detect elbow impairments, it was selected to the study setting due to its´ previous use in athletes´ shoulder conditions [12, 13]. Hence, detected correlation between KJOC and ASES relates to KJOC score’s ability to detect shoulder symptoms. Previously one study [22] has reported the divergent validity of the KJOC, respectively. Similarly, with the previous results, a moderate correlation between the Finnish KJOC score and RAND-36 was detected, as hypothesized. Sport represents a considerable role in athletes’ daily life, and the moderate correlation may describe a link between perceptions regarding general health and physical performance.

Construct validity was supported by the KJOC score’s ability to stratify athletes by their self-reported upper arm functional status. The KJOC overall score showed wider mean score difference between symptomatic and asymptomatic athletes compared to the other upper arm scores. In addition, symptomatic athletes’ total KJOC scores varied more extensively from the highest score possible compared to asymptomatic athletes. These observations are parallel with the original idea behind the KJOC score to function as a sensitive tool between overhead athletes with or without upper arm insufficiencies [6]. Overall, the correlations and differentiating abilities suggest a good construct validity of the Finnish KJOC score.

Internal consistency was evaluated as excellent, and is in line with previous studies [17,18,19,20,21]. Compared to earlier publications, ICC resulted lower (0.77 vs. 0.82–0.99) and consequently, the measures of absolute reliability were of a higher level (SEM 5.5 vs. 0.81–8.54; MDC 15.1 vs. 2.42–8.5) [6, 17, 19,20,21,22]. The variation in repeatability results might be due to the characteristics of study subjects. In general, low mean age (18.1), high participation of asymptomatic athletes (83.3%), and athletes’ semi-professional level of competition (79.8%) might have resulted in the increased variability in the test–retest data. Previously, it has been argued that older and higher-level athletes might give more precise answers related to their athletic performance [20]. Older and higher-level athletes might possess a broader range of symptoms and observe themselves in a closer manner, leading to differences in the accuracy of reporting.

The Finnish KJOC score showed an apparent ceiling effect in asymptomatic subjects, consistent with two earlier translations [19, 21]. These findings can be considered expected when utilizing the KJOC score in primarily healthy athletes. Although, ceiling effect could also indicate a score’s weaker ability to identify mild functional changes. Within the symptomatic subjects, a ceiling effect was observed only in item five (47.4%), which inquired about the athletes’ relationship with club personnel. Hence, as the item does not measure shoulder or elbow functions, the symptomatic athletes may score generally high results within this item.

In addition to validity and reliability of the novel Finnish KJOC score, we evaluated responsiveness of the translation. Previously KJOC score’s responsiveness has been assessed in two studies evaluating the characteristics of the questionnaire before and after treatment with an average of 14- and 6-month time-intervals [6, 17]. In the present study setting, recruitment of injured athletes was not possible and the change in upper arm health status was determined by subjective rating in the eight-month follow-up time-point. Interestingly, not only athletes reporting their upper arm functional status as worse compared to baseline measurements, but also the ones rating their function better, resulted as significantly lower KJOC mean scores in follow-up. According to previous publications [6, 17], KJOC has shown to be a responsive measure after injury and a period of treatment, but remains unclear if longitudinal assessment of change in function is reliable in actively competing athletes with continuously experienced subtle symptoms. Detection of score change might be more reliable in cases where physical condition has experienced major changes between the data collection time-points. Before the Finnish KJOC score can be reliably used in the assessment of athlete recovery and long-term assessment of upper arm function and performance, the responsiveness of the score needs to be further studied.

There are some limitations of this study. The participants were recruited exclusively through sports club contacts, which led to the unequal distribution between asymptomatic (n = 95) and symptomatic (n = 19) athletes. Besides swimming and volleyball, the number of athletes from other overhead sports was also low. Further, since athletes were not asked about the location of upper arm symptoms, it remains unknown how many of the symptomatic subjects presented with shoulder or elbow insufficiency. Due to recruitment methods, it also cannot be concluded how the Finnish KJOC score would function with severely injured athletes in health care environment. Overall, the study was executed during the Covid-19 pandemic, and data was collected within the altering guidelines of national restrictions. Athletes reported changes in the quantity and quality of sports training, and it is justifiable to consider, if collected KJOC score results would have differed if measured during another time-period. Despite the unusual circumstances, results regarding the psychometric properties of the Finnish KJOC score are still to be considered reliable, since they describe the score properties instead of athlete function and performance. As a strength of this study, the included study population was considered sufficient according to literature recommendations [36] and previously published validation studies of the KJOC score.

Conclusions

The Finnish version of the KJOC score was evaluated as a valid and reliable questionnaire to measure the self-reported functionality of the shoulder and elbow of overhead athletes. The findings of this study indicate that the Finnish KJOC score may function as a useful tool in the evaluation of overhead athletes’ upper arm performance to identify possible impairments. The score may be applied to the athlete evaluation in training environments. Further studies from different overhead sports with broader sample sizes are required to develop more comprehensive information regarding the validity and feasibility of the Finnish KJOC score. In addition, further studies regarding the responsiveness of the score are warranted.