Introduction

Joint hypermobility, the movement of multiple joints beyond what is considered typical range of motion, is a commonly reported physical trait of childhood [1]. The prevalence of joint hypermobility varies markedly in children across studies, hypothesised to be related to age [1], pubertal status [2], sex [3] and ethnicity [4]. The most commonly used screening tool to identify generalised joint hypermobility is the Beighton score which has acceptable reliability [5]. The Beighton score is quick to perform and is well known internationally. This allows for ease of use as a screening tool in global settings, despite the tool’s known limitations being upper limb dominant and including combined joint movements [6]. The Beighton score is a continuous scale out of 9; however, it is commonly accepted that a cut-off score serves as a dichotomous indicator of whether a child has generalised joint hypermobility or not.

The accepted standard within paediatric orthopaedics health care identifies children’s measurable physical traits falling outside of two standard deviations from the population mean, presenting as outside of typical variance [7]. This identifies children in the upper and lower 2.5% of the population as presenting atypically from the remaining 95% of the population. Defining typical variance with different screening or assessment tools allows clinicians to determine atypical traits and guides the most appropriate clinical course of care when required [7].

Current international consensus of the appropriate cut-off score of the Beighton to define if a child is hypermobile or not, is based upon expert clinical consensus, rather than these accepted standards [8, 9]. Consequently, a cut-off score indicating generalised joint hypermobility should be positioned at two standard deviations above the population mean to fit within accepted clinical conventions. This approach allows a robust approach to identifying the appropriate cut-off Beighton score indicating when a child presents with generalised joint hypermobility [10]. In the absence of comprehensive normative data using this approach, the recent diagnostic framework for paediatric hypermobility disorders recommended a cut-off score indicating generalised joint hypermobility at 6/9 for all children and adolescents up to 18 years of age and prior to skeletal maturity [8].

Therefore, this present study aimed to determine the prevalence of Beighton scores of children worldwide to provide a recommendation for establishing the Beighton score cut-off to identify generalised joint hypermobility in children. Where possible, this present study aimed to consider the impact of age, sex, pubertal status and ethnicity on the prevalence of generalised joint hypermobility in children and adolescents.

Methodology

Data were collated through a systematic review performed and reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines checklist (Supplementary File 1) [11]. The review protocol was prospectively registered on PROSPERO (CRD42021248465). The study question was developed using the PICO (Population, Intervention, Comparison and Outcomes) model [12]. The review keywords were targeted to the purpose of this review. The search terms included Beighton Score/scale, hypermobility, Ehlers Danlos syndrome, children and adolescents (Supplementary File 2: Search strategies).

An electronic search was performed in AMED (Allied and Complementary Medicine), OVID Medline, Embase and CINAHL from inception to the 18th of April, 2024, by the primary author. Covidence systematic review software (Veritas Health Innovation) was utilised to screen and manage articles. Titles and abstracts were screened independently by two independent team members (CMW and JJW) and conflicts resolved by a third team member (VP) against the inclusion and exclusion criteria. Articles were included only when describing Beighton scores of children up to and including 18 years from the general population. Articles were excluded if the full text was not in English, if children described musculoskeletal pain, if the population was recruited within a health care service or clinic assessing or treating conditions relating to hypermobility or from a cohort of children where hypermobility may lead to successful sporting participation (e.g. dancing or gymnastics). Where data were grouped with young people aged 19 or above, we included data where the reported cohort mean age was 18 years or less, or we removed any data where the participants were aged 19 years or older from the analysis if it was reported separately. If we were unable to extract data from children older than 18 years, we excluded all data. If studies of children with known conditions with joint hypermobility and the study reported a comparable community-based comparison or control group, data from the only control group were extracted.

All full-text included articles were independently screened by two reviewers (CMW and JJW), and again, any disagreements were discussed and resolved with a third reviewer (VP or MS). The authors of studies were contacted about unpublished data to support inclusion when the article provided limited information about the cohort, or Beighton score groupings were limited. If these authors did not respond or were unable to provide meaningful data to support inclusion, articles were excluded. We used forward and backward chaining methodology to check references and citations of included articles to ensure as many articles were included as possible.

Data extraction and quality assessment

A custom data extraction template was developed in Microsoft Excel (Microsoft Corporation, 2023). One reviewer independently extracted article data (CMW or JJW), and this was cross checked by a second reviewer (CMW or JJW). Where there were any differences, these were highlighted and discussed with the additional third reviewer for confirmation (VP). Data extraction from each article included study characteristics such as the country of publication, total number of participants, summary data about the age and sex of participant, Beighton scores and any cut-off score authors deemed children as having joint hypermobility, profession of assessors and how many children were rated at the corresponding Beighton Score/s. Due to variability in how ethnicity was reported, we used world region as a proxy for ethnicity. We also pre-planned to extract pubertal status according to age from studies; however, this was also inconsistently reported; therefore, no puberty status data were retained in the final extraction template.

We planned to assess the risk of bias using the ROBINS-I tool. We deviated from the registered protocol and used the JBI Critical Appraisal Checklist for Studies Reporting Prevalence Data [13]. This deviation was due to the checklist being more appropriate to describe the risks of bias in the included study designs. Each of the quality indicators were scored with a Yes, No, Unclear or Not applicable. Questions included responding to the study’s sample frame, appropriateness of sample, sample size, setting description, data analysis, description of condition (or absence of), measurements, statistical analysis, and response rate.

Data analysis

Microsoft Excel (Microsoft Corporation, 2022) was used to collate data extracted from the included manuscript in terms of author, year of publication, age (mean/median, standard deviation (SD) or interquartile range (IQR), range), country, population (total, sex specific totals) and Beighton Scores for population (number of positives/negatives relating to author determined cut-off points for joint hypermobility, profession of assessor, total at each Beighton score and sex-specific totals at each Beighton score). Where studies only described a pooled score population (e.g. ≥ 4 Beighton Score), these studies were grouped accordingly within only that cut-off score group. General and sex-specific prevalence were calculated for Beighton scores ≥ 4, ≥ 5, ≥ 6 and ≥ 7 with corresponding standard errors for each world region (i.e. East Europe/Russia, Middle East, North America, Oceania, South America, South/South East Asia and Western Europe). Statistical heterogeneity was quantified according to I2 statistic and formed the bases for the application of fixed or random effect models. I2 was > 50% in all analyses; hence, only random effect models were constructed. The PRISMA flow diagram was made using PRISMA2020 (R Package and Shiny app). All effects models were constructed on general and sex-specific prevalence, with sex sub-analysis. Differences between world regions and age groups were assessed by Mann-Witney U testing. All analyses were performed in Python, and the base code was made available on https://github.com/HR-Data-Supported-Healthcare. p values < 0.05 were considered statistically significant.

Results

A total of 523 articles were identified from searching, and one article was identified through hand searching (Supplementary File 3). There were 125 articles assessed for eligibility. Of these 125, 28 met the broader inclusion criteria but we were unable to extract or obtain data in a useable form for meta-analysis and therefore excluded. There were a further 60 excluded for other reasons outlined in Supplementary File 3. There were 37 articles reporting on the prevalence or incidence of joint hypermobility at cut-off scores included in meta-analysis from 28,868 children (Table 1). Where data met the assumptions of normality, there were no statistically significant differences between the mean prevalence and any world region for any cut-off score groups (p > 0.05).

Table 1 Included articles and key demographics of participants where there was a Beighton score of 4 or greater provided

Risk of bias in studies

Table 2 provides the risk of bias analysis for all studies included. The major concern rated unclear or no was absent descriptions of the qualification of assessors or training assessors completed in order to assess and record the Beighton Score (Tables 1 and 2). This occurred in 19 studies [4, 14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31], potentially impacting the intra or inter-rater reliability. Additionally, there were 14 studies reporting the response rate [1, 3, 4, 14, 18, 22, 24, 27, 29, 32,33,34,35,36], while the other studies were unclear or had a low response rate without adequately described overall responses. The strength of the vast majority of studies was the validity of the measures researchers employed to collect the data, in addition to their description of the participants enabling inclusion as data representative of the community.

Table 2 Risk of Bias

Meta-analysis

Data from studies reporting children having generalised joint hypermobility at a Beighton score of ≥ 4

Prevalence data were extracted from 34 of the 37 studies with data of children rated as having a Beighton score of ≥ 4. The studies were additionally pooled on sex and world region: Eastern Europe and Russia: n = 3 [4, 26, 37], Middle East: n = 4 [19, 25, 32, 36], North America: n = 2 [28, 38], Oceania: n = 3 [1, 18, 21], South America: n = 2 [17, 24], South and South East Asia: n = 5 [14, 15, 23, 29, 30] and Western Europe: n = 15 [2, 3, 5, 16, 20, 27, 31, 33,34,35, 39,40,41,42,43]. In total, 25,060 children were included (reported male = 11,853 and female n = 11,441), grand mean age: 12.3 years (Table 1). When considering the general prevalence worldwide from all studies, prevalence varied between 11.7 and 64.6%, with a grand mean prevalence of 33% (Fig. 1). The prevalence of a Beighton score of ≥ 4 varied between 4.3 and 59.9% in males and 16.2 and 68.8% in females. There was a statistically significant mean difference in prevalence between males and females of 14.6% (95% CI = 5.1% to 17.9%, p = 0.003).

Fig. 1
figure 1

Forest plot random effect models of 33% general prevalence for studies reporting data, 24% for studies reporting male data and 39% for studies reporting female data with a Beighton score of ≥ 4

Data from studies reporting children having generalised joint hypermobility at a Beighton score of ≥ 5

Prevalence data were extracted from 18 of the 37 studies with data of children rated as having a Beighton score of ≥ 5. The studies were additionally pooled on sex and world region (Eastern Europe and Russia: n = 2 [4, 37], Middle East: n = 2 [25, 36], Oceania: n = 3 [1, 21, 44] South and South East Asia: n = 1 [14] and Western Europe n = 10 [2, 5, 16, 33,34,35, 39,40,41, 43]. In total, 12,079 children were included (reported males n = 5867 and females n = 5475) with a grand mean age of 11.8 years. When considering the general prevalence worldwide from all studies, prevalence varied between 4.5 and 45.9%, with a grand mean prevalence of 18% (Fig. 2). The prevalence of a Beighton score of ≥ 5 varied between 0 and 18.5% in males and between 3.3 and 29.7% in females. There was a statistically significant mean difference in grand mean prevalence between males and females of 11.5% (95%CI = 5.1% to 17.9%, p < 0.001).

Fig. 2
figure 2

Forest plot random effect models of 18% general prevalence for studies reporting data, 9% for studies reporting male data and 21% for studies reporting female data with a Beighton score of ≥ 5

Data from studies reporting children having generalised joint hypermobility at a Beighton score of ≥ 6

Prevalence data were extracted from 21 of the 37 studies with data of children rated as having a Beighton score of ≥ 6. The studies were additionally pooled on world region (Eastern Europe and Russia: n = 1 [4], Middle East: n = 3 [25, 32, 36], North America: n = 2 [28, 38], Oceania: n = 3 [1, 18, 21], South and South East Asia: n = 2 [14, 22] and Western Europe: n = 10 [2, 3, 5, 16, 20, 33, 35, 39,40,41] and sex. In total, 23,200 children were included (reported male n = 11,643 and female n = 10,867) with a grand mean age: 11.5 years. The prevalence of a Beighton score of ≥ 6 varied between 2.6 and 44.4%, with a grand mean prevalence of 11% (Fig. 3). The prevalence of a Beighton score of ≥ 6 ranged between 0 and 12.6% in males, and in females, it ranged between 3.4 and 7.0%. There was a statistically significant mean difference in grand mean prevalence between males and females of 7.3% (95%CI = 2.2 to 12.4%, p = 0.005).

Fig. 3
figure 3

Forest plot random effect models of 11% general prevalence for studies reporting data, 6% for studies reporting male data and 13% for studies reporting female data with a Beighton score of ≥ 6

Data from studies reporting children having generalised joint hypermobility at a Beighton score of ≥ 7

Prevalence data were extracted from nine of the 37 studies (Fig. 4; Table 1) reporting prevalence for ≥ 7 Beighton scores. The studies were additionally pooled on world region (Eastern Europe and Russia: n = 2 [4, 26], Middle East: n = 2 [25, 36], Oceania: n = 2 [1, 45], Western Europe: n = 3 [2, 5, 16] and sex). In total, 6321 children were included (reported male n = 3209 and female n = 2787), ranging in the ages of 3 to 18 years (grand mean age: 11.3 years). When considering the general prevalence worldwide from all studies, prevalence varied between 0.5 and 23.3%, with a grand mean prevalence of 7% (Fig. 4). The prevalence of a Beighton score of ≥ 7 ranged between 0 and 2.2% in males, and in females, it ranged between 2.9 and 8.7%. Due to the small number of studies reporting prevalence of sexes (n = 3), no sub analysis of scores between sexes was performed.

Fig. 4
figure 4

Forest plot random effect models of 7% general prevalence for studies reporting data with a Beighton score of ≥ 7

Discussion

This is the first known study using pooled normative data of Beighton scores from more than 28,868 children around the world. This analysis comprehensively supports rejecting using a Beighton score cut-off of 4 or 5 in paediatric cohorts to identify generalised joint hypermobility. We identified, at a minimum, the working threshold for identifying generalised joint hypermobility in children should be a Beighton score of 6 or more. This is in keeping with the recent Pediatric Joint Hypermobility Diagnostic Framework [8]. Our analysis also suggests a Beighton score of 7 or greater may be appropriate, particularly for females.

Robust screening tools in health care prevent diagnostic and assessment wastage [46]. While the Beighton score has been criticised for its simplicity, it remains an easy-to-learn, reliable and free tool that clinicians can quickly use in both the face-to-face and telehealth formats [47]. Our data-driven approach using worldwide prevalence data identified even with the Beighton cut-off score of 5 or more, that 21% of female children and 9% of male children would be categorised with generalised joint hypermobility. This high prevalence indicates that a score as such is a typical trait of childhood lying within normal variance. Utilising a higher cut-off score of 6 of more is strongly recommended in future clinical and research use.

Clinicians screening children with joint-related or multisystemic concerns should only consider additional screening tools or diagnostic tests related to generalised joint hypermobility disorders, after considering this higher threshold Beighton score. Other comprehensive joint assessment tools, such as the Lower Limb Assessment Scale [48] and the Upper Limb Hypermobility Assessment Tool [49], may then be used to further understand the child’s joint profile at that point in time. Maintaining the higher threshold for identification of generalised joint hypermobility of 6 or more at all paediatric ages requires clinicians acknowledge generalised joint hypermobility status can change over time, one of the key recommendations within the new paediatric hypermobility diagnostic framework [8]. Resisting the status quo of a hypermobility disorder diagnosis based on a lower cut-off score may challenge clinicians and families as a child’s presentation varies over time.

We were unable to fully interrogate our data with respect to age variation, ethnicity outside of world regions, or pubertal status. Limited data availability precluded further sub-analysis of these factors even though some studies identified these variables as important factors impacting joint mobility. We were also unable to analyse the prevalence of the higher Beighton score cut-offs due to limited data reporting. Future studies should ensure, at a minimum, Beighton score data is available based on children’s age, ethnicity and pubertal status. There is also emerging evidence of gender dysphoria in the paediatric hypermobility cohort; future research should consider including reporting both sex assigned at birth and gender identity [50]. One element of bias introduced into the review was limited reporting of assessor training. It is also unknown what impact this had on our results. On-going research has determined the Beighton score assessment inter and intra-rater reliability high in trained assessors [51] and shows promising reliability with skilled observation over telehealth [52]. Improving reporting of assessor training and setting assessment would improve the quality of the evidence in the future.

Conclusion

This data-driven approach provides clinicians with clear guidance of cut-offs for generalised joint hypermobility in children. These results should give clinicians confidence to reassure families about the wide and typical variation in childhood joint mobility. These results should also provide guidance about a higher threshold for clinicians to consideration if further assessment is required to reduce unnecessary testing, inaccurate or over diagnosis.