Background

Magnetic resonance imaging (MRI) has been used to assess skeletal muscle morphology and composition for over four decades [1,2,3]. Assessment of skeletal muscle with MRI can contribute to improved understanding of normal responses to physical activity and changes associated with healthy ageing, muscle injury, and pathology [1, 4]. Advancing MRI technologies, including a range of faster, higher resolution techniques continue to emerge with the aim of improving visualisation and quantification of muscle characteristics [5,6,7].

The use of MRI to evaluate hip muscle morphology and composition in healthy and musculoskeletal pain populations is becoming more common. Interest in hip muscle size and quality is driven by the knowledge that the muscles spanning the hip joint contribute to hip joint forces [8,9,10]. The capacity of a muscle to generate force has been linked to its size, including cross sectional area (CSA) and volume [11, 12]. Hip joint forces have, in turn, been associated with joint health, pain and/or other symptoms [13, 14]. How the size and quality of muscles spanning the hip joint contribute to hip joint forces is an area of particular interest [8,9,10].

The lateral hip muscles including the gluteus maximus, gluteus medius, gluteus minimus and the tensor fascia latae (TFL) generate forces around the hip joint for both movement and stability, particularly in single leg stance and during gait [15,16,17,18]. In people with musculoskeletal hip pain, several studies have demonstrated muscle atrophy and increased intramuscular fatty infiltration of the lateral hip muscles when compared to age-matched controls and asymptomatic contralateral limb [19,20,21,22,23,24,25]. As such, muscle size and fatty infiltration present as possible targets for interventions. Preliminary evidence indicates that these muscles can respond to exercises targeting the hip and other regions [26,27,28]. Further work assessing size and adiposity of these muscles will help to establish the most responsive type and dose of exercise to use, as well as the relationship to symptom recovery.

Recent systematic reviews have highlighted heterogeneity and inconsistencies in published MRI methods designed to assess muscle size and composition of the lateral hip muscles [7, 17, 29]. Common to all studies remains the challenges of accurately differentiating and consistently measuring the borders of individual muscles on conventional MRI which may lead to difficulties in comparing results. For the lateral hip muscles, the individual gluteal muscle borders are difficult to identify at the region between the upper border of the acetabulum and the superior tip of the greater trochanter [26, 30]. The use of high-resolution E-12 anatomical plastinates alongside MRI, may improve the ability to visualise anatomical regions by comparing and identifying key features at specific locations [5, 31, 32]. Currently, there is an urgent need for robust and reproducible MRI methods for identifying, measuring, and interpreting hip muscle images, particularly to enable comparison of results across studies and data pooling.

The primary aim of this review was to define standardised MRI methods for assessing lateral hip muscle size and fatty infiltration. A secondary aim was to provide illustrative anatomical comparisons between MRI and high-resolution E-12 anatomical plastinates using standardised locations as determined from the literature to improve visibility of muscle borders.

Method

This review followed the PRISMA guideline extension for scoping reviews [33, 34] and was prospectively registered on the open science framework platform (https://osf.io/5nyuq/).

Search strategy

Five electronic databases (Medline, CINAHL, Embase, SportDISCUS and AMED) were searched from inception up to Nov 1st 2021. No language limits were placed. Search terms were mapped to three main concepts; (i) Magnetic resonance imaging, (ii) lateral hip muscles (i.e., TFL, gluteus maximus, gluteus medius and gluteus minimus) and, (iii) muscle morphology and composition (i.e. muscle size and fatty infiltration). Synonyms within each concept were mapped to subject headings, where possible, or searched under title, abstract and/or keywords. Results within each concept were combined with 'OR' and between concepts combined with "AND" (Additional file 1).

The search strategy was modified according to the specifications of each database. Manual citation tracking and reference checking of included articles was performed. Ahead of print lists of journals included in the study were screened for additional studies. Grey literature, such as internal reports and conference proceedings, were searched for further eligible studies.

Titles and abstracts of studies retrieved from the databases, as well as those identified from reference-checking and citation-tracking, were screened for eligibility by two reviewers (ZP and CS). Any disagreements in the eligibility of a study were discussed and a consensus reached with the aid of a third reviewer (AS). The final yield was exported into Covidence online software (www.covidence.org) for eligibility screening against inclusion and exclusion criteria.

Inclusion/exclusion criteria

Studies with participants of any age and either healthy or musculoskeletal pain populations were included. People with cancer, neuromuscular and neurological conditions, were excluded as well as those undergoing cosmetic surgery. All MRI investigations which assessed lateral hip muscle size and/or fatty infiltrate were included. Studies were excluded if muscles were assessed as a group rather than reported individually (e.g., gluteals) and if using other imaging modalities (e.g., ultrasound) without comparison to MRI. In line with previous publications establishing regions of interest in axial images [5, 31, 35], studies using axial MRI slices for size and fatty infiltration measures were included. All published peer-reviewed studies were included; opinion pieces/editorials, systematic reviews, narrative reviews, conference abstracts and single case studies were excluded.

For our secondary aim, axial MRI images were compared to E12 anatomical plastinate sections at corresponding anatomical levels to illustrate differences, and thus identify regional morphology. The E12 anatomical plastinate sections used in this study are part of the anatomy collection, in the WD Trotter Anatomy Museum at the University of Otago. Approval to use images of the E12 plastinate sections was granted by the Department of Anatomy, University of Otago. Digital photographs were acquired of selected E12 specimens that were appropriate for the anatomical regions included in this study.

Risk of bias (quality) assessment

The primary aim of this review was to report MRI methods rather than individual study results. As such, and in line with the PRISMA extension for scoping reviews (PRISMA-ScR) checklist [34, 36], a risk of bias assessment was not conducted.

Data extraction

A standardised data extraction form was used to extract data relating to the individual study characteristics (study purpose, design [37], population, sample size). Countries and institution affiliations of corresponding author were recorded. Details on MRI parameters (e.g. scanner field strength, manufacturer, MRI sequence, slice selection & thickness), specific lateral hip muscles assessed, and details of size (volume and CSA) and fatty infiltration outcomes were collected by two authors (ZP and NF). Any discrepancies were discussed between authors and conflicts resolved by a third author (AS) if required.

Intraclass correlation coefficient (ICC) and the kappa coefficient (k) statistic are frequently used as a measure of intra- and inter-rater reliability and were collected to assess consistency of the MRI methodology between included studies [38, 39]. ICC values were interpreted as values less than 0.5 as poor reliability, 0.5 -0.75 as moderate reliability, 0.75 -0.9 as good reliability, and values greater than 0.90 as excellent reliability [38]. Kappa coefficient were interpreted as values ≤ 0.20 as none to slight, 0.21–0.40 as fair, 0.41– 0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1.00 as almost perfect agreement [40]. Other measures of reliability were not collected.

Analysis/ synthesis

Descriptive statistics were used to summarize findings across studies for MRI parameter and anatomical locations for regions of interests. Data for muscle size were grouped into volume and CSA. Fatty infiltration measures were grouped into qualitative and quantitative methods. Qualitative measures could include the Goutallier classification system [41], which grades muscle according to the relative amount of fatty tissue that is present, progressing from 0 (regular muscular tissue, no intramuscular fat) to 4 (more fat than muscle), and the Quartile classification [42] which also adopts a 5-step grading system (0%, 25%, 50%, 75% or 100%) to define the percentage of fatty tissue that is present. Quantitative measures could include various calculations incorporating fat-value pixels.

Anatomical levels for measuring CSA and fatty infiltration were collected. When a single anatomical level contained multiple anatomical features, the most easily identifiable and distinguishable anatomical feature on axial MRI slice was extracted. Axial MRI DIXON sequence images and E12 anatomical plastinate sections were cross referenced. Anatomical levels were compared on a 3D MRI image.

Results

The initial search identified 2,684 studies, from which 1,614 duplicates were removed with a further 813 removed after title and abstract screening. An additional 176 were removed following full text screening, which resulted in 78 studies from 81 publications that met the inclusion criteria (Additional file 2 and 3).

Trends in publication of MRI studies: 1992 to 2020

Frequency of publication of MRI studies has increased steadily since 1992, growing from one study [43] to 12 in 2020 [44,45,46,47,48,49,50,51,52,53,54,55] (Fig. 1A). Across the included studies, 17 countries were represented: Australia (n = 16), Japan (n = 11), Germany (n = 10), USA (n = 10), United Kingdom (n = 7), Switzerland (n=5), Finland (n = 4), France (n = 3), Netherlands (n = 3), Spain (n = 3), China (n = 2), Turkey (n = 2), Canada (n = 1), New Zealand (n = 1), Norway (n = 1), Poland (n = 1). Twelve institutions featured across two studies, and four institutions featured in more than two studies (Charité University Medicine, Germany n = 9 [42, 56,57,58,59,60,61,62]; La Trobe University, Australia [19, 21, 22, 63] n = 4, Royal National Orthopaedic Hospital, UK n = 3; The University of Queensland, Australia n = 3 [23, 24, 63]) A range of study designs were used including nine randomised controlled trials, 33 prospective cohort, 10 retrospective cohort, 15 case–control and 10 case series study designs (Fig. 1B).

Fig. 1
figure 1

Individual study characteristics: A Publication year of individual studies B Individual study designs C Populations used across individual studies D Lateral hip muscles assessed across individual studies * Incomplete year (January to November’01); RCT Randomised control trial, Msk musculoskeletal

Patient and non-patient populations

Twenty-three studies across 25 publications investigated hip related musculoskeletal pain (e.g., hip osteoarthritis, lateral hip pain and intra-articular hip joint pathologies) (Fig. 1C). Three studies examined non-hip related musculoskeletal pain which included low back pain [64, 65] and patellofemoral joint osteoarthritis [66]. Twenty-nine studies, across 33 publications, used healthy comparison groups and 26 studies explored one of three surgical presentations (i.e., total hip arthroplasty, hip arthroscopy and surgical correction for hip dysplasia) (Fig. 1C). Gluteus medius was the most frequently assessed lateral hip muscle (Fig. 1D). Fifty-four studies measured muscle size and 40 studies investigated fatty infiltration (Table 1).

Table 1 MRI parameters for individual studies

Measurement of muscle size and quality

Thirty-six studies reported the profession of the individual(s) interpreting MRIs and calculating size and fatty infiltration measures. The most frequently cited professionals were radiologists (31 studies) with 15 studies reporting radiologists with further training in musculoskeletal presentations. Other health professionals included orthopaedic surgeons and physiotherapists. Ten studies [44, 65, 67,68,69,70,71,72,73,74] reported years of experience for those who interpreted the MRIs, which ranged from 1 to 28 years.

ICC or kappa scores were reported in 33 studies (42%). For size measures, ICC scores reflected moderate to excellent reliability, with data ranging from 0.75 to 1.00 for intra-reliability and 0.70 to 0.99 for inter-reliability. Fatty infiltration ICC values indicated moderate to excellent reliability with scores ranging from 0.75 to 0.99 for intra-rater reliability and 0.70 to 0.99 for inter-rater reliability. However kappa coefficient scores were only performed for fatty infiltrate and demonstrated a greater variety of scores spanning from fair to almost perfect agreement among studies. Kappa scores ranged from 0.72 to 0.93 for intra-rater and 0.23 to 0.94 for inter-rater reliability (Table 2). No study reported scan to rescan reliability.

Table 2 Volume measurement outcomes for individual studies

MRI parameters

The MRI parameters of all studies are summarised in Table 1. Two MRI field strengths were reported, 1.5 Tesla and 3 Tesla. A wide range of MRI sequences were used across the studies, with many incorporating several sequence types, both T1- and T2-weighted, with and without fat suppression. Slice thickness ranged from 0.5 mm to 15 mm, with 16 studies (20.3%) not reporting slice thickness. Acquisition time ranged from 2 h 32 min [75] to 1 min 29 s [76].

All studies that reported patient positioning specified a supine position with legs extended and hips in neutral, except three studies [45, 62, 77] that used pillows under the knees for comfort, and two studies [44, 46] placing the hips into internal rotation.

Muscle size measures

Lateral hip muscle volume was measured in 31 studies and CSA was measured in 24 studies, (Tables 2 and 3). For volume measures, manual segmentation techniques were most frequently used (77.4%) compared to automated. For CSA, all studies used manual segmenting techniques.

Table 3 Cross sectional area measurement outcomes for individual studies

Volume measurement outcomes

Whole muscle volume was calculated for 28 studies (90.3%), while two [26, 52] measured partial muscle volume. To calculate volume, all studies incorporated sums of CSA estimates. Seventeen (54.8%) studies also incorporated slice thickness and five (16.1%) normalised calculations to either individual height or mass (Table 2).

Cross-sectional area measurement outcomes and axial anatomical slice location

Five studies calculated CSA from multiple slices either by using the mean derived from several consecutive slices or assessing CSA at two predetermined locations (Table 3). Single axial slices were chosen at a pre-determined anatomical locations for all other studies except for two studies [78, 79], which measured at the single slice with the greatest CSA for the individual muscle.

Seven anatomical levels were identified as locations where CSA can be measured for the lateral hip muscles (Figs. 2, 3 and 4). These include i) anterior superior iliac spine (ASIS) [59, 80, 81] ii) half way between the iliac crest and the superior tip of the greater trochanter [67] iii) anterior inferior iliac spine (AIIS) [59] iv) upper border of the acetabulum [46, 82, 83] v) superior tip of the greater trochanter [45, 70, 77, 84,85,86,87] vi) lower border of the acetabulum [25, 82, 83] and vii) lesser trochanter [57, 81].

Fig. 2
figure 2

3-D representation of anatomical levels for CSA measurement; 1- Anterior superior iliac spine; 2- ½ way from iliac crest and greater trochanter; 3-Anterior inferior iliac spine; 4- Upper border of acetabulum; 5- Superior tip of greater trochanter; 6- Lower border of acetabulum; 7- Lesser trochanter; IC- Iliac crest, GT- Greater trochanter

Fig. 3
figure 3

Axial DIXON sequence MRI and E-12 anatomical plastinate comparison at anatomical levels for cross sectional area measurement above the hip joint. A At the level of anterior superior iliac spine B Halfway between the iliac crest and the superior tip of the greater trochanter C Anterior inferior iliac spine; square dotted box surrounds enlarged morphological region of interest (Fig. 4); 1- gluteus minimus; 2- gluteus medius; 3- gluteus maximus; 4- TFL; 5- ilium; 6- iliacus; 7- psoas major; 8- rectus abdominis

Fig. 4
figure 4

Axial DIXON sequence MRI and E-12 anatomical plastinate comparison at anatomical levels for cross sectional area measurement. A upper border of the acetabulum B superior tip of the greater trochanter C lower border of the acetabulum D lesser trochanter; 1- gluteus minimus; 2- gluteus medius; 3- gluteus maximus; 4- TFL; 6- iliacus; 9- acetabulum; 10- piriformis; 11- iliopsoas; 12- sartorius; 13-rectus femoris; 14- femoral head; 15- greater trochanter; 16- lesser trochanter; 17- vastus lateralis; 18- pectineus; 19- adductor brevis; 20- adductor magnus; 21- quadratus femoris

When comparing MRI images to E-12 anatomical plastinates (Figs. 3 and 4), the E-12 anatomical plastinates provide better visualisation of muscle borders. At levels AIIS and the upper border of the acetabulum, the muscle borders between gluteus medius and piriformis are better visualised on the E-12 anatomical plastinates with detail of individual muscle fibre directions demarcating the individual muscles (Fig. 5). For levels at superior tip of greater trochanter and below, the TFL border is better visualised on the E-12 anatomical plastinates against neighbouring muscle borders including the gluteus medius and rectus femoris.

Fig. 5
figure 5

Enlarged region of interest at the level of anterior inferior iliac spine. A Axial DIXON sequence MRI B E-12 anatomical plastinate C Schematic illustration; round circle indicates feature of interest; Red line- gluteus minimus; Green line- gluteus medius; Dashed red line- partition between gluteus medius and piriformis; Dashed grey line- partition between gluteus maximus with both gluteus medius and piriformis; Red circle- highlights angles between partitions to help identify separation between piriformis and gluteus medius; Blue line- scale bar represents relative scale between images

Some same slice locations were described in multiple ways as these levels contained multiple identifying features. For example the slice location at the level of the tip of the greater trochanter (level vi) is consistent with the level described as the centre of the femoral head [70, 85, 86], and the level where the femoral head has the greatest CSA [45], depending on slice thickness. Other slice locations were at a pre-set distance from an anatomical feature including 20 mm distal to the proximal aspect of the femoral head [88] for gluteus maximus and 15 mm from the superior margin of the acetabulum [75] for gluteus medius and minimus.

Intramuscular fatty infiltration measurement outcomes and axial anatomical slice location

Forty studies measured intra-muscular fatty infiltration (Table 4). Qualitative measures of fatty infiltrate were used by 30 studies with the Goutallier classification being the most frequently used. Quantification methods, using a ratio of pixel intensity from fat and water images were used by 10 studies. This technique has become more utilised over recent years.

Table 4 Fatty infiltration measurement outcomes for individual studies

Gluteus medius and/or gluteus minimus were further divided into compartments in 11 studies. Gluteus medius was divided into three equal compartments (anterior, middle and posterior) by nine studies and two equal compartments (anterior, posterior) by one study. Similarly, gluteus minimus was divided into three equal compartments (anterior, middle and posterior) by seven studies and into two equal parts (anterior and posterior) by two studies. The TFL and gluteus maximus were not divided into compartments for intramuscular fatty infiltration measurement.

Six anatomical levels were identified as locations for fatty infiltration measurement of the lateral hip muscles (Fig. 6). Two levels were identified for TFL, four levels were identified for gluteus maximus, gluteus medius and gluteus minimus muscles. Four studies [53, 89,90,91] described quantitative measures of fatty infiltration for whole muscle.

Fig. 6
figure 6

3-D representation of anatomical levels for intramuscular fatty infiltration measurement; 1- 1/3rd from iliac crest and greater trochanter; 2- Anterior superior iliac spine; 3- Greater sciatic foramen; 4- 2/3rd from iliac crest and greater trochanter; 5- GT; 6- Lesser trochanter; Aqua- TFL; Blue- Gluteus maximus; Green- Gluteus medius; Red- Gluteus minimus; IC- Iliac crest, GT- Greater trochanter

Tensor fascia latae

The two anatomical levels for TFL fatty infiltration assessment included the superior tip of the greater trochanter [85, 87] and the lesser trochanter [75, 80, 81]. The level at the greater trochanter was consistent with other anatomical features including the centre of the femoral head [85] and the fovea capitis [19, 21, 22]. The ischial tuberosity was described in one study [92] and can span multiple slices. The greatest axial CSA was described in one study [93].

Gluteus maximus

The four levels for gluteus maximus fatty infiltration assessment are i) the distance at one third the distance from the iliac crest to the superior tip of the greater trochanter [19] ii) greater sciatic foramen (superior most part) [19, 21, 22, 42] iii) two thirds the distance from the iliac crest to the superior tip of the greater trochanter [19] iv) the superior tip of the greater trochanter [19, 94]. The level where the femoral head has a round configuration [74] and where it has the greatest circumference [19] was deemed similar to the level at the greater trochanter.

Gluteus medius and minimus

Gluteus medius and gluteus minimus were frequently assessed individually at the same level within a study. The four levels for gluteus medius and gluteus minimus fatty infiltration assessment are i) the distance at one third the distance from the iliac crest to the superior tip of the greater trochanter [19, 67, 69, 85, 93] ii) anterior superior iliac spine [80, 81] iii) greater sciatic foramen (superior most part) [19, 21, 22, 42] and iv) two thirds the distance from the iliac crest to the superior tip of the greater trochanter[19, 56,57,58, 67, 69, 85, 93, 95].

Other levels described included pre-determined distances from anatomical features and included 15 mm superior to the upper margin of the acetabulum [75], three and six slices proximal to greater trochanter with slice thickness set at 6 mm [60], 30 mm proximal to greater trochanter [61]. Descriptions of levels that could span multiple axial slices included the level of the acetabulum [75, 94] and the ipsilateral sacroiliac joint [96].

Machine learning

Overall machine learning was incorporated in 16 (20.3%) of the studies. For size measures, eight (25.8%) studies reporting volume either used automatic or semi-automatic tracing methods while no study reporting CSA incorporated machine learning. For fatty infiltration, 10 (25.0%) studies used machine learning to identify and quantify water and fat value pixels within regions of interest.

Discussion

This scoping review aimed to define standardised MRI methods for assessing lateral hip muscle size and fatty infiltration. When measuring size and fatty infiltration, a lack of detail and heterogeneity in reporting MRI parameters highlights the need for a consistent approach to reporting methods in future MRI research. We report seven identifiable anatomical locations for measurement of lateral hip muscle CSA and six identifiable anatomical locations for fatty infiltration at single slice measurement. We also identified new and emerging technology in machine learning for automated muscle segmentation techniques for size and fatty infiltration measures.

MRI acquisition parameters and methodology

MRI parameters determine the quality of images that can influence the results of a study. The use of heterogenous MRI parameters, as found in this review, can complicate comparisons and future pooling of data between studies. Global, multi-centred collaborations aimed to provide MRI protocol consensus have been undertaken for other body regions and could be developed around the hip and pelvis with the aim of reducing the large variability in imaging parameters and wasted time on pilot research [97].

Measurement

Previous studies have examined the influence of rater’s experience in reading and interpreting MRI [98, 99]. In this review, radiologists were most frequently cited professionals reading and interpreting results, with some studies specifying musculoskeletal radiologists to reflect greater experience in musculoskeletal presentations. Previous research has demonstrated MRI to be reliable for muscle size and fatty infiltration measures [6, 100]. Although the majority of studies, reporting ICC or kappa scores, stated good to excellent reliability, some studies reported fair to moderate reliability. One study [59] assessing size measures in a total hip arthroplasty population found poor reliability for measuring gluteus minimus size with analysis limited by prosthesis artifacts and poor visualisation. To overcome this limitation specific MRI techniques have since been developed for improving imaging around and near metal [101,102,103]. There also remains a large proportion of studies that did not report on reliability measures. This may reflect reporting bias, since poor scores would be less likely reported, and potentially inflate our estimate of reliability across the body of literature. It is recommended that future studies continue to measure and report reliability of measurement to help guide and update the development of standardised MRI methods.

Size measures

Seven single level axial slices were identified that provided consistent CSA measurement, including three for both gluteus maximus and TFL, and four for both gluteus medius and minimus. There was no consensus on which axial slice best represents size and/or location where size changes are most likely to occur. E-12 anatomical plastinates did make visualisation of muscle borders clearer, particularly around neighbouring gluteus medius and piriformis, TFL and gluteus medius and TFL and rectus femoris. The use of E-12 anatomical plastinates in understanding and defining muscle borders at certain single level slices can aid future studies to correctly trace muscle borders and could help develop more accurate automatic, machine learning techniques.

Anatomical slice levels used in some of the included studies, where located at the very proximal or distal insertions of the target muscle which may not be representative of the muscle’s overall size. For example, the level of the anterior superior iliac spine for gluteus minimus measurement may not be the best representation for size as the muscle may not even appear at this level in some individuals. Interestingly, four studies [66, 104,105,106] reported size measurements from maximum CSA for individual muscles. This is supported by a recent study [107] in healthy individuals, which compared greatest CSA and volume and found a positive correlation for gluteus maximus and gluteus medius muscles. However greatest CSA may be quite different between individuals, pathologies and across studies. It is unclear at what level CSA should be calculated for the lateral hip muscles.

Compared to CSA, volume has a stronger correlation to muscle strength [12, 108], power [109], and can better reflect muscle size for the entire muscle in both healthy and musculoskeletal pain populations [7, 12]. Additionally assessing whole muscle, volume can better identify regions more susceptible to change and can inform most appropriate levels for CSA [7, 110]. For example in the thigh, after a bout of strength training in healthy individuals, muscle size changes have been observed in proximal portions of a muscle but not around distal portions [111]. Single CSA measures may therefore miss potential changes, depending on where measurements are taken. However compared to CSA, volume calculations can be more time consuming when manually derived. Supported by the results of this review, there has been an increase in interest and development of automatic calculations through machine learning. This increase will lead to greater availability of studies for future pooling of data.

Fatty infiltration

For assessment of fatty infiltration, six axial slice locations were identified including two for TFL, four for each of the gluteal muscles. There was no consensus which axial slice best represents fatty infiltration and/or location where changes are most likely to occur. We found that 86% of studies measuring fatty infiltration used qualitative, five-point Likert scales, often at a single slice. The most frequent Likert scale used was the Goutallier classification system [41]. All studies incorporating quantitative methods for fatty infiltration studies have been published within the last 10 years reflecting it as an emerging technique.

We feel it is important to quantify muscle fat across the entire length of the muscle. This will help to identify locations where muscle fat accumulates in symptomatic groups, how it compares to asymptomatic groups, and where interventions like exercise may have the greatest effect. For example, in a study by Koch et al. [26], muscle fat was quantified on every slice from proximal to distal, and normalised to muscle length. They found that exercise had a significant effect on reducing muscle fat of gluteus minimus at the proximal portion of the muscle. If muscle fat was only measured in the distal portion, then the authors may have falsely concluded that exercise had no effect on muscle fat. In other regions of the body, Crawford and colleagues [112] have shown that the fat content at lumbar segment four (L4) best represents fatty infiltration measures that reflects the entire lumbar region in healthy participants. Further work is needed on the hip muscles to clarify if specific locations are representative of whole muscle changes.

In addition to the specific anatomical level of location, recent cadaver and electromyography studies have identified different anatomical and functional regions within the lateral hip muscles [30, 113, 114]. These compartments or regions within the individual muscle may be uniquely impacted by specific movements or muscle actions, which has relevance in musculoskeletal pathology. For example, some studies in this review divided the gluteus medius and minimus muscles into either three equal parts (anterior, middle and posterior) or two equal parts (anterior/posterior) while the gluteus maximus was divided into upper and lower portions. Investigation and understanding of muscle size and fatty infiltration within these functional regions and portions has the potential to guide future interventional studies. In spinal studies such divisions can allow for a more specific quantification to map the spatial distribution of fat content, which is increasingly showing clinical relevance as a meaningful parameter [112, 115,116,117,118].

MRI advances

Manual tracing techniques were used for the majority of size studies but can be time consuming, involving several hours per participant. Recent advances in MRI technology include the development of automated tracing techniques through machine learning [52]. Machine learning for muscle tracing as well as for automatic fatty infiltration calculation has shown to be reliable and accurate in other regions [119]. Automated analysis incorporating machine learning is more time efficient than manual tracing, reducing analysis time from hours to seconds while still maintaining near human-level performance. However with limited valid and reliable automated methods, manual methods for labelling muscles for size and fatty infiltration are currently the gold standard [52, 91]. However, machine learning has the potential to make the analyses of larger data sets more feasible, increasing the statistical power of future research and facilitating the translation of these measures to clinical practice. Although in their infancy, automated, machine learning methods around the lateral hip muscles have shown to provide reliable data for size and the ability to quantify fatty infiltration and will aid future research [26, 48, 52, 61, 91, 120, 121].

Limitations

This scoping review has limitations that should be considered. Firstly, this review focused on people with hip-related musculoskeletal pain and healthy populations, therefore the findings may not be generalisable to other populations such as those with neurological or muscle disease. Secondly, we acknowledge that by focusing on hip-related pain and healthy populations, additional fatty infiltration classification systems described in other populations were not included in this review. Thirdly, in addition to low reporting of reliability results, multiple studies were from single institutions which may make overall methods seem more homogeneous. Therefore, caution should be taken when generalising our findings.

Lastly, we acknowledge that a quality assessment of individual studies was not conducted. This is optional when undertaking a scoping review [34]. Reporting study quality would have a greater impact on describing the risk of bias of outcomes, rather than informing our understanding of muscle size and fat measures, which was the primary aim of this review.

Conclusion

Whilst no consistency was found for which anatomical location(s) is(are) most appropriate and clinically meaningful to measure lateral hip muscle size and fatty infiltration, we report several identifiable anatomical levels for single axial slice muscle size and fatty infiltration. Further studies into whole muscle measures are required before strong recommendations can be made about the most suitable anatomical locations for standardised MRI single slice muscle measures and within muscle regions susceptible to change. Whilst automated machine learning technology is rapidly emerging with associated improvements in time efficiency, widespread implementation remains a challenge. Accordingly, there remains a need to optimise manual segmentation. Overall, the findings of this scoping review will assist in the future establishment of a standardised method for examination of and measurement for lateral hip musculature using MRI.

Implications

Establishing a standardised method for MRI assessment of lateral hip muscles will contribute to greater understanding of muscle size and fatty infiltration for people with musculoskeletal conditions and the development of standardised MRI protocols. The findings of this scoping review will inform research in other clinical populations such as people with neuromuscular disease.