Background

The Achilles tendon is the most commonly injured tendon in the body [1]. Achilles tendinopathy is common and in particular, mid-portion (free tendon) Achilles tendinopathy which can affect any adult, whether sedentary or involved in sport or physical activity [1, 2]. There is a higher prevalence in high-impact tendon loading sports, such as long-distance running and football [2]. The aetiology and mechanism of this disorder are largely inconclusive and disputed. However, it is largely agreed that it is not an inflammatory condition but more degenerative with failure to repair [3, 4]. The presence of neovascularity seems to be the source of the pain and in the last two decades, it is also the focus of targeted treatment [5,6,7].

Achilles tendinopathy can be difficult to manage and rehabilitate, taking a prolonged period to obtain positive results in pain reduction and normal functioning [8, 9]. Although various treatment modalities are available with limited evidence on the mode of action, there are no established monitoring instruments and associated clinical protocols. Whilst ultrasonography is used clinically to diagnose and predict the development of symptoms [10], its role in detecting change in follow-up improvement during rehabilitation remains debatable. During the last consensus on the reported outcome measures in tendinopathy clinical trials (ICON 2019) [11], the panel of experts failed to reach an agreement on the sonographic structural changes as an important consideration in tendinopathy. Moreover, the sensitivity of standard sonography is limited, because conventional sonographic signs are missing in a relevant number of symptomatic individuals [12] and full symptomatic recovery does not ensure full recovery of muscle–tendon structure and function [13, 14].

With improvements in ultrasound technology, elastography has emerged as a potential measurement instrument offering opportunities to quantitatively investigate the mechanical and material properties [15] within the tendon. Elastography research has shown rapid growth in the past 10 years and has been used to understand how loading [16] ageing [17] and different treatments [18] are affecting tendon recovery [19, 20]. Elastography has been claimed as having better sensitivity and specificity than ultrasonography in the diagnosis and monitoring of tendinopathies. Different types and modes of action of elastography exist, but their effectiveness in assessing Achilles tendon patients with tendinopathy has not been evaluated in a systematic review. This manuscript does not aim to provide a detailed explanation of all the different modes of action for each elastography technique, but the reader is referred to other narrative reviews [21, 22] for further understanding.

Knowledge of measurement properties helps to inform the clinician and researchers in the choice of the most appropriate equipment to be used whilst achieving more accurate, reliable, and valid results. Using measurement instruments with poor measurement properties increases the risk of bias in results obtained and may fail to detect a true change when assessing different treatment modalities and monitoring rehabilitation [23]. Considering the continuous encouragement of clinicians to use evidence-based practice and perform measurements that can monitor recovery, a better understanding of the technologies and techniques of elastography is needed. A systematic evaluation of the current knowledge of quantitative elastography used on healthy and Achilles tendinopathy during both static and dynamic functioning will help to provide evidence for its use in clinical practice. The aim of the reported work is to identify, evaluate and synthesize the current literature on elastography used on healthy and tendinopathic Achilles tendons in order to provide evidence for its use in clinical practice. Measurement properties of reliability, measurement error, validity, and responsiveness will be considered.

Method

This systematic review was designed and conducted according to guidelines outlined by Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [24].

Information source and search strategy

Electronic databases including EBSCO, CINAHL, PubMed, Cochrane, Scopus, MEDLINE Complete and Academic Search Ultimate were searched by one investigator (TM). A broad search strategy was developed using both free-text terms and MeSH index terms using a combination of keywords as seen in Appendix 1. To identify studies of measurement properties a validated methodological search filter was used (https://www.cosmin.nl/tools/PubMed-search-filters/). The full search strategy is available in Appendix 2. In addition, reference lists of the included and possible eligible articles were also hand searched and scrutinized to identify any additional studies. No restriction was made on the publication year, however, only articles published in the English language were included. Retrieved references were exported to a reference manager to identify and remove any duplicates present.

Eligibility criteria and study selection

Two independent reviewers (TM and AG) screened the title and abstract of available articles to identify studies that used elastography to measure the mechanical properties of the Achilles tendon, according to the following inclusion and exclusion criteria as listed in Table 1. The full text of the shortlisted papers was then reviewed to obtain a final set of articles. Any disagreement over the eligibility of studies was resolved through discussion amongst both reviewers. If an agreement was not reached, a third reviewer was consulted (NP). When abstracts and full texts of potential inclusion articles were not found, the authors of these articles were contacted.

Table 1 List of inclusion and exclusion criteria

Data extraction

Data were extracted by the primary reviewer (TM) for each of the included studies. The secondary reviewer checked the extracted data. Data extracted included: the study design, setting, method of assessment, population characteristics, outcome measures, equipment used and specifications, statistical results, measurement properties focusing on reliability, measurement error, validity, responsiveness, and the limitations of the study. When reliability or measurement errors were investigated, the seven elements that construct the research question were also extracted as per guidelines by the COSMIN Manual 2021 (Appendix 3).

Methodological quality evaluation of the studies

The included articles were assessed for methodological quality by the two independent reviewers, using the COSMIN methodology [25, 26] consisting of three sub-steps. First, the methodological quality of every single study was assessed using the COSMIN Risk of Bias (ROB) checklist to assess reliability, measurement error, validity, and responsiveness for each study, respectively. Each standard was rated on a 4-point scale as very good, adequate, doubtful, or inadequate quality. To determine the overall quality rating of every article, the lowest rating of any standard was taken [26].

Secondly, the statistical results of every study were rated against the criteria for good measurement properties as sufficient ( +), insufficient (-), or indeterminate (?) [26]. Reliability was rated as sufficient if the results of ICC were ≥ 0.70 [26], while for measurement error, the smallest detectable change or the limits of agreement were smaller than the minimum important change. Criterion and construct validity were rated as sufficient if the results were in accordance with the predefined hypothesis by the review team. The correlation between compared instruments (convergent validity) had to be ≥ 0.70 or show no significant differences between instruments. The comparison between groups that were expected to be different (discriminative validity) had to be significantly different. Responsiveness was rated as sufficient when results obtained were able to detect a significant important change over time [26].

Finally, the quality of the evidence was graded (high, moderate, low, or very low evidence) using the modified Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach. This approach takes into consideration the methodological quality of the studies (COSMIN score), inconsistency of the results per measurement property between the different articles, imprecision, including the total sample size of the available studies and indirectness involving evidence from different populations than the population of interest in this review. Multiple studies were only combined when the same measurement property was evaluated for specific types of elastography. Moreover, when results across reported test conditions were consistent, these results were summarized to determine the overall evidence of the measurement properties. Measurement properties from studies that were rated ‘doubtful or inadequate’ on the COSMIN ROB were not eligible to be combined in evidence synthesis. Any discrepancies between reviewers were discussed and resolved via consensus with a third reviewer (NP).

Results

Search strategy

The literature search was conducted on 10th September 2020 and updated on 15th January 2022. It yielded 1644 articles, of which 597 were duplicates and therefore were removed. Of the 1047 article titles and abstracts which were screened, 83 articles were eligible for full-text assessment. Of these 83 articles, 20 articles were found to be appropriate for inclusion, together with another article identified through citation searching. Thus, analysis was conducted on a total of 21 articles. The full PRISMA flow diagram summarizing the screening process and results are provided in Fig. 1.

Fig. 1
figure 1

PRISMA flow chart

Study characteristics

A summary of study characteristics, including participants’ demographics, are provided in Table 2. The study populations included mainly healthy populations except in five articles [27,28,29,30,31] that investigated patients with midportion Achilles tendinopathy. Healthy asymptomatic participants were younger than participants with Achilles tendinopathy in almost all articles. The sample size of the included studies ranged from three [32] to 326 [27], either investigating one limb or bilateral. Detailed justifications for sample sizes were not provided in all the studies. Half of the articles did not report any information related to the participant’s physical activity levels and the other half had a range of levels from normal daily walking to participation in recreational sports. None of the articles included professional athletes. A cross-sectional research design was implemented in most studies, with only a few prospective longitudinal studies included. In Table 3, the different instrument specifications and probe use are reported as different elastography machines should be considered when collating data for the best evidence available.

Table 2 Demographic Data
Table 3 Elastography equipment specifications used

Quality of review articles—methodological quality

Results were grouped according to the type and mode of action of elastography for better homogeneity and consistency within the results. Of the 21 articles included, seven articles investigated strain elastography where strain ratio was calculated as a semi-quantitative measure of stiffness [27, 32,33,34,35,36,37], eleven investigated shear wave imaging presenting results as either shear wave velocity (SWV) [28,29,30, 38,39,40, 47] or modulus [31, 41,42,43], two investigated continuous shear wave elastography (cSWE) [44, 45], while the last article assessed three-dimensional shear wave elastography (3D SWE) [46]. A summary of the overall quality ratings for the measurement properties of each elastography method and their statistical results are reported in Table 4 and Table 5. Table 4 presents extracted data for the different types of reliability and measurement error (intra-rater, inter-rate and inter-session), while Table 5 presents the validity and responsiveness.

Table 4 Reliability and Measurement error results
Table 5 Validity and responsiveness results

The overall quality ratings for each article assessing these measurement properties were predominantly adequate. Articles rated as doubtful are reported at this stage but will not be included in the following phase to combine results for best-quality evidence. These doubtful articles had non-optimal statistical analyses or a lack of proper reporting regarding the blinding of assessors. Refer to Appendix 4 for a detailed explanation of Tables 4 and 5.

Best evidence synthesis

Given the large variety between methodologies and the identified inconsistencies within each type of elastography results were combined cautiously. Appendix 5 provides a detailed account of the methodologies applied in the included articles. Some of the major differences found included patient positioning with the ankle either relaxed or set at a specific dorsiflexed or plantarflexed angle; the investigated part of the Achilles tendon, including the middle part of the free tendon, the myotendinous junction, level of the medial malleolus and specific areas such as 5 cm from the enthesis. Other identified differences included the type of the pre-set used on the ultrasonographic machine, whether online or offline processing was carried out, the probe placement including longitudinal and transverse planes, and the placement of the region of interest on the tendon to be assessed. These findings will be explored further in the first part of the discussion section, where each elastography modality evidence is analysed.

Tables 67 and 8 present the grading of evidence for reliability, validity and measurement error respectively for the different elastography methods and a summary of the rating of good measurement properties according to the statistical results of the articles which obtained an adequate or very good rating for the ROB. Evidence for some measurement properties is indetermined, either because there was no information available or available from one study as obtained for the reliability of shear wave modulus and 3D SWE. Findings of measurement error were mostly indeterminate scores, which do not suggest that the measurement instrument is of poor quality, but only highlight the need for more high-quality studies that can adequately assess the measurement properties. The best evidence synthesis for each elastography method will be reported separately in the next section.

Table 6 Reliability evidence grading
Table 7 Validity evidence grading
Table 8 Measurement Error evidence grading

Strain elastography

Construct validity of strain ratio received positive ratings for correlation to both VISA-A and ultrasonographic imaging when assessing tendinopathic participants. Since only one study investigated this correlation, grading of evidence was not conducted. The strain ratio correlated to isometric contraction obtained a moderate level of evidence, with the ability to significantly detect a decrease in this ratio (i.e. tendon becomes harder since the reference material remains the same through the measurement) with the increase in loading. The intra-rater reliability of the strain ratio was rated as having a moderate level of evidence, for both longitudinal and transverse probe placements. However, the former obtained positive results, with only one article showing low ICC, while the latter obtained only negative ratings within a combined sample size of only 58 participants. Inter-rater reliability was downgraded to moderate for the longitudinal probe placement due to inconsistencies in results.

Shear wave velocity

Convergent validity of SWV was correlated to the patient-reported outcome measure, the VISA-A, when tendinopathy was being assessed and was graded as high evidence. For discriminative validity, SWV was able to measure significant differences in foot positions and age, both receiving a high level of evidence, while moderate evidence was found when differentiating between pathological and healthy tendons. The quality of evidence for inter-rater reliability was rated as moderate due to inconsistency in results when using a longitudinal probe position and indirectness when using a transverse position. Intra-rater reliability within the same day was not graded, as only one article was found. However, when intra-rater reliability was assessed in different sessions, mixed results were present, with some articles having an ICC of 0.71 while others reported a lower value of 0.54. This conclusion was based on a total sample size of only 36 participants.

Shear wave modulus

No grading of evidence was possible for the criterion validity of SWE as only one article assessed the correlation to MRI, B-mode ultrasound, power Doppler, and ultrasonographic tissue characterisation (UTC). A good correlation between instruments was found for diagnostic accuracy. When convergent validity was assessed, no correlation was found between shear wave modulus and isometric contraction in the two articles. Thus, the grading of evidence was downgraded to low due to the small healthy sample size on which the results are based. Insufficient data is present for all types of reliability assessed when using shear wave modulus and therefore no grading of evidence was conducted. Responsiveness was only investigated in one article and SW modulus showed a significant change when a 6-month follow-up was compared to baseline data. However, only poor monitoring accuracies were found for midportion Achilles tendinopathy so evidence could not be graded.

Continuous shear wave elastography

Convergent validity of cSWE to maximum voluntary isometric contraction (MVIC) obtained inconsistent results within a small sample size of healthy people; thus grading of evidence was downgraded to very low. Only one article studied the correlation to shear wave modulus, so evidence was not graded. A low level of evidence is present for intra-rater same-day cSWE reliability due to the indirectness of the study population while inter-rater was not investigated.

Three-dimensional Shear wave elastography

Evidence was not graded for 3D SWE as only one article was found investigating this method. Moreover, no validity testing in vivo was conducted.

Discussion

This systematic review identified four different modalities of elastography: strain elastography or also known as compression elastography, shear wave elastography, continuous shear wave elastography, and 3D elastography. Each elastography method will first be discussed in light of the inconsistencies (Appendix 5) and quality of evidence collated on the identified measurement properties as presented in Table 6 for reliability, Table 7 for validity, and Table 8 for measurement error). General considerations that should be taken into account when evaluating the evidence of each measurement property of different elastography modalities will also be discussed.

Strain elastography

Strain elastography is considered a semi-quantitative measure, represented as a ratio of tissue stiffness in comparison to its surrounding or external reference material. This strain index can be used as a comparative index and should not be considered an absolute strain measurement. The use of different reference materials led to both high and low ICC values, leading to mixed positive and negative results, especially when Kager’s fat pad was the chosen comparator for patients with Achilles tendinopathy. Material properties of the fat pad can change due to the pathology itself [49], thus giving rise to a false ratio. For this reason, it is recommended that when opting for strain elastography as a measurement instrument, an external reference material with known elastic properties is used. This will allow comparisons of strain ratios under different conditions and among subjects.

In this review, the validity and reliability of strain elastography obtained moderate ratings, supporting its use to measure the Achilles tendon material properties. It was found to be a highly operator-dependent procedure with better reliability in more experienced professionals. Repeated manual compression using the ultrasound transducer causes axial strain in the tissue of interest. These compressions produce a displacement within the tissue, which is less pronounced in harder than softer materials.

Shear wave imaging – shear wave velocity and modulus

Shear wave imaging has the advantage of providing quantitative measures within a relatively small region of the tendon, thus tendon pathology which is known to affect discrete areas, can be accurately identified and measured. It is believed that using shear wave modalities is more reliable than strain elastography as the compressions are automatically induced by using a radiation force of ultrasound beams [50]. However, the grading of evidence from this review suggests that reliability properties are still low and insufficient to arrive at such conclusions.

Direct comparison of shear wave imaging results was not possible as some articles based their analysis on estimates of Young’s modulus (E) rather than reporting the underlying shear wave velocity. These two variables are directly related but not the same [51, 52]. Converting SWV to E relies on the equation E = 3 pv2 where v is SWV and p is tissue density [47, 50]. This equation assumes tissue isotropy based on 1000 kg.m3 used as a constant tissue density; however, this may not always be true as tendons are found to be heterogeneous [53], anisotropic [54] and viscoelastic [55], with variation in their structural composition and fluid consistency especially when pathology is present [56]. Thus, it is recommended that shear wave velocity should be reported rather than the shear modulus.

Although criterion validity against the gold standard method of dynamometry and ultrasonography was not investigated, several studies found shear wave imaging to be a valid tool to differentiate between healthy and diseased tendons, as tendinopathic tendons are significantly less stiff than healthy ones [29]. Results also correlate strongly to the patient-reported outcome measure; the VISA-A questionnaire and clinical symptoms make shear wave imaging a valid tool to identify tendon damage with high evidence.

One of the major drawbacks of shear wave imaging is that it can result in saturation of the elastogram in ankle positions around 0° of flexion since the Achilles tendon bears high tensile loads in this position or when dorsiflexed. Given this limitation, authors [38] of previous research suggested that the evaluation of AT should be limited to a relaxed or plantar flexed position and not stretched or loaded tendon with additional weight as this will increase the shear wave velocity [39, 57] leading to saturation and possibly false results.

Inconsistencies were also present when calculating the shear wave velocity and modulus. The midportion of the tendon was investigated at different lengths from 1 cm from the insertion up to the myotendinous junction of the soleus muscle and the gastrocnemius myotendinous junction [39]. Furthermore, no homogeneity exists on how to identify the region of interest (ROI) to measure the shear wave velocity. Some authors based their findings on the whole thickness of the tendon, while others applied multiple circular or box ROIs of 1 to 3 mm and an average was taken within the same frame or different frames of a recorded clip for offline analysis. In a study assessing the size of ROI, transducer pressure, and time of acquisition [58], the authors found significant differences in the maximum value of the elastic modulus of the rectus femoris and patellar tendon when different ROI sizes were used. Unfortunately, no consensus exists on which protocol works best to improve reliability and with the limited available literature present and the high heterogeneity that exists, recommendations cannot be made. Further research on these technical aspects can identify whether the same results are achieved when using different methods especially when diseased tendons are measured. Recent reporting guidelines on the use of shear wave elastography [59] suggest that the region of interest information should be reported in detail, including the position, number, size of the ROIs and whether ROIs were standardised and kept constant across all participants.

Continuous shear wave elastography

cSWE overcomes the issue of saturation by using an external actuator to generate shear waves across a specified range of frequencies. However, this recent innovation is still in its infancy stage and insufficient literature is available to grade data. One limitation is that it requires an extra pair of hands due to the added actuator that is placed near the probe.

Three-dimensional Shear wave elastography

3D SWE has been recently introduced to better acquire a three-dimensional acquisition volume of the tendon’s stiffness. However, its validity remains questionable because of anisotropy when tendon fibres are not perfectly aligned. Advancements in ultrasound transducers allowed for reduced time for the elastogram to stabilise thus reducing the effects of movement artefact when obtaining results. However, any conclusions are premature, inconclusive, and further research is needed to explore the measurement properties of this method.

Considerations on reliability, validity, measurement error, and responsiveness

Reliability and validity

Most of the above gradings of evidence were based on studies investigating healthy subjects. Although this is critical for establishing reliability and validity, uncertainty remains in the presence of tendon injury. Reliability depends on the homogeneity of the study population being assessed, affecting the generalisability of results. Homogeneity reduces the variance between participants, and the ICC values will be a conservative estimate of the reliability to be expected in a cohort more representative of the general population.

The importance of having reliable instruments available becomes increasingly essential since the translation of data into clinical practice is safer and more accurately reflects the functional condition of the evaluated person. It was evident that a standardised protocol optimises reliability because repeated measurements are similar and the error of measurement arising from variation in measurement protocols is kept to a minimum. This review reported the methodological inconsistencies found that hinder further analysis of results. Standardised ultrasonographic technical settings and positioning of the patient with monitored muscle activity are imperative for better interpretable results.

An important consideration that was missing in some of the assessed reliability articles was the preconditioning of the tendon. Since tendons have elastic [60] and viscoelastic properties, with other time-dependent mechanical properties affected by loading history [48] and hydration state [61], pre-conditioning protocols should be used to ensure that the tendon behaves in a repeatable way.

Measurement error

The overall evidence for measurement error could not be determined because assessing measurement error takes into consideration intra-individual variability between repeated measurements and is often expressed as the coefficient of variation, the smallest detectable change, and the limits of agreement. An instrument with a large error of measurement may fail to detect the true meaningful change in an individual patient.

Understanding if a truly meaningful change has occurred, is of added value. This meaningful change has clinical importance in identifying an improvement in physical functioning that is large enough for the person to perceive a difference [62, 63]. If no minimal important change (MIC) is reported in studies, there is no value upon which to make a comparison, and the measures by which a meaningful change is judged may not reflect the true state. There is still ongoing debate about how the MIC should be assessed [25] and so graded evidence of the measurement error was indeterminate for all elastography modalities. This value also varies according to context, as MIC derived from study participants who are healthy may have limited value for studies that investigate participants with tendinopathy.

Responsiveness

Evidence on responsiveness cannot be determined as very few articles were found to consider this measurement property. Only one article using axial strain elastography and another using SWE having a longitudinal design were found to determine the changes that are occurring over a period of time. This shows that no established measurement instrument is yet identified as a treatment monitoring instrument, which aggravates the detection of early treatment effects or possible complications.

Conclusions

This systematic review explored and highlighted the paucity of evidence for the measurement properties of different elastography methods. Our data synthesis focused on the qualitative approach as considerable heterogeneity between studies was present, thus not allowing for a meta-analysis of results. The qualitative approach adopted permitted the best possible synthesis of evidence that accounted for between-study similarities, the quality of each study, and the consistency of measurement properties reported across different studies. There are a limited number of studies exploring quantitative elastography on Achilles tendinopathy as most evidence found in this review was based on a healthy population. Only articles published in the English language were eligible for inclusion. It is, therefore, possible that other potential studies might have been excluded.

Based on the identified evidence on the measurement properties of elastography, none of the different elastography methods showed superiority over the others with gradings ranging from very low to moderate. However, strain elastography and shear wave elastography reported as shear wave velocity have the best potential to be used in the identification of tendinopathy. Further high-quality studies with robust longitudinal designs to investigate responsiveness are needed to aid the monitoring of tendon recovery and detect any differences over time that may be attributed to true physiological changes.