Introduction

LA is of hemodynamic importance for overall cardiac performance through reservoir, conduit, and booster pump functions [1,2,3]. The components of left atrial function are traditionally estimated using Doppler analysis of trans-mitral and pulmonary vein flow. However, the evaluation of LA function by Doppler analysis can be affected by left ventricle (LV) dysfunction, and is therefore limited [4,5,6]. Two-dimensional (2D) quantification of cardiac chamber size can be used to assess LA remodeling and function. M-mode echocardiography measured LA anteroposterior (AP) linear dimension only represents a single parameter of the left atrium LA [7, 8]. LA volume measured by 2D echocardiography reflects LA chamber size in all directions. However, the value heavily dependent on geometric assumptions [9]. Furthermore, the lack of a standardized methodology for three-dimensional (3D) echocardiography prevents the widespread use of 3D echocardiography to measure LA function [10, 11].

Because of the aforementioned drawbacks, there is increasing interest in speckle-tracking echocardiography (STE), which provides visualization of all phases of LA function [12, 13].

Initially, there was no specialized 2D-STE software for the assessment of LA deformation. Studies evaluating LA function used software designed for evaluation of the left ventricle (LV) [14]. It is controversial to measure LA strain without dedicated software [13, 15]. And there are a number of different commercially available software packages [16, 17]. Though consensus has been reached that the relative variation in LV strain measurement among different software should not exceed 10% before the technique can be recommended for clinical use [18], this has not been agreed upon for measurement of LA strain. As such, the widespread use of LA strain measurement could be hindered by the uncertainty of measuring strain with different software [19].

Thus, the purpose of this study was to compare the results of measuring LA strain with 3 different software packages.

Methods

Study population

Adult patients receiving echocardiography at the clinics of the Adult Echocardiography Lab of Guangdong Provincial People’s Hospital from December 2016 to September 2017 were recruited for the study. This study was approved by Research Ethics Committee of Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences. Inclusion criteria were: (1) older than 18 years; (2) sinus rhythm at examination; (3) agreed to participate in the study and provided written informed consent. Exclusion criteria were: (1) onset of atrial fibrillation during the examination; (2) valvular heart disease (moderate or severe heart valve stenosis or valve replacement); (3) implantation of a pacemaker or defibrillator; (4) poor image quality; (5) did not provide informed consent. Clinical data including a history of hypertension, diabetes mellitus, dyslipidemia, and smoking were collected trained research staff at the time of the first hospital admission.

We identified 99 participants. Of these, 35 subjects were excluded from analysis for not providing consent (n = 5), sinus rhythm turning into atrial fibrillation (n = 7) or inadequate imaging quality due to acquisition with unclear LA endocardium (n = 12), LA foreshortening (n = 11). The final study population consisted of 64 individuals (male/female, 32/32; mean age = 44.1 ± 16.1 years).

Echocardiographic acquisition

Echocardiographic studies were performed using a GE Vivid 9 (GE Vingmed Ultrasound, Horten, Norway) echocardiograph system. Examinations were performed with subjects in the left lateral recumbent position. Apical 4- and 2-chamber views were obtained using conventional 2D gray scale echocardiography with an M5S probe (2 ~ 4 MHz), using a frame-rate of 40–71 frames/s, in accordance with current American Society of Echocardiography recommendations [20]. Both apical views used should be optimized in terms of orientation, depth, and gain to avoid LA foreshortening and to visualize the entire LA throughout the cardiac cycle. Five cardiac cycles of each plane were stored in cine loop format in order to subsequently select the images of better quality for off-line speckle-tracking analysis.

Conventional echocardiography analysis

Offline analysis of images was performed using EchoPAC version 201 (GE Vingmed Ultrasound) (VSSLV) software, and Image Arena 2D Cardiac Performance Analysis version 4.6 (TomTec Imaging Systems, Unterschleissheim, Germany) (VISLA and VISLV) software, yielding 3 strain analysis sets for each examination (Fig. 1).

Fig. 1
figure 1

Bland-Altman scatter diagram of strain among VSSLV, VISLV, and VISLA. Compared with VIS, VSS had a larger bias and a wide 95% consistency range. Within VIS, the bias was small and the consistency range was relatively narrow. Sct: VSSLV - VISLV (A1), VSSLV - VISLA (A2), VISLV - VISLA (A3). Scd: VSSLV - VISLV (B1), VSSLV - VISLA (B2), VISLV - VISLA (B3). Sr: VSSLV - VISLV (C1), VSSLV - VISLA (C2), VISLV - VISLA (C3)

From the parasternal long-axis view, LV end-diastolic diameter and LV end-systolic diameter were obtained at the level of the mitral valve tips by M-mode Doppler ultrasonography. Left ventricle ejection fraction (LVEF) was calculated automatically by the GE Vivid 9 system. From the apical 4- and 2-chamber positions, maximum LA volume (LAVmax,; measured on the 2D frame just before mitral valve opening), LA pre-atrial contraction volume (LAVpreA; measured on the frame just before the onset of atrial emptying), and LA minimum volume (LAVmin,; measured on the frame at end-diastole with the smallest LA volume) were computed separately following American Society of Echocardiography guidelines, and using the biplane modified Simpson’s method of discs. The indices and formulas calculated from the volumes were as follows. Total LA stroke volume = LAVmax − LAVmin. Active LA stroke volume = LAVpreA − LAVmin. Passive LA stroke volume = LAVmax − LAVpreA. Active LA emptying fraction = (active LA stroke volume / LAVpreA) × 100%. Passive LA emptying fraction = (passive LA stroke volume / LAVmax) × 100%. The LA expansion index = (total LA stroke volume / LAVmin) × 100%. Pulsed wave Doppler at the apical position was used to obtain mitral inflow velocity between the tips of the mitral leaflets. E/e′ was calculated as early mitral inflow velocity (E) divided by the average of septal and lateral mitral annular peak early diastolic velocity (e′) obtained by pulsed wave tissue Doppler imaging (TDI).

Speckle-tracking echocardiography analysis

Speckle-tracking analysis was performed by the software: VSSLV, VISLA, and VISLV. VSSLV analyzed the image derived from raw data, while VISLV and VISLA analyzed the image from compressed data.

According to the consensus document of the EACVI/ASE/Industry Task Force recommendation [21], the LA was traced as followings: starting tracing the LA endocardial border at the endocardial border of the mitral annulus, up to the opposite mitral annulus side, while carefully excluding the pulmonary veins and LA appendage orifices. The apical two-chamber view was also analysed to obtain a biplane calculation of the LA strain.

The endocardial border was traced manually or was user defined. The user could review the tracking path and manually adjusted it after running. The adjustable numbers of images were 1/2 of the frame frequency with VISLA and VISLV, and 1 frame with VSSLV. With VSSLV and VISLV, the tracking location marker was placed at the end of QRS wave. With VISLA, the marker was placed at the beginning of the P wave. With VISLV and VISLA, the amplitude of the ECG cannot be modulated.

A region of interest (ROI) was selected in the endocardial mode of VSSLV and VISLV, while it was defaulted to include the endocardium with VISLA. VSSLV uses full thickness ROI including endocardium and epicardium, the position and size of which can be adjusted. And the ROI tracing line of VIS is single, which can not be adjusted.

The software algorithms automatically performed speckle-tracking on a frame-to-frame basis.

Exclusion of the individual studies was done upon visual assessment, when abnormal curves were believed to be artefactual. Strain values were obtained from the apical 4- and apical 2-chamber views.

Intra-observer and inter-observer reproducibility of LA strain values were analyzed with repeated measurements by the same observer at 2 different time points, and by a second independent different observer. All observers were blinded to the results of the other software package and previous strain results when assessing reproducibility.

Statistical analysis

Statistical analysis was performed using SPSS version 20 (IBM, Armonk, New York) and Empower (R) (www.empowerstats.com, X&Y Solutions, Inc., Boston MA) and R (http://www.R-project.org). All measurements were tested for distribution normality with the Kolmogorov–Smirnov test. Continuous variables were expressed as mean ± standard deviation (SD). Categorical variables were reported as percentages. Variability values were expressed by the coefficient of variation (CV), defined as CV = S/X × 100%, where S is the standard deviation and X is the mean value. Differences between groups were analyzed for statistical significance with the unpaired t-test or Mann-Whitney U test, as appropriate. Agreement between the 3 speckle-tracking methods was assessed by Bland–Altman analysis. The bias (mean difference) and the 95% limits of agreement (2 SDs around the mean difference) between the measurements derived from each system were calculated. The reliability for inter-software, intra-observer, and inter-observer measurements was evaluated by intra-class correlation coefficient (ICC) [22]. An ICC of ≥0.90 was considered excellent reliability, an ICC of ≥0.70 - < 0.90 was considered good reliability, an ICC of ≥0.50 - < 0.70 was considered moderate reliability, an ICC of ≥0.30 - < 0.50 was considered poor reliability, and an ICC of > 0.30 was considered very poor reliability. Pearson’s correlation coefficient and the point-biserial correlation coefficient were used to assess the correlation between strain value and baseline clinical characterizes. The Z-test (after transformation) was used to test the difference of correlation coefficient among the 3 software. All statistical tests are 2-sided, and a p-value < 0.05 is considered statistically significant.

Results

All study subjects showed normal systolic function as determined by LVEF. However, the AF group had a significantly larger LA volume index (24.4 ± 5.9 vs. 31.7 ± 11.3, p = 0.002), LA volume (39.6 ± 10.8 vs. 55.7 ± 20.9 for LAVmax, p < 0.001; 15.3 ± 6.1 vs. 25.8 ± 15.5, p < 0.001 for LAVmin; 21.9 ± 7.9 vs. 37.3 ± 18.5 for LAVpreA, p < 0.001), lower passive LA emptying fraction (%) (44.9 ± 10.2 vs. 34.3 ± 13.2, p < 0.001), lower LV emptying fraction (%) (61.7 ± 8.6 vs. 55.2 ± 12.8, p = 0.020), and lower LA expansion index (%) (176.0 ± 73.4 vs. 140.3 ± 65.8, p = 0.045) (Table 1).

Table 1 General characteristics of the study population. Clinical and echocardiographic features of the study population

The mean heart rate showed no significant differences when strain was analyzed with 2-chamber view and 4-chamber view (67.1 ± 10.7 vs. 67.1 ± 10.7 beats per min [bpm], p = 1.00). As shown in Additional file 1: Tables S1–S3, the correlation coefficients between strain value and patient’s baseline clinical characteristics were calculated and compared among the software: VSSLV, VISLV, and VISLA. To assess the comparability between paired software (VSSLV-VISLV, VSSLV-VISLA, and VISLV-VISLA), the overall ratio of non-significance was calculated for each specific software, and for each pair. For VSSLV, VISLV, and VISLA, the ratios were 92.86, 96.03, and 96.83%, respectively. For the pairs VSSLV-VISLV, VSSLV-VISLA, and VISLV-VISLA, the ratios were 92.05, 93.65, and 100%, respectively.

Strain analysis

Strain analysis was obtained in all 64 subjects. Representative examples of strain measured by the 3 systems are shown in Table 2. When using VSSLV, there were significant differences in intra-observer measurement of Sct and Sr, and inter-observer measurement of Sct (p < 0.05). Comparing VISLV and VISLA, there were no significant differences in intra-observer or inter-observer measurements (p > 0.05).

Table 2 Left atrial strain measured by different observers (mean ± SD, n = 64)

Strain measurements between VSSLV and VIS (VISLV and VISLA) demonstrated considerable variability in Sct, Scd and Sr as assessed by their CoV, as shown in Table 3, the CoV for measurements between VSSLV and VIS (VISLA and VISLV) ranged from 0.04 to 31.16% (Sct), − 1.94 to 38.21%(Scd) and − 1.29 to 36.40%(Sr).

Table 3 The CoV for the values measured by VSS and VIS (n = 64)

The comparability of strain measurements showed good agreement within VIS (VISLA and VISLV), with a smaller mean differences, which were 0.02 (Sct), 0.76 (Scd), and − 0.50 (Sr), and a narrower limits of agreement ranging from − 5.42 to 7.05. Between VSSLV and VISLV, mean differences were 3.51 (Sct), 6.41 (Scd), and − 10.20(Sr), with limits of agreement ranging from 16.99 to − 21.30. Between VSSLV and VISLA, the mean differences were 3.53 (Sct), 7.17 (Scd), and − 10.70 (Sr), with limits of agreement ranging from 17.69 to − 23.03. Figure 2 shows the Bland-Altman Scatter plots.

Fig. 2
figure 2

Schematic diagram of left atrium global longitudinal strain index of measured by VSSLV, VISLV and VISLA. Apical four-chamber view was obtained using conventional 2D echocardiography. The left atrial strain (a) measured by VSSLV; The left atrial strain (b) measured by VISLA;the left atrial strain (c) measured by VISLV. The white dashed line (a) and white lines (b and c) represent the average strain. r, reservoir phase; cd, conduit phase; ct, contraction phase. The respective strains are Sr, calculated as difference between onset of filling and end-diastole (positive value); Scd, calculated as difference between onset of atrial contraction and onset of filling (negative value); Sct, calculated as difference between end-diastole and onset of atrial filling (negative value)

In Table 4, the intra-class correlation coefficients revealed excellent reliability for VIS, with the correlation coefficients generally > 0.9.

Table 4 Intra-class correlation coefficients of the different software (n = 64)

Overall, there were fewer differences between strain values from VIS compared with strain values from VSS derived from same images.

Inter-observer and intra-observer variability and correlation in the determination of LA strain

As shown in Table 5, the inter-observer and intra-observer variability (mean and range) of the measurements derived from VSSLV were 2.35% (1.52–3.30%) and 2.67% (1.95–3.49%), respectively. The differences of inter-observer measurement of Sct and Sr were statistically significant (p < 0.05), and the intra-observer difference of measurement of Sct was significant (p < 0.05). The inter-observer and intra-observer variability (mean and range) of VISLV were 1.39% (0.71–2.31%) and 3.87% (3.67–4.10%), respectively. Both intra-observer and inter-observer variability were not significantly different (p = 0.01).

Table 5 Intra-observer and inter-observer CoVs for strain value measured by the same software (n = 64)

For VISLA, the inter-observer and intra-observer variability (mean and range) were 0.55% (0.05–1.18%) and 1.62% (0.85–2.68%), respectively, and there were no significant differences (p > 0.05).

The results of the analysis of bias and limits of agreement for the same software are shown in Table 6. The bias for value measurement by VISLA was smaller than that by VSSLV and VISLV.

Table 6 Bias and limits of agreement for the same software (n = 64)

The inter-observer and intra-observer evaluation results of the same software for intra-class correlation coefficient are shown in Table 7. The inter-observer ICC values were VISLA > VSSLV = VISLV; for intra-observer ICC, VSSLV > VISLA > VISLV.

Table 7 Intra-observer and inter-observer intra-group correlation coefficients for the same software (n = 64)

Discussion

In this study, we sought to investigate the reproducibility of various LA strain analyses obtained using VSS (VSSLV) and VIS (VISLV and VISLA), and evaluate the agreement between the methods. The main results of this study are that (1) when comparing VSS (VSSLV) and VIS (VISLV and VISLA), the absolute values of the CoV for strain measured by VSS were larger than those measured by VIS. In addition, the ICC between VIS and VSS indicated that the measurement reliability was weak (generally < 0.6). However, when comparing strain measured between VISLA and VISLV, the bias and CV were smaller, and reliability was good. Therefor measuring LA strain across the different vendors is not recommended. (2) The reproducibility of inter-observer and intra-observer measurement within the same software was good. The CoV were minor and < 10% [18], and Bland-Altman analysis suggested that the bias of 3 software was small and the ranges of consistency were stable. VSS have no obvious advantages compared with VIS, and measurements derived from VISLA had the smallest bias. (3) Strain values were correlated with patient baseline clinical characteristics. The measurements obtained with the same software were more consistent than those obtained with different software, which might influence the analysis of the results clinically. These findings are important in view of future clinical applications of 2D-STE, particularly for patient diagnosis and follow-up in centers where there are a diversity of cardiac ultrasound systems.

In a study of similar design focusing on LV and RV strain [23], the authors found there was good reproducibility for global longitudinal strain, but only moderate reproducibility for circumferential strain and poor reproducibility for radial strain when comparing LV strain, and good reproducibility across different ultrasound platforms and software packages when comparing the RV strain [24]. The main contractile direction of the left atrium is longitudinal [25], and it is reasonable to posit that Sct had a smaller bias and CoV across different ultrasound platforms and software packages than Sr and Scd. The strain values measured were generally correlated with baseline clinical characteristics. However, when measuring Scd and Sr with different software, differences of the correlation coefficients were prominent. And even with improved observer reliability, the intra-observer differences between Scd and Sr measured by VISLV and VISLA was still significant, while there was no significant intra-observer difference in the repeated measurements by the same software. A consensus has been reached that the relative variation in strain measurements among different software should not exceed 10% [18]. We believe 10% is a proper reference, and found the variations between VIS and VSS ranged from 31.16 to 42.76%, while that compared within VIS did not exceed 10%. Therefore, we suggest different software are not interchangeable when analyzing 2D strain data to assess LA strain from the same subjects.

Compared with measurements derived from VIS, those from VSS had large bias and a wide 95% consistency range, while within VIS the bias was small and the consistency range was relatively narrow. We believe the differences of CoV and bias between VIS and VSS are due to dissimilarity in ROI [26, 27]. During tracing of the left atrium, VSS tracks speckles from the mid-myocardium, while VIS determines strain from the endocardium. Therefore, a full-thickness ROI will be placed by VSS. With VIS, the line placed is slightly within the endocardial wall. Variation in exact placement of the ROI will cause variation in longitudinal strain values, as longitudinal strain decreases from the endocardium to the epicardium [28, 29]. It is also important to know that the different software use different algorithms to calculate deformation and express the results. However, the algorithms are not publically disclosed.

Different from the results of the research conducted by Takigiku et al. [30], that showed intra- and inter-observer ICCs were always better than inter-vendor agreement, our study showed that VSS had the best inter-observer ICCs, while VISLA had the best intra-observer ICCs. We supposed the repeatability of zero strain chosen at the beginning of the p-wave could cause the differences [27, 31]. GE has the access to the raw data, while TomTec is disallowed from the manufacturers’ raw data. Therefor VIS analyses the strain by the data which were compressed, which could decrease the repeatability of zero strain selection and increase the error of strain output. However, with improved operator experience, it is possible to improve the repeatability of the zero strain setting to a certain extent by correcting the starting point of LA contraction with the 2D image [27]. Furthermore, with VISLA the tracking location marker was placed at the starting frame of the p-wave before the LA contraction began, while with VSS the marker was placed at the end of QRS where the LA contraction has started. Therefore, the left atrium will be more dilated in VISLA, which makes tracking more accurate by avoiding the entrance of the pulmonary veins and LA appendix [32]. Even though VSSLV has the access to the raw data (before Digital Imaging and Communications in Medicine [DICOM] formatting), while VISLV and VISLA are disallowed from the manufacturers’ raw data and only gets access to the DICOM, VSS had no obvious advantages compared with VIS, and measurements derived from VISLA had the smallest bias.

Conclusion

Measurement of LA strain is an evolving echocardiographic technique for the assessment of LA function, and has been studied in a variety of clinical settings [33]. The recent European Association of Cardiovascular Imaging and the American Society of Echocardiography 2018 task force highlighted that differences between vendor software for strain assessment remain a very important barrier to widespread use and applicability of LA strain. To our knowledge, there are still no data showing a variability between different ultrasound software packages regarding LA analyses. And this is the first study focusing on discussing the reproducibility of various LA strain analyses obtained by different software packages. The findings of our study reveals when comparing VSS (VSSLV) and VIS (VISLV and VISLA), the absolute values of the CoV for strain measured by VSS were larger than those measured by VIS, therefor we suggest that the same 2D-STE software should be used to analysis the left atrial strain during patients’ following up. Since we find measurements derived from VISLA had the smallest bias, we think the dedicated 2D-STE software should be used to analyze LA strain.

Limitations

The study was not designed to assess the accuracy of LA strain measurements as there was no comparison to gold standard. This study was focused on determining the reproducibility of LA strain measurements among different vendors.

We used small numbers of subjects to assess consistency among 3 vendors. However there are no data showing a variability between different ultrasound software packages regarding LA analyses, and the use of a larger number of subjects may help clarify this issue.

Although we included patients with AF, they were in sinus rhythm during analyzing and further studies are needed to assess use of LA strain in patients who are not in sinus rhythm.

We excluded the patients with poor image quality. However it is important for STE analyzing, prospective image acquisition with a focus on LA optimisation would benefit further LA strain study.