Cluster analysis of extracellular matrix biomarkers predicts the development of impaired systolic function within 1 year of acute myocardial infarction

The clinical utility of combining extracellular matrix (ECM) biomarkers to predict the development of impaired systolic function following acute myocardial infarction (AMI) remains largely undetermined. A combination of ELISA and multiplexing assays were performed to measure matrix metalloproteinase (MMP)-2, MMP-3, MMP-8, MMP-9, periostin, N-terminal type I procollagen (PINP) and tissue inhibitor of matrix metalloproteinase-1 (TIMP-1) in plasma samples from 120 AMI patients. All patients had an echocardiogram within 1 year of AMI, and were divided into impaired (n = 37, LVEF < 50%) and preserved (n = 83, LVEF ≥ 50%) systolic function groups. Exploratory factor analysis was performed on log-transformed biomarkers using principle axis analysis with Oblimin rotation. Cluster analysis was performed on log-transformed and normalised biomarkers using Ward’s method of minimum variance and the squared Euclidean distance metric. Upon univariate analysis, current smoking, prescription of ACE inhibitors at discharge, peak hsTnT > 610 ng/L (median), MMP-8 levels, Factor 1 scores and Cluster One assignment were predictive of impaired systolic function. Upon multivariate analysis, Cluster One assignment (odds ratio [95% CI], 2.74 [1.04–7.23], p = 0.04) remained an independent predictor of systolic dysfunction in combination with clinical variables. These observations support the usefulness of combining ECM biomarkers using cluster analysis for predicting the development of impaired systolic function in AMI patients. Supplementary Information The online version contains supplementary material available at 10.1007/s00380-022-02118-8.


Introduction
A major consequence of acute myocardial infarction (AMI) is the subsequent development of adverse left ventricular (LV) remodelling. Described by progressive changes to LV geometry and function, adverse remodelling is closely 1 3 associated with increased risk of poor cardiovascular outcomes [1] and persists following AMI despite optimal medical therapy and clinical management [2,3]. Systolic dysfunction is a clinical manifestation of adverse LV remodelling and is characterised by reduced left ventricular ejection fraction (LVEF) [4]. While commonly presenting as asymptomatic in patients, systolic dysfunction is a significant risk factor for the development of heart failure with reduced ejection fraction (HFrEF) [5].
The cardiac extracellular matrix (ECM) is comprised of multiple molecular factors that provide structural support to the myocardium and facilitate cardiac repair processes following ischaemic injury [6]. Collagen deposition is central for stable scar formation and is critical for the development of cardiac fibrosis [7]. While localised to the myocardium, by-products of collagen maturation can be measured as biomarkers of collagen synthesis, such as N-terminal type I procollagen (PINP). In addition, matrix metalloproteinases (MMPs) and their endogenous regulators, tissue inhibitors of matrix metalloproteinases (TIMPs), are important proteolytic enzymes involved in remodelling the architecture of the myocardium following injury [8]. Matricellular proteins, such as periostin, have a limited role in matrix architecture, and instead facilitate matrix remodelling processes by acting as an interface between the cellular and ECM network [9]. Indeed, a number of candidate ECM biomarkers have been implicated as prognostic tools for the development of systolic dysfunction [10][11][12][13] and heart failure [14,15]. However, current studies are mostly limited to single biomarker analysis and inconsistencies of prognostic effectiveness exist within the literature. Potentially, investigating strategies to combine multiple biomarkers may more optimally capture cardiac remodelling, which is characterised as a complex pathophysiological process.
We have previously demonstrated that combining biomarkers using cluster analysis can separate AMI patients into groups with differential risk based on Global Registry of Acute Coronary Events (GRACE) scores and peak highsensitivity Troponin T (hsTnT) levels [16]. Cluster analysis describes the partitioning of similar objects together and dissimilar objects apart using the expression of input variables, such as biomarker levels [17]. Subsequently, cluster analysis is a powerful tool to investigate patient risk based on the combined expression of multiple biomarkers. In our previous study, we also utilised exploratory factor analysis (EFA) to examine the underlying interrelationships between ECM biomarkers [16]. In addition to this purpose, factor scores can also be generated from EFA as a mechanism to investigate the relationship between combined variables and patient risk [18]. Thus, cluster analysis and EFA are two techniques that can be applied to a clinical population to investigate the utility of combining biomarkers.
The aim of this study was to assess if combining ECM biomarkers using EFA or cluster analysis could more accurately predict the development of impaired systolic function in AMI patients when compared to single biomarker analysis. Seven ECM biomarkers were examined in this study based on previously published findings that suggest MMP-2, MMP-3, MMP-8, MMP-9, periostin, PINP and TIMP-1 were an optimal biomarker panel to assess prognostic outcomes in AMI [19].

Study population
Patients diagnosed with acute coronary syndromes (ACS) and undergoing coronary angiography with or without percutaneous coronary intervention at Wellington Regional Hospital between January 2012 and September 2018 were prospectively recruited into the Wellington ACS Registry. ACS was defined as having symptoms suggestive of myocardial ischaemia for greater than 10 min alongside either troponin elevation or ≥ 1 mm of new ST-segment deviation or T wave inversion as identified using an electrocardiogram in a minimum of two contiguous leads [20]. Patients were excluded from this registry if they had a platelet count less than 100 × 10 9 /L, a known platelet function disorder or were administered a fibrinolytic agent within 24 h or a glycoprotein IIb/IIIa receptor agent within a week of enrolment. From this cohort, a subset of 120 AMI patients were selected who had no previous history of myocardial infarction or atrial fibrillation and who had echocardiogram measurements at a remote time point within 1 year from index admission. This study was approved by the Lower South Regional Ethics Committee (LRS/11/09/035) and the New Zealand Central Health and Disabilities Ethics Committee (16/CEN/68).

Data collection and blood sampling
Clinical characteristics were prospectively collected upon review of medical records. Whole blood was collected into sodium citrate tubes (0.109 M, BD Vacutainer, New Jersey, USA) either from a peripheral vein using a 21-gauge needle before angiography or from the arterial sheath during cardiac catheterization prior to heparin administration. Citrated whole blood was separated into plasma by centrifugation at 1500 × g for 12 min, and aliquots were stored at -80 °C for subsequent analysis. Cardiac specific peak hsTnT measurements were recorded for all AMI patients as part of routine standard of care using the Elecsys Troponin T high-sensitive test (Roche Diagnostics, Basel, Switzerland).

Echocardiogram assessment
A 2-dimensional transthoracic echocardiogram was performed on each patient by a trained sonographer during clinical follow-up within 1 year of AMI onset. A standard transthoracic echocardiographic imaging protocol was used [21], with apical four-chamber and two-chamber views; LV short axis were recorded from the left parasternal region at the following three levels: the mitral valve, the mid-papillary muscle, and the apex. All calculations and interpretations of echocardiogram reports were performed by cardiologists or cardiac sonographers, and missing values identified upon retrospective analysis were completed by an advanced cardiology trainee. Systolic function was categorised into preserved and impaired function based on the updated recommendations by the American Society of Echocardiography and the European Association of Cardiovascular Imaging (ASE/EACVI) [22]. Preserved function was defined as LVEF ≥ 50% and impaired function was defined as LVEF < 50%.

Biomarker measurement
The biomarker panel selected for this study comprised the following: MMP-2, MMP-3, MMP-8, MMP-9, periostin, PINP and TIMP-1. In a previous study, we demonstrated that these ECM biomarkers were of interest to investigate in a prognostic setting following AMI [19]. Specifically, we showed that levels of these biomarkers were significantly altered in AMI patients when compared to healthy volunteers and that these biomarkers were stably expressed in the first 3 days following symptom onset, confirming opportunistic blood sampling during acute hospital admission. Biomarker levels were quantified in plasma samples of ACS patients blinded to systolic function categorisation. The levels of all MMPs and periostin were measured using two Magnetic Luminex panels according to manufacturer instructions. Panel One contained MMP-2, MMP-3 and MMP-9 (Lot L136374, 1:25 sample dilution) and Panel Two comprised MMP-8 and Periostin (Lot L125986, 1:2 sample dilution). The mean fluorescence intensity for each analyte was measured using dual-lasers on the Luminex 200 analyzer (Sigma-Aldrich, Massachusetts, USA). Experimental data were analysed by fitting a 5-PL curve to the standard analyte curves. Intra-assay coefficients of variation ranged between 4.4% and 7.2%, and inter-assay coefficients of variation were between 4.5% and 14.1%. Levels of PINP (Human PINP ELISA, MyBioSource, California, USA) and TIMP-1 (Human Duoset ELISA, R&D Systems, Minnesota, USA) were measured using ELISA according to manufacturer instructions. Optimal density was measured at 450 nm, with a wavelength correction set to 570 nm. 4-PL and 5-PL standard curves were generated to determine sample concentrations of TIMP-1 and PINP, respectively. Intra-assay coefficients of variation ranged between 3.4% and 10.0%, and inter-assay coefficients of variation were between 2.9% and 7.4%.

Statistical analysis
Continuous variables were assessed for normality using the Shapiro-Wilk test. Parametric continuous variables were reported as mean ± standard deviation (SD) and nonparametric continuous variables were reported as median (interquartile range; IQR). Categorical variables were reported as frequencies (percentages). Mann-Whitney U or Kruskal-Wallis H Test was undertaken to compare continuous and categorical variables. Chi Square testing was utilised to compare categorical variables. To construct multivariate regression modelling, all demographics and clinical characteristics with a p-value < 0.05 on univariate analysis were included in binary logistic regression analysis. Statistical significance was determined in this study by p < 0.05. All basic statistical analysis were carried out in either GraphPad Prism Software v.7.04 (GraphPad Software Inc; California, USA) or SPSS v.24 (IBM; New York, USA).
EFA and cluster analysis were performed as previously described [16]. Briefly, EFA was performed on seven logtransformed ECM biomarkers using principle axis analysis with Oblimin rotation using SPSS v.24. Eigenvalues > 1 and parallel analysis were performed to confirm factor number. Parallel analysis was performed using an online engine [23]. All variables with factor loadings > 0.3 were presented in this study. Cluster analysis was performed exclusively on ECM biomarker data and did not include clinical characteristics or patient risk factors. Prior to cluster analysis, biomarker data were log-transformed to normalise distribution and each biomarker was standardised to the same scale (mean = 0, SD = 1) to account for large variance between biomarkers which could influence cluster assignment. Subjects were partitioned using agglomerative hierarchical clustering using Ward's method of minimum variance and the squared Euclidean distance metric in R version 4.0.2 [24]. Identification of optimal cluster number was assessed using two clustering indices measured using the NbClust package [25] and visualised using the Factoextra package [26] in R version 4.0.2 [24]. A cluster size of two was determined as optimal for this study based on Average Silhouette Width ( Supplementary Fig. 1) and the 'Within Sum of Squares' methodology ( Supplementary Fig. 2).

Clinical characteristics
A summary of the clinical characteristics and levels of ECM biomarkers for the patient population is shown in Table 1. There were 83 patients (69.2%) with preserved

EFA analysis of ECM biomarkers
EFA was performed on log-transformed biomarkers using principle axis analysis with Oblimin rotation. Model fit was assessed using Kaiser-Meyer-Olkin (KMO) and was above the required threshold of 0.5 with a value of 0.6. Collinearity of the dataset was assessed using Bartlett's Test for Sphericity and was found to be significant (p < 0.0001). Factor loadings were suppressed below a 0.3 threshold to only include meaningful variables. EFA identified a two-factor solution, and the rotated factor matrix is shown in Fig. 1a. Factor 1 comprised of MMP-8 and MMP-9 with high loading values > 0.6, and MMP-3 and TIMP-1 with lower loading factors. Factor 2 comprised of MMP-3, MMP-2, TIMP-1 and PINP all with moderate loading factors between 0.4 and 0.5. MMP-3 and TIMP-1 were cross-correlated between factors. Both factors had higher loadings on Factor 2, suggesting they contributed to the composition of this factor more than Factor 1. Despite being included in the EFA analysis, periostin did not contribute to either factor. Factor scores were generated for each patient and these scores were compared between preserved and impaired systolic function groups. Factor 1 scores were significantly elevated in patients with impaired systolic function compared to patients with preserved function ( -0.02 [ -0.60 to 0.95] versus -0.46 [ -0.69 to 0.21], p < 0.05; Fig. 1b). No differences in Factor 2 scores were observed between patient groups (Fig. 1c). The rotated factor matrix of log-transformed ECM biomarkers in 120 AMI patients using EFA. The large ovals represent each latent factor with the percentage of variance for each factor described in bold. The small ovals represent the variables included within each latent factor, and the loading factors are displayed in bold below. b Mann-Whitney U testing demonstrated a significant increase in Factor 1 scores in patients with impaired systolic function compared to preserved systolic function. c No differences were observed in Factor 2 scores between systolic function groups. Median and interquartile range are plotted, and graphs were created using GraphPad Prism software, version 7.04 for Windows. Abbreviations: MMP matrix metalloproteinase, TIMP tissue inhibitor of matrix metalloproteinase, PINP N-terminal type I procollagen

Cluster analysis of ECM biomarkers
Factor 1 scores derived from EFA analysis were significantly elevated in patients with impaired systolic function compared to preserved function. Consequently, hierarchical cluster analysis was performed using the biomarkers that encompassed this factor which included the following: MMP-3, MMP-8 and MMP-9, TIMP-1. Cluster analysis identified two patient groups (Fig. 2). Differences in clinical characteristics and ECM biomarker levels between impaired and preserved systolic function patients are shown in Table 2.
Patients in Cluster One (n = 31) had decreased LVEF measurements (p < 0.05) and elevated levels of MMP-3 (p < 0.05), MMP-8 (p < 0.0001) and MMP-9 (p < 0.0001) compared to Cluster Two (n = 89). Of the clinical characteristics assessed in this study, only gender was significantly different between clustered groups with more male patients in Cluster One compared to Cluster Two (p < 0.05).

Multivariate analysis
Binary logistic regression analysis was undertaken to investigate the predictive potential of MMP-8 levels, Factor 1 scores and cluster assignment for identifying systolic dysfunction in AMI patients. The prescription of ACE inhibitors at discharge, current smoking status and peak hsTnT > median were also predictors of systolic function and were, therefore, included in the multivariate model. As Factor 1 scores and cluster analysis assignment both contain MMP-8, three separate models were created. The multivariate analysis outcomes are shown in Table 3. Model 1 included MMP-8 levels and clinical variables, Model 2 contained Factor 1 scores and clinical variables and Model 3 comprised cluster assignment and clinical variables. All clinical variables included in analysis remained significantly associated with systolic function across all three models. MMP-8 levels and Factor 1 scores were not predictors of systolic dysfunction, while assignment into Cluster One remained an independent predictor of systolic dysfunction (OR 2.74 [95% CI 1.04-7.23], p = 0.042).

Discussion
In this study, we examined the utility of combining biomarkers to predict the development of systolic dysfunction in a cohort of 120 AMI patients. On univariate analysis, MMP-8 levels, Factor 1 scores and cluster analysis partitioning were associated with the development of impaired systolic function following AMI alongside ACE inhibitor medication at discharge, peak hsTnT levels and current smoking. When multivariate analysis was performed, we demonstrated that Cluster One assignment alongside clinical variables was an independent predictor of impaired systolic function development within 1 year of AMI.
The only ECM biomarker associated with the development of impaired systolic function upon univariate analysis was MMP-8. A previous study conducted by Fertin et al. [27] demonstrated that acute MMP-8 levels were an independent predictor of LV remodelling, defined as a > 20% increase in LV end-diastolic volume, within 1 year of AMI. While LVEF was not directly assessed, MMP-8 levels were also independently associated with the development of cardiovascular death and hospitalisation for heart failure [14], indicating a link between increased MMP-8 levels and LV dysfunction. However, Nilsson et al. [28] has previously demonstrated no association between acute MMP-8 levels and LVEF at 4 months post-MI. These inconsistencies are common throughout the literature, and when LVEF is assessed as the endpoint for LV dysfunction, discordant results are observed for MMP-2, MMP-9 and TIMP-1 [28,29] [11,13]. Potentially, opportunistic biomarker testing and timing of LV measurement are important factors influencing these inconsistencies. Additionally, there is no standardisation within the literature to assess LV remodelling and individual studies employ different imaging metrics to asses LV function. In this study, LVEF was used as a global index for systolic function. While this is a common metric, we acknowledge that using alternative LV function endpoints may alter the findings of this study.

3
Upon multivariate analysis, MMP-8 levels did not remain significantly associated with the development of impaired systolic function. Potentially, these findings, alongside inconsistencies within the literature, suggest that single biomarker analysis is insufficient for capturing the complex pathophysiological processes that comprise adverse LV remodelling. Instead, investigation into more complex methodologies for combining biomarkers that represent cardiac remodelling processes warrants further investigation.
We have previously investigated the clinical utility of combining biomarkers using EFA and cluster analysis [16]. In this study, we applied these techniques to the patient cohort. EFA analysis created a two-factor solution. Factor 1 was composed of MMP-8, MMP-9, MMP-3 and TIMP-1. MMP-8 had the highest loading on this factor and this suggests most of the variance for Factor 1 is captured by MMP-8 levels. Factor 2 comprised of MMP-3, MMP-2, TIMP-1 and PINP, which all demonstrated mid-range positive factor loadings. Factor scores can be generated as a mechanism to study the relationship between clinical outcome and EFA [18]. Factor 1 scores were slightly elevated in patients with impaired systolic function compared to patients with preserved function on univariate analysis. No differences were observed for Factor 2 scores. These findings suggest that biomarkers that make up Factor 1 may more appropriately capture LV remodelling processes and were taken forward for cluster analysis. Cluster analysis separated patients into a two-cluster solution. Patients in Cluster One were more likely to be males, had higher levels of peak hsTnT and were more likely to have impaired systolic function compared to patients in Cluster Two.
When both combined biomarker strategies were assessed using multivariate analysis, Cluster One assignment, but not Factor 1 scores, was independently associated with the development of impaired systolic function within 1 year of AMI. This is of interest, as both methodologies included the same input data as follows: MMP-3, MMP-8, MMP-9 and TIMP-1. Thus, we demonstrate that our combined biomarker analysis methods were not equal in their capacity to predict the development of systolic dysfunction. Differences in predictive power could be associated with the underlying statistical principles of each test and their practical purpose. Cluster analysis is well established in the literature as a methodology for combining variables and examining the relationship between cluster phenotype and outcome [30,31]. While EFA scores have also been used in this context with success [32,33], the underlying principle of this methodology is to examine the structure and relationship between variables, and potentially this may reduce the power of EFA for outcome analysis. Regardless, findings from this study outline two important points. First, a combined biomarker strategy may more appropriately risk-stratify patients following AMI when compared to single biomarker analysis. Our findings that Cluster One assignment, but not MMP-8 levels, are predictive of systolic dysfunction suggests that important information is captured by the levels of the three additional biomarkers measured in cluster analysis that are not represented by MMP-8 levels alone. Second, care must be taken when deciding on a methodology to combine biomarkers, as some statistical methods may have more power in a clinical setting than others.
In addition to Cluster One assignment, the prescription of ACE inhibitors at discharge, a current smoking status and peak hsTnT levels above the population median remained significantly associated with the development of impaired systolic function upon multivariate assessment. ACE inhibitor prescription is clinically indicated in patients with chronic heart failure or LV systolic dysfunction upon index admission [34] and thus increased uptake in this patient groups is to be expected The findings of a current smoking status being protective from systolic impairment were more surprising. However, these results were likely skewed by a small sample size of 22 patients reporting as current smokers. While hsTnT is a key marker of myocardial necrosis used in the diagnosis of AMI, peak levels of hsTnT are also a surrogate marker of infarct size [35]. As such, hsTnT levels have been linked to risk of development of systolic dysfunction [35,36]. In this study, we demonstrated that patients with peak hsTnT greater than 610 ng/L were approximately three times more likely to develop impaired systolic function when compared to patients with lower levels. Our multivariate models suggest that ECM cluster analysis provided additional information that was independent of hsTnT levels regarding risk of systolic dysfunction. ECM biomarkers and hsTnT were assessed separately in this study in order to examine the clinical utility of ECM biomarkers alone to risk-stratify patients following AMI. However, future studies could investigate the benefit of combining hsTnT and circulating biomarkers for risk prediction. Indeed, recent findings have suggested that multi-marker approaches that combine synergistic pathways may be superior at predicting major adverse cardiovascular outcomes following AMI [37] There are some limitations that should be acknowledged. Timing for echocardiography was not standardised for patients, and this may influence measures of LV remodelling. The median time to echocardiography for preserved and impaired systolic function was 143 (IQR 80-245) days and 163 (IQR 91-296) days, respectively. Despite these numerical differences between patient groups, no statistical differences were observed in timing for echocardiography. Additionally, the median time to echocardiography was approximately 5 months. While this is a common time point recorded in the literature to assess LV remodelling [38,39], it may be a limitation for measuring advanced remodelling processes. It is important to acknowledge that 1 3 these biomarkers are not cardiac specific and can be altered by systemic physiological activity. For example, PINP is a biomarker of collagen type I synthesis that is not specific to myocardial tissue alone. However, we have selected an optimal panel of ECM biomarkers based on previous testing that suggested these biomarkers were altered in AMI patients compared to healthy volunteers [19]. This study comprises a moderate sample size and is explorative in nature. Findings from this study should be assessed in a prospective multicentre cohort of AMI patients to ensure validation of these results, and ideally such a study would have a greater number of patients to ensure adequate statistical precision when using multi-marker approaches. Assessing power in a multi-marker study remains challenging when using cluster analysis, as standard power calculations are not appropriate. The sample population included in this study were clinically indicated to receive echocardiogram measurements at follow-up appointments, potentially biasing the population. However, more patients were shown to have preserved systolic function than impaired systolic function. Finally, there is no single standardised methodology within the literature to validate cluster size. In this study, we utilised the NbClust package in R to validate cluster partitioning, with a specific focus on the following two commonly used indices that assess cluster fit: Average Silhouette Width and the 'Elbow' method.
In conclusion, we demonstrate that combining ECM biomarkers is useful for predicting the development of impaired systolic function within 1 year of AMI and may provide greater prognostic utility compared to single biomarker analysis. Further research is required to establish the best methodology for combining biomarkers in a clinical setting.