Introduction

Severe aortic stenosis (AoS) is the most common valvular heart disease in Europe and North America [1]. The incidence of AoS increases with age, ranging from 0.2% in patients of 50–59 years to almost 10% in 80–89 years old patients [2]. AoS has a drastic impact on quality of life due to its debilitating symptoms, such as impaired exercise tolerance and decreased functionality, syncope, and other severe exercise-induced complaints [3]. Once diagnosed, if untreated, symptomatic severe AoS has a yearly mortality rate of 25% [4]. Examination of the potential stenosis of the aortic valve often occurs after the arise of symptoms and currently, transthoracic echocardiography (TTE) is the gold standard to confirm diagnosis and assess the severity of AoS. Once severe AoS is symptomatic, early intervention is strongly recommended [5]. The only effective treatment for severe AoS is valve replacement, either by surgical aortic valve replacement (SAVR) or transcatheter aortic valve replacement (TAVR).

Stenosis of the aortic valve diminishes the aortic valve area, increases afterload by adding a valvular resistance, and can result in both progressive hypertrophy of the left ventricle, and a reduced systolic coronary flow velocity, compromising subendocardial perfusion [6,7,8]. In addition, because of the added resistance, left ventricular ejection time (LVET) increases [9]. It is possible that the treatment of AoS after the manifestation of AoS related symptoms is suboptimal in some patients, as pathophysiological changes could be irreversible [10]. Early diagnosis of an evolving AoS could prove helpful in preventing potentially irreversible changes such as decline in left ventricular function [11]. However, in most cases, TTE is only performed when indicated by patients’ symptoms. Furthermore, TTE is known to be time consuming and operator and acoustic window dependent [12]. A more simple, low cost, non-invasive and feasible way to detect AoS would therefore be of added value.

AoS causes a delayed pressure rise in the aorta, and a prolonged systolic ejection period [13, 14]. As this affects the blood pressure waveform, this change in morphology can potentially be measured more distally in the vascular tree. In patients with AoS, a prolongation of left ventricular ejection and upstroke time, and a less steep slope are expected [15,16,17,18]. In this study, we aimed to create a machine-learning derived diagnostic model to detect severe AoS based on non-invasive blood pressure waveform features.

Methods

Data of two prospective, single-centre studies was combined, comprising a population of patients with and without AoS. The first study included patients with AoS who underwent elective TAVR, recruited from March 2017 until February 2019, registered at ClinicalTrials.gov (NCT03088787). The second study consisted of patients who underwent elective cardiac surgery, including SAVR. These patients were recruited from October 2019 until May 2022, and registered with the Netherland Trial Register (NL7810).

We excluded patients with a body weight below 40 kg, age younger than 65 years, a congenital unicuspid or bicuspid valve, mechanical aortic valve prosthesis, atrial fibrillation or flutter, a mitral valve insufficiency categorized as higher than mild, no TTE available to the procedure, or inability to perform non-invasive blood pressure measurements.

Transthoracic echocardiography

Severity of AoS was derived from TTE reports extracted from the electronic patient records, executed at least within 6 months before either TAVR or surgery, or within 12 months in case moderate or severe AoS was detected. AoS was graded according to the EAE/ASE guideline [19]. This always included, but was not limited to, assessment of: AoS jet velocity, the trans-aortic gradient and the valve area by continuity equation. The final grading severity was at the discretion of accredited echocardiographers. Patients with mild or no AoS were classified as no AoS patients, and patients with a moderate or severe AoS were considered AoS patients.

Non-invasive blood pressure monitoring

Non-invasive blood pressure data was obtained in all patients using a finger cuff with built-in light emitting- and receiving plethysmography diodes (ccNexfin, Edwards Lifesciences, Irvine, CA, USA), applied to the middle or index finger. Within the device, the finger blood pressure curve was automatically transformed to the brachial blood pressure waveform with a sample frequency of 200 Hz [20]. Non-invasive measurement of blood pressure has shown to be accurate in patients with severe aortic stenosis [21, 22].

Blood pressure data was collected shortly before procedure until either induction of general anaesthesia (in case of surgery) or local anaesthesia (in case of TAVR) was administered. Two researchers (EK and JS) manually selected a segment of ten minutes of consecutive data. In case of artefacts in the data, a shorter data segment was selected, with a minimum length of 3 min.

Data analyses

The ccNexfin automatically calculates several parameters, such as the systolic (SAP), mean (MAP), and diastolic (DAP) arterial blood pressure [20]. Furthermore, the interbeat interval (IBI), heart rate (HR), left ventricular ejection time (LVET), stroke volume (SV), stroke volume index (SVI) cardiac output (CO), cardiac index (CI), systemic vascular resistance (SVR), systemic vascular resistance index (SVRI), and an estimated index of left ventricular contractility (dP/dt, the maximum value of the first time-derivative of pressure), are automatically calculated.

From these derived parameters, several extra features were calculated offline. First, the pulse pressure (PP) was calculated subtracting DAP from SAP; stroke work (SW) was calculated multiplying SV with MAP [23]. The instantaneous baroreflex sensitivity (xBRS), a measure of autonomic function, was computed and expressed as millisecond (ms) change in IBI per mmHg change in SAP [24]. Here, the regression line with the highest correlation between the two changes, while shifting in time, was calculated. The slope of this line was defined as the gain, and the corresponding shift in time was described as the delay [24].

From the raw blood pressure data, several extra features were calculated for individual beats. After applying the smoothing Savitzky-Golay filter, the timing of SAP, dicrotic notch and corresponding time, area under the curve (AUC) of SAP/DAP, based on the AUC of the beat until/from the dicrotic notch, were calculated. The dicrotic notch was calculated by averaging the time of the second maximum of the first and second derivative of the raw blood pressure beat [25]. Furthermore, for each beat, the area under the curve, but above the dicrotic notch, maximum slope of the up- and down-stroke of the systolic part of the beat, based on the maximum and minimum of the first derivative, were calculated.

Statistical methods

In total, 27 features based on non-invasive blood pressure measurement were derived and used to calculate the final features used for the model. From these features, the median, interquartile range (IQR), variance, the 1st and 9th decile of the change were derived. Next, the features were divided by the patients’ age, to adjust for age-dependent differences, and then split into a training and a test set. The training set was used to derive the most optimal model, whereas the test set was used to test this model. The training set was based on 75% of the data, and consisted of an imbalanced set (75 patient with AoS, and 36 with no AoS). To create a more balanced dataset, the set of 36 patients without AoS was oversampled with the Synthetic Minority Over-sampling Technique (SMOTE), to match the 75 patient with AoS [26]. Next, the training dataset was normalized with MinMaxScaler (Scikit-Learn 1.1.3) and used as the input to several classifiers; logistic regression, K-nearest neighbours, decision tree, support vector machine, and random forest. Additionally, the hyperparameters of all classifiers were optimized through a grid search with four-fold cross validation. Training was performed towards the highest possible area under the receiver operating curve (AUROC).

Difference between patients with and without AoS was tested statistically with the unpaired t-test or Wilcoxon rank sum test in case of non-parametric data, or with the Fisher’s exact test when it concerned discrete data. For descriptive purposes, significant differences of the features between the two populations were calculated based on the value of the features before correcting for age. Descriptive data are presented as mean with (SD) or median with (1st–3rd quartile), when applicable. A p-value < 0.05 was considered statistically significant. All data and statistical analyses were performed with Matlab (Version 2020b, the Mathworks Inc., Nattick, MA, USA) or Python (Version 3.9, package: Scikit-learn 1.1.3).

Results

In the TAVR sample, 114 patients were included, of whom 57 were eligible for further analyses (Fig. 1). In the cardiac surgery sample, 260 patients were included, of whom 92 were eligible for further analyses.

Fig. 1
figure 1

Study flow diagram

In the combined sample, 101 (68%) were classified as AoS patients and 48 (32%) as patients with no AoS. Study population characteristics of the patients can be found in Table 1. The majority of the population without AoS was male (41 vs 7, p < 0.001), while the population with AoS was evenly distributed (46 females vs 55 males, p = 0.260). Compared to patients without AoS, patients with AoS were older (78 (73–83) vs 73 (68–77) years), had a lower bodyweight (76 (70–87) vs 83 (75–92) kg), and had a lower length (170 (164–178) vs 176 (172–179) cm). For both populations, the majority had an American Society of Anesthesiologists (ASA) classification of III (80% vs 83%). The median (1st to 3rd quartile) duration of the selected blood pressure data for patient with AoS was 600 (435–600) seconds, whereas for patients without AoS this was 400 (250–600) seconds, a significant difference of 200 s (95% CI: 114–283), p < 0.001. In the population with AoS, 9 out of 101 patients were classified as having low flow low gradient.

Table 1 Patient characteristics

When compared to patients without AoS, patients with AoS showed significantly higher values for LVET, AUC SAP, and AUC dicrotic notch, whereas lower dP/dt, SW, AUC DAP, timing of the maximal up- and downstroke were found. Exact differences, confidence intervals and p-values can be found in Table 2. Here, the average values of the features before adjusting for age are displayed, while the values after adjusting for age are implemented in the machine-learning model.

Table 2 Features before adjusting for age

Machine learning derived detection model

Based on the training set, the best performing (AUROC of 0.93 (SD:0.03)) classifier was logistic regression (parameters listed in Table 3). Applying the model to the test set, an AUROC of 0.79 was found (Fig. 2) with a sensitivity of 0.81 (81% of the patients with AoS are correctly labelled) and a specificity of 0.67 (67% of the patients with no AoS are correctly labelled). The accuracy of the model was 0.76, representing how often the model labels patients correctly (both AoS and no AoS). The positive predictive value was 0.84, so 84% of the patients labelled as AoS actually had AoS, whereas 62% of the patients labelled as no AoS actually had no AoS (negative predictive value: 0.62). In total 8/12 of the cases without AoS were detected by the model, whereas 21/26 of the cases with AoS were detected (Table 4).

Table 3 Optimized parameters of the final classifying model
Fig. 2
figure 2

Performance of the logistic regression classifier for detecting aortic valve stenosis. The area under the curve (AUC) indicates the area under the receiver operating characteristic curve

Table 4 Confusion matrix of the test dataset

Discussion

In this study, a detection model based on non-invasive blood pressure waveforms was developed, and showed good to excellent performance in differentiating between no or mild AoS and moderate to severe AoS. The hemodynamic features implemented into the model were to some extent able to differentiate between patients with and without AoS. As the stenosis of the aortic valve results in a diminished opening of the valve, the systolic phase is prolonged: the timing of (maximal) SAP and maximum upstroke occurred later in patients with AoS. This is translated to the increased LVET, and the increased AUC of SAP and dicrotic notch. As heart rate did not differ between the populations, AoS resulted in a shorter diastolic period of the beat. In addition, SV was decreased in patients with AoS.

AoS can be detected through a variety of methods, including physical examination, imaging tests, and cardiac function tests [27,28,29]. TTE is the gold standard, but expensive, time-consuming and operator and acoustic window dependent [12]. Auscultation showed a sensitivity of 43% and a specificity of 69% for diagnosing significant heart valve disease [27]. Chest radiography and electrocardiography can identify secondary effects of AoS, like left ventricular hypertrophy, which is usually developed after sustained AoS, making early AoS detection more difficult [28]. A deep learning algorithm, based on electrocardiography (ECG), showed a AUROC of 0.88 for detecting significant AoS [29]. This deep learning algorithm is very complex as compared to the more straightforward logistic regression of our model. Besides, only 4% of the patients in the ECG model were diagnosed with AoS, affecting positive and negative predicted values. Comparing both models, our model showed slightly better performance in distinguishing patients, with an accuracy of 0.76 compared to 0.72 of the ECG model. While this ECG model was based on more than 45,000 ECG signals, the novel detection model constructed in this study was based on the data of only 150 patients. As a result, it is likely that model performance could be further optimized in external validation studies that should be performed in the future.

When interpreting model performance, it is important to consider its main goal in clinical practice. High sensitivity represents the ability to correctly identify AoS patients, but does not reckon with false positives. With high specificity, most patients without AoS will be correctly classified, but some patients with AoS could be classified as not having a severe AoS. The goal of this study was to accurately distinguish patients with AoS from patients without AoS, which is best described by the AUROC. The AUROC of the developed detection model was excellent in the training set and very good in the test set, outperformed auscultation [23], and yielded comparable results to an ECG-based detection model [25]. A recent study employing bioinformatics and machine learning identified a novel biomarker of Aortic Valve Calcification. The identified biomarker (fibronectin 1) showed an excellent predictive performance [30]. Both biomarkers and blood pressure waveform derived diagnostic models might prove to be of even further added value if besides detecting severe cases, they would allow classification of AoS severity. This might result in earlier treatment, when progression is fast, diminishing the reduction of quality of life for these patients. In this case, a higher sensitivity could be of interest, especially when the progress of the disease is being monitored.

Limitations

The constructed model was solely based on non-invasive BP waveform-based features. In our study population, gender showed poor distribution in the population without AoS. This was expected, since patients without AoS were found in the cardiac surgery sample, and the majority of cardiac surgery patients are male, whereas the majority of TAVR patients are female. Consequently, this resulted in a higher weight and height in the population without AoS. In a sample with a more comparable distribution of age and gender, patient characteristics might prove to be beneficial to enhance model performance. To assess its face validity and performance, future studies should further assess potential performance differences based on sex, weight and varying amounts of arterial stiffness. Analysing and adding these features was beyond the scope of this study. Concerning blood pressures, there were no significant differences in SAP, MAP or DAP values between the patients with or without AoS, or comparing TAVR with SAVR. However, some other differences between the two populations were found:

First, there was a difference in number of data between the two populations. Due to the general anaesthesia patients received, there was less time to measure blood pressures compared to patients who did not receive general anaesthesia. The 1st–3rd quartile range for smaller datasets is often broader, however as the smallest dataset was 180 s and most features were calculated every beat, this would not have had major impact in developing our model. The only features based on multiple beats were xBRS gain and tau. Here, 10 s of consecutive data were implemented, which was still considered small enough to not be affected by the available length of data.

Second, there was a difference in age between the two populations. Patients unfit for SAVR, or at high operative risk, received TAVR treatment. This resulted in a higher average age in the TAVR population, and consequently in the patients with AoS. This age difference is a common problem in AoS detection studies [29, 31]. Ageing effects the blood pressure, resulting in an increase in SAP, MAP, and PP [32]. Other age-related hemodynamic alterations are an increase in aortic stiffness and a decrease in the cross section of the peripheral vascular bed. This results in an increase in pulse wave velocity and wave reflection [33, 34]. Comparing this to our study, SAP, MAP, DAP, and PP were not significantly higher in the population with AoS compared to the population without AoS, suggesting that the effect of age was limited.

Third, an unbalanced training set will cause learning algorithms to be biased towards the majority class. Therefore, oversampling was applied with SMOTE. With oversampling, synthetic data is generated based on the actual data instead of copying existing data, and no information is lost. Disadvantage of oversampling is that noise can be introduced in the data, resulting in decrease of model performance. However, a good performance was still found with the model, where a ROC-AUC of 0.79 represents good performance.

A last limitation of this study was the decision to restrict our sample to patients with isolated AoS, i.e. without other heart (valve) diseases. The goal of this study was to assess whether patients with AoS could be distinguished from those without AoS. However, as a consequence, generalizability of this model is limited to this specific population, and has not yet been externally validated in this population. In future studies, we plan to analyse and optimize model performance in a more heterogeneous sample of the population, by incorporating patients with other/mixed heart (valve) diseases. Furthermore, we plan to assess model performance in patients suspected of low flow, low gradient AoS, as the model might provide insight in the necessity to perform additional (burdensome) examination such as a stress test and CT-scan.

Conclusion

A machine-learning model using non-invasive finger arterial waveform analysis is able to detect moderate and severe aortic stenosis with high sensitivity and adequate specificity. External, independent validation of our model should be performed to assess whether this non-invasive, easy-to-use model may be implemented in clinical practice to detect severe aortic stenosis.