Introduction

Clinical parameters directly measured in the heart or at the root of the aorta are crucial for detection, diagnosis, prognosis, treatment, and management of cardiovascular diseases1,2,3,4. Aortic hemodynamics, such as aortic systolic blood pressure (aSBP) and cardiac output (CO), are direct and more informative parameters for assessing cardiovascular health than corresponding measurements obtained at the peripheral arteries1,5,6. However, the gold standard techniques for measuring aSBP and CO are catheter-based and expensive7,8. Furthermore, there is a need for noninvasive estimation of cardiac contractility. End-systolic elastance (Ees), i.e., the slope of the end-systolic pressure–volume relation (ESPVR), is a pivotal determinant of left ventricular (LV) systolic performance and a powerful index of the arterio-ventricular interaction4,9,10. Despite its clinical importance, the clinical use of this measure is limited by the need for invasive acquisition of multiple LV pressure–volume loops under varying loading conditions11.

Peripheral blood pressure (BP) measurements acquired by cuff sphygmomanometry have a fundamental role in the everyday clinical setting12. Recognizing the important differences between peripheral and central aortic pressures, significant efforts were oriented towards the noninvasive estimation of aortic hemodynamics, in particular aSBP, based on peripheral pressure measurements13. Among commonly used approaches for obtaining aSBP are generalized transfer functions (GTFs)14,15,16, moving average models17,18 and pulse wave analysis-based methods8,19,20. Nevertheless, the totality of them relies on the acquisition of the entire peripheral pressure waveform which can be tedious and susceptible to errors21.

Prediction of CO constitutes a more challenging task due to its dependency on the patient-specific arterial dimensions22. Noninvasive CO monitoring has been addressed using single-beat pulse contour analysis23,24,25 which, however, allows for the derivation of only an uncalibrated estimation instead of the absolute CO value. Finally, notable studies have been developed and validated against invasive techniques for estimating Ees for a single cardiac cycle26,27. The first fully noninvasive method was introduced by Chen et al.26. They proposed a simple equation to derive Ees from pressure arm-cuffs, echo-Doppler cardiography and electrocardiograms.

Despite the good precision of previous techniques, there has been no holistic and complete study to investigate the possibility of estimating aortic hemodynamics and cardiac contractility using readily available noninvasive measurements on the same population. This is mainly attributed to two inherent limitations, i.e., the lack of invasive data in a large scale and the ethical limitation to perform invasive measurements on a healthy population, if no diagnostic reason has been provided.

Cardiovascular models hold a valuable position for addressing the challenge of limited access on in-vivo data28,29. They constitute a faithful representation of the real cardiovasculature and allow the study of pathophysiological mechanisms and diseases30,31. Furthermore, they can provide a complete set of parameters to describe the system, whereas the simulated signals are noise-free.

The present study aimed to evaluate whether aortic hemodynamics (i.e., aSBP and CO) and cardiac contractility (i.e., Ees) can be accurately predicted by the use of brachial systolic blood pressure (brSBP) and diastolic blood pressure (brDBP), heart rate (HR), carotid-to-femoral pulse wave velocity (cfPWV), and, if necessary, ejection fraction (EF). These quantities were chosen as they are readily available in clinical practice and have been shown to provide information on the cardiovascular state2,3,4,32. To overcome the aforementioned limitations, we performed our experiments using synthetic data (n = 4,018), which were generated using a previously validated one-dimensional (1-D) mathematical model of the cardiovascular system28. Regression analysis was performed to establish the relationship between the noninvasive measurements (brSBP, brDBP, HR, cfPWV, (and EF)) and the invasive quantities of interest (aSBP, CO, and Ees). The regression pipeline of the present study is presented in Fig. 1. A ten-fold cross validation (CV) scheme was employed for the training/testing of the proposed approach. We evaluated four models including Random Forest33, Support Vector Regressor (SVR)34, Ridge35, and Gradient Boosting36. In addition, averaging of the multiple predictions was performed. Two approaches were investigated: (i) prediction of aSBP, CO, and Ees using brSBP, brDBP, HR, and cfPWV as inputs, and (ii) prediction of Ees using brSBP, brDBP, HR, cfPWV, and EF. The accuracy of our prediction was evaluated by comparing the model-derived values with the reference simulated data. The accuracy of the aSBP model was subsequently validated using a large clinical dataset including in-vivo hemodynamic measurements (n = 783). Lack of CO and Ees in-vivo data impeded the clinical evaluation of the corresponding models.

Figure 1
figure 1

Schematic illustration of the regression pipeline. brSBP Brachial systolic blood pressure, brDBP brachial diastolic blood pressure, HR heart rate, cfPWV carotid-to-femoral pulse wave velocity, and EF ejection fraction were used as features for predicting aortic systolic blood pressure (aSBP), cardiac output (CO), and end-systolic elastance (Ees). Regression models were trained to map the input data to the respective target data of interest. The methodology presented here was followed for each regression process (in terms of set of inputs, model, and output).

Results

Table 1 aggregates the cardiovascular parameters of the in-silico study population. The comparisons between the model-derived predictions and the reference data are presented below for each of the targeted outputs.

Table 1 Distributions of the parameters of the in-silico population (n = 4,018).

Prediction of aSBP, CO, and Ees from brSBP, brDBP, HR, and cfPWV

For the four models, the comparison between the predicted aSBP and the actual aSBP is presented in Table 2. The average difference (in absolute value) between the model-aSBP and the reference aSBP was less than 5 mmHg in 87% of the total cases for Random Forest, 89% for SVR, 75% for Ridge, and 88% for Gradient Boosting, respectively. Accuracy, correlation and agreement of model-CO estimates in comparison to the reference data are summarized in Table 3. The difference between model-CO and reference CO was less than 0.3/0.5 L min−1 in 62/84% of the population for Random Forest, 65/86% for SVR, 50/74% for Ridge, and 63/85% for Gradient Boosting. Finally, the Ees predictions are compared to the reference data in Table 4. High errors were reported for all of the regression models, whereas correlation between the estimated and the reference data was significantly poor.

Table 2 Regression statistics between model predicted aSBP and reference aSBP.
Table 3 Regression statistics between model predicted CO and reference CO.
Table 4 Regression statistics between model predicted Ees and reference Ees.

Prediction of Ees from brSBP, brDBP, HR, cfPWV, and EF

The statistics of the second regression analysis for Ees, i.e., after additional knowledge of EF, are presented in Table 5. Differences between the predicted Ees and the actual Ees were found to be less than 0.05/0.20 mmHg mL−1 in the 47/78%, 51/81%, 39/70%, and 47/78% of the entire population, for Random Forest, SVR, Ridge, and Gradient Boosting, respectively.

Table 5 Regression statistics between model predicted Ees and reference Ees.

The scatterplots and Bland–Altman graphs for the best performing models are provided in Figs. 2 and 3. The plotted data are corrupted with random noise (see “Blending the dataset with random noise” in “Methods”). Table 6 presents the frequency of selection for each hyperparameter value over the tenfold CV for the best performing model. For the aSBP and Ees estimators, we observed an apparent consistency for the values of the C and gamma hyperparameters. Concretely, C and gamma were set at 100 and 0.001 for aSBP, and 10 and 0.001 for Ees, respectively, in the totality of the 10 folds. Such a consistency is not evident for the CO estimator where C was set at 100 for the 60% of the times. Nevertheless, gamma was again consistently selected to be 0.001.

Figure 2
figure 2

Comparison between predicted and reference data. Scatterplots and Bland–Altman plots between: (A, B) the predicted aSBP and the reference aSBP, and (C, D) the predicted CO and the reference CO. The solid line of the scatterplots represents equality. In Bland–Altman plots, limits of agreement (LoA), within which 95% of errors are expected to lie, are defined by the two horizontal dashed lines.

Figure 3
figure 3

Comparison between predicted and reference data. Scatterplots and Bland–Altman plots between: (A, B) the predicted Ees and the reference Ees without ejection fraction as regression input, and (C, D) the predicted Ees and the reference Ees with ejection fraction as regression input. The solid line of the scatterplots represents equality. In Bland–Altman plots, limits of agreement (LoA), within which 95% of errors are expected to lie, are defined by the two horizontal dashed lines.

Table 6 Statistical results in percentage of times that the hyperparameter value was selected during the hyperparameter tuning with tenfold cross validation process.

Sensitivity analysis for the training size

The training size, that is, the number of data instances used for training, plays a major role on the accuracy of the predictions. To investigate the sensitivity to the number of training data, the training size was modified from 95 to 15% of the total number of cases (Fig. 4). For all models except for Ridge, the RMSEs were increased gradually with decreasing training size. For the Random Forest, SVR, and Gradient Boosting, the RMSEs of the aSBP predictions were less than 4.20 mmHg. Using Ridge, the RMSE varied at a lesser extent, while it was consistently higher compared to the rest of the models. For the CO predictions, all RMSE values were less than 0.50 L min−1. In particular, RMSE for SVR did not exceed 0.38 L min−1, even when only the 15% of the entire population was used for the training. Finally, all RMSEs of Ees estimations were equal or below 0.20 mmHg mL−1.

Figure 4
figure 4

Sensitivity of RMSE to changes on the training size for aortic systolic blood pressure (aSBP) (A), cardiac output (CO) (B), and end-systolic elastance (Ees) (C). RMSE root mean square error; RF random forest; SVR support vector regressor; GB gradient boosting.

Feature importance evaluation

Figure 5 presents the correlation matrix reporting the inter-feature correlations, and the correlations between the inputs and the target outputs. Table 7 presents the average importances of the input features, sorted in a descending order for predicting aSBP, CO, and Ees, respectively. For estimating aSBP, brSBP was found to be a critical contributor; the importance level (0.98) indicated that brSBP should be sufficient for estimating aSBP. The features of brSBP and cfPWV were the dominant contributors in the estimation of CO. Finally, EF was found to play the most significant role in the Ees prediction, followed by brDBP and HR. To further verify the sensitivity of the model’s performance to the input features, we present the RMSE variation for different subsets of input features (only for the best performing models) (Table 8).

Figure 5
figure 5

Correlation matrix for the in-silico database.

Table 7 Average feature importances for the prediction of aSBP, CO, and Ees.
Table 8 Model performance for the best performing configurations (SVR) using different subsets of the input features.

For aSBP, it was shown again that the brSBP is the most pivotal predictor of aSBP; when brSBP was removed from the input features, the RMSE increased significantly. On the contrary, a precise prediction of CO requires the use of at least one of the brachial BP values; exclusion of the latter resulted to a deterioration of the model’s performance. Finally, Ees appears to be mainly sensitive to EF which significantly contributes to the accuracy of the Ees estimation. Results of the hypothesis testing for the ordinary least squares (OLS) regression coefficients are summarized in Table 9. All of the specified coefficients were statistically significantly different from zero.

Table 9 The t-statistics for the OLS regression coefficients.

In-vivo evaluation of the aSBP estimations

After the in-silico validation, the performance of the aSBP estimator was evaluated anew using clinical data. The population included both women (n = 136) and men (n = 647). The descriptive and clinical characteristics of the clinical population are presented in Table 10.

Table 10 Distributions of the parameters of the in-vivo population (n = 783).

The comparisons between the predicted aSBP and the reference aSBP are presented below. First, we assessed the capacity of an SVR model, which was trained using only in-silico data, to make an accurate prediction for the human population (Fig. 6A,B). Then, we compared the latter’s performance with an SVR model which was trained using in-vivo data (Fig. 6C,D). The regression statistics between the model predictions and the reference data are summarized in Table 11. For the in-vivo data, the hypothesis testing’s results for the OLS regression coefficients are presented in Table 12. Figure 7 provides the correlation matrix for the in-vivo dataset.

Figure 6
figure 6

Comparison between predicted and reference clinical data. Scatterplots and Bland–Altman plots between: the predicted aSBP and the reference aSBP for SVR trained using in-silico data (A, B), and for SVR trained using in-vivo data (C, D). The solid line of the scatterplots represents equality. In Bland–Altman plots, the limits of agreement (LoA), within which 95% of errors are expected to lie, are defined by the two horizontal dashed lines.

Table 11 Regression statistics between model predicted aSBP and reference aSBP.
Table 12 The t-statistics for the OLS regression coefficients.
Figure 7
figure 7

Correlation matrix for the in-vivo database.

Discussion

The present study demonstrated that accurate estimations of central hemodynamics (namely, aSBP and CO) and left ventricular Ees from readily available noninvasive clinical measurements can be obtained by using machine learning models. Our basic hypothesis was whether brSBP, brDBP (cuff BP), HR, and cfPWV provide sufficient information to predict aSBP, CO, and Ees. However, for the determination of Ees, data from peripheral pressure waves fall short to provide a precise estimate. Our results indicated that additional information, such as the EF, which is directly measured in the heart (rather than the periphery) may improve the noninvasive Ees predictions. To our best knowledge, this is the first work to evaluate the use of machine learning models in predicting cardiac contractility.

The best performing prediction model for all three target outputs was SVR which outperformed the other models accomplishing the highest accuracy. The Ees estimation was effectively achieved only with the inclusion of EF in the set of input features. In order to evaluate the robustness of our regression models, sensitivity to the training size was investigated. The RMSE was gradually increased with decreasing the number or training samples for Random Forest, SVR, and Gradient boosting. Variations were less distinct for Ridge. Despite the increase in RMSE with changes in the training size, the errors lied within acceptable limits37,38,39,40,41 for Random Forest, SVR, and Gradient Boosting.

Moreover, we tested the performance of an ensemble predictor which used averaging of the single models’ predictions. The ensemble prediction model did not outperform the best performing single prediction model (SVR). However, such an approach may benefit the estimations’ accuracy by reducing the variance of the predictor and thus may improve the model’s generalization ability42. To avoid overwhelm the reader with an exhaustive report of several other approaches, we did not explore other ensemble learning techniques. Such an extensive exploration of different ensemble techniques would be out of the scope of this study.

Following the in-silico validation, in-vivo validation was performed only for the aSBP. The aSBP predictions were found to be precise in the both investigated scenarios, i.e., SVR trained with in-silico data, and SVR trained with in-vivo data. The accuracy was slightly higher in the second scenario despite the smaller size of the training dataset. This is expected if we consider that the in-vivo data may contain more physiologically relevant content and thus be more informative compared to the in-silico data in the training of the model. Interestingly, the hyperparameter tuning led to the same selection for the hyperparameters C = 100 (selected 100% of the times) and gamma = 0.001 (selected 100% of the times) when the SVR model was validated using the in-vivo population. These findings may verify that the in-silico predictive models can be rather informative for the design of clinical studies.

The principal reason that brSBP, brDBP, HR, and cfPWV were selected as the model inputs was the simplicity of their measurement in a clinical setting. Brachial cuff pressure constitutes a readily available and cost-efficient measurement in traditional medicine. At the same time, the use of pressure-based cfPWV is steadily increasing, as a result of numerous studies demonstrating its importance as an independent predictor of cardiovascular disease43,44,45. The convenience and the cost-efficiency of the aforementioned measurements render them suitable for easy, noninvasive, regular medical check-ups.

Based on the feature importances’ assessment, the aSBP prediction was found to be determined mainly from the brSBP. The strong dependency between aSBP and brSBP errors is to be expected, given that the two values are strongly related to mean BP, which is practically the same in both the aorta and the brachial artery. Moreover, brSBP seemed to be a significant predictor of CO. Resistance, and thus mean BP, is a strong determinant of CO. Given that brSBP is related to mean BP, this means that brSBP is indirectly related to CO. In addition, cfPWV is a measure of arterial compliance, which is also determinant of stroke volume and thus CO. Finally, EF and Ees have been reported to be positively correlated46 and this further explains the high importance level of EF for predicting Ees. The results using different subsets of the input features further verified each feature’s contribution to the predictions of the target output variables.

Prior work proposed by Xiao et al.47 used an artificial neural network (ANN) to predict aSBP from invasive radial SBP and DBP, and HR. The differences between the predicted aSBP and the measured aSBP were found to be low and equal to − 0.30 ± 5.90 mmHg. Despite providing accurate results, invasive radial BP is not commonly measured on a regular basis, and thus its modelling imposes a substantial limitation on the clinical application of their proposed model. When an ANN with the same configuration, as the one reported in the study of Xiao et al., was employed to estimate aSBP in our study, the results indicated a similarly good prediction performance. Concretely, the employment of the ANN using only the in-silico data (n = 4,018) achieved an RMSE = 3.79 ± 1.88 mmHg and r = 0.99 (p < 0.001). Training/testing the ANN with only the in-vivo data (n = 783) achieved an RMSE = 3.38 ± 1.09 and r = 0.97 (p < 0.001). In the case of the in-vivo data, we observed that the accuracy is slightly improved by the use of ANN compared to the best performing configuration (SVR achieved an RMSE = 3.53 ± 1.27 mmHg, r = 0.97, p < 0.001).

In general, the majority of previous aSBP estimators relies on features extraction from the pressure waveforms47,48. In our approach, apart from peripheral SBP and DBP, and HR, we incorporated the cfPWV measurement. The idea was that cfPWV being an index of aortic stiffening would improve the performance of the model and strengthen the clinical relevance of our results. However, feature importances indicated that brSBP may be sufficient for estimating aSBP. Using only brSBP, brDBP, and HR as inputs would not alter significantly the accuracy of the estimation of aSBP (using the in-silico data); namely, the RMSE would slightly increase from 3.13 to 3.31 mmHg for the best performing model. In the case that only brSBP and brDBP were used as input features, the accuracy would deteriorate with a RMSE of 3.46 mmHg which could still be acceptable. The use of only brSBP as an input, however, would essentially increase the error at 5.33 mmHg. For the clinical dataset, the same errors were equal to 3.52 mmHg (brSBP, brDBP, HR as inputs) and 4.11 mmHg (brSBP, brDBP as inputs). Finally, using only the brSBP predictor would lead to an RMSE = 4.20 mmHg.

In addition to prediction models for aSBP, estimation of CO from arterial BP characteristics has been a fertile area of research. Dabanloo et al.25 has evaluated the performance of neural networks in predicting CO from invasive arterial pressure waves. Upon comparison between the predicted CO and thermodilution-derived CO, their best performing model provided a mean absolute error equal to 0.54 L min−1 and a correlation coefficient of 0.89. Nevertheless, their model made use of the entire pressure waveform, from which input features were extracted, whereas it provided only an uncalibrated estimation of CO rather than its absolute value.

The results presented in this study are also compliant with the findings of Bikia et al.49, who suggested that brachial BP and cfPWV can be used to predict central SBP and CO (RMSE equal to 2.46 mmHg and 0.36 L min−1, respectively). Following an inverse problem-solving approach, a generalized model of the cardiovascular system was adjusted to quasi- patient-specific standards using measurements of brSBP, brDBP, HR, and cfPWV. Additional geometrical information on the aortic diameter size of each subject was also integrated. The aortic diameter was approximated using previously published age- and BSA-related data50. A similar approximation of the aortic geometry could be embedded in the present study and improve the accuracy of the results. Therefore, employment of machine learning on clinical data could be further reinforced with the inclusion of additional input features such as age, height, and weight. However, given that the errors are already rather low, it is not anticipated that such an improvement would be of particular clinical significance.

Additionally, this study aimed to effectively predict Ees while utilizing a small number of required inputs. Chen et al.26 proposed a method to estimate Ees from cuff pressure, stroke volume, and EF. Their method provided accurate predictions of Ees with differences equal to 0.43 ± 0.50 mmHg mL−1. In contrast to Chen’s approach, we excluded stroke volume from our input vector and, on the other hand, we introduced cfPWV which constitutes an index of aortic stiffness and thus a powerful index of arterio-ventricular coupling51. In an attempt to remove EF from the set of inputs, Ees was found to be poorly predicted. This underachieving performance may be rather expected given that a specific combination of brachial SBP and DBP, and cfPWV might not be unique for only a particular Ees value. Importantly, our study emphasized on the significance of EF in accurately predicting Ees.

The use of EF is further encouraged from the fact that EF constitutes a noninvasive parameter which can be derived via several cardiac imaging modalities. The Simpson’s method52 has been the most commonly used technique; however, it might underestimate EF when compared to the magnetic resonance imaging (MRI), which has been shown to be the gold standard noninvasive technique for assessing LV function and thus EF53. Of course, the latter are not considered as convenient and cost-efficient as a cuff- or tonometry-based pressure measurement. It is likely that the EF-related information may be derived from another measured parameter which is directly or indirectly related to the cardiac contractility, e.g. electrical signals of cardiac events54. Further investigation towards this direction will be conducted in future work.

It should be noted that the aim of the current study is not to propose necessarily a tool that could provide simultaneous predictions of aSBP, CO, and Ees. The models developed in this study could be considered as independent predictors for each of the target parameters in different clinical occasions. In particular, aSBP and CO are major hemodynamical indices that are often useful to the clinician and their noninvasive estimation is highly desirable in a routine clinical examination. On the other hand, Ees is less often required. Currently, Ees is measured invasively with the acquisition of the left ventricular pressure–volume loops. The invasive nature of this technique severely limits the use of Ees in clinical practice.

The booming of data has led to efforts of transferring one type of information to another using machine learning models. Specifically, in relation to patho-physiology, the advancement on measuring and imaging techniques has encouraged the employment of machine learning for estimating clinical pathophysiological indices and validating their results. This promising area of research could not exclude applications on cardiovascular health25,47,55,56. High correlation between peripheral pressure and central aortic pressure indicates the potential to estimate the latter from the former. However, the correlations for a complete set of cardiovascular variables have not been thoroughly investigated. In this work, we performed a first study to elucidate which input parameters (noninvasive measurements) are considered necessary when machine learning is employed for predicting aortic hemodynamics and contractility index (invasive measurements). A major advantage of the present study pertains to the well-balanced dataset that was used for the training/testing scheme. The use of synthetic data allowed for covering a wide range of hemodynamical characteristics, whereas it provided us with access to cardiovascular quantities which are difficult to obtain noninvasively in the real clinical setting, i.e., aortic BP or Ees.

Cardiovascular models have attracted great interest due to the increasing impact of cardiovascular disease. They have provided a valuable alternative for the assessment of pressure and flow in the entire arterial network providing additional pathophysiological insights, which are difficult to acquire in-vivo. Numerous previous studies have used in-silico data for the estimation of aortic BP, cardiac output, aortic PWV and many more56,57,58,59,60. Importantly, in-silico studies allow for the preliminary evaluation of predictive models across a wide range of cardiovascular parameters61 in a quick and cost-efficient way, while their results can be rather informative of the design of clinical studies62,63.

Several limitations need to be acknowledged. The data used for the training/testing scheme were derived from a simulator instead of a real human population. While synthetic data can mimic numerous properties of the real clinical data, they do not copy the original content in an identical way. Nevertheless, the goal here was to define the minimum necessary input information that is required to estimate aortic hemodynamics and Ees. Thus, despite that the use of synthetic data might not lead to exactly the same results with the results coming from clinical data, it should not undermine the reliability of the study’s findings. The latter has been verified by the in-vivo validation of our aSBP estimations. Clinical validation was not possible for the CO and Ees estimators, due to the lack of the respective data. At the initial stage of our research, we found it reasonable to start with an in-silico validation of our predictive models, instead of collecting measurements of CO and Ees in a large cohort. In addition, the cost and the complexity of the respective measurements would make it difficult to incorporate them in the current study. Future work should include the use of real-world data for all parameters that will finally verify the application of the proposed method in the clinical setting. Finally, the proposed models have been designed and tested on data coming from a generalized model of the cardiovascular system which was developed according to published data28. Hence, the precision of the predictions might be compromised in the case of pathological conditions, such as atherosclerosis, aneurysm or aortic valve disease. It is of great importance that in-vivo validation of the models should be conducted using pathological clinical data as well.

In summary, this study showed that the use of noninvasive arm-cuff pressure and PWV alone potentially allows for the estimation of aSBP and CO with acceptable accuracy. This might not be the case for Ees prediction. Nevertheless, the estimated Ees can be greatly improved when EF is used as an additional input in the prediction model. Following validation on in-vivo invasive data, this approach may provide a promising potential in the prediction of aortic hemodynamics and left ventricular contractility using unintrusive, readily available standard clinical measurements.

Methods

A regression pipeline was applied for estimating aortic hemodynamics and LV contractility index. The schematic representation of the methodology is presented in Fig. 1. The input data comprised brSBP, brDBP, HR, cfPWV, and EF for every subject. These data were fed to the regression models to estimate aSBP, CO, and Ees. First, brSBP, brDBP, HR, and cfPWV were used as input predictors for all three outputs, i.e., aSBP, CO, and Ees. A second regression analysis was performed using EF as an additional input feature only for the estimation of Ees. The outputs of each testing set were blinded and kept as the ground truth against which our predictions were later compared.

Brief description of the in-silico model of cardiovascular dynamics

In the present study, we used a 1-D in-silico model of the cardiovascular system, that has been previously described and validated against in-vivo data28,29. The arterial tree includes the main arteries of the systemic circulation, as well as the cerebral circulation and the coronary circulation. In summary, the governing equations of the model are derived by integrating the longitudinal momentum and continuity equations over the arterial cross section. Pressure and flow are acquired across the arterial tree by solving the governing equations employing an implicit finite-difference scheme. Local arterial compliance is calculated, provided that pulse wave velocity (PWV) is approximated as an inverse power function of the arterial lumen diameter28. Three-element Windkessel models64 are coupled to the distal vessels to account for the peripheral resistance. The contractility of the LV is modeled using a time-varying elastance model4,9. This elastance model considers a linear ESPVR characterized by its slope, the end-systolic elastance (Ees), and its intercept, the dead volume, Vd, as well as a linear end-diastolic pressure–volume relation characterized by its slope, the end-diastolic elastance (Eed).

Synthetic population generation

A database of 4,018 synthetic hemodynamic cases was created. The 1-D cardiovascular model ran using different combinations of arbitrary input parameters. The distributions of the input parameters were based on physiologically relevant data from the literature. The cardiovascular parameters were chosen to represent healthy individuals. Due to the limited amount of probabilistic information, the sampling was selected to be random Gaussian. The values of Ees and Eed ranged within [1.03, 3.50] mmHg mL−1 and [0.05, 0.20] mmHg mL−1, respectively65,66,67. HR varied between 60 and 100 bpm. The LV filling pressure lied between 7.00 and 23.00 mmHg according to68. The dead volume (Vd) and the time of maximal elastance (tmax) were kept unchanged. Their selected values were equal to the mean values of Vd = 15.00 mL and tmax = 340.00 ms as reported by previously published works28,69. Arterial geometry was modified to simulate different body types by adapting the length and the diameter of the arterial vessels. The heights covered a range of [150.00, 200.00] cm while the limits for aortic diameter were set to [1.90, 4.00] cm50,70. Total peripheral resistance varied within 0.50–2.00 mmHg s mL−171. Total arterial compliance was chosen within the range of [0.10, 3.80] mL mmHg−1 in order to account for a wide range of different values of arterial tree stiffness72,73. It should be noted that evidence of nonuniform aortic stiffening was integrated for the elderly and hypertensive virtual subjects, following the methodology described by Bikia et al.49.

Virtual Database

The parameters of interest were estimated from the 1-D model-derived pressure and flow waves (simulation’s outputs). Concretely, synthetic brSBP, brDBP as well as HR data were obtained from the pressure wave at the left brachial artery. Similarly, aSBP was derived from the pressure waveform at the aortic root. CfPWV was derived using the tangential method74. The method computed the intersection (foot) of two tangents, i.e., the line passing tangentially through the systolic upstroke and the horizontal line passing through the point of minimum pressure. Subsequently, the pulse transit time was estimated between the foot of the wave at the two sites, namely, between the carotid artery and the femoral artery. The length between the two arterial sites was calculated by summing the lengths of the arterial segments within the transmission path. Finally, the cfPWV was estimated by dividing the arterial length of the path by the pulse transit time. Given that the ESPVR was known, the EF was derived by dividing the blood volume that is ejected within each heartbeat, i.e., the stroke volume (SV), by the end-diastolic volume (EDV). The value of the Ees was defined as the slope of the ESPVR. Then, all simulated information was discarded, except for the “measured” brSBP, brDBP, HR, cfPWV, and EF (inputs) and the aSBP, CO, and Ees data (outputs). The total dataset (organized in pairs of inputs and outputs) was kept for the training/testing process.

Blending the dataset with random noise

The synthetic data were corrupted with random noise in order to represent a more realistic data collection. The introduced noise was equivalent to a random relative error within the range of [− 6.00, 6.00] % with respect to the actual value. This magnitude of error was selected based on published data from previous studies75.

Clinical database

For the clinical validation of the aSBP estimations, we used clinical data from 783 subjects who underwent noninvasive cardiovascular assessment for research purposes, at the First University Department of Cardiology (Hippokration General Hospital, Athens, Greece). Anonymized data were analyzed in compliance with the Declaration of Helsinki of the World Medical Association and the National Regulations for clinical research.

The carotid-to-femoral pulse wave velocity (cfPWV) was measured in every subject as previously described76,77,78. In brief, cfPWV measurement was performed using the SphygmoCor apparatus (AtCor Medical Pty Ltd, West Ryde, Australia). First, short-term continuous arterial pressure waveforms were recorded by use of a hand-held tonometer (Millar, Houston, USA), simultaneously with ECG acquisition (for the synchronization of the continuous pressure waves recorded at the carotid and the femoral artery). Then, the recorded pressure waveforms were processed by proprietary software that automatically computes pulse transit time from the carotid to the femoral artery using the tangential method 74. Finally, cfPWV was calculated by the ratio of the distance between the two recording sites (calculated as the length from the suprasternal notch to femoral artery minus the length from the carotid artery to the suprasternal notch) to the pulse transit time. CfPWV measurements were performed with the subject at the supine position after 5 min resting period.

Noninvasive estimation of the aortic pressure waveforms was performed by the SphygmoCor System (AtCor Medical Pty Ltd), as previously described79,80. Radial pressure waves were first recorded by applanation tonometry and central pressure waves were derived by use of validated transfer functions81. Multiple recordings were performed in every subject to accomplish optimal quality control criteria (quality index: > 85%). Calibration of the recorded pulse waves was performed using the brachial systolic and diastolic BPs, which were measured by cuff sphygmomanometry. The accuracy of this apparatus has been previously evaluated by comparing the estimated aortic BPs with intra-aortic catheter-based BP measurements79. Furthermore, the reproducibility of this technique has been also found to be acceptable under several different conditions and populations82.

Regression analysis

Four regression models were trained/tested to estimate the corresponding target outputs. The models that were employed were Random Forest33, SVR34, Ridge35, and Gradient Boosting36. By definition, a regression model comprises the following components: (i) the unknown hyperparameters, β, (ii) the independent variables, Xi, and (iii) the dependent variable, Yi. In this analysis, the objective was to investigate whether the regression model can estimate aSBP, CO, and Ees from single-beat input predictors (brSBP, brDBP, HR, cfPWV, (EF)). The training/testing scheme was based on a tenfold CV scheme83 (Fig. 8). Concretely, all cases were divided into ten equal sets in a random manner. In each fold, one set was left out being the testing group, and the rest of sets were used as the training group to tune the parameters of the models. Hyperparameter tuning was performed internally in each fold using GridSearch with a tenfold CV in order to optimize the β parameters of each fold’s model (Fig. 8). The hyperparameters that were chosen to be optimized are reported in the Table 13. The hyperparameters’ values that are not reported in Table 13 were set to their default value.

Figure 8
figure 8

Representation of the experimental design for the evaluation of every regression model. The model evaluation of was done using tenfold cross validation (CV) (external CV). In every external fold, we performed hyperparameter tuning with tenfold CV (internal CV).

Table 13 List of the hyperparameters which were chosen to be optimized and their corresponding values.

We investigated two approaches: (i) one to predict aSBP, CO, and Ees using brSBP, brDBP, HR, and cfPWV, and (ii) a second one to predict solely Ees using brSBP, brDBP, HR, cfPWV, and EF. Consequently, we evaluated the accuracy of each regression model for every target variable on a subject level. Additionally, averaging of the multiple predictions was tested as an ensemble learning approach. The training/testing pipeline was implemented using the Scikit-learn library84 in a Python programming environment. The pandas and numpy packages were also used85,86.

In-silico validation of the model-derived predictions

We first assessed the performance of each regression model for every target variable on a subject level for the virtual population. Ten-fold CV as described above was used to evaluate the accuracy of the trained models. Moreover, we calculated the percentages of the cases whose aSBP errors met the international standards (< 5 ± 8 mmHg) of the European Society of Hypertension International Protocol37. The error threshold for CO was set to 0.3 and 0.5 L min−1 based on the objective criteria suggested by Critchley and Critchley87. Finally, given that the only clinically acceptable technique for measuring Ees is the invasive end-systolic pressure–volume relationship, there are not meta-analyses using Ees data. In this respect, for the Ees values within the range of [1, 4.5] mmHg mL−1, thresholds of 0.05 and 0.20 mmHg mL−1 should be adequate to provide an accurate estimation of Ees.

Sensitivity analysis for the training size

In order to assess the effect of the number of training samples on our models’ accuracy, sensitivity analysis was performed. Concretely, the regression analysis was repeated after decreasing the training size from 95 to 15% of the total number of cases. For each training size, the predictions were evaluated in terms of RMSE between the estimated and reference data. Hyperparameter tuning was implemented for each different training set under consideration.

In-vivo validation of the model-derived aSBP predictions

Moreover, in-vivo validation was performed only for the best performing aSBP estimator, i.e., SVR. The validation was realized in two steps. First, we trained/tested an SVR model using only in-vivo data following the experimental design described in Fig. 8. Consequently, an SVR model was trained with the totality of the in-silico data (n = 4,018) and, then, was tested on the in-vivo data (n = 783), as depicted in Fig. 9. During training, hyperparameter tuning was performed using GridSearch with tenfold CV.

Figure 9
figure 9

Representation of the evaluation of the synthetically trained model against in-vivo data.

Feature importance evaluation

We assessed the importance of each input feature using the scores returned by the Random forest model. The average importance of each feature was then calculated by averaging the scores from every fold k (k = 1, 2, … 10).

Statistical analysis

The algorithms and the statistical analysis were implemented in Python (Python Software Foundation, Python Language Reference, version 3.6.8, Available at https://www.python.org). We performed OLS estimation of the regression coefficients using each of the target parameters, i.e., aSBP, CO, and Ees, as dependent variable and brSBP, brDBP, cfPWV, HR, and EF (only for Ees) as independent variables (using statsmodels library88). Hypothesis testing for each regression coefficient was realized using the t-stastistic. The agreement, bias and precision between the method-derived predictions and the real values were evaluated by using the Pearson’s correlation coefficient (r), the coefficient of determination (R2), the root mean square error (RMSE), and the normalized root mean square error (nRMSE). The computed nRMSE was based on the difference between the minimum and maximum values of the dependent variable. Bias and limits of agreement as described by89 were reported. The level of statistical significance was set at p < 0.05.