Introduction

Heart failure (HF) is a global health problem that has a negative impact on the quality of life (QoL) of patients [1]. An overall prevalence of 1–2% is estimated, which increases with age, being the most frequent mortality cause in patients older than 65 years [2,3,4]. Most patients with HF are hospitalized at least once a year [5].

Close and frequent follow-up of these patients by multidisciplinary teams has demonstrated to reduce mortality and hospitalizations due to acute HF [6,7,8]. However, it is difficult to ensure strict monitoring, so alternative strategies such as telemonitoring are gaining ground [9]. This approach allows to obtain and provide information on patient’s health status though a virtual interface, assist care, reduce the frequency of adverse outcomes, improve QoL, speed up access to healthcare, reduce transportation costs, and reduce face-to-face visits [10, 11].

Telemonitoring strategies have improved medication adherence and re-admission rates [12]. Strategies focusing on treatment optimization and self-care seem to be more successful reducing mortality and hospitalizations due to heart failure, compared to those that aim at early detection and management of acute events, probably due to false alerts [13]. Home-based telemonitoring have proven to be an efficient method of educating and motivating the patients [14]. Smartphone-based apps for telemonitoring in HF are advantageous due to their wide availability, portability, low-cost, computing power, and interconnectivity [15, 16]. A growing number of smartphone-based apps with differential complexities are now available [17,18,19,20], with variable feedback strategies, including in some cases 24 h support for emergency event detection and management. However, few studies have evaluated their benefits in clinical outcomes, as shown in previous systematic reviews [16, 21,22,23,24,25,26,27].

In this systematic review of RCTs, we evaluated mobile-based telemonitoring strategies in patients with HF, assessing their impact on mortality, hospitalization, and QoL, when compared to standard care.

Methods

Protocol and registration

This systematic review followed Cochrane methodology [28]. Protocol was approved by the institutional committee (approval code: 005–2022) and registered in the International Prospective Register of Systematic Reviews (PROSPERO), #CRD42018107855. This report is based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [29].

Eligibility criteria

We included randomized controlled trials (RCTs) evaluating adults (> 18 years old) with HF and comparing telemonitoring strategies using mobile applications with usual care, published between 2000 and 2021. A clear HF definition had to be defined (universal definition [30] or an explicit definition from a national or international guideline). We defined telemonitoring mobile application as a tool that should (1) register at least one relevant clinical variable for follow-up (i.e., symptoms, weight, heart rate, blood pressure); (2) offer an interface using any kind of mobile device; and (3) ask the patient to register clinical variables during follow-up. Studies should provide detailed description of clinical decisions derived from registered information (i.e., feedback), and measure at least one effectiveness outcome (mortality, hospitalization, or impact on QoL). For QoL, we included studies reporting any of the following: EQ-5D-5L [31], SF-36 [32], KCCQ [33], and MLHFQ [34]. We excluded non-randomized studies, reviews, abstracts, letters to the editor, case reports, case series, before and after studies, studies with follow-up of less than a month, studies focusing on multiple diseases, and studies using implantable devices or invasive monitoring.

Search strategy and information sources

A comprehensive literature search was conducted (full search strategy and terms described in Supplemental Appendix). Electronic databases, including PubMed (MEDLINE), EMBASE (Elsevier), BVSalud (LILACS), and Cochrane Reviews from January 1st, 2000, through December 31st, 2021, were searched. We included studies in English and Spanish. Terms used were “heart failure”, “Smartphone”, “telemedicine”, “mobile applications”, “mHealth”, plus filter “randomized controlled trial”, their synonyms and combinations using Boolean terms. We further searched for useful articles using a “snowball strategy” by reviewing references of included articles and searching grey literature. All duplicates and overlapping results were identified and removed in title screening phase.

Study selection

Study selection was performed by two independent researchers (MRdT, NHL, or JBC) using online application Abstrackr [35]. We reviewed full texts of relevant citations and further screened for eligibility. Disagreements between individual judgments were resolved by consensus or with a third evaluator (OMM), based on recommendations of the Cochrane Handbook for Systematic Reviews [28] and PRISMA statement checklist [29].

Data collection process

Data was collected in standardized electronic form including study design, inclusion criteria, participant demographics and baseline characteristics (i.e., age, gender, basal functional class according to New York Heart Association classification [36], HF etiology, Left Ventricular Ejection Fraction [LVEF]), HF definition, telemonitoring software type, retrieved variable type, input methodology by patient, output variables for patient and physician, feedback availability, and follow-up time. Outcomes registered were all-cause mortality, mortality due to HF, all-cause or due to HF hospitalizations, and QoL. We did not adjust units for analysis. Data from included studies was collected by two investigators (MRdT, NHL, or JBC). Disagreements were resolved by consensus or with a third evaluator (OMM).

Assessment of risk of bias in included studies

Two reviewers (MRdT, NHL, or JBC) independently assessed all documents using RoB2 tool. An experienced third reviewer (OMM, AG, or DF) resolved disagreements between individual judgments. All studies were ranked in five different domains yielding results of low risk of bias, some concerns of bias, or high risk of bias. Risk of bias was determined by outcome. Mortality and hospitalization were not likely to be influenced by blinding, whereas measurement of QoL, despite being performed using standardized tools, relies on patients’ subjectivity. Evaluation of evidence certainty for each outcome was performed using GRADE tool [37].

Data synthesis and analysis

Data synthesis was performed for each evaluated outcome. We reported quantitative variables as median and interquartile range, and dichotomic variables as proportions. If sufficient information was available, we calculated relative risks for all-cause or HF-specific mortality, hospitalization outcomes, and QoL using a random effects model for meta-analysis. We performed subgroup analyses for follow-up time (< 1-year vs. > 1-year), patient feedback (immediate vs delayed), and software type. Data analysis was performed using RevMan 5.4. Finally, we generated summary and evaluation tables of retrieved evidence, including certainty of evidence for each outcome, using GRADEpro Tool.

Results

Study selection and characteristics

We found 900 references, 66 were reviewed in full text and 19 were finally included in the analysis [22, 25, 38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59]. Selection process is described in Fig. 1. Patient characteristics for each study are presented in Table 1. All included studies were published in English. Most (68%) included less than 100 patients per arm. Mean age was between 48 and 80 years old, with higher proportion of men. Twelve (63%) studies reported HF etiology, ischemic being the most frequent. Fourteen (74%) studies reported mean LVEF: 85% of studies included patients with reduced ejection fraction heart failure. Eleven studies (57%) reported mortality, 13 (68%) hospitalization, and 11 (57%) evaluated QoL. Most studies (63%, n = 12) had patient follow-up of less than a year.

Fig. 1
figure 1

PRISMA

Table 1 Description of the studies

Application characteristics are presented in Table 2. Regarding telemonitoring software, most involved preinstalled or web apps through a smartphone (37%, n = 7), while two (10%) included web apps not specifically designed for smartphones. Other studies included wireless tablets (21%, n = 4) or proprietary devices (31%, n = 6).

Table 2 Characteristics of the applications

Most frequently monitored variables were weight (95%, n = 18), symptoms (79%, n = 15), blood pressure (57%, n = 11), and heart rate (42%, n = 8). Regarding data entry method, manual input was most frequent (95%, n = 18), although ten of the studied strategies (53%) reported both, manual and automatic interface using wirelessly connected external equipment (e.g., scales, blood pressure monitors, etc.). Most (n = 18) had a feedback plan; however, only 3 (16%) explicitly stated having immediate (< 2 h) support. Only 4 (21%) declared having 24 h availability.

Risk of bias assessment

RoB2 domain scores for each included study are shown in Supplemental Fig. 1. Only two (10%) RCTs were ranked as low risk of bias [49, 54, 55], whereas twelve (63%) presented at least some concerns of bias with regard to outcomes such as mortality and/or hospitalization.

All-cause and HF-specific mortality

In the global analysis, no differences were found in the risk of all-cause and cardiovascular mortality (Figs. 2 and 3).

Fig. 2
figure 2

All-cause mortality

Fig. 3
figure 3

Cardiovascular mortality

All-cause and HF-specific hospitalization rate

Tele monitoring strategies using mobile applications reduced HF hospitalization (RR 0.77 [0.67; 0.89], I2 7%). No differences were found in the risk of all-cause hospitalization (Figs. 4 and 5).

Fig. 4
figure 4

Heart failure hospitalization

Fig. 5
figure 5

All-cause hospitalization

Quality of life

Several scores to evaluate QoL were used in included studies (n = 11) (Table 3). Most frequently used tools were MLHQ [34] (64%, n = 7), SF-36 [32] (18%, n = 2), KCCQ [33] (9%, n = 1), and EQ-5D [31] (9%, n = 1). Due to heterogeneity in effect measurement report, pooled analysis was not possible. No improvement in QoL was observed in studies using MLHQ [25, 40, 42, 53, 57, 59, 60] or EQ-5D [54], whereas studies applying SF-36 [43, 46] and KCCQ [41] reported statistically significant improvement. Noteworthy, one study was not included as it only reported QoL previous to intervention [52]; further, two studies [60, 61] measured QoL using two different tools, but only presented complete data for one tool.

Table 3 General characteristics of studies evaluating Quality of Life

Subgroup analysis

For subgroup analyses (Figs. 6, 7, and 8), we stratified studies by follow-up length (less or more than a year), device type (Smartphone application, tablet, or other device), and feedback (by physician or not). With regard to mortality, tablet use was associated with lower all-cause mortality risk (RR 0.72, CI 95% 0.53, 0.97). Smartphone application or another device as monitoring strategy was associated with lower risk of both all-cause (RR 0.28, CI 95% 0.13,0.60 for smartphone application; RR 0.65, CI 95% 0.44,0.95 for tablet) and cardiovascular hospitalization (RR 0.46, CI 95% 0.31,0.68 for smartphone application; RR 0.84, CI 95% 0.73,0.97 for another device). Meanwhile, cardiovascular hospitalization was reduced in the intervention group, regardless of follow-up length (RR 0.78, CI 95% 0.69, 0.89) and feedback type (RR 0.76, CI 95% 0.59, 0.97).

Fig. 6
figure 6

All-cause mortality subgroup analysis

Fig. 7
figure 7

All-cause hospitalization subgroup analysis

Fig. 8
figure 8

Heart failure hospitalization subgroup analysis

GRADE

Supplementary Table 1 describes the summary of findings and evidence certainty evaluation. Certainty of evidence for both all-cause mortality and cardiovascular hospitalization was moderate, whereas for cardiovascular mortality and all-cause hospitalization was low. Certainty of evidence for QoL differed between applied tool, with high certainty level for EQ-5D (only one study), moderate for SF-36, and low for MLHFQ and KCCQ’s.

Discussion

This systematic review evaluated impact of telemonitoring strategies using mobile applications for patients with HF. We found their use reduces HF hospitalization risk (RR 0.77, [0.67; 0.89]) with low heterogeneity. No significant differences were found for all-cause and cardiovascular mortality, and all-cause hospitalization. Regarding QoL, several scores have been evaluated with different reporting strategies limiting pooled analysis; their impact was divergent between studies. Most studies presented at least some concerns of bias.

Most strategies that reduce hospitalization risk in patients with HF rely on pharmacologic approach [1, 62]. Nonetheless, adherence to therapy and guidelines’ recommendations are suboptimal [63, 64]. As illustrated by our results, mobile-based software for telemonitoring patients with HF may positively impact this risk. Previous meta-analyses [65, 66] including studies of home-based monitoring for patients with HF, showed these strategies reduce re-admission events, due to earlier detection of decompensation and therapeutic intervention; in addition, it promotes treatment adherence. In addition, telemonitoring strategies can reduce the frequency of unnecessary hospital visits, which has been of great importance during Covid-19 pandemic [11].

Smartphone-based apps for telemonitoring in HF are beneficial due to their wide opportunity, cheapness, and computational power [15, 16]. Current evidence suggests positive impact on treatment adherence and reduction in HF hospitalization [12, 16, 22,23,24]. We recently published a pilot study in 20 patients followed for 6 months at our institution using real-time telemonitoring smartphone App (“ControlVit”), in which we found that 91% of patients who used the App did not present any hospitalization event [12].

In 2016, Cajita MI et al. published a systematic literature review exploring impact of mobile phone-based interventions in patients with HF, which included 9 studies (5 were RCTs), reporting inconclusive findings regarding mortality, readmissions, hospitalization duration, QoL, and self-care [26]. The readmission risk assessment included only three studies and less than half of the patients included in the present review, possibly explaining differences with our results. Further, a more recent pooled analysis by Son YJ et al. reported mobile-based interventions had significant impact on in-hospital management duration. Nonetheless, authors did not find differences in all-cause mortality, readmissions, emergency department visits, or QoL 27. In contrast to our study, the most frequent intervention was voice-call feedback, in which an interface for telemonitoring interaction was lacking; thus, evaluated interventions were rather different.

Noteworthy, our results did not show a definite impact on mortality. Few interventions have demonstrated to reduce mortality in this patient group. Out of 19 included studies, we found that only one RCT showed reduction in mortality. Koehler et al. [25] evaluated telemonitoring using a wirelessly connected tablet, in which variables such as symptoms, vital signs and heart rate were retrieved. We hypothesize the positive impact was because feedback was available 24 h/7 days. Further, this strategy was based on an algorithm identifying critical values and able to classify patients in different risk strata [25]. New studies are needed to assess whether the potential benefits of closer feedback and automated algorithms are consistent.

Regarding evidence quality, we found most RCTs presented at least some concerns of bias. This phenomenon may be explained by a couple of reasons. As measure of QoL relies on patient’s subjectivity, it yields a high-risk of bias in the evaluation process. This limitation is less important for main outcomes such as mortality and readmission. Most studies (4/5) were considered as low risk of bias RCTs with regard to those outcomes. Remaining studies had mainly limitations on their randomization, as information concerning concealing was lacking, or due to baseline differences between study arms.

We acknowledge some drawbacks of our study. First, most studies were performed before widespread sacubitril/valsartan and iSGLT2 use, which has been one the most important advances in HF management, as it reduces mortality and hospitalization risk across the whole heart failure spectrum [1, 62, 67]. We were unable to ascertain pharmacologic treatment and patient adherence. Thus, our results may differ during foundation therapy era, as several novel agents have become first-line therapy in HF management armamentarium [1, 62]. Nonetheless, smartphone-based telemonitoring implementation is a low-cost and widely available strategy warranting further exploration in high-quality RCTs. Second, the fact we included different strategies for telemonitoring, using not only smartphone-based apps, but external devices and web-based forms, may be considered a limitation for comparisons. We recognize the heterogeneity among included mHealth interventions. However, our telemonitoring definition finds common basic characteristics, illustrating a process in which there is (1) patient input, (2) data processing, and (3) output allowing both feedback and decision-making. As smartphone availability is increasing and access to wirelessly connected external devices (e.g., smartwatch, scales) is spreading, impact of such devices on real-time data input and decision-making should be explored. For instance, data from Apple Watch® has been shown to be useful in arrhythmia detection [68]. Seeking to minimize this possible bias, we performed a subgroup analysis to assess possible heterogeneity secondary to device type without significant differences. Third, interpretation and data pooling for QoL was limited due to the use of different tools. As interest on impact of patient-reported outcomes is increasing, a call is warranted to establish a preferred tool and to standardize reporting of this outcome. This will allow data pooling in meta-analysis. In addition, novel approaches for composite outcomes analysis, such as win ratio [69], allow inclusion of QoL scores in RCTs. This approach should be considered in data analysis of RCTs evaluating telemonitoring. Fourth, follow-up times were uneven between studies, thus limiting data interpretation. Future studies on smartphone app telemonitoring should consider a minimum and ideally longer follow-up time. We acknowledge that differences in inclusion criteria and HF definition across studies make it challenging to determine in which HF subpopulations we can expect a positive effect on HF hospitalization. HF definitions have evolved over time, and future RCTs should probably include the recently proposed universal definition [30], allowing a more homogenous set of patients.

Conclusion

HF is a burdensome entity from an individual and a societal perspective. Despite widespread mobile device availability and its frequent use by patients at-risk or with established HF, mobile-based telemonitoring of HF patients is still a growing area of research. To the best of our knowledge, we offer the most comprehensive and updated systematic review on this topic, demonstrating reduction in HF hospitalization risk in patients using this strategy. Reduction in mortality risk was not statistically significant, warranting further exploration in high-quality RCTs in the foundational therapy era. Future studies on this topic should allow a better assessment of QoL.