Introduction

Wearable technologies, simply termed “wearables”, are defined as any electronic device that can be worn, put on, removed from the body, or worn into clothing or as accessories [1]. Within the ongoing technological revolution in the twenty-first century, wearables can be optimized to provide individualized patient care that offers the unique advantage of self-management. With the introduction of new wearable technologies, smart gadgets including smartwatches can now continuously identify specific types of arrhythmias, including atrial fibrillation (AF), using automated algorithms [2]. A wide range of smart wearables, including smartwatches, chest patches/straps, or sensors integrated into clothing and footwear, have emerged as a result of groundbreaking technological advancements [3]. These devices enable continuous and real-time recording of heart rate, making it easier to identify cardiac arrhythmias [4].

Depending on the wearable device and the algorithm being used, the accuracy of the algorithm used to detect AF varies [5]. In 2020, Margulescu and Mont found that the general sensitivity and specificity of smartwatch algorithms for identifying AF varied from 70 to 90% [6]. The adoption of wearable devices may have a positive impact on patient care by enhancing patient-physician interaction, personalizing AF therapy, and improving stroke prevention techniques. [10] A high-quality single-lead ECG from a wearable device suggests AF can hasten the process of AF diagnosis which may otherwise potentially be delayed [2]. Emerging wearable technology is photoplethysmography (PPG) which has gained popularity due to the rising popularity of smartwatches. PPG is a simple non-invasive technique that optically obtains a plethysmograph through blood volume changes in a peripheral vascular bed that is synchronous with the heartbeat. The technology focuses on underlying baseline flow and low-frequency factors including breathing, the sympathetic nervous system, and thermoregulation [7]. While ECG is one output measure of PPG, other uses are oximetry, blood pressure measurement, and peripheral vascular disease assessment [8].

Typically, the “elderly” age is defined as an adult older than 60 years by the WHO. Biologically, arterial stiffness increases with rising age and serves as a prognostic indicator for coronary artery disease and stroke, both of which are linked to underlying AF. Patient outcomes can improve if the effort is put into awareness and actions to mitigate underdiagnoses of AF in the elderly through an integrated approach that works towards the prevention of AF complications. We aimed to investigate the potential of wearables for detecting AF among older adults in both home and hospital settings. We sought to evaluate the sensitivity and specificity of different wearable technologies for AF detection among older adults as the primary outcome. The secondary objectives were to examine the incidence of AF across different studies and to synthesize contextual factors, such as home versus hospital monitoring, and adverse events with the use of wearables among older adults.

Methods

Database Search

To locate the latest literature, a systematic search was conducted in three databases, namely, PubMed/Medline, Cochrane Library, and Scopus. The PRISMA Statement 2020 guidelines were adhered to during the literature search process [9]. The search was restricted from 2015 till January 31, 2023, with no restriction to language. The titles/abstracts and records were screened by three mid-career researchers (H.A., Z.S., and A.S.) independently. Disagreements were resolved with a third late-career researcher (I.C.-O.). A combination of the following keywords was used across the databases and search engines: wearables, device, hand, geriatrics, elderly, older, atrial, and fibrillation. Only peer-reviewed articles were included to ensure low bias.

PICO Framework

The following PICO framework was used to guide the research question:

P:

The population of interest includes older adults who utilize wearables or handheld devices.

I:

The intervention or exposure of interest is screening, diagnosis, or management for atrial fibrillation.

C:

The comparison of interest is a comparison of sensitivity, specificity, adverse events, and type of technology used for detection.

O:

The outcome of interest is an assessment of the effectiveness and appropriateness of the technology utilized for atrial fibrillation detection in older adults using wearables or handheld devices.

Exclusion Criteria

Reviews, observational studies, animal studies, letters to editors, and case series were excluded. Furthermore, if insufficient data was available to conduct the quantitative analysis, the study was discarded, as discussed by all authors.

Inclusion Criteria

Only those patient groups included comprising of older age groups, broadly using 60 years as a cut-off for age. However, studies reporting mean ages close to 60 years were herein included to represent a broader patient sample that is aging. Any original study, namely, randomized and non-randomized clinical trials, and observational studies including cross-sectional, case–control, and cohorts that used wearables as a tool for measurement of AF risk in previously non-diagnosed patients or management of AF-confirmed patients were included. The reference or “gold standard” employed by each study was used as a benchmark for the sensitivity and specificity of wearables. Studies that reported the incidence, sensitivity, specificity, and adverse events for AF detection with wearables were considered. All genders were included. Any disagreements were resolved through active consensus among the reviewers.

Software and Data Extraction

The data for included studies was extracted as follows: year, first author, title, type of study, country, primary objective, type of assessment, inclusion criteria, technology name, technology function, study setting, number of participants included, age, gender, the incidence of AF, AF characteristics, sensitivity (95% CI), specificity (95% CI), positive predictive value (95% CI), negative predictive value (95% CI), and adverse events or challenges.

The extraction of the studies was co-led by all co-authors. The findings of the included 30 studies are tabulated in Tables 1 and 2. Duplicates and bibliographic libraries were stored in EndNote X9 (Clarivate Analytics). Mendeley (Elsevier, Amsterdam, Netherlands) was used for reference management and citations. Descriptive statistics were used to assess continuous data (i.e., incidence, sensitivity), including percentages, means, and proportions.

Table 1 The study-level characteristics of the included studies
Table 2 Patient-level characteristics of the included studies

Funding Role

No funding was obtained.

Results

In phase I, the identification phase, a total of 242 studies were identified across 3 databases including Pubmed, 119; Cochrane Library, 59; and Scopus, 64. Of these, 19 duplicates were removed. In phase II, the screening phase, 223 study titles, and abstracts were screened of which 137 were omitted as they did not meet the inclusion criteria. Consequently, 86 full-text studies were reviewed and assessed for eligibility. Of these, 56 articles are removed due to lack of eligibility against the inclusion criteria or full-text not located, as shown in Fig. 1. A total of 111,798 participants were included across 30 studies.

Fig. 1
figure 1

PRISMA flowchart illustrating the study and record selection process

The Kappa score for the inter-reviewer agreement was 0.9275 suggesting excellent agreement between the two reviewers in the study processing.

Study Types

Lubitz et al. [24], Zhang et al. [25], Perez et al. [26], Jacobsen [27], Heo et al. [28], Reverberi et al. [29], Haverkamp [30] and Dörr et al. [31] conducted prospective clinical trials. Wang et al. [32], Proesmans et al. [33], Himmelreich et al. [34], William et al. [35], Tison et al. [36], and McManus et al. [37] conducted retrospective cohorts. Wasserlauf et al. [38], Haverkamp et al. [30], Selder et al. [39], Rozen et al. [40], Bumgarner et al. [41], Chan et al. [42], Wegner et al. [45], and Haberman et al. [44] reported outcomes from prospective cohorts. Poh et al. [45] obtained data both prospectively and retrospectively in their cohort. Santala et al. [46], Jaakkola et al. [47], Lown et al. [48•], and Krivoshei et al. [49] conducted case–control observational studies. Gladstone et al. [50••], Ha et al. [51], and Steinhubl et al. [52] conducted randomized clinical trials. While studies including Lubitz et al. [24] and Perez et al. [26] enrolled a majority of their participants younger than 60 years of age, the subgroup analysis included study outcomes separately for older age groups. An exception was made for the older age criteria of 60 years or older for Haverkamp et al. [30], Selder et al. [39], Tison et al. [36], and Haberman et al. [44] where the mean age was slightly younger than 60 years. The study-level characteristics of the included studies are summarized in Table 1.

Study Setting

Home-Based Monitoring

Ten studies monitored participants during routine activities, including screening for previously undiagnosed AF in older participants [24, 26, 39, 42, 52], with present risk factors for AF including hypertension [50••] and chronic kidney disease [28], at-risk post-cardiac surgery [51], and follow-up for AF management including healthcare utilization [32] and management post-ablation [53].

Hospital-Based Monitoring

A total of 20 studies were conducted in hospital settings. Twelve of the studies conducted in inpatient settings were among high-risk participants who were admitted for screening of AF and other cardiology-related diagnoses in cardiac and other wards [25, 27, 30, 38, 43, 46, 47], cardioversion procedures [29, 37, 40, 41], and initiation of anti-arrhythmic drugs [35]. Eight studies were conducted in outpatient settings with data obtained from patients who were visiting clinics [31, 33, 34, 36, 44, 45, 48•, 49].

Sensitivity, Specificity, and Gold Standard

R-R Interval

Reverberi et al. [29] obtained 10-min heart rate (HR) readings with R-R interval data obtained through two integrated electrodes attached across the chest on the RITMIA + smartphone app. There was a sensitivity and specificity (SE/SP) of 97% and 95.2% against the cardiologist-interpreted 12-lead ECG.

Oscillometric Sphygmomanometer

Lown et al. [48•] observed the diagnostic potential of 3 oscillometry devices which obtained pulse intervals in 1 (PH-7, BG2) or 3 (WatchBP) cycles of consecutive BP measurement. Compared to cardiologist-reviewed 12-lead ECG, the SE/SP was 96.3% and 93.5–98.5%.

Gladstone et al. [50••] conducted up to 2 weeks of continuous ECG (cECG) recordings with a miniature, Holter-type device stuck on the chest at all times, and twice-daily automated blood pressure (BP) readings 3 months apart. Using greater than 5-min episodes on cECG or 12-lead ECG as diagnostic standard, the SE/SP was 35.0%/81.0%.

Mechanocardiography

Jaakkola et al. [47] collected 3-min mechanocardiography recordings with a smartphone placed on the sternum with a SE/SP of 93.5% and 96.0%.

Photoplethysmography

Six studies evaluated smartwatches for AF detection. Lubitz et al. [24] analyzed Fitbit-based PPG data using an algorithm that examined irregular heart rhythms. The sensitivity/specificity (SE/SP) was 67.6%/98.4% compared to single-lead ECG patch monitors. Perez et al. [26] screened participants using Apple Smartwatch and found a positive predictive value (PPV) of 78% in older adults. Wang et al. [32] found cost-effective healthcare use among AF-confirmed participants using various smartwatch models. Wasserlauf et al. [38] analyzed PPG data from Apple smartwatches, achieving a SE/SP of 97.5%/83.3%. Tison et al. [36] obtained PPG data from Apple smartwatches and achieved SE/SP of 67.7%/67.6% in ambulatory cohorts and 98.0%/90.2% in cardioversion cohorts. Four studies evaluated smartphones. Rozen et al. [40] obtained waveforms using a finger pulse and achieved SE/SP of 93.1%/90.9%. McManus et al. [37] analyzed recordings using RMSSD, PPA, and ShE, achieving SE/SP of 97.0%/93.5%. Krivoshei et al. [49] recorded PPG-based readings on iPhones, achieving an SE of 95.0%. Dörr et al. [31] analyzed PPG recordings from Gearfit2 smartwatches, achieving SE/SP ranging from 93.4%/98.0% to 93.7%/98.2%. Jacobsen [27] used PPG-based wearables and achieved higher SE/SP with a DNN compared to nRMSSD. Poh et al. [45] analyzed PPG waveforms using a DNN, achieving SE/SP of 97.6%/96.5%.

Single-Lead Electrocardiography

Nine studies evaluated the diagnostic accuracy of AliveCor handheld single-lead ECG devices. Lown et al. [48•] found a sensitivity/specificity (SE/SP) of 87.8%/98.8%. Haverkamp et al. [30] obtained SE/SP of 100%/94% against 12-lead ECG using finger-guided 30-s tachograms from smartphone-based single-lead ECGs. Selder et al. [39] obtained SE/SP of 92%/95% using 30-s tachograms from the AliveCor handheld ECG device. Himmelreich et al. [34] achieved SE/SP of 87.0%/97.9% using 30-s tachograms from AliveCor devices compared to 12-lead ECG. William et al. [35] obtained SE/SP of 96.6%/94.1% using 30-s tachograms from AliveCor smartphones. Bumgarner et al. [41] achieved SE/SP of 93.0%/84.0% using tachograms from AliveCor-based smartphone readings. Chan et al. [42] obtained SE/SP of 75.0%/98.2% using 30-s tachograms from AliveCor-based smartphone readings. Haberman et al. [44] obtained SE/SP of 94.4%/99.1% using 30-s single-lead ECGs from smartphone cameras. Tarakji et al. [53] analyzed AliveCor-based 30-s waveforms and obtained SE/SP of 100%/97.0% using a reference trans-telephonic cardiac monitor. Other studies evaluated the diagnostic accuracy of chest adhesives and handheld chest and palm electrodes. Santala et al. [46] obtained SE/SP of 94.7%/100% with palm-based readings and 98.3%/100% with chest-based readings using Movesenseecg. Zhang et al. [25] obtained SE/SP of 93.3%/95.4% using AMAZFIT single-lead ECGs. Wegner et al. [43] achieved SE/SP of 70%/69% with finger-based electrodes and SE/SP of 55%/60% with a parasternal lead compared to 12-lead ECG interpretation by cardiologists.

Photoplethysmography and Single-Lead Electrocardiography

Proesmans et al. [33] analyzed 3-min tachograms from PPG-based smartphone camera data and single-lead ECGs attached to the left side of the chest using the Fibricheck app. The SE/SP for PPG was 95.6%/96.6% and for ECG was 94.7%/96.6%, corroborated by cardiologists’ 12-lead ECG diagnosis.

Incidence of AF

The profiles of included participants were variable, with an incidence rate of 2.3–100% across all 30 studies.

Adverse Events

Six studies reported adverse events including anxiety/stress and allergic reactions on the skin due to adhesive patches. Lubtiz et al. [24] found anxiety with the use of wearables (n = 1124, 0.25%) followed by skin irritation from either the Fitbit (n = 977, 0.21%) or ECG patch monitor (n = 615, 0.13%) as adverse events across the entire cohort. Santala et al. [46] did not report any adverse events. Gladstone et al. [50••] discontinued cECG monitoring in 5 (1.2%) of the participants due to skin reactions. Perez et al. [26] reported anxiety/stress related to the use of a smartwatch among 15 participants (0.004%). Jacobsen et al. [27] found adverse skin reactions in only 1 participant. Steinhubl et al. [52] found 40/2,659 participants to have adverse skin reactions with an ECG patch.

The participant-level characteristics of the included studies are summarized in Table 2.

Discussion

In this review, we included 30 studies with 111,798 older adults who were being screened for AF in various settings. The two most common wearable technologies were photoplethysmography and single-lead electrocardiography which had comparable sensitivities and specificities. PPG-based devices include smartwatches, smartphones, and upper-arm bands with a sensitivity range of 67.6–98.0% and a specificity range of 83.3–98.4%; however, it is important to note that certain studies used deep learning models. Single-lead electrocardiography-based devices including handheld finger electrodes, handheld chest electrodes, handheld palm electrodes, and adhesives patches were reported with a combined sensitivity range of 55.0–100% and a specificity range of 60.0–100%. Mechanocardiography was conducted by one study with a high sensitivity and specificity of 93.5 and 96.0%. Oscillometric sphygmomanometer-based sensitivity and specificity ranged from 35.0–96.3% to 81.0–98.5%. R-R interval-based analysis had a sensitivity and specificity of 97.0% and 95.2%. A summary of our synthesis is presented in Fig. 2. Our findings suggest a wide range of sensitivities/specificities with wearables for the detection of AF across older adults.

Fig. 2
figure 2

Graphical overview of key findings from the systematic review. *Industry 4.0 techniques including deep machine learning have been input as a statistical method that increased detection rates

The health tech, known also as digital health, industry has bloomed in the twenty-first century with more than 600 million wearables in use in 2020 [54]. Based on current trends, this number is expected to rise to 928 million in 2021 and 1100 million in 2022 [55]. According to estimates, 67 million Americans will be using wearable technology by 2022, with smartwatches making up more than half of all gadgets. The wearable industry is expanding significantly, and by 2026, it is predicted that the market for wearable devices would be worth $27.49 billion globally [56]. With digital health, healthcare workers (HCWs) conduct remote examinations, assess vital sign data regularly, and offer teleconsultations. In the UK, wearables have been used to remotely monitor patients with chronic diseases or post-COVID-19 symptoms [57]. In a cost- and time-effective manner, wearables allow patients to self-screen, specifically during symptoms such as palpitations. For arrhythmia monitoring, KardiaMobile, AliveCor (Mountain View, CA, USA), combines cellphones with a portable ECG monitor, having received the US Food and Drug Administration (FDA) for AF in 2014 [58]. Specifically, in the iTransmit [53] and iRead [35] trials included in our review, these wireless devices demonstrated exceptional accuracy, with 100% and 96.6% sensitivity and 97% and 94.1% specificity for AF detection, respectively, despite various ambulatory and immobile patient profiles. Four years later, Apple created the Apple Watch Series 4 in December 2018, which for the first time merged the features of an ECG and a watch and received FDA approval as a class II medical device [59]. Similar to typical 12-lead ECG recordings, this device has similar accuracy in measuring arrhythmia, atrioventricular block, and QRS duration extension, and the dial was also made to show a bipolar ECG to keep track of hidden AF [60].

PPG is also a promising optical method to monitor blood volume changes per pulse and may be used to calculate blood pressure using multiple algorithms, incorporated into a wristwatch, i.e., wavelet, or bracelet, i.e., Biobeat BB-613. The incremental value of PPG presentation as a tachogram and Poincaré plot and of algorithm classification for interpretation can improve the detection of AF with equivalent accuracy compared to single-lead ECG if the PPG waveforms are presented together with a tachogram and Poincaré plot and the quality of the recordings is high [61]. The Apple Heart study by Perez et al. [26] was the largest with 419,297 participants of all age groups, and the investigators found that the sensitivity was lower in the elderly (78%) compared to the entire cohort (84%). The study confirmed that the PPG-based algorithm is safe and feasible for large-scale ambulatory screening of non-AF populations, yet there may be certain reasons why the sensitivity of PPG may be lower in elderly individuals compared to younger adults. Aging is associated with reduced blood flow to the peripheral tissues, increased arterial stiffness, and natural changes to the skin which can impact the quality of the PPG signal and reduce its sensitivity in the elderly [62, 63]. Overall, these factors can result in a weaker PPG signal in elderly individuals, making it more difficult to obtain accurate readings using this technology.

Limitations

Wearable sensors face challenges in their effective integration into clinical care, despite the known health benefits of physical activity and their potential in healthcare [64]. The technology is still new, with devices in early development stages, requiring further exploration of their capabilities, limitations, and practical applications [65]. Privacy, security, ethics, and user acceptance are important concerns for wearables, requiring attention in healthcare settings [66]. Additionally, digital literacy may be necessary for the optimal use of wearables, posing a challenge for some users [67]. Rigorous testing and approval processes limit the availability of devices. Variations in trial protocols and patient profiles may explain differences in sensitivities and specificities. The definition of “elderly” based on chronological age can lead to ageism bias, impacting clinical practice. The amount of data collected by wearables can burden healthcare providers, and distinguishing between true AF and other contractions can be challenging.

Recommendations

Wearable devices and machine learning show promise for stroke prevention and risk prediction. The Apple Watch Series 4 and KardiaMobile 6L from AliveCor extract heart rate and enable 6-lead ECG recording, respectively, aiding in the detection of AF. Wearables that monitor ECG provide greater stroke risk prevention benefits. Studies demonstrate that adopting wearables in clinical care outperforms traditional methods, making their implementation essential in healthcare settings. Notably, the Mobilize Center and MD2K organized a workshop to explore wearables’ potential and identified key elements for successful implementation, including problem definition, integration with existing health systems, focus on user experience, reimbursement models, and program evaluation. The symptoms of AF can be less noticeable in elderly adults. Younger people with AF may experience palpitations, shortness of breath, and chest pain, which can make the condition easier to detect. In contrast, older adults with AF may experience fatigue, weakness, or dizziness, which can be mistaken for other conditions. Older adults may be taking medications that can affect the heart rate, such as beta-blockers, which can make it more difficult to detect AF. There are unique challenges that come with diagnosing AF in older adults which must be considered when evaluating patients for the condition such as polypharmacy and other underlying medical conditions. Overall, wearables signal a new era in the provision of healthcare and have the potential to drastically alter several facets of clinical treatment.

Conclusion

In our review, across 30 studies, 111,798 older adults were included that were screened or managed with wearables in hospital- or home-based settings for atrial fibrillation. Both PPG and single-lead electrocardiography have been tested across various settings with a wide range of sensitivity and specificity potential. Other less frequently tested wearables included mechanocardiography and oscillometric sphygmomanometer in AF. We found great potential in both PPG-based and single-lead electrocardiography-based screening and management of AF. Wearable technologies will inevitably take on greater significance in the field of healthcare and become more integrated into our everyday lives as a result of advances in science and technology alongside the acceptance of personalized health conceptions. Consequently, it is of utmost importance to understand the challenges of wearables for the detection of AF in the elderly populations and incorporate its use in clinical practice as a preventative and monitoring tool.