The rapidly growing and ageing world population with its healthcare needs has placed an increased demand on clinicians, not only in terms of their numbers, but also in terms of interaction time and quality of care. To meet such demand, there has been a notable shift in the opportunities for care outside the clinic, thanks to the evolution of monitoring and diagnostic technologies fueled by their integration into wearable technologies.

Healthcare sensors and remote measurement tools have saved lives thanks to the possibility of continuous monitoring and life-saving intervention tools, and over the years their deployment has expanded into areas beyond healthcare such as fitness and well-being perception [1]. Such expansion has led to the collection of data at an unprecedented rate, opening the floodgates to new possibilities in biomedical research [2, 3].

Biomedical sensors are being ubiquitously placed into daily-use devices, such as smart-watches and smartphones. Since the number of smartphone subscriptions worldwide today surpasses six billion and is forecast to further grow by several hundred million in the next few years,Footnote 1 the age of planet-scale healthcare technology testing and diagnostics is right around the corner. But what does this mean for studies using standard clinical data measurement setups in small, controlled settings when confronted with experiments approaching “N = All humanity” scales of data? Does quantity mean quality? What are the future implications on healthcare? A debate has arisen as to the quality of data collected and signal processing methods, which may significantly deviate from medical standards.

Proxies of clinical gold standard measurements are frequently employed in wearable data acquisition and can lead to greater risks for error. The main issue that arises in the employment of wearable measurement devices, as is the case for any biomedical measurement for clinical use, is uncertainty generated by the presence of noise. The quality of the data and associated level of noise depend on many factors, including the devices employed and the skill level and competence of the person operating the device.

While wearable technologies offer new opportunities for translational biomedical engineering, the many pathways by which physiological data could be collected and the subsequent signal processing techniques raise concerns for data validity. Furthermore, the intricacy of dealing with sensor acquisition, data processing, and the complexity of physiological systems in health and disease raise concerns about data interpretation, data privacy, and data sharing.

The World of Wearables

Wearable sensors and remote measurements have made continuous patient monitoring possible. Wearables and wearable technology have consolidated their presence in health and well-being applications; in fact, 85 million wearable medical sensors and devices were shipped in 2021, and the number of shipments is expected to grow to 160 million in 2024.Footnote 2

State-of-the-art wearables can potentially capture data always and everywhere. For example, blood pressure monitoring in a clinical setting usually requires 1-h doctor appointments every 6 months; over the same 6-month period, 4320 hourly blood pressure readings could be obtained through the use of wearables. Clinically validated data acquisition systems found in hospitals are cumbersome to use and require specialized technicians [1,2,3]. The availability of miniaturized cardiovascular wearable diagnostic sensors worn usually on the chest or wrists for cardiorespiratory or body-motion detection typically in smartphones have allowed for (i) longer durations of ambulatory monitoring for at-risk patients [4]; (ii) a prolonged use of wearables by asymptomatic, healthy subjects for assessing physical fitness or improving individual perception of well-being [5]; (iii) and the implementation of follow-up protocols for at-risk patients, in the form of alerts sent to patients or caregivers. Textile wearables are widely used as well [6].

Due to its nature, wearable technology employs proxy measures to obtain information that is meant to be equivalent to clinical biomarkers. For example, video-based motion-capture systems are the gold standard for capturing proxy parameters of healthy and pathological gait dynamics, which may also be retrieved using wearable inertial motion capture platforms [1]. These provide proxy measurements of the video-based motion-capture proxy in many rehabilitation scenarios, while allowing for patient monitoring in real-life conditions.

In respiratory medicine, spirometry is the gold standard proxy for the assessment of pulmonary capacity and related disorders [7]. In the wearable domain, a proxy of the proxy spirometric assessment has been developed in the form of abdominal and/or thoracic belts [1]. These are currently employed in pneumology studies, providing relevant diagnostic information, such as on apneic transients [7].

Besides monitoring patients, other advances in biomedical research have been developed using wearable technology, such as the acquisition of digital biomarkers through consumer-generated physiological and behavioural data. These biomarkers offer a great potential in their application to medical domains that are less understood or where disease diagnosis is difficult or difficult to quantify, such as in the fields of neurology and psychiatry. For example, emotion monitoring is carried out through camera-based facial expression sensing, capable of providing ubiquitous measurements that are otherwise physiologically or neurologically inaccessible, such as pain, which has no clear sensor correlate. Another example regards overcoming the limitations of more standard and expensive invasive vagus nerve stimulation techniques through extra skin/in-ear devices, benefitting patients with pharmacoresistant epilepsy (partial onset seizure disorders) and depression [8, 9]. The use of digital biomarkers could also enable and reduce clinical trial duration since disease progression may be monitored by wearables with more precision and accuracy.

The propagation of easily accessible and affordable wearable technology contributes to significant improvements regarding the power of well-being studies as well as the reach of basic healthcare diagnostics, though not an easy task. Due to the use of proxy techniques and because of the nature of the devices themselves, data evaluation and validation require improved signal processing analytics that can extract diagnostic information due the elevated amount of noise present in the acquisition to monitor the performance of on-going interventions [1,2,3] and to predict future clinical events. With regards to problems in cardiovascular healthcare monitoring, class action lawsuitsFootnote 3 filed against major device manufacturers are supported by experimental testing: while wearable optical heart rate trackers has proven effective for cardiac healthy behaviour assessment when subjects are at rest, its predictive power decreases during physical activity or mental stress, unlike ECG-derived measurements [10]. Indeed, some studies claim clinical validity of smartphone-derived cardiovascular metrics [11], while others have raised concerns [12]. In young healthy adults, commercial-grade wearable optical heart rate trackers are off by 5–40 beats per minute while exercising on a treadmill for 3 min at rest, and at 2–6 mph activity [12]. These trackers have also been associated with no less than 1.14% as mean absolute percentage error in other sedentary testing, and light (error range 5.60–24.38%), moderate (error range 6.70–24.27%), and vigorous physical activity (error range 3.32–9.88%) [13]. This is confirmed by the fact that wearable heart rate monitoring through smart-watches on average underestimates heart rate by 1–9%, depending on the activity (e.g., rest: lying, sitting, standing; exercise: walking, cycling), with respect to a reference ECG [14]. In a small cohort of patients in an intensive care unit, wearable monitoring had a sensitivity of 69.5% and specificity of 98.8% for the detection of tachycardia, with lower sensitivity in case of patients not in sinus rhythm or with faster heart rates (> 150 bpm) [15]. Other, although minor, levels of disagreement were found in comparing standard and healthcare wearable metrics in patients with mild/moderate systolic dysfunction (see [16] and references therein).

Overfitting the Patient and Confounding Physiological Complexity

How reliable are healthcare wearables? The deployment of wearables outside a clinical setting have raised several questions as to the quality of the data being generated as well as the reliability of data validation techniques employed.

Uncertainty arises from both noise and ambiguity in our measurements and sets fundamental challenges and limits to inferring the underlying physiological state of a person. Measurements taken by devices are often noisy due to electrical noise in amplifier circuits or the physical nature of the biosignals themselves [17]. Wearables by their very nature promise ubiquitous data collection with the opportunity cost in using more “indirect” or “noisier” sensors. In the aforementioned example of blood pressure measurements, while 4320 readings can be taken over a year, the level of uncertainty is decreased; however the level of noise may be much greater than those readings taken by a clinician.

Different technological platforms and software systems among manufacturers present different levels of noise, signal-to-noise ratio, noise correction and smoothing algorithms that are often not exposed to the user nor the “App” software developer. Common devices embedding biomedical sensors may come with noisy and biased and/or non-precise sensor performance. This may vary considerably across different brands and even within devices of the same make and series. Moreover, software developed across platforms without careful testing or calibration across devices could give rise to problems; this is especially a risk in the two big software platforms for smartphones, which create abstractions of low-level sensor differences and often simulate missing sensors. Developers are provided with generic sensors for devising software for “well-being” applications, which are likely to be the output of team programmers, without the involvement of clinicians or biomedical engineers who understand the underlying physiological and sensor mechanisms. It is therefore not surprising that, between the beginning of 2020 and the end of 2021, the number of mHealth apps available to Android users reached over 65,300 thousand,Footnote 4 the majority of which lack scientific evidence of their function, and only four of these have been subjected to clinical trials [5]. This is a concern as there is a high-risk of implications for the use of ‘well-being’ applications of wearables due to misinformation, misdiagnosis or even mistreatment that can be caused by these Apps. The absence of clinical trials or prospective studies limiting reliable self-monitoring and self-management of disease creates a divide between well-being gadgets and tightly regulated and clinically-tested wearable medical devices. However, in the domain of medical devices, regulatory tightening by the US Food and Drug Administration and European Medicines Agency required more medical devices to undergo actual clinical trials only this year. Another example of problems with wearable devices lacking clinically driven data regards the dermatological inspection of skin abnormalities through photography-based apps, which has recently been challenged by the US Federal Trade Commission, who took action against two melanoma detection Apps in 2015 because of little evidence of clinical output.Footnote 5 In either case it is important that even wellbeing gadgets are clinically validated, as the scalability of this technology paired with the end-user’s illusion of precise technology can lead to medical complications from the misreading of such wearables.

A key area of concern regards specificity issues in human pathophysiology that require careful use of data-driven methods when healthcare monitoring occurs outside a clinically-informed context. While different stages of severity of a specific disease may show a clear correlation with wearable-derived biomarkers, many other conditions, especially healthy ones, may show similar biomarker variations as in patients. For example, in internal medicine, autonomous nervous system functioning can be assessed through Heart Rate Variability (HRV), as a proxy measure. HRV is a widely recognized, non-invasive tool used to investigate neural control of cardiac activity with a considerable amount of applications and clinical evidence [18, 19]. While gold-standard measures of HRV use electrocardiogram recordings to quantify heartbeat time intervals, most wearable cardiovascular monitoring devices rely on optical measures (photoplethysmography) of the mechanical blood pulse signal. Thus, another wearable proxy of a proxy [10]. Autonomic variations following dynamical sympathetic and parasympathetic activation or withdrawal can be observed for different conditions such as postural changes, physical mental stress or different pathological status. This may lead to inconsistencies due to the variety of experimental set-ups and methodological approaches used in validation studies. Let us illustrate this autonomic non-specificity with respect to vagal driven heartbeat dynamics in resting state: similar variations in HRV series may be observed during simple postural changes in healthy subjects as well as during unstructured activity in congestive heart failure [20]. In fact, these healthy and pathological states are associated with a significant sympathetic-driven dynamics on cardiac control.

For the validation of data from wearable sensors versus gold-standard proxy, the use of simple correlation analyses, as well as inferential statistics based on first- and second-order moments (e.g., parametric t-test or F tests) should be avoided because they refer to group-wise metrics, which are not sample-wise. This is dangerous when estimating heartbeat complex dynamics (e.g., using entropy metrics), which may be significantly affected by the precise estimation of heartbeat time intervals and, consequently, by the signal sampling frequency [16]. Instead, a quantitative, sample-wise analysis based on the point-by-point comparison of proxies should be applied, also statistically testing the agreement between two methods of measurement (e.g., through Bland–Altman analysis).

When considering applications of statistical testing, analysts should account for an increased likelihood of finding spurious correlations, as well as for a redefinition of arbitrary thresholds in significance tests. As datasets increase in size, spurious correlations can begin to wreak havoc suggesting significant results where there are none [21]. This poses challenges in how to verify the reliability of results gathered from very large-scale experiments, and how to run scientifically rigorous large-scale evaluations through randomised controlled trials employed to reach the standard of care we are used to [22]. Further confounds are added by users cheating in their wearable use, but because healthcare insurers monitor their physical activities, it resulted in the development of anti-cheating technology [23].

Tailoring Wearables Through Signal Processing—Outlook

Wearable technology and the creative approaches that have sprung from the need to provide cost-effective healthcare to a growing population are expected to build upon what is nowadays called “precision medicine”, that is, the prevention and treatment strategies that take individual variability into account [24]. Users will be equipped with healthcare self-assessment and self-management tools and, over time, will generate knowledge that directly impacts clinical practice.

To this end, the author believe that some important considerations must be taken into account to ensure that the wake of this technological disruption unfold responsibly.

The first is the need for standardisation of wearable devices. The variety of device hardware, software, experimental set-ups and methodological approaches used for validation is primarily responsible for important inconsistencies in wearable technology data validation. In order to make validation studies more reliable, it would be advantageous to the research community to encourage manufacturers and developers to create standards including, but not limited to, analogue front-ends, body placement, duration, and task details, as well as information on sensor calibration included in the software development kit.

Wearables give us the opportunity to analyse our complex physiological systems “as a whole”. Network medicine approaches and subsequent discovery of functional interdependencies among newly developed healthcare variables will lead to alternative ways of understanding the structure of disease, therefore accounting for the complexity of physiological systems, i.e., due to the many interaction of many sub-components, the study of the system as a whole will uncover properties that the study of the individual subsystems acting alone cannot [19, 25]. To this end, biomedical engineers and computer scientists must also be prepared for their role in the healthcare chain in converting what is currently considered non-healthcare data into actual healthcare data, e.g., step counters and GPS locations. End users will also need training or supervision in order to play their part in the data collection process, such as data labeling using “fuzzy labeling” rather than “binary logic” (e.g., diseased vs. healthy).

A further consideration regards the exploitation of large data collection in a wearable fashion to overcome non-specificity of physiological dynamics. Confounding disease elements resulting from comorbidities have already been recognised in pharmaceutical trial validation, yet awareness in biomedical device evaluation with inputs from autonomic dynamics has yet to gain broader appreciation. To this end, large-scale data acquisition initiatives have already started for cardiovascular healthcare, e.g., Apple Heart Study.Footnote 6 Understanding how multi-faceted properties of disease are reflected in data gathered from simple wearable sensor biomarkers will lead to discerning among various disease stages and healthy states by wearable sensors. Further insight into a-specificity will be gleaned through consideration of a combination of biomarkers. We envisage that data driven approaches including deep/reinforcement learning may resolve several arising issues relating to uncertainty and individuality of treatment strategies, as they can learn real-world tasks. This requires that the algorithm operate in a closed-loop, i.e. responding to sensor information appropriately and weighing the consequences of future actions to find the best final outcome. So far these processes have already proven how to surpass human performance [26]. Moreover, this class of algorithms can build rich internal representations, or models of patients, that allow the system to simulate potential future outcomes.

There are considerable ethical challenges present in the pervasive use of wearable healthcare technology, especially regarding individual privacy in health data sharing. Often, privacy concerns arise from unintended and often the unforeseen exploitability of data exhaust from wearables, e.g., users willing to undergo ubiquitous cardiovascular monitoring may expose their psycho-physiological states, such as emotion recognition through analysis of cardiovascular dynamics [27]. Moreover, storage of large amounts of data in clouds or other repositories also raise issues about security of personal information. The hope is that these will be tackled by privacy-preserving data sharing and data mining algorithms that prevent the problem of disclosing sensitive data when mining for useful information, also compliant with changes in research regulation (EU Regulation 2016/679 of the European Parliament and of the Council of 27 April 2016 (General Data Protection Regulation). As of May 26, 2020, any medical device certified in the EU must comply with the requirements of the Medical Device Regulation. By 2024, any device sold, and in 2025 any device put into service must comply with the regulations). This will be achieved through cryptographic and blockchain technologies that prevent disclosure of sensitive personal information, while making the anonymised data usable for analysis, protecting the source from being retraceable. Moreover, algorithms will be designed to compute values with partial information only, e.g., averages, so that actual sensitive parameters need not be explicitly stated.

Another message regards society, in that it is our duty as experts to create awareness in the general public and the medical community. It is important that the public be educated and informed, and that they understand the limitations of embedded signal processing in the presence of noise and the benefits of wearable technology through patient involvement. Healthcare wearables have already placed the end user in a position to be more intimately involved in his or her own healthcare and well-being management. While some consumers are aware of the differences and efficacy between unregulated, untested nutritional supplements promoting well-being and tightly controlled pharmaceuticals that drive healthcare, talk about the regulation of wearable healthcare has only started now. Clinicians and biomedical engineers should also be present in all aspects of wearable technology development. Machine-learning approaches will place medical doctors in the role of a “network of medicine” rather than confining them to the narrow and specialised fields delineated by medical schools [28].

These healthcare innovations are expected to stimulate academia and industry in developing countries. The rapidly changing social and technological dynamics has driven a price-sensitive market and growth in mobile phone and low-cost, smart-device ownership in critical regions of low and middle-income countries. Because of the lack of formal infrastructure to reach the base of the population, wearable biomedical sensors (especially embedded into smartphones) constitute particularly promising channels for the provision of health services and information in rural, otherwise inaccessible zones. Considering that 78% of global mortality and 86% of mortality and morbidity from cardiovascular diseases occurs in developing countries [29], which could be a major step towards improving global healthcare.

The evolution and propagation of wearable technology has disrupted the medical device industry and the traditional way of delivering healthcare. The access to a substantial degree of previously unmet personalised treatments and interventions is now possible, dramatically reducing the fear of communicating personal information directly to another individual, as patients are willing to disclose more information about themselves to a computer than to their therapist [30]. Medical device technicians and analysts, who once stood at the helm of research and development of biomedical device technology, are now equal partners with medical doctors and the general population, redefining the healthcare chain.

Technologies that were once in the hands of a few are now in the hands of all, resulting in mass quantities of data, thereby presenting new challenges and opportunities to the research community. At the forefront of this debate is the importance of noise reduction and signal processing validation, for if these are not resolved, the potential benefits and opportunities afforded by wearable technology may be lost.