The data collection for HAR is a challenging and time consuming task that involves general challenges on data collection, synchronization and multi-modal sensor arrangements (Calatroni et al. 2011). Some challenges are common pattern recognition problems and are associated with the robustness of the system like intraclass variability (i.e., people perform the same activity in different ways), interclass similarity (i.e., some activities can be similar between each other, e.g., drinking and taking a pill), and the NULL class problem (i.e., most of the time a person is not performing target activities) (Bulling et al. 2014).
The NULL class problem and the interclass similarity are coped by collecting large datasets including many samples from the target activities. Consequently, the learning stage of the models increases allowing to discover more complex patterns that effectively enables to differentiate between similar classes and NULL class activities (Vargas Toro 2018). To cope with intraclass variability, data from several subjects needs to be collected, such that the variability of single activities can be learned by the model.
Moreover, the ground truth labeling (assigning a ground truth label to each time stamp of the data stream, which is also called annotation for short) is a highly time-consuming task. It is mostly done by first recording the data collection session on video and then labeling the performed activities based on these recordings.
Before starting to collect data in accordance with our own requirements, we have conducted a review of various public datasets and selected one for a proof of concept and to use it as guideline for designing our data collection methodology respectively.
Review of Public Datasets
Over the past few years several research groups have attempted to establish benchmark datasets for the HAR task by simulating real life scenarios, variabilities, and activities. These datasets serve as a standard point of reference for machine learning algorithms applied in HAR.
Datasets for HAR are typically recorded in controlled environments, where persons wear body sensors and perform a series of scripted activities (Vargas Toro 2018). Some groups also included sensors that are attached to objects, as well as ambient sensors. The following list contains the considered HAR datasets and gives a brief description (see (Vargas Toro 2018) for further information):
The DaLiAc and WARD datasets contain a collection of 13 locomotion related activities using body-worn sensors ((Leutheuser et al. 2013), (Yang et al. 2009)).
The OPPORTUNITY dataset has a more limited set of locomotion activities but includes hand gestures and high-level activities in a setting with both body-worn and object sensors, simulating a scenario of activities of daily life (Chavarriaga et al. 2013).
The PAMAP dataset focuses on locomotion activities using body-worn sensors and includes a heart rate monitor in the measurements to address the physical intensity of an activity (Reiss and Stricker 2012).
The HAR dataset focuses on recognizing some physical activities by measuring data from only one smartphone (Anguita et al. 2013).
The REALDISP dataset attempts to simulate some of the variability that may occur in the day to day usage of sensors by inducing in some measurements a degree of displacement of body-worn sensors. It focuses on recognizing physical activities (Baños et al. 2012).
The HHAR (Heterogeneity HAR) study analysed various heterogeneities in motion sensor-based sensing (i.e., sensor biases, sampling rate heterogeneity and sampling rate instability) and their impact on HAR by sensing a set of activities with 13 different smartphones (Stisen et al. 2015).
The AReM dataset measures the Received Signal Strength (RSS) between body-worn sensors in an experiment focused on recognizing physical activities (Palumbo et al. 2016).
We have selected the Opportunity dataset (Chavarriaga et al. 2013) among the publicly available HAR datasets to conduct a proof of concept of a HAR system (Section 4). This public dataset was chosen because of the amount and location of wearable sensors, difference of types and complexity of the target activities, relevance of the target activities in a real life setting, and quantity of participants in the data collection sessions (Vargas Toro 2018). Furthermore, both the data collection process and the ground truth labeling were carried out in detail by recording each session on video, which was later used for annotation with a special software (Chavarriaga et al. 2013).
However, the Opportunity dataset was measured with healthy and young persons only. Thus, we propose a data acquisition methodology for creating a specific HAR dataset containing daily activities of elderly people and patients. This newly obtained database will be the basis for the development of an ARC that aims to support this group in their daily living.
Conception of Data Collection
Unlike other attempts for building such datasets (e.g. as proposed by (Roggen et al. 2010)), in this paper the target test group comprises of elderly people and current patients. This approach is challenging, since healthy people move much faster and safer compared to elderly or diseased people. Furthermore, the type of disease also affects the movements. This approach will therefore increase the chance to generate data for classifiers, which are compatible for patient analysis in the future. Another challenging aspect during data recording is that the endurance of the participants is not as high as it is with healthy persons.
The design of the data collection aims at covering realistic scenarios as well as clinically relevant activities. Hence, we have distinguished three main aspects pertaining to the collection of a HAR dataset, namely, (1) Definition of realistic scenarios, (2) Selection of activities of special interest, and (3) Selection of sensors and equipment. The main aspects and solutions are described next.
Definition of Realistic Scenarios
According to (Roggen et al. 2010), we have decomposed human activities into four hierarchical levels:
- (1)
High-level activities (e.g. preparing breakfast, relaxing, cleaning up, etc.)
- (2)
Mid-level activities (e.g. slicing bread, open drawer, etc.)
- (3)
Low-level activities (e.g. moving bread, reach glass, etc.)
- (4)
Modes of locomotion (e.g. walking, standing, sitting, etc.)
There are two categories of target activities that we have been interested in: activities of daily life (ADL) and modes of locomotion. Ni et al. 2015define ADL as the self-care and domestic activities that a person performs in a daily living e.g., feeding oneself, bathing, dressing, grooming work, homemaking, and leisure. These activities are typically the first ones that require outside support and it has been found that there is a progressive functional loss on them, with hygiene being an early-loss activity (i.e., this is one of the first ADL where it is likely that a person needs help from others), toilet-use a mid-loss activity, and eating a late-loss activity (Morris et al. 2013). The measurement of ADLs allows conclusions to be drawn about the physical and cognitive status, provides information about frailty, and allows predictions about the risk of falling ((Nourhashemi et al. 2001), (Hellström et al. 2013), (Tinetti et al. 1994)). Frailty is defined as a syndrome of physiological decline in late life, characterized by increased vulnerability to adverse health outcomes and reduced ability to adapt to stressors. Procedural complications, falls, institutionalization, disability, and death are often associated with frailty (Clegg et al. 2013). Monitoring ADLs would allow to estimate the independency in their daily living of older adults, as well as to recommend activities that might improve a patient’s health status.
Additionally, modes of locomotion like walking, sitting and lying can be used as reference for the physical activity level of an individual. Furthermore, they can be useful for detecting hazardous situations such as falling (Ni et al. 2015). Falls are the second most frequent cause of fatal accidents worldwide (estimated 646.000 falls per year), with elderly people (60+) suffering most fatal falls. Though not fatal, approximately 37.3 million falls, severe enough to require medical attention, occur each year. Such falls are responsible for over 17 million lost DALYs (disability-adjusted life years). In addition, elderly people with a disability due to a fall are exposed to a considerable enhanced risk of needing long-term care and institutionalization (WHO 2018).
Hence, we have designed the data acquisition methodology in a way that it incorporates some typical daily activities, modes of locomotion and object interactions that may happen during the daily routines. This includes the usual locomotion in a room, morning activities, hygienic activities, taking medicine, drinking and eating as well as some leisure activities. Two types of sessions have been designed to cover both a realistic daily routine and the need for a large database for each activity. The first one, called ADL session, covers a short version of a realistic daily routine and is divided into the following sequence of phases:
Bed phase: The person starts the session by lying down in bed. It follows a short simulation of sleeping (taking on different sleeping positions), before becoming active again as usually done during the wake-up phase. The person sits up in bed and interacts with a smartphone (simulating a phone call and the writing of a message). Then the person moves to sit on the edge of the bed, leaves the bed and puts on some pants.
Bath phase: The person leaves the bedroom and enters the bathroom. In the bathroom, the test subject executes a set of hygienic activities. These activities include washing hands, brushing teeth, using a hair comb, and taking a seat on the toilet.
Table phase: The person leaves the bathroom again and moves towards the table in order to take a seat. This requires moving the chair appropriately. The person then sits down and drinks from a glass of water. Following, the person takes a pencil and some paper to write down some notes.
Door phase: While the person is still sitting at the table, someone knocks at the door. The person stands up and opens the door. First, the person receives a tray with food, carries the tray to the table and then returns to the door. There, the person receives a plate with cookies and closes the door. The person carries the plate to a cupboard and returns to the table to take a seat again.
Table phase: Back at the table, the person first eats with cutlery, then takes a pill and finishes with drinking from a glass of water. All of these activities should be carried out in a natural way, with some short breaks during the transitions.
Cupboard phase: The person takes the glass, leaves the table and goes towards the cupboard. When standing next to the cupboard, the session will be closed by first eating one of the cookies, and then drinking from the glass of water.
We call the sequence of these phases and associated activities “activity protocol” that can be read to the participants to guide them through the ADL session.
Selection of Activities of Special Interest
Complementary to ADL sessions - which follow a natural pattern - activities of special interest (ASI) were performed in ASI sessions in a repeated pattern with small variations. These ASI sessions aim to generate more training data for a selection of activities in a short time sequence. From a clinical perspective the following activities were deemed as most important, which are explained in the following:
- (1)
Changing positions in bed
- (2)
Getting out of the bed
- (3)
Using the toilet
- (4)
Drinking
- (5)
Eating
Micro-mobility (movements in bed) and nutrition status are important indicators to estimate the risk profile for decubitus, especially in elder patients. These patients constitute the single largest group (more than 60%) among all patients with decubitus ulcers (Anders et al. 2010) and their ability to change the position in bed is essential to avoid it (Harris 1996). For timely prevention it is necessary to monitor mobility and nutrition intake and detect any change indicating a risk increase for decubitus. Additionally to the discomfort for the patient and the enhanced risk to develop an infection, the treatment is also care intensive, time consuming, and therefore an important economic factor (Anders et al. 2010).
Getting out of the bed unsupervised may lead to falls being a major problem in hospitals and contribute to substantial healthcare burden, e. g., a significant injury or an increased length of stay ((Oliver et al. 2004), (Oliver et al. 2010)). To prevent falls in patients, who need support when getting up, it is necessary to detect the activity before the patient leaves the bed. This is an important part in a fall prevention program (Oliver et al. 2010).
The activity “Using the toilet” is clinically important, because incontinence is one of the most common symptoms in neurological rehabilitation and has considerable physical, psychological and social consequences, which can significantly impair the quality of life of affected patients (Irwin et al. 2006). Other impediments for independent toilet use may be motoric impairments, apraxia, or dementia. Independent toilet use is often an important therapy goal that is essential for an increased privacy of the patients and a self-determined way of life. Moreover, in the hospitals a vast amount of caregiver time is used to support the patient to use the toilet.
A sufficient hydration is necessary to maintain intra- and extracellular volume homeostasis to avoid cognitive impairment, confusion, reduced concentration and irritability (Popkin et al. 2010). Especially elder neurological patients often show decreased thirst, which leads to dehydration and exsiccosis (Lauster and Mertl-Rötzer 2014). Therefore, it is important to monitor the drinking habits of persons with enhanced risk of dehydration (Shells and Morrell-Scott 2018).
Malnutrition in elderly patients undergoing rehabilitation is a prevalent and often neglected problem associated with lower rehabilitation effect and lower physical function (Pirlich et al. 2006). In addition to monitoring the patient’s weight, it is important to recognize changes in dietary habits for timely intervention, i.e. before weight and muscle loss occur. Therefore, the nutrition intake was also selected as activity of special interest.
The ASI session contains the described activities of special interest, which are repeated several times to generate more training data. It is divided in the following sequence of phases:
Bath phase: In the bathroom, the person first simulates the toilet use. After washing and drying hands, the person also drinks from a glass of water.
Table phase: In the bedroom, the person sits down at the table to drink water again and to eat with cutlery. Then, the person takes a pill and drinks again.
Bed phase: The person sits down on the bed and then lies down. While lying down, the person moves from supine position once to the right side and once to the left side and back.
Selection of Sensors and Equipment
Two types of sensors find application in the HAR context: (1) wearables and (2) ambient sensors.
Wearables
In order to record the activity data of a person, several wearables were used. Since acceleration information is often the most promising input for human motion detection (Roggen et al. 2010), we selected various devices that include accelerometer.
The activPAL micro (PAL Technologies Ltd. 2019) is a small and slim activity monitor that includes a 3-axis accelerometer (see Table 1). It is used quite often in clinical, older aged residents and hospital trials (Chan et al. 2017). The activPAL software allows setting the start time for recording, while the device is plugged in the docking station. If no stop time is defined, the recording stops when either the memory is full, the battery runs out or the device is plugged in the docking station again. The activPAL monitor is attached by covering it with a finger cot to prevent direct skin contact and then pasting it on the skin with medical patches. The correct orientation has to be taken into account, which is provided by the manufacturer.
Table 1 Overview of data acquisition devices Additionally, it was intended to record electromyographic (EMG) data as well, which provides the possibility to investigate the potential use of EMG wearables for this kind of human activity recognition. The Myo armband (see Table 1) includes eight EMG sensors and a 9-axis inertial measurement unit (IMU) that consists of a 3-axis gyroscope, a 3-axis accelerometer and a 3-axis magnetometer. This device is supposed to be placed on the forearm so that the EMG sensors can measure electrical activity from those muscles to detect hand gestures whereas the IMU collects data about the arm movement. Before starting to record data with the Myo armband, it has to be calibrated by the user in order to adjust the sensors to the respective muscle constitution. This is done by a simple hand gesture to which the Myo armband provides vibration feedback. The Myo armband uses Bluetooth Low Energy technology via a Bluetooth adapter to communicate with other devices. This interface can also be used to store the raw data of a Myo armband using a terminal program on a computer. Since only one Bluetooth connection can be established on a device, two computers are necessary for capturing the data of two Myo armbands.
A prototype of the SmartCardia wearable (see Table 1) was used additionally to record acceleration data of the upper body. This device is primarily designed for continuously measuring physiological and vital parameters, including the Electrocardiogram (ECG), heart rate, pulse rate and others (SmartCardia SA, 2019). It is worn like a patch on the chest.
Ambient Sensors
Additionally, two types of ambient sensors were selected, which are a Body Pressure Measurement System (BPMS) and cameras (only necessary for subsequent data annotation).
The BPMS from Tekscan (Tekscan, Inc. Tekscan 2016) was used to gather data for the analysis of the transitions between lying to sitting or sitting to standing. The BPMS provides a pressure distribution image of a person lying in the bed, with a resolution of 34 × 52 sensors per sensor layer, which has a dimension of 940 × 640 mm2. In order to cover the size of a full bed, three sensor layers are placed in a cloth cover next to each other (see Fig. 3), which provide a total sensor surface of 940 × 1920 mm2 with 5304 pressure sensors. Due to the pressure density of 0.3 sensels/cm2, the pressure mattress provides a pressure distribution picture, as visible in Fig. 2.
In order to ensure the recording of all activities for the later annotation, we have decided to use wall mounted stationary cameras. The “Akaso Action Cam EK7000” (see Table 1) was used, since this camera provides an ultra-wide angle objective of 170°, as well as high resolution record ranging from 1280 × 720 pixel up to a 4096 × 2160 pixel. Furthermore, a hand camera (see Table 1) was used to record a close-up of the test subject.