1 Introduction

1.1 Background

Japan is facing a super-aging society. According to the research of the Japanese government, the proportion of people over 65 years old in the total Japanese population was less than 5% in 1950, but it increased to 28.4% in 2019 [1]. Under these circumstances, there is a chronic shortage of nursing facilities and care workers. To cope with the problem, the Japanese government is shifting the policy from conventional facility-based care to in-home long-term care.

The Ministry of Health, Labor, and Welfare in Japan declares the Community-based Integrated Care System [2], which ensures the provision of health care, nursing care, prevention, housing, and livelihood support. The system relies on four kinds of aids: self-aid, mutual voluntary aid, insurance aid, and public aid. Among them, the insurance aid and the public aid are no longer expandable due to the limitation of the social security budget. Hence, the government especially encourages elderly people to conduct the self aid as well as mutual voluntary aid under the system.

However, it is not easy for most elderly people to keep self-aid and independent living at home. As their physical abilities and cognitive functions are declined, external support must be needed. With the declining birthrate and increasing prevalence of nuclear families, support from family members inevitably has its limitations. When elderly people are tired of self-aid, it is almost impossible to take care of others, which makes mutual voluntary aid quite challenging.

Under this situation, the use of technology is promising to alleviate various problems of in-home care. The research and development of assistive technologies for elderly people has been thriving in the world. The book [3] summarizes the practice of assistive technology to support people with dementia. Also, the term gerontechnology appears as a multidisciplinary academic and professional field combining gerontology and technology. In [4], a lot of researchers and practitioners from various fields gather and form communities.

1.2 Research Goal and Approach

Our research group has been studying service-oriented architecture (SOA) [5], and its application to smart systems (also called cyber-physical systems), including smart home (e.g., [6,7,8]) and smart city (e.g., [9,10,11]). In general, every smart system consists of heterogeneous things and software components communicating over the network. Wrapping such heterogeneous components by Web services implements glue between the components, which achieves flexible integration and orchestration. Thus, all the distributed and heterogeneous components are considered as services, and can be connected or disconnected easily, based on the principle of loose-coupling.

At first, our research of the service-oriented smart home had been motivated by technical interests. However, we began to think that it was important to use it as gerontechnology. Although smart devices and information on the Internet are quite promising to help elderly people, it is yet difficult for most elderly people to make full use of them. Therefore, we considered it essential to make these devices and information easy to use for the elderly. Concerning wide acceptance and sustainable use, it is also important that the technologies must be affordable for general households, and be non-intrusive for daily living. To realize such a smart system that is really useful for in-home elderly care, we obtained a Grant-in-Aid for Scientific Research (JSPS Kaken-hi) in 2016 [12] and in 2019 [13]. Using the research budget, we have invited collaborators from various research fields.

Our research goal is to design smart services that support and encourage elderly people at home to conduct the self-aid and mutual voluntary aid, and to implement services with devices and systems affordable for general households.

Fig. 1
A framework of elderly support system. It defines how the virtual agent mediates the mind of the elderly person and encourages self aid and mutual aid support through in home care service platforms, self aid support services, and mutual aid support services.

Conceptual architecture of elderly support system

Figure 1 shows a conceptual architecture of the whole system. In the proposed system, a virtual agent (hereinafter referred to as “VA”) mediates between the “mind” of the elderly person, such as his/her concerns and wishes, and the support services necessary to resolve or realize them, and provides self-aid and mutual-aid support without requiring complex operations from the elderly person. It consists of three parts.

1.2.1 (S0) In-Home Care Service Platform

It is a platform that monitors the subject’s daily living and provides support services. In addition to general environmental and activity sensing using IoT, the system performs Mind Sensing by interacting with the VA, and records the subject’s mental state (physical condition, mood, anxiety, hopes, problems, etc.) that cannot be observed by sensors by externalizing them into words. From the sensing data, the system constructs a digital twin, a data object that maps the subject’s observable behavior and mental state in cyberspace.

1.2.2 (S1) Self-Aid Support Service

It is a service implemented by applications that support subjects to solve problems by themselves. The system understands the physical and environmental conditions of the subject in real time through the digital twin. The system understands the subject’s physical and environmental conditions in real time through the digital twin. The system also attempts to detect signs of mild cognitive impairment (MCI) and dementia, by extracting the number of problems, failed behaviors, and anxious discourse revealed by the externalization of the subject’s mind. The system then supports self-aid in the healthy elderly to the MCI stage, by actively connecting to information and services on the Internet, related organizations, and supporters.

1.2.3 (S2) Mutual-Aid Support Service

It is a service implemented by applications that create opportunities where elderly people are connected to help each other. Using information from the digital twin, the system matches elderly people who share the same concerns and interests and the VAs communicate with them. Once a relationship of mutual trust is established, the VAs contact each other directly via chat or videophone applications, forming a network of mutual assistance. The VAs also share the externalized “mind" information with the person who has opted in and achieve safety confirmation, peer counseling, and voluntary living assistance.

1.3 Scope of Chapter

We have been studying various methods, applications, and services to implement the whole system shown in Fig. 1. In this chapter, however, due to the limited pages, we especially focus on the sensing technologies provided by the (S0) in-home care service platform.

What we consider most challenging in in-home care is the individuality of the household. That is, situations and circumstances are quite different from one household to another. It is therefore important for the system to understand first how the individual elderly person is living, and then to provide appropriate (ideally personalized) care and support for the person.

In the following sections, we introduce our research achievements related to sensing technologies for elderly people at home. These sensing technologies are used to monitor in-home elderly people from two different dimensions. The first dimension is to monitor the living of elderly people. As the first step to address individuality, we should observe and understand the physical life environment of individual elderly people. In Sect. 2, we introduce non-intrusive environmental sensing using an IoT sensor device, called Autonomous Sensor Box. We then present an activity recognition method using the environmental sensing data in Sect. 3.

The second dimension is to monitor the minds of elderly people. The ordinary sensing technologies have a limitation that sensors can detect externally observable events only. Thus, it cannot observe the internal state of the elderly person. To cope with the limitation, we proposed the Mind Sensing technology, which externalizes the internal states as words, through conversation with the virtual agent. In Sect. 4, we first introduce the agent technologies and services used for Mind Sensing. Then, in Sect. 5, we introduce the Mind Monitoring Service, which supports healthy daily living based on daily self-assessment with a LINE chatbot.

2 Monitoring Elderly Living by Environmental Sensors

2.1 Autonomous Sensor Box

To provide appropriate support for individual elderly people, it is important to first observe their living and environment physically, and to understand their current situation. Since it is impossible for family caregivers to manually observe and record the situation 24 h a day, deploying IoT sensor devices is a promising method. Recently, IoT has been actively studied in ubiquitous computing and pervasive computing. In the research fields, a lot of sophisticated devices and methods have been developed (e.g., [14,15,16,17,18]).

In the context of monitoring in-home elderly people, however, the sensing devices must be affordable enough, and should not be intrusive to their daily living as well and the house properties. Therefore, we have decided to avoid wearable sensors or expensive indoor positioning systems. Instead, we have developed inexpensive stationary environmental sensing devices, called Autonomous Sensor Box [10].

The Autonomous Sensor Box is an IoT device that consists of a box with seven kinds of environmental sensors, and a single-board computer Raspberry Pi. Figure 2 shows the actual implementation assembled with seven kinds of Phidgets sensors [19] (light, temperature, humidity, sound volume, gas pressure, motion, and vibration).

A user simply puts the sensor box in a location, where the box does not interfere with daily life, and connects the box to a power source. Then, the sensor box automatically starts measuring the surrounding environment at 10 s intervals. The data is uploaded via the Internet to a private cloud in our laboratory. In the private cloud, we implement services that manage data collection, device settings, and deployment information. Communicating with the cloud services, the software running on the Raspberry Pi automates all the processes of environmental sensing. Thus, the operation required at the elderly house is minimized to switch on/off the power.

Fig. 2
A photograph of an autonomous sensor box with inbuilt sensors and a single board computer raspberry pi unit.

Autonomous sensor box

Using the Autonomous Sensor Box, we have implemented a service platform for in-home environment sensing, as will be introduced in the following sections.

2.2 Service Platform for In-Home Environment Sensing

2.2.1 System Architecture

Figure 3 shows the system architecture of the proposed environment sensing service. The system consists of the following four components.  

C1: Autonomous Sensor Box:

With power and network connection, this device starts environment sensing autonomously and uploads the data to the cloud.

C2: Sensor Box Management Service:

  This service manages the configuration and deployment information of all the sensor boxes deployed in the experimental area.

C3: Log Collection Service:

This service collects the environmental data from the sensor boxes, attaches the timestamp, and stores as sensor log in a large-scale database.

C4: State Cache Service:

This service caches the newest data from every sensor box and provides the data as the current state of the sensor box for external applications.

C5: Sensor Box Log Service:

This service provides the stored sensor log to external authorized applications.

Fig. 3
A framework of the proposed service defines the request configuration and upload logs. The major components are the C 1 autonomous sensor box, C 2 sensor box management services, C 3 log collection services, C 4 state cache services, and C 5 sensor box log services.

System architecture of proposed service

We integrate the above five components with the principle of Service-Oriented Architecture (SOA). The detailed implementation of each component is described in the following sections.

2.2.2 (C1) Autonomous Sensor Box

Autonomous Sensor Box is an IoT device conducting indoor environment sensing at one location in an elderly house. The hardware of the sensor box consists of environmental sensors and a sensor hub that controls them.

As shown in Fig. 3, the sensor hub is equipped with Sensor Box Framework, which abstracts concrete sensor devices as sensor objects. More specifically, the framework takes a configuration file as input, declaring the name, the owner, and the location of the sensor box, and the type and implementation of each of the sensors in the box. Based on the configuration, the framework dynamically creates sensor objects and binds each object to the sensor implementation class. Thus, the framework allows developers to install various kinds of sensors in the box.

The simplest way to manage the sensor box configuration is to put the configuration locally in each sensor box. However, this approach lacks scalability as we manage more and more boxes. Therefore, we manage the configuration and deployment information in the central database on the cloud, so that every sensor box downloads its own configuration on the boot phase. This is shown in Fig. 3 as the interaction between C1 and C2.

In our implementation, the Sensor Box Framework is further wrapped by Sensor Box Service, by which the external application can get the data from the sensor objects via REST API. The logger application in the sensor hub periodically calls Sensor Box Service by REST to acquire sensor data and upload it to the Data Collection Service. The default sampling interval is 10 s.

To minimize the manual operation at home, we implement the following features by which the sensor box can autonomously start the environment sensing.

 

Auto connection to the network::

  When switched on, the sensor box automatically connects to the pre-set network and prepares for connection to the cloud services.

Auto configuration of sensor box::

  When prepared, the sensor box confirms its own ID and requests its own configuration to the Sensor Box Manage Service. Based on the given ID, the Sensor Box Manage Service retrieves configuration and deployment information for the sensor box. Upon receiving the configuration, the sensor box creates sensor objects and launches the Sensor Box Service.

Auto launch logger::

  When the service is ready, the sensor logger is launched automatically. The logger uses REST API of the Sensor Box Service and obtains the current values of the connected sensors. The logger finally uploads acquired values to Log Collection Service, and State Cache Service. The data sampling and upload are executed at every pre-determined interval (10 s by default).

Alive monitoring of logger::

  While the sensor box is running, the system periodically checks if the logger, the service, and the network are all alive. If any critical error is observed, the system is rebooted automatically.

With the above autonomous functions, the user only needs to put the sensor box in a location and turn on the power, which automatically starts the environmental sensing. This minimizes the time and effort required to set up and manage the installation.

2.2.3 (C2) Sensor Box Manage Service

Sensor Box Manage Service manages the configuration and deployment information of all the sensor boxes deployed for the experiment. Each sensor box is identified by ID. The configuration information of a sensor box declares its name and the list of sensors installed in the box. Each sensor is defined by sensor type (e.g., temperature, light, humidity, etc.), device (i.e., the reference to a concrete device class), and binding information (parameters passing to the device class). The deployment information manages where the sensor box is deployed, including the location (house, room, position) and owner.

When booted, a sensor box accesses this service to retrieve its own configuration and deployment information. Furthermore, the service also manages network connection information (IP address and others) of every sensor box. The system administrator uses this information for remote testing and maintenance.

2.2.4 (C3) Log Collection Service

Log Collection Service receives environmental data from every sensor box and stores the data as Sensor Log (i.e., time-stamped senor values). To achieve efficient retrieval and aggregation of sensor data, the log collection service defines the data schema shown in Table 1. In the table, data is the sensor values measured by the sensor box, and info is metadata describing the sensor values. In order to handle data with various combinations of sensors in a unified manner, the sensor data itself is represented by the Key-Value of attribute names and values, without a strict schema.

Table 1 Data schema of environmental sensor data

On the other hand, the metadata is defined by common attributes that are independent of specific types of sensors. This enables cross-sectional search and aggregation of all sensor data. More specifically, we identified data items that explain when, who, and where the measurements were taken since these aspects are independent of specific environmental sensing. The date, timeOfDay, time are data items related to when. The boxId, owner are data items related to who. The location is a data item related to where.

The logger in the sensor box generates data based on the schema for each measurement, represents the data in JSON-formatted text, and uploads it to the Log Collection Service.

2.2.5 (C4) State Cache Service

State Cache Service caches only the latest data sent one after another from the sensor box and provides applications with fast access to the current values of the sensor box. The sensor log stored in Log Collection Service is good for applications that use past values. However, for applications that need only the current values, the overhead of retrieving the latest values from the stored data is not ignorable.

The state cache service always keeps the latest measured value in memory with the sensor box ID as the Key and the current measured value as the Value to realize fast access to the current value of any sensor box. Autonomous sensor box uploads the measured sensor values to both the log collection service and the state cache service to realize efficient data provision to both applications that use past and current data.

2.2.6 (C5) Sensor Box Log Service

Sensor Box Log Service provides the stored sensor log for external applications. Through REST API with info attributes in Table 1, external applications can retrieve the sensor data by JSON or XML format.

2.3 Implementation

2.3.1 Service Platform

We have implemented the service platform for in-home environmental sensing.

First, the autonomous sensor box has been implemented by assembling commercial sensors manufactured by Phidgets Inc [19]. More specifically, the following seven sensors were used:

  • Temperature Sensor 1125

  • Humidity Sensor 1125

  • Absolute Pressure Sensor 1141

  • Vibration Sensor 1104

  • Sound Sensor 1133

  • Light Sensor 1127

  • Motion Sensor 1111.

These seven sensors were connected with a Phidget Interface Kit, which exposes the sensor values to USB interface. For the sensor hub, we used Raspberry Pi 2 (Model B, Raspbian Jessie) single-board computer. As shown in Fig. 2, the box case contains the seven sensors and the interface board, and a USB cable is connected to the Raspberry Pi.

The sensor logger was implemented by Perl, and the alive monitoring system was implemented by shell script and cron. Also, Sensor Box Framework and Service were implemented in Java, and deploy as Web service using Apache Axis2.

The Sensor Box Manage Service was implemented in Perl CGI and HTML::Template library. The Log Collection Service was implemented in the Fluentd log collection framework. For the database, we used MongoDB and HBase. Finally, the State Cache Service and Sensor Box Log Service were implemented in Java and were deployed as RESTful Web service using Jersey.

2.3.2 Applications

As examples of external applications of the proposed system, we introduce two Web applications. Figure 4 shows a Web application, called Sensor Box Dashboard. Connecting to the Sensor Box Service on the Autonomous Sensor Box, the application displays the current values of installed sensors. Using the application, an administrator of the sensor box can check if the sensor box works correctly.

Figure 5 shows a Web application, called Sensor Box Log Viewer. Connecting to the Sensor Box Log Service, the application displays the daily time-series data of a given sensor box. Using the application, the user can review how the environment has been changed during the day.

Fig. 4
A screenshot of the sensor box web dashboard with text in a foreign language. It has a gauge chart for light brightness, and bar charts for sound volume, temperature, humidity, gas pressure, movement, vibration, presence, Wi-Fi count, and discomfort index.

Sensor box dashboard

Fig. 5
A screenshot of the sensor box log viewer web application with line graphs for light, sound, temperature, humidity, gas pressure, motion, vibration, and presence on 22, April 2023. The text is in a foreign language.

Sensor box log viewer

With the consent of the elderly person, the data on these applications can be shared with family members and acquaintances to create opportunities for mutual assistance. Thus, the applications can implement the first step of the (S2) Mutual-aid support service in our conceptual architecture (see Fig. 1).

2.4 Deploying Autonomous Sensor Box in Actual Elderly Home

Currently, the autonomous sensor boxes are installed in 20 locations in the houses of research collaborators. The sensor log has been collected for several years. Let us see an example of how the sensor box can be used for daily monitoring of an elderly person.

Fig. 6
A screenshot of the sensor box log viewer web application for an elderly person. The page has line graphs for light, sound, temperature, humidity, gas pressure, motion, vibration, and presence on 15, September 2021. The text is in a foreign language.

Visualized sensor data of an elderly person

Figure 6 shows time-series sensor data recorded on September 15th, 2021 at the home of an elderly woman. She was in her 80 s, and was living alone. The sensor box was placed beside her television in the dining kitchen. In the figure, the eight line plots represent the data of light, sound, temperature, humidity, gas pressure, motion, vibration, motion, and human presence likelihood (derived by the integration of motion value), respectively. In each graph, the horizontal axis represents the time (from 0:00 to 23:59), while the vertical axis plots the sensor value.

From the graphs, we can infer the woman’s approximate daily life. First, the values of the motion and presence indicate that she went to bed at 2:20 and woke up at 11:30 During the sleeping period, she woke up once around 5:30. The value of the sound volume indicates that the TV was turned on at 11:50. The volume dropped and the human presence did not respond around 14:15, indicating that she went out somewhere. Then, she returned home at 15:30. After that, she turned on the air conditioning since the temperature and humidity changed from 15:30 to 19:30. From 19:50 no motion was detected, indicating she was taking a nap. Then, she woke up at 22:40, and sat up late until 2:00 the next day. The gas pressure was low, as it was raining at this date.

Thus, the autonomous sensor box accumulates multiple sensor data 24 h and 365 days, which can characterize the home environment of an elderly person from multiple perspectives. Remote family members, who know the elderly person well, can view the data over the Internet, and imagine what he or she is doing. In other words, the family can keep a loose watch over the elderly person without intruding too deeply into his or her privacy.

3 Recognizing Daily Activities from Environmental Sensor Data

3.1 Can System Recognize Activities from Environmental Data?

As seen in the previous section, the time-series data collected by the autonomous sensor box (i.e., sensor log) characterizes the living environment of an elderly person. The sensor log would be useful for remote families to monitor if the elderly person is getting along well as usual. Basically, the environmental sensing by the autonomous sensor box is easy to introduce and is not intrusive too much to daily living, which is a great advantage. On the other hand, due to the nature of environmental sensing, it is not easy to recognize what the person is exactly doing from the data. A family member, who knows the elderly person well, may be able to guess it manually. However, if the system can do this, it helps a lot.

Our research question here is: “Using the sensor log collected by an autonomous sensor box, can the system automatically recognize the daily activities of an elderly person?” The daily activities refer to in-home activities regularly performed in the daily life, including sleeping, eating, cooking, cleaning, bathing, etc.

The problem is generally called sensor-based activity recognition [14], which has been studied for a long time in the fields of ubiquitous and pervasive computing. Related work for recognizing the in-home daily activities are summarized as follows. Kusano et al. [15] proposed a system that derives life rhythm by tracking the movement of the elderly by using RFID positioning technology. Munguia-Tapia et al. [20] installed state-change sensors on regular items such as a door, a window, a refrigerator, a key, and a medicine container, to collect interactions of a resident with an object. Philipose et al. [21] attached an RFID tag to items to collect interactions. Pei et al. [22] combined a positioning system and motion sensors of a smartphone to recognize human movements.

Although there were many existing works, we did not find any method that can answer directly to our research question.

3.2 Proposed Activity Recognition Method

To answer the research question in Sect. 3.1, we have developed a new activity recognition system in [23]. Since the autonomous sensor box cannot distinguish multiple residents at home, we focused our methodology on one-person households (OPH, for short). Although the target is limited, there still exists a strong demand to monitor elderly people in OPH.

In the proposed system, we apply supervised machine learning extensively to the sensor log collected by the autonomous sensor box. Given the proposed method based on supervised machine learning, the proposed system requires initial training, where the resident manually records activities using a designated lifelog tool. The initial training is supposed to be performed over several days, to associate labels of activities with sensor data.Footnote 1

In the proposed system, we define seven daily activities (cooking, PC working, cleaning, bathing, sleeping, eating, and going out), which are the most typical activities for maintaining a life rhythm. For the labeled dataset, supervised learning algorithms are applied to construct a model of activity recognition for the house. For this purpose, careful feature engineering is performed to determine essential predictors that best explain the activities in OPH. Furthermore, we try several classification algorithms to compare performance.

Figure 7 shows the outline of the proposed system, where we explain the proposed system from left to right. The system is initially set up within a target OPH. A single (or multiple if necessary) autonomous sensor box is deployed in a position where the daily activities are well observed as environmental measures. A software called LifeLogger is then installed on the user’s PC, which is used to attach correct labels of activities to the environmental sensing data. The autonomous sensor box uploads the measured data to Log Collection Service (see Sect. 2.2), whereas LifeLogger records time-stamped activities as lifelog. The sensor log and the life log are joined by the timestamp, to form the training data. For the training data, we then apply the feature engineering and a machine learning algorithm, in order to construct a prediction model of activity recognition.

Once the trained model is constructed, the system moves to the operation phase. Taking environmental sensing data as input, the trained model outputs recognition result, reasoning the current activity.

Fig. 7
A dataflow diagram of the proposed system. Phase 1 is training, where the life logger, lifelong data, and training data undergo feature engineering and machine learning. Phase 2 is operation, where the autonomous sensor box sends the environment sensing data for recognition, and displays the result.

Outline of the proposed system

3.3 Collecting Data for Activity Recognition

3.3.1 Collecting Environmental Sensor Data

In the proposed system, we use the environment sensing platform with the autonomous sensorbox, which was described in Sect. 2.2.

To be able to recognize daily activities by environment attributes, the sensor box should be put on where the resident’s activities are frequently conducted. Note that the room layout and living circumstances of every single resident are different among households. Hence, the sensor log collected in a household can be used only for activity recognition within that household.

3.3.2 Recording Lifelog for Correct Labels

During the initial several days, the resident needs to input the correct labels for activities, so that the system can learn these activities from the environmental sensing data. For this purpose, the residents were asked to use LifeLogger.

Fig. 8
A screenshot of the perl life logger window. It displays the menu with the message P C working and reads eight buttons for the cook, P C work, clean, bath, sleep, eat, absence, and others. The button P C work is enabled.

Screenshot of life logger tool

Fig. 9
A screenshot of the set of program logs. It treads the lifelog for starting and ending the P C work, editing P C work, bath, others, and sleeping.

Raw data of life log

Figure 8 shows the user interface of LifeLogger. As shown in the figure, LifeLogger has 8 Buttons, each of which corresponds to an activity. When the resident initiates an activity, he/she simply presses the corresponding button to record the current activity.

Based on relevant studies [24, 25], 8 types of daily activities were chosen (sleeping, eating, bathing, cooking, PC working, cleaning, going out, and others), and registered in LifeLogger. When the button is pressed, the system records the starting time of the activity. When the button is pressed again or another button is pressed, the system records the ending time, and the starting time of the new activity if any. Figure 9 shows a part of the raw data recorded by LifeLogger. From the data, we can see that on February 19, 2017, the user did PCwork, Bath, Others, Sleep, and Others in this order.

3.3.3 Joining Sensor Data and Lifelog Data

For the supervised learning, the system requires training data that have a correspondence between the activities and the sensor log in advance. To establish the training data, we join the two time-series data collected by SensorBox and LifeLogger by timestamp. Activity data labeled as ’other’ was deleted, since it was beyond the scope of the activity recognition.

The sensor log is time-series data with fixed interval (10 s by default), while the life log data is event data recording the starting and ending time of every activity. Hence, we first convert the life log data into time-series data with fixed interval, by filling the activity ID between the starting and the ending time. Then, we join the two data with the timestamp.

Table 2 shows the part of the resulting data, which represents the sensor log from 3:33:02 to 3:33:32 on February 19, 2017. We can see that activity ID 5 (i.e., Sleep) is attached in the last column. Thus, this environmental data is used as training data to characterize the Sleep activity of the user.

Table 2 Training data

3.4 Constructing Machine Learning Recognition Model

3.4.1 Choosing Relevant Environmental Attributes

For accurate activity recognition, it is essential to identify the relevant environmental attributes that best predict activity. From the seven environmental attributes of the sensor log, only temperature, humidity, light, sound volume, and motion were chosen because the remaining attributes (vibration and gas pressure) seem irrelevant to the target activities. According to compared about 20 recognition models based on different combinations of environmental attributes, the determination was made that sensing data of gas pressure and vibration was almost not affected by the resident’s activity.

3.4.2 Feature Engineering

Feature value is the data that is effective in the identification of the activities. In this study, the feature values are obtained from training data according to the following process.

The size of time window is first determined. To enhance the features of the time-series data, the raw data within the same time window is aggregated into one data. In this case, the window size affects the accuracy. If the size is too large, the window is likely to contain different activities. If it is too small, the window will not contain sufficient data to reason and predict an activity.

Finally, for each of the five environmental attributes chosen, an aggregation function was determined. An aggregation function aggregates all the data within the same time window. Typical, aggregation functions include maximum value (MAX), minimum value (MIN), average value (AVG), and standard deviation (STDEV). Based on the nature of each environment attribute, an appropriate function was carefully chosen. Figure 10 shows the process of the feature engineering. The fine-grained time-series data is aggregated based on designated time windows, which characterizes features of activities.

Fig. 10
2 graphs for feature engineering. Top. Time series for sensors 1, 2, and 3 versus activity labels plot fluctuating lines along sleep, bath, cook, and eat. Bottom. Time windows T 0 to T 12 have corresponding fluctuating lines for aggregated data, and activity labels for each time window.

Feature engineering

Note that it is non-trivial to know what aggregate function is best for each environmental attribute. Hence, different aggregation functions must be tested for each environmental attribute. By analyzing all the tests, the optimal combination of aggregation functions can be determined. However, if all situations need to be tested, then hundreds of rounds of tests need to be performed, which is time-consuming.

To effectively test all cases of function combination, a tool called PICT [26] was used. PICT generates a compact set of parameter value choices that represent the test cases required to achieve comprehensive combinatorial coverage of the parameters. Table 3 shows the 9 cases of combinations generated by PICT.

Table 3 Nine groups of aggregation functions

3.4.3 Establishing Recognition Model

For the developed features of the training data, machine-learning algorithms are applied, to construct a prediction model for activity recognition. Popular classification algorithms are then used, including Logistic Regression, Decision Forest, and Neural Network. Using these algorithms, it is possible to construct prediction models that classify given environmental sensor data into one of the seven activities.

The performance of a prediction model is evaluated by a confusion matrix to see how much percentage of the time windows is classified as the correct or wrong activities. The parameters to construct a prediction model are (1) the size of time window, (2) the selection of aggregate functions, and (3) the choice of the machine learning algorithm. We test as many variations of parameters as possible and determine the best combination that yields the most accurate prediction performance.

Fig. 11
A schematic presents the layout of an apartment consisting of a bedroom, living room, bathroom, and kitchen. Two autonomous sensor boxes are positioned as indicated by the bright triangles, one in the kitchen and one in the living room.

Apartment for the experiment

3.5 Experimental Evaluation

3.5.1 Setup Experiment

The proposed system was deployed in an actual apartment of a single resident. As shown in Fig. 11, the apartment is an ordinary condominium in Japan, consisting of a bed/living room, a bathroom and a kitchen. Two autonomous sensor boxes were positioned as indicated by the red triangles in Fig. 11, one in the kitchen and one in the living room.

A total of 645,705 rows of raw sensor data was collected from the kitchen SensorBox. The living room SensorBox collected 483,862 rows of raw data. We used Multiclass Decision Forest (DF), Multiclass Logistic Regression (LR) and Multiclass Neural Network (NN) algorithms of Microsoft Azure Machine Learning Studio [27], in order to build the activity recognition model.

Fig. 12
A confusion matrix of order 7 cross 7. It compares the actual versus predicted values of cook, P C work, clean, bath, sleep, eat, and absence. The diagonal elements read 78.2%, 64.6%, 7.7%, 80.8%, 75.0%, 24.0%, and 80.9% and are shaded in color gradients.

Confusion matrix of activity recognition with environmental sensing

3.5.2 Result

We have tested many combinations of the parameters to build the activity recognition model. As a result, we found that the best parameters were 30 s for the time window size, the tuple of [Min(light), Ave(motion), Std(temperature), Std(humidity), Ave(sound)] for the aggregation functions, and the decision forest for the machine learning algorithm.

Figure 12 shows a confusion matrix, where each row represents the actual class of activity and each column represents the predicted class of activity. From the matrix, we can see that the accuracy of the activity recognition depends on the class of activity. In this experiment, Cook, Bath and Absence marked high accuracy around 80%, PC work and Sleep marked middle around 60%, and Clean and Eat was quite low.

We investigate the result in more details. The activities PC work and Sleep were often misidentified as Absence. The reason is that the three activities were done in similar environmental condition, where the light was dark, there was no sound or motion. Eat was quite often misidentified as PC work, since the subject often ate meals on the PC desk. Hence, the proposed system is not good at recognizing activities that have similar impact to the environment. In other words, using environmental data only cannot distinguish environmentally similar activities, which is the limitation of the proposed system. Clean was misidentified as Cook, PC work, or Bath. A reasonable interpretation is that for the cleaning the user had to move around whole area, and the duration of each cleaning was short, therefore, the system could not learn unique characteristic of the cleaning.

3.6 Introducing BLE Beacons to Improve Accuracy

As seen in the previous section, the activity recognition with the environmental sensing only was not satisfactory for some activities. This means that the environmental data did not contain sufficient information to identify the activities. Although introducing cameras or wearable devices would provide much richer information, they interfere with the daily life.

As an idea to improve accuracy with preserving the non-intrusiveness, we have attempted additional experiment by deploying BLE (Bluetooth Low Energy) beacons in [28]. A BLE beacon a small device that repeatedly transmits a constant signal that other smart devices (e.g., smartphones) can see. On receiving the signal, a smart device can obtain ID of the beacon as well as RSSI (Received Signal Strength Indicator), by which the device can estimate the distance to the beacon. Using this principle, it is possible to estimate approximately which room the resident is in. As some activities are strongly related to the location, adding the location information to the sensor log is promising to improve the recognition accuracy.

We have implemented a small smartphone application called Blue PIN. When a smartphone receives a signal from a BLE beacon, Blue PIN sends the beacon ID and RSSI to a designated server. The server stores the data in a database.

3.7 Additional Experiment with BLE Beacon Data

3.7.1 Overview of Additional Experiment

In parallel with the previous experiment in Sect. 3.5, we asked the subject to carry a smartphone with Blue PIN. Two BLE beacons were deployed in the kitchen and the living room, as indicated by the blue triangles in Fig. 11. During the experiment, 368,047 rows of data were collected from the living room, while 370,372 rows were collected from the kitchen.

Feature engineering for the beacon data is similar to that of the environmental sensing data. For each of the two beacons, we apply aggregation functions MIN and AVE to data within every time window.

The aggregated sensor data and beacon data are then integrated based on this consistent time window. Finally, training data is created by joining the time-series activity log data and integrated data based on the timestamp. Table 4 show a part of the real training data. In the table, b2 and b3 represent two beacons placed in the living room and the kitchen, respectively. The ’bi.ave’ or ’bi.min’ respectively represents the average or the minimum of RSSI value of beacon bi.

Table 4 Training data for additional experiment

From the data in Table 4, we can estimate approximately which room the resident was in. The first three rows where the values of b2 is larger indicate that the resident was in the living room. This is consistent with the fact that he was sleeping (activity No. 5). The last two rows where the values of b3 is larger indicate that the resident was in the kitchen. It is also consistent with the fact that he was taking bath (activity No. 4). Thus, the training data was integrated with the location information.

Using the integrated training data we constructed a prediction model. As for the feature engineering of the environmental data, we took the same parameters as those of the previous experiment.

3.7.2 Result

Figure 13 shows the confusion matrix. Compared with the previous result in Fig. 12, the accuracy is significantly improved. Thanking to the location information, Sleep and Absence were clearly distinguished. Cook was no more identified as Eat. Clean was improved but yet unsatisfactory. PC work and Eat were still confused, since these activities were performed in the same place. Thus, the location information did not contribute at all.

Fig. 13
A confusion matrix of order 7 cross 7. It compares the actual versus predicted values of cook, P C work, clean, bath, sleep, eat, and absence. The diagonal elements read 92.7%, 66.0%, 30.8%, 80.8%, 100.0%, 21.2%, and 98.9% and are shaded in color gradients.

Confusion matrix for proposed system using integrated data

Table 5 compares the results of the two experiments. We can see that using the beacon data together with the environmental sensor data significantly improve the performance of the activity recognition.

Table 5 Overall comparison of experimental results

The proposed system enables the automatic activity recognition from non-intrusive sensing, which is promising for the future in-home care. A major drawback is that it requires the activity labeling to the sensor data, which is so tedious that most people may not accept it. How to improve the acceptance of the system is left for our future work.

4 Monitoring Elderly Mind by Agents

4.1 Understanding Internal States

As seen in the previous sections, the environmental sensing achieves automatic and non-intrusive monitoring of physical living environment of elderly people. On the other hand, monitoring with sensors has a limitation that the system can detect only externally observable events. For example, suppose an elderly person sitting in his living room and is concerned about his back pain. Then, the sensors can detect that he is in the living room, but not that his back pain.

When we discussed the activity recognition system with a research collaborator, who was a professional speech therapist specializing in dementia care, he said: “Understanding elderly people by sensors is technologically interesting. However, why you don’t ask the person directly how he/she is? Elderly people do not want machines to guess their activities. They are happy that you care about them!”. His statement shocked us to realize that we needed a new method that was different from the conventional sensing technology.

The point is how to understand the internal state of elderly people. Here, the internal state refers to a status of a person that cannot be observed externally, including moods, pains, conditions, desires, and intentions. Since the internal state is directly linked to human health, it is important to monitor within the home care [29]. The internal state is usually obtained by conversation, and is technically assessed through inquiries and counseling by clinicians or counselors. However, it is not realistic to request human professionals to monitor the internal state regularly at home.

The episode brought us an idea to utilize the agents technologies. An agent here refers to any software robot that can talk to a human user. It includes animated virtual agents (e.g., MMDAgent [30]) that interact with voice and chat bots that converse via text messages (e.g., LINE Bot [31]).

The key idea is to let an agent talk to the elderly person in the daily life, externalize his/her internal state as words, and record the state with timestamp. We named this idea as Mind Sensing (“Kokoro” Sensing, in Japanese), in the sense that the system is trying to capture the internal mind of elderly people.

4.2 Agent Technologies Developed for Mind Sensing

Using the existing agent technologies, we have developed two kinds of agent systems for the Mind Sensing.

4.2.1 PC Mei-chan

PC Mei-chan is an animated virtual agent implemented with MMDAgent [30, 32]. MMDAgent is a toolkit for building voice interactive software system, which was originally developed in Nagoya Institute of Technology, Japan. MMDAgent contained a variety of modules including text-to-speech (TTS), speech-to-text (STT), voice interaction control, and avatar representation. A virtual agent Mei-chan was contained as a default avatar of MMDAgent.

Since we wanted to integrate Mei-chan with our service-oriented smart home (see Sect. 1.2), we de-coupled the voice interaction control and avatar representation modules from the system, and wrapped them by Web services [33]. By doing this, Mei-chan was controllable via Web-API, and was orchestrated with sensors and home appliances within our smart home.

Using the MMDAgent re-engineered, we developed a system called Virtual Caregiver (VCG) [34], where Mei-chan talks to elderly people in accordance with personalized care scenarios. Integrated with a Web browser, Mei-chan can also present Web contents such as texts, control buttons, pictures and videos. Figure 14 shows the screen of VCG, where Mei-chan is asking “Do you regularly take medicine?” Fig. 15 shows a scene of an experiment, where an elderly person was enjoying her favorite music played by Mei-chan.

Fig. 14
Two screenshots. The M M D A agent window displays the profile of the chatbot and the virtual caregiver window displays the questions in a foreign language.

Virtual caregiver system: (C) 2009–2018 Nagoya Institute of Technology (MMDAgent Model “Mei”)

Fig. 15
A photograph of two people sitting in a chair with laptops in front of them. An elderly person on the right listens to the chatbot that is displayed on the laptop.

An elderly person talking to VCG

Fig. 16
Two screenshots. The M M D A agent window displays the profile of the chatbot and the virtual caregiver window reads a conversation between the chatbot and the user in a foreign language.

PC Mei-chan in active listening: (C) 2009–2018 Nagoya Institute of Technology (MMDAgent Model “Mei”)

Fig. 17
Three photographs of an elderly person interacting with the Mei chan chatbot on their laptops indoors.

Elderly people operating PC Mei-chan

We then implemented dialog scenarios in which Mei-chan actively listens to the elderly. Mei-chan asked the elderly about their physical condition and mood in response to the motion sensor, and then listened to them, thus externalizing their internal state into the conversation. At the time, Mei-chan was proven to be a powerful means for the Mind Sensing. We named the system PC Mei-chan for simplicity, in the sense that Mei-chan was working on PC for elderly people.

Figure 16 shows PC Mei-chan, in active listening mode. Until present, various extensions have been made to PC Mei-chan (e.g., [35,36,37]). Also, it has been deployed in actual elderly household to see if elderly people can accept PC Mei-chan in their daily life. Figure 17 shows some scenes taken from demonstration experiments.

4.2.2 LINE Mei-chan

In order to achieve portable Mind Sensing, we implemented another version of Mei-chan as a LINE chatbot, which is called LINE Mei-chan. Integrated with LINE Messaging API [31], LINE Mei-chan sends questions for Mind Sensing via a well-known smartphone application LINE. Since the conversation is asynchronous based on text messages, the elderly person can answer the question at any time convenient. Also, it can send questions even if the elderly person goes outside. Thus, LINE Mei-chan and PC Mei-chan complement each other, and they are chosen appropriately for the purpose of Mind Sensing.

Fig. 18
Two screenshots of the mobile interface. Both read a conversation between the chatbot and the user in a foreign language.

LINE Mei-chan asking questions for mind sensing: (C) 2009–2018 Nagoya Institute of Technology (MMDAgent Model “Mei”)

Figure 18 represents the screenshots of LINE Mei-chan on smartphones. The left figure shows Memory-Aid Service [38], with which the elderly person actively chats to LINE Mei-chan to memorize the current internal state in the system. In the figure, LINE Mei-chan was asking the elderly person what he was doing in the room at 22:19 on June 24, based on the event detected by the activity recognition with the environmental sensing (see Sect. 3). The elderly person answered what he was doing at that time. The answer (i.e., the internal state) was recorded in the system. The service then provides the retrospective process, where the person can review, correct, classify, and search the recorded information of own at any time. Thus, the service is designed for the memory-aid purpose of healthy elders as well as people with cognitive impairment. In [39], we extended the Memory-Aid Service so that LINE Mei-chan asks and records daily health status (e.g., blood pressure, weights, body temperature, mood, etc.).

The right screen of Fig. 18 shows Mind Monitoring Service [40], where LINE Mei-chan periodically sends questions to monitor the internal state of the elderly people for long-term assessment. The Mind Monitoring Service will be described in details in Sect. 5.

4.3 Mind Sensing Service: Rule-Based Service for Systematic Mind Sensing

As we developed various applications using PC Mei-chan and LINE Mei-chan, similar features for Mind Sensing were implemented as different software code within individual applications. Thus, the way of Mind Sensing was tightly coupled with each application, which increased the software complexity, and decreased the flexibility and the scalability.

For instance, the Memory-aid Service introduced in the previous section was tightly coupled with the activity recognition system and LINE Mei-chan. That is, the Mind Sensing can be only triggered by the specific activity recognition, and the inquiry is performed only by the LINE Mei-chan. Also, all the questions were hard-coded within the program. Thus, the service lacked the flexibility, where it was quite difficult to add or change the configuration of Mind Sensing, adapting to individual elderly people.

To cope with the limitation, we developed Mind Sensing Service. The proposed service exploits a rule-based system which allows individual users to define custom mind sensing methods. The key idea is to de-couple the definitions of the mind sensing from the surrounding systems.

Fig. 19
A graphic of mind sensing service. It defines how the event notification from the smart home services is received and satisfies the condition under the rules. The stares from the message services are recorded on a daily retrospective according to the mental states of the elderly person.

System architecture of mind sensing service

4.3.1 System Architecture

Figure 19 shows the system architecture of the Mind Sensing Service. In the proposed service, each mind sensing is defined by a rule, specifying which question is sent, to whom, at when, by what event, and with which message service. Once a rule is defined, the service automatically sends the questions to the target users, and collects the answers. In the figure, we assume there are various smart home services that support the elderly person at home, including the activity recognition service, the position detection service, the change detection service, and so on. Each of these services generates and manages events. The Mind Sensing Service is supposed to receive event notifications from these services, and ask questions to designated users based on the pre-determined rules.

A rule specifies an enabling condition when the mind sensing should be executed. The condition is based on either time or event. A time-based rule is triggered when the designated time is arrived, while an event-based rule is executed when an event matching the condition is notified. Each rule is associated by a set of actions. An action corresponds to an inquiry to a user, consisting of an address of the user, a question to ask, and a message service to deliver the question. We adopt various messaging services, including SMS (short messaging service), Email, and Slack, to inquiry the questions to the target user. By supporting interaction with various devices such as smartphone and PC, we can perform the mind sensing, according to the lifestyle of individual user.

As the user responds to the question in the natural language text, the answer is then recorded in the database with a timestamp. The stored conversations between a user and a chatbot are later used in services such as Memory-Aid Service. This allows users to review, correct, classify, and search the information they recorded themselves. In addition, through appropriate access control, they can be accessed by third parties such as doctors and caregivers for person-centered care treatments.

4.3.2 Action

An action defines a configuration of concrete inquiry of the mind sensing. The configuration includes three items: targets specifies target user(s) ID to inquiry, messageBody specifies the content of a question message, and serviceType specifies a service to deliver the message.

It is possible to specify multiple users in targets, and a designated question can be sent to the multiple users simultaneously. When an action is executed, the system looks up a name aggregation table, which maps a user ID within the proposed service to a user ID of the concrete message service specified in serviceType. After resolving the user ID, the service invokes Web-API of the message service, passing the text described in the messageBody to the destination address of the user.

For example, suppose that we define the following action: act1 = {targets: [“Maeda”], messageBody: “How is your current condition?”, service: “LINE”}. The act1 defines an action that the LINE chatbot send a message “How is your current condition?” to a LINE user ID corresponding to “Maeda”.

4.3.3 Time-Based Rule

A time-based rule (TBrule, for short) is a rule that repeatedly executes actions at time interval within a designated period. It defines an inquiry without depending on any event from external services. The TBrule can be used when asking questions regularly scheduled or when sending messages at a fixed time of a day. A TBrule is defined by four parameters: actions specifies a list of actions to execute, since specifies the start time, until specifies the end time, and interval specifies minutes of the repetition interval.

For example, suppose that we define the following TBrule: tbrule1 = {actions: [“act1”], since:“10:00”, until:“16:00”, interval: 60}. The tbrule1 defines a TBrule that action act1 is executed every hour from 10 o’clock to 16 o’clock every day.

When the service is started, all TBrules in the database are lorded. Each TBrule creates a timer task that periodically checks, for every interval minutes, if the current time is between since and until, and executes the designated actions.

4.3.4 Event-Based Rule

An event-based rule (EBrule, for short) is a rule that is triggered by an event notified from an external service, based on when, where, and what event is notified.

An EBrule is defined by three items: actions specifies a list of actions to execute, conditions specifies one or more conditions to be satisfied by the event, and breakTime specifies minutes of cooling time to the next execution.

When an event is notified, an EBrule is triggered only if all the conditions are satisfied. Each condition is defined by the 5W perspective (i.e., WHO, WHOM, WHAT, WHEN, WHERE). This perspective can cover most events issued by external systems. Each condition is defined by six items: from, to, event, since, until, location:

cid

: condition ID

from

: The subject of event

to

: The object of event

since

: Whether the event took place after this time

Until

: Whether the event took place before this time

Event

: The contents of event

Location

: The location of event

description

: The description of event

For example, suppose we define the following condition: con1 = {from: “Activity recognition”, to:“Maeda”, since:“06:00”, until: “10:00”, event:“Waking up”, location:“Bedroom”, } The con1 defines a condition that activity recognition service detects user Maeda’s waking up in the bedroom.

Next, let us define the following EBrule: ebrule1 = {actions: [“act1”], conditions: [“con1”], breakTime: 30} . The ebrule1 defines a rule that action act1 is executed only when the condition con1 is fulfilled. That is, when that activity recognition service detects that Maeda wakes up, then send him a question “How is your current condition?” by LINE. Once ebrule1 is executed, it will not run for the next 30 minutes.

To receive the event notification from external systems, the proposed service exposes REST API, with a method postEvent(from, to, event, time, location). When the external system executes the API, the service evaluates conditions of every EBrule against the given values of the parameters. For example, when postEvent(“Activity recognition service”, “maeda”, “10:42:24”, “Waking up”, “bedroom”) is executed for the above con1, it returns false because the perspective of the WHEN is not met.

4.4 Case Study

4.4.1 Collecting Mental State by LINE Chatbot

As a case study, we conduct an experiment that obtains user’s mental state by sending questions using Mind Sensing Service. The purpose of this case study is to confirm if the proposed service works as expected, and to see how effectively the system can collect user’s mental states.

In the experiment, a questioner, who is a professional speech therapist, created 42 questions by referring to the mental illness questionnaire sheets. Then, the questioner wanted to ask each subject three questions at a time, twice a day at 6:30 and 21:30. Since each question was a bit technical, sending the question by text message was more understandable than sending it by voice message. Therefore, we chose LINE chatbot as the message service.

As receiving a question, a subject answered the question by four-level evaluation: (0) not at all, (1) don’t think so, (2) think so, and (3) absolutely think so. Then, the answer was stored in a database for later analysis. After the user answers 42 questions in 7 days, we finally sent the review of the week, and questionnaire that asked the actual mental state of the subject.

4.4.2 Creating Rules with Mind Sensing Service

In order to start the experiment, the questioner had to register actions and rules to the proposed system, so that three questions were sent to designated subjects every morning and evening. Additionally, the questions had to be updated to cover the total 42 questions. Therefore, we implemented a Web application that allows the questioner to easily create and update actions and rules within a Web browser. Figure 20 shows a snapshot of the Web application. In the screen, a list of actions and rules registered for each user is shown. With the application, the questioner can easily register, update, and delete them.

First, we set the user information on the subject. The subjects are 6 people in their 20s–60s and we register their user ID and account information of LINE. Because the content of the question is somewhat technical and the SMS text message is easy to convey, we adopt LINE application as the message service we use to send questions.

Second, we set actions to send a question to the subject. In this experiment, we send four messages at a time, twice a day, at specific times in the morning and evening. We accordingly register 8 actions: MorningAction0,1,2,3 and EveningAction0,1,2,3. MorningAction0 and EveningAction0 are greeting messages to start an inquiry. MorningAction1,2,3 and EveningAction1,2,3 define concrete three questions asked in the morning (orevening, respectively) inquiry. For example, MorningAction1 is described as follows:

MorningAction1 = {targets: [“maeda”, “yasuda”,..., “nakamura”], messagebody: “[Question#1]Do you think you feel satisfied with your daily life?”, service: “LINE”}.

Finally, we set two TBrules to execute actions: MorningRule and EveningRule. In both rules, the interval is set to 1440 minutes, so that MorningRule and EveningRule are executed exactly once a day. Thus, each target subject receives three questions following a greeting message every morning at 6:30, as well as every evening at 21:30. For example, MorningRule is described as follows: MorningRule = {actions: [“MorningAction0”, “MorningAction1”, “MorningAction2”, “MorningAction3”], since: “06:30”, until: null, interval:1440}.

Fig. 20
A screenshot of the web application. It reads a list of actions and rules registered for each user in a foreign language.

Web application managing actions and rules

4.4.3 Result and Feedback

Figure 21 shows the LINE chatbot interacting with the subject. The chatbot sends a message asking about the mental state by MorningAction0,1,2,3 at 6:30, defined in MorningRule. Subjects responded to these questions at any time. In this way, by using the proposed service, it is possible to make a rule-based question as a basic service for Mind Sensing.

Fig. 21
A screenshot of the mobile application. It reads a conversation between the user and the line chatbot in a foreign language.

Interaction between a subject and LINE chatbot

In this case study, the questioner, who made the question from some existing questionnaire sheets for mental illness, set some actions and rules through GUI on PC. She said this rule-based talking service was useful and more efficient than manual transmission. On the other hand, she pointed out a lack of usability of GUI.

5 Design and Evaluation of Mind Monitoring Service

5.1 Monitoring Internal State for Long Term

Our next challenge is how to monitor the physical and mental health in-home elderly people for a long term through the Mind Sensing. In general, it is not easy to obtain the physical and mental health condition at home by external observation by non-intrusive sensors. Thus, the proposed mind Sensing with the agent is a promising approach. However, how to observe and assess the health condition by the Mind Sensing is still an open question.

According to the World Health Organization (WHO), the concept of health is defined as follows [41]:

Health is a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity.

By this definition, health is a state that can be characterized by three aspects: physical, mental and social aspects.

As the physical and cognitive functions decline, elderly people easily develop not only physical illness, but also mental illness. Typical mental illnesses elderly people tend to develop include depression [42] and anxiety disorder [43]. A major factor that causes such mental illnesses lies in their experiences of loss. The experiences include the deterioration of physical ability due to aging, the loss of social role by the retirement, and the bereavement of familiar people.

In clinical scenes, psychological assessment tools, including tests, scales, and questionnaires, are used to quickly assess the mental state of the person. Representative tools include, GDS-15 (Geriatric depression scale 15) [44]: the depression scale for the elderly, PHQ-9 (Patient Health Questionnaire-9) [45]: assessment of general depression, GAD-7 (Generalized Anxiety Disorder-7) [45]: measuring the degree of anxiety disorder, and GHQ (General Health Questionnaire) [46]: assessment of neurosis. However, it is unrealistic for in-home elderly people to use these tests regularly at home.

5.2 Concept of Mind Monitoring Service

Exploiting the Mind Sensing Service with LINE Mei-chan, we have developed a new service named Mind Monitoring Service [47,48,49]. The service aims to visualize and monitor the mental states of elderly people at home, through a continuous interaction with LINE Mei-chan. The service also provides appropriate supports based on the acquired mental states data to encourage user’s self-reflection and spontaneous self-care of mental health.

The concept of Mind Monitoring Service is to grasp mental states of elderly people at home, which have been difficult to obtain so far. It also tries to provide appropriate supports according to the mental states. For this, we utilize LINE Mei-chan, in order to establish continuous interaction platform with elderly people at home. Moreover, we develop specific questions to acquire mental states of the elderly person. We also introduce scoring methods for evaluating answers of the questions and visualizing mental states numerically.

5.3 System Architecture

Figure 22 shows the overall system architecture of the Mind Monitoring Service. As seen in the figure, the proposed service consists of three methods.

Fig. 22
A framework of the mind monitoring service with three stages. M 1 interaction with LINE Mei chan using mind sensing service, M 2 inquiry method specialized for the acquisition of mental state, and M 3 self care assistance and feedback by monitoring mental state.

System architecture of mind monitoring service

 

M1: Interaction with LINE Mei-chan using Mind Sensing Service::

We utilize the Mind Sensing Service (see Sect. 4.3), and let LINE Mei-chan ask questions to an elderly person every day. In stead of human caregivers, the chatbot listen to and record the internal minds of the elderly person continuously.

M2: Inquiry method specialized for acquisition of mental state::

We develop inquiries specific for acquiring mental states of the elderly person. The inquiries are stored in a database. The inquiries are then encoded by actions and rules of the Mind Sensing Service.

M3: Self-care assistance and feedback by monitoring mental state::

Every time the elderly person answers a question, the answer is stored in a database with timestamp. With an appropriate period, the service then analyzes the answers and evaluates his/her mental states. According to the result, the service produces feedback including further questions and advices.

5.4 M1: Interaction with LINE Mei-Chan Using Mind Sensing Service

Using the Mind Sensing Service, we let LINE Mei-chan ask questions to elderly people triggered by time or external events. Since the internal state should be obtained within a daily routine, we send questions to elderly people at a fixed time every day. For this, we apply the time-based rule of Mind Sensing Service. The time for sending the questions should be set in consideration of the person’s daily rhythm and lifestyle. It is also necessary to change questions every day, and to send encouraging messages, so that the person does not get tired and quit answering.

To make the interaction with LINE Mei-chan easier, we extensively use LINE template message, where we embed a question and the list of choices for the answer within a pre-defined layout. Figure 23a shows a screenshot of a template message. In the figure, the question is written in the middle part of the template message, and the answer choices are given by buttons at the bottom of the message. To answer the question, the user only has to push one of the two buttons. Since this way of answering does not need entering any text, it makes the elderly answer questions easily.

We also treat the user’s answer as an event, and command LINE Mei-chan to send a reply message using the event-based rule. For instance, when the template message provides two buttons meaning “yes” or “no”, the user’s answer can be classified either positive or negative. Therefore, by defining two kinds of replies in advance, we can make LINE Mei-chan reply a different reply depending on the answer.

Figure 23b shows the actual interaction between the chatbot and the user. In Fig. 23b, the chatbot replies to user’s positive answer to the question “Have you slept well in the past week?” After understanding a good sleep condition of the user, the chatbot sends an additional question to ask any concerns regarding sleep. The user can input any text messages and externalize his or her minds as words.

Design of these questions and reply messages will be described in the next section.

Fig. 23
Two screenshots of the mobile application. A reads a template with a profile of the chatbot along with a message in a foreign language. B reads a conversation between the user and the line chatbot in a foreign language.

Interaction with LINE Mei-chan in mind monitoring service

5.5 M2: Inquiry Method Specialized for Acquisition of Mental State

5.5.1 Monitoring Internal States from Three Aspects

Integrating the definition of health in Sect. 5.1, we monitor the internal status of elderly people in the following three perspectives: Physicality, Mentality, and Sociality.

  • Physicality corresponds to the physical aspect of health. It targets physical symptoms that can be explained by objective factors. We try to grasp the health, according to the presence or absence of the physical symptoms.

  • Mentality corresponds to the mental aspect of health. It covers subjective feelings such as emotions and moods. We characterize the mental health by subjective assessment.

  • Sociality corresponds to the social aspect of health and sickness. It covers the self-evaluations and behaviors such as happiness, self-esteem, motivation, or social behaviors. We try to understand the health from social aspects.

5.5.2 Preliminary Experiment

In our preliminary experiment [48], we developed questions by referring to the psychological assessment tools (see Sect. 5.1). Specifically, based on GDS-15, PHQ-9, GAD-7, and GHQ60, we created 42 questions in total. We then classified the 42 questions into the above three categories. Table 6 shows 21 questions assessing Mentality. Each question was supposed to be answered by 4-level scales:“Yes, I really think so”, “Yes, I might think so”, “No, I might not think so”, “No, I definitely do not think so”. The mind sensing was performed twice in the morning and in the evening, and, for each time, three questions out of the 42 were sent to each elderly person by LINE Mei-chan.

Table 6 Questions for assessing mentality, created in the preliminary experiment

In fact, however, the preliminary experiment did not work well. Each question was so technical that the elderly people could not understand well the meaning and the intention of the question. It was also too much to ask three technical questions twice a day, which was a burden for the subjects.

5.5.3 Simplifying Questions and Interactions

With the help of experts, we re-drafted the questionnaire into seven questions shown in Table 7, so that the elderly people can easily answer the questions. The seven questions were intended to grasp approximate state within the week from seven fundamental aspects of daily living: Sleep, Health, Emotion, Memory, Psychology, Motivation, and Socialization. Each question asks the state of past one week, and each elderly person is supposed to answer it with simply “Yes” or “No”, instead of the 4-level scale. “Survey item” in Table 7 indicates what to investigate by the question. For example, the question “Have you slept well in the past week?” investigates the condition of sleeping. Besides, “Category” shows a class of the three perspectives.

Table 7 New seven questions

We also configured the time-based rule so that LINE Mei-chan sent only one question per day to the elderly person. The time of the message delivery was determined according to the person’s life rhythm. Since one question was sent once a day, all of the seven questions were covered in a week. In the next week, LINE Mei-chan sent the first question again.

To keep the motivation of elderly person answering the question, we let LINE Mei-chan to send reply messages as well as LINE stamps. Table 8 shows an example of the pre-defined reply messages. “Positive Reply” and “Negative Reply” were sent when the user answered the question positively and negatively, respectively. Each reply message was an open question asking why the user selected the choice, externalizing any concerns related to the question.

Table 8 Reply messages according to the user’s choice

As shown in Fig. 23b, when the elderly person sends the details of the additional question, LINE Mei-chan sends LINE stamp back to the elderly. We implemented these interactions with the LINE reply messages and event-based rules of the Mind Sensing Service, regarding each answer as an event.

5.6 M3: Self-Care Assistance and Feedback by Monitoring Mental State

Based on collected answers from each elderly person, the Mind Monitoring Service then evaluates his/her mental state. According to the result, the service produces feedback including further questions and advices. The service also provides a Web application with which the user can review the past answers.

5.6.1 Quantifying Answers for Assessment of Mental State

Since each elderly person answers each of the seven questions with “Yes” or “No”, the mental state of the week with respect to a category can be assessed to be positive or negative. We also take care of how the state of the week was changed from that of the previous week. If the state remained negative, the situation is bad. If it changed from negative to positive, it is a good sign but still needs to be observed. Finally, we should take the answer for the open question into account. Based on these consideration, we have proposed a method that quantifies the mental state by the following three kinds of scores:

  1. (i)

    Score_answer: The score directly obtained from the answer. We assign 1 point for a positive answer and −1 point for a negative answer.

  2. (ii)

    Score_observation: The score obtained by observing how the answer has changed from the previous week. When the user answered positively in the previous week, if the answer remains positive in the target week, 1 point is assigned. If the answer turns negative, −0.5 points is assigned. Similarly, when the user answered negatively in the previous week, if the answer remains negative in the target week, −1 points is assigned to the answer. If the answer turns positive, 0.5 points is assigned.

  3. (iii)

    Score_sentiment: The score obtained by sentiment analysis of user’s answer to the additional open question. Using Microsoft Azure Text Analytics API [50], the service calculates a sentiment value (from negative to positive) from the given text sentence. The score is then normalized so that it takes a value from −1 to 1 (1 points means the most positive).

Finally, we calculate the total score of the answer by the weighted sum of the above three scores.

$$\begin{aligned} S_{{\text {total}}}= w_1 \cdot S_{{\text {answer}}} + w_2 \cdot S_{{\text {observation}}} + w_3 \cdot S_{{\text {sentiment}}} \end{aligned}$$

Currently, we calculate the total score as the average of the three scores, where \(w_1 = w_2 = w_3 = \frac{1}{3}\).

5.6.2 Generating Weekly Feedback for Spontaneous Self-Care

Based on the score of the mental state, Mind Monitoring Service generates a weekly feedback to promote the user’s self-reflection and spontaneous mental health care. This feedback generation is intended to implement an instance of (S1) Self-aid support service in our conceptual architecture (see Fig. 1).

In the feedback, the service firstly selects one question whose score is the worst in a week. Secondly, the service creates a concrete feedback message to be sent by LINE Mei-chan. In order to generate natural sentences, we structured a feedback message by four paragraphs: Greeting, Reflection, Advice, and Conclusion.

More specifically, in the greeting paragraph, the chatbot greets the user according to the season or climate. In the refection paragraph, the chatbot shows how the user answered the selected question in order to get the user to look back him- or herself. The advice paragraph gives the user useful information about the content of the question. We refer to the information of the “Kenko-Choju Net” [51], which provides a lot of information about health and longevity for Japanese elderly people. Lastly, in the conclusion paragraph, the chatbot gives a closing remark, such as “Let’s do our best again this week.”

Figure 24 shows an example of a feedback message. In this feedback, the question about psychology was picked up. Since this feedback was created in June, the chatbot firstly mentioned the climate in June. The chatbot secondly indicated that the user had been feeling anxiety, and suggested to have her family or friends listen to the anxiety. The sentences of each paragraph are pre-defined, and the service are combining these paragraphs to create the complete feedback message.

Fig. 24
A screenshot of the mobile application. It reads the weekly feedback message from the line chatbot in a foreign language.

Example of a weekly feedback message

5.6.3 Developing Web Application for Visualization

To realize effective mind monitoring, we have developed a web application that visualizes the score of the mental states. Using the application, the elderly person can review his/her mental states. Furthermore, upon the consent of the elderly person, remote supporters (family members, caregivers, doctors, etc.) can watch the target person’s mental states by data.

Fig. 25
Three screenshots of the mobile application. A reads the conversation between the user and the line chatbot in a foreign language. B and C read a bar graph and a multiline graph comparing the time series score concerning physicality, mentality, and sociality.

Web application visualizing the mental state

Figure 25 shows the developed application. As shown in the left figure, LINE Mei-chan sends the URL of the application after the weekly feedback. When the elderly person taps the URL, the application shows the weekly score for each of the survey item, as shown in the middle of Fig. 25. As shown in the right figure, the application can also display the time-series score with respect to Physicality, Mentality, and Sociality. Thus, the elderly person and the external supporters can conduct long-term monitoring of the internal state.

5.7 Operating Mind Monitoring Service in Actual Households

5.7.1 Long-term Monitoring Experiment

The Mind Monitoring Service has been deployed on actual households, and been operated for long-term monitoring of their internal states. We recruited 8 elderly subjects (4 men, 4 women in the 50 s–80 s), who were able to use LINE application. The operation period was from November 1, 2019 to January 31, 2021, one year and two months (14 months) in total. The experiment has been approved by the research ethics committee of Graduate School of System Informatics, Kobe University (No. R01-02). Written informed consent was obtained from subjects for publication and accompanying images.

Two elderly subjects dropped out from the experiment within a few months. As for one elderly (male in the 70s), it was difficult for him to use the service because he did not use his smartphone frequently in his daily life. The other person (male in his 70s) had been using the service for the first three months, but he eventually stopped using it because his asthma worsened, and made him difficult to continue to answer the questions from LINE Mei-chan every day.

The remaining 6 subjects kept using the Mind Monitoring Service. Table 9 shows the response rate of each subject, which is the ratio of the number of responses (i.e., the answers) the subject made to the total number of questions from LINE Mei-chan during the 14 months.

Table 9 Total response rate of elderly subject

From Table 9, we can see four out of six elderly subjects responded to more than 90% of the questions from a chatbot. For subject C and E, the overall response rate was low, not because they had stopped using the service, but simply because they responded less frequently. In other words, we could not get high frequency of responses from subject C and E, but we were able to get them to answer the questions periodically.

5.7.2 Analyzing Time-Series Data in Detail

Through the 14 months of the operation, the Mind Monitoring Service collected a large amount of mental state data. Figures 26, 27 shows graphs of the mental state scores of two elderly subjects, subject A (male in 70 s) and subject D (female in 70 s), in 2020. In the graph, the vertical axis represents the average score value and the horizontal axis represents months. The blue, yellow, and green lines represent the scores of Physicality, Mentality, and Sociality, respectively.

Fig. 26
A multiline graph compares the time-series score concerning physicality, mentality, and sociality of subject A. Each resembles a fluctuating trend. Physicality has a deeper plunge than the others.

Transition of scores of subject A

Fig. 27
A multiline graph compares the time-series score concerning physicality, mentality, and sociality of subject D. Each resembles a fluctuating trend. Sociality and mentality have a deeper plunge than physicality.

Transition of scores of subject D

In Fig. 26, we can see that subject A’s scores of each perspective are generally positive. This means that his health in terms of Physical, Mental and Social was maintained stable throughout the year. However, we can see that his Physicality score dropped sharply in the middle of May. When we asked subject A about the reason, he said that he had hurt his leg by walking too much at that time. Afterward, thanks to treatment and rehabilitation, his leg finally started to get better around July. Also, we can find that his Sociality score dropped in early March. This was because the spread of the coronavirus (covid-19) reduced his opportunities to go out. For a while after that, he stopped exercising at the gym, and his Sociality score continued to stagnate. However, he started to go to the gym again around October, and his Sociality score started to increase.

In Fig. 27, it can be seen that the Mentality score and Sociality score of Subject D were much lower than Physicality score. When we asked Subject D about her situation in 2020, she told us that her elder sister had passed away in January and she had been experiencing terrible sense of loss. This sense of loss continued until around October, and her Mentality went through a series of manic-depressive cycles. She also had a lifestyle in which her days and nights were reversed. In contrast, her Physicality score tended to be relatively positive, but we can also find a sudden decrease in her Physicality score around July. In fact, at that time, she was suffering from dizziness caused by otolith detachment. Later, as she learned how to live well with her illness, her Physicality score gradually recovered.

5.7.3 Internal Mind Externalized as Words

During the experiment, elderly people sometimes answered the additional open question by their own words. We found that these words well characterized the internal mind of each elderly person, which was never captured by the conventional sensors. It seemed that elderly people externalized their minds through conversations with LINE Mei-chan.

Fig. 28
A screenshot of the page. It reads the conversation log between the user and the line chatbot in a foreign language.

A part of conversation log of subject D (in original text)

Figure 28 shows a part of conversation log of Subject D, where the log is listed from the new to the old. As we read the log from the bottom, she answered fine for the question No. 4 of Physicality, but she recognized that she got tired easily because of age. Related to the question No. 5 about the memory, she was anxious to forget what she ate, promised, shopped, etc. As for the question No. 6 about the socialization, she answered negatively, and she missed her friends as she grew older and they passed away. Finally, she answered negatively for the question No. 7 about anxiety, because of the Covid-19.

Currently, we evaluate these words by the simple sentiment analysis as described in Sect. 5.6. More sophisticated analysis should be considered to detect the severe situation, which is left for our future work.

6 Conclusion

In this chapter, we have introduced our research achievements of sensing technologies for monitoring in-home elderly people. In the first half of the chapter, we presented technologies of monitoring daily living of elderly people. Equipped with seven kinds of environmental sensors, the developed Autonomous Sensor Box enables non-intrusive environmental sensing 24 h 365 days with minimized maintenance effort at the edge side. The collected time-series data allows the elderly person as well as remote supporters to reason how the person is living. We have also presented a method of automatic activity recognition with the environmental sensing data. Using the time-series sensor values labeled by the LifeLogger tool, we have shown that the supervised machine learning was able to recognize the seven kinds of daily activities to some extent of accuracy. It was also shown that the accuracy was improved using location information collected by BLE beacons together with the environmental sensing data.

In the latter half of the chapter, we presented technologies for monitoring internal minds of in-home elderly people. The proposed concept of Mind Sensing (Kokoro Sensing) aimed to externalize the internal minds as words through the conversation with virtual agents. By wrapping the sophisticated MMDAgent components with Web services, we implemented an animated virtual agent PC Mei-chan, who talks to the in-home elderly person to obtain the internal state via voice. We also implemented a LINE chatbot, LINE Mei-chan, who provides asynchronous text communication to obtain the internal state. To achieve the efficient and flexible mind sensing, we introduced Mind Sensing Service, by which users can define custom mind sensing methods by time-based and event-based rules. Finally, we presented the Mind Monitoring Service to achieve long-term monitoring of the internal minds. It was shown, in the long-term experiment, that the scores of the states with respect to Physicality, Mentality, and Sociality characterized well the situation of the target elderly person, and that the internal minds were externalized as words in the answers of open questions.

Our research and development for assisting elderly people are still ongoing, and there are many other achievements that could not be introduced in this chapter (e.g., [52,53,54,55,56]). Although there are still many challenges, we believe that the idea of using smart technologies and big data for person-centered elderly assistance and care is crucial in this super-aging society. The integration with the latest AI technologies such as deep learning and large language model (LLM) is also promising for our future work.