1 Introduction

In 2018, Korea became an aging society, as defined by the United Nations, and the number of senior citizens living alone is increasing rapidly. However, a workforce shortage exists to monitor senior citizens living alone and deal with possible risks. Therefore, research has recently been conducted to analyze human behavior and apply it to various fields, such as marketing and medicine [1,2,3]. Fourth industrial revolution technologies, such as artificial intelligence and the internet of things, contribute to analyzing human behavior [4,5,6]. Research has been emerging to fuse fourth industrial revolution technology with the monitoring system for the care of the elderly and patients living alone in residential spaces [7].

However, existing studies on human behavioral analysis are mostly fragmentary studies using one or two sensors to recognize a situation. The monitoring system attaches a biological sensor or a wearable device to the monitoring target and observes abnormal data in the generated data or observes the current state of the monitoring target using closed-circuit television (CCTV) [8, 9]. However, as the recent analysis of human behavior and the provision of individual services for each individual are the most efficient, monitoring systems in residential spaces should evolve into customized services rather than the existing uniform services [10].

Analyzing human behavior is required at the healthcare level, not for therapeutic purposes. The demand for personalized services is also increasing [11, 12]. Personalized models are more efficient than generalized models to provide these services. In terms of monitoring systems, daily life in an individual’s living space varies from person to person. Moreover, both the beginning and end of the day are different, and each person has his or her life pattern. Therefore, individual behavioral patterns must be analyzed to establish a monitoring system in a residential space. The monitored subject must be observed for a long time to derive the behavioral pattern. The monitored subject’s current behavior must also be recognized in detail. However, in general, research on finding behavioral patterns has collected data by observing subjects for a long time but records the current state within a subjective category. Therefore, the current monitoring target status is difficult to determine objectively, and accurate behavioral patterns are challenging to derive because the current status must be determined within a uniform category.

Assuming that human life is a process, process mining can be a useful technique for deriving human behavioral patterns [13]. Process mining techniques can be used to derive process models from a database of the monitored subjects’ daily lives and to detect anomalies by comparing the derived process models with running processes [14,15,16]. A study was recently conducted to derive a process model and detect anomalies in real time rather than derive a process model from a database recorded for a certain period [17,18,19]. However, applying process mining techniques in their current state to human behavior is challenging because anomaly detection using a process model is performed in the same situation in which the beginning and end of a process are the same. However, because human behavior is different in both the first and last actions recorded and daily life varies greatly, a very complex spaghetti-type process model is derived [20]. Therefore, comparing a complex process model with the current process consumes many resources. Process mining also uses a token replay method that sends a single token to a derived process to detect anomalies, which detects the point at which an abnormality occurs through the collected data for a certain period. This method has the disadvantage that abnormality cannot be detected in real time.

Therefore, this study proposes a human behavioral pattern analysis-based anomaly detection system (HBPAADS) that classifies the current human behavior for residential-space monitoring systems and uses the behavior to derive living patterns to detect abnormalities. The deep learning image methodology is used to implement the monitored behavior classifier, collect and analyze the classified behavior, and derive the multifrequency subsequence pattern to implement the HBPAADS in this study. The abnormality is judged by comparing the derived subsequence pattern with the monitored subject’s current behavior.

The strength of the proposed system is that the behavioral classifier using deep learning can analyze the behavior in the human living space in more detail. In addition, the system categorizes the monitored subjects’ behaviors observed for a certain period rather than using the existing uniform behavior classification method, satisfying the need for personalized monitoring services. In terms of anomaly detection, the system compares the monitoring target’s status collected in real time with the pattern using a multifrequency subsequence, not the life patterns derived from process mining. Therefore, it is possible to detect anomalies even in the current state of a complex type of monitoring subject, not in a process mining technique that compares formalized processes with the same start and end. The proposed system also means that the system can be expanded using methodologies or fourth industrial revolution techniques in other areas by applying the existing methodologies applied to humans in various areas, such as manufacturing and biomaterials, to provide customized monitoring services and automatically detect anomalies.

The structure of this study is as follows. In Sect. 2, the existing research on residential space monitoring systems and anomalies detected in monitoring systems is reviewed. Section 3 describes the proposed HBPAADS architecture, descriptions of each module, the implementation method, and the algorithm. Section 4 derives the pattern to verify the HBPAADS and describes the experimental results concerning the accuracy of the anomaly detection. Finally, Sect. 5 describes the summary of this study, its contributions, and future work.

2 Related work

The flow of data beyond the normal data pattern is called an anomaly [21]. The basic concept of anomaly detection is predicting the future state based on the current state and discovering the cases compared with the pattern of prederived data. Common anomaly detection uses critical value settings and the support vector machine, among others [22, 23]. By applying this methodology, the research was conducted to find anomalies by observing ordinary human life in various fields, by deriving patterns and comparing them with the current state [24, 25]. In addition, research has been conducted to detect emergencies by finding normal data patterns in residential space monitoring systems and determining data that deviate from these patterns [26].

2.1 Anomaly detection in the living space

Anomaly detection in the monitoring system analyzes human behavior and compares the current monitoring target status in real time with patterns or the critical data value. In the early monitoring system, abnormality detection was performed by collecting data from a biosignal sensor. Jang et al. proposed a system measuring the pulse rate using a wristwatch and a ring-shaped sensor to monitor it remotely [27]. If an abnormality exists in the pulse during remote monitoring, it is determined to be an emergency, and an alarm is given. Lical et al. proposed an anomaly detection method using a change in frequency by attaching a radio frequency identification (RFID) reader to a living space and wearing an RFID tag on a monitoring target [28]. A method detects anomalies when a general frequency and a frequency from this is generated according to the movement of the monitored subject.

Subsequently, studies have been conducted to derive patterns by focusing on the monitoring target’s behavior, not on the biosignal data, in methods using the critical value and data classification. Carlos et al. derived the pattern using the movement path of the monitoring subject in the monitoring space [29]. Beacons in the form of bracelets were worn on the monitoring targets, and the location data were collected. The experiment was conducted to collect and derive patterns and test the data for 1 month. Virone derived patterns by focusing on specific activities at specific times of day, on a 24-h basis [30]. A motion sensor was installed in the residential area where the monitoring target was active. The residence time was collected, and a pattern was derived. An experiment was conducted to determine the degree of abnormality by randomly comparing the derived data with the daily data.

2.2 Anomaly detection in the living space

In the study analyzing human behavioral patterns in the monitoring system, the monitored person’s behavior was recorded manually or through sensors. Then, patterns were derived using process mining and data mining techniques.

Timo et al. collected the values of the acceleration, orientation, and GPS sensors using such equipment as smartphones and smartwatches to represent human behavior as a process model [31]. Using the collected data and the predefined device location, environment, and posture, the state of the current monitoring target was recognized. In addition, a predefined activity was recorded by the monitoring target and was used to derive a process model using a fuzzy model, heuristic miner, and inductive miner with sensor values. Finally, the process of deriving a process model through three use cases was presented.

Seki proposed a detection method using fuzzy variable representation based on Bayesian networks after constructing a monitoring system using omnidirectional cameras [32]. An experiment was conducted to verify the proposed system, deriving a pattern using the dataset of the older adults’ average behavioral time for a week, which was investigated by the Statistics Bureau in the Japanese Ministry of Internal Affairs and Communications in 2006.

Tasi et al. analyzed the behavior after collecting the data from the monitoring target using the acceleration sensor attached to the smartwatch [33]. The day of the week that the action occurred and the duration of the action were collected to determine a pattern, and after modeling the distribution of actions by the day of the week, the association between the contexts was analyzed. Afterward, to verify the proposed system, a pattern was derived using CASAS smart home data, and the accuracy of the prediction for the next action was calculated [34].

Song et al. defined the movement and action for an action and recognized it using a three-axis acceleration sensor [35]. Next, using the proposed framework, the probability of moving from the current action to the next action was calculated to derive the action pattern. In addition, because the proposed framework is based on probability, it is possible to predict the next action in real time by linking a mobile phone with a cloud system with a high computational speed.

3 System architecture

This section describes the proposed HBPAADS architecture and the main algorithms and implementation methods of the modules that comprise the architecture.

3.1 Anomaly detection in the living space

The HBPAADS consists of five major modules, as illustrated in Fig. 1. The modules are the residential space monitoring system, behavior classifier, pattern generator, anomaly calculator, and warning module.

Fig. 1
figure 1

Architecture of the human behavioral pattern analysis-based anomaly detection system

First, the residential space monitoring system collects image information generated in the residential monitored activity spaces and delivers it to the behavior classifier. A webcam, CCTV, open-source hardware, or another device is used to collect the live video of the monitored target. The behavior classifier analyzes the monitored subject’s current behavior from the collected images and stores the data with the time in the database. The pattern generator derives subsequence patterns from the behavioral sequences in the database and stores them. The anomaly calculator is used to detect derived subsequence patterns and abnormal conditions. The last warning module informs the guardian of an abnormal situation through an SMS alarm when the anomaly calculator detects an abnormal condition.

3.1.1 Residential space monitoring system

The residential space monitoring system captures the monitored subject’s activities in residential and living spaces using a webcam, CCTV, open-source hardware, or another device and delivers the image information to the behavior classifier. Figure 2 displays an example of the residential space monitoring system transmitting a video frame after capturing an image from a camera device.

Fig. 2
figure 2

Example of the residential space monitoring system process

Using OpenCV and Python’s picamera library, each frame is sent to the socket server [36, 37]. The frame sent to the socket server is used to classify the current monitored subject’s behavior in the behavior classifier.

3.1.2 Behavior classifier

The behavior classifier classifies the current monitoring subject’s behavior in the image transmitted from the residential space monitoring system. The behavior classifier uses a behavioral classifier trained using the object detection algorithm that uses the deep learning image methodology. An image of the defined behavior is collected to train the behavior classifier. Images are collected from the web and use the Fatkun Program, Extreme Picture Finder Program, and so on to download images corresponding to the keywords [38, 39]. The collected images are preprocessed for learning. Image preprocessing includes data labeling to remove unnecessary duplicates of the image, designate the region of interest (ROI) to learn within the image, and create metadata containing the specified ROI and class information.

The deep learning methodology used to train the behavior classifier uses one of the object detection models. The mask regional convolutional neural network (mask-CNN), you only look once (YOLO), single-shot detector (SSD), and faster regional CNN (faster R-CNN) are primarily used [40,41,42,43]. The above process determines the current monitoring target behavior for each input image frame after learning the behavior classifier. Then, it stores the current monitored behavior in the database with the time. Figure 3 presents the input and output data of the behavior classifier.

Fig. 3
figure 3

Example of input and output data for the behavior classifier

The behavior classifier classifies the current monitoring target’s behavior through a prelearned behavior classifier using the video frame transmitted from the residential space monitoring system as input data. Afterward, the categorized actions with the time are output in a tuple (time, event).

3.1.3 Pattern generator

Collecting monitored behaviors over a period allows the subsequences of monitored behaviors to be derived from the database. After sorting the monitoring target’s behavior in the database in chronological order and extracting only the information about the behavior, subsequence patterns are derived. Generally, subsequence patterns can be derived using sequential pattern algorithms and time-series deep learning algorithms.

Recently, a study derived a pattern using the long short-term memory (LSTM) algorithm of the recurrent neural network series, which is a time-series deep learning algorithm [44, 45]. The LSTM algorithm is generally used to predict the next event using the derived pattern. In this study, the LSTM is not used because the degree of abnormality for the entire collected situation is not predicted for the next event. In addition, it takes a very long time to derive a pattern by learning a large volume of data through the LSTM. Therefore, in this study, a pattern was derived using a sequential pattern algorithm. The sequential pattern appears in common data and refers to a multifrequency pattern containing the concept of time [46]. The sequential pattern algorithm is divided into an a priori series represented by the generalized sequential pattern and pattern-growth series represented by the PrefixSpan algorithm [47, 48]. Sequential patterns typically use candidate measures to create candidate patterns and derive patterns, but the PrefixSpan algorithm skips the process of creating candidate patterns to improve speed. Therefore, this study derives a subsequence pattern using the PrefixSpan algorithm. Figure 4 depicts an example of the pattern output from the collected behavior sequence.

Fig. 4
figure 4

Example of input and output data for the pattern generator

When the (time, event) dataset from the behavior classifier is delivered, the data are preprocessed to generate a sequence of daily events after sorting them in the order of the time when the events occurred. The preprocessed data are used as input data for the pattern generator. The pattern generator derives patterns using preprocessed data and sequential pattern algorithms, and the derived pattern is output as a pattern ID and subsequence set.

3.1.4 Anomaly calculator

The anomaly calculator uses the sequence alignment algorithm to compare the monitored subjects’ behavior in the database with the subsequence patterns derived using the pattern generator to determine the degree of abnormality of the current state [49]. The sequence alignment algorithm has been widely used in bioinformatics to analyze the homology of sequences, such as deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Sequence alignment is largely divided into global alignment, comparing two sequences, and local alignment, comparing which parts of two sequences have high homology. In this study, the local alignment algorithm is used to compare the sequences of different lengths in comparing the behavioral sequences of the monitored subjects stored in the database for a certain period with the subsequence patterns derived during the day. Figure 5 presents an example of determining the degree of abnormality by comparing the derived pattern with the state of the current monitoring target’s collected data.

Fig. 5
figure 5

Example of input and output data for the anomaly generator

The anomaly calculator calculates the degree of abnormality by comparing the subsequence pattern generated by the pattern generator with the (time, event) dataset delivered from the behavior classifier through the preprocessing process. Unlike the pattern generator, which uses data collected over a long period, the anomaly calculator collects data for about 10 s to calculate the degree of abnormality over a short period and uses it after preprocessing.

3.1.5 Anomaly warning module

The anomaly warning module determines the risk level of the current monitoring target based on the alignment score calculated as a result of the sequence alignment in the anomaly calculator. When a dangerous situation is detected, the anomaly warning module sends an SMS that warns the guardian of the dangerous situation through the internet hosting company’s message service.

The alignment score calculated in the sequence alignment used to calculate the degree of the abnormality may vary depending on the length of the data collection. Therefore, after collecting the alignment scores for a certain period, the average score of the alignment is used to set the alignment score that can determine the risk situation.

3.2 Implementation of HBPAADS

In this study, to implement the proposed HBPAADS, the system comprises a layer that photographs the monitored subject and a layer that classifies the current monitored subject’s behavior and calculates the degree of risk. In this study, we used the open-source hardware Raspberry Pi and a dedicated camera to implement the residential space monitoring system. In addition, we used the object detection algorithm faster R‑CNN, the machine running library of Apache Spark, Spark MLlib, and the MySQL database to implement the behavior classifier, pattern generator, anonymous detector, and anomaly warning module and analyzed the current state of the monitored target. Each module implementation uses Python.

3.2.1 Implementation of residential space monitoring system

In this study, the open-source hardware Raspberry Pi 3 was used to implement the residential space monitoring system. The camera can be a regular webcam or a dedicated Raspberry Pi camera. In this study, however, a Raspberry Pi camera was used to prevent power shortages when using a typical webcam. We used Python’s picamera library to capture the monitoring targets using a Raspberry Pi camera. In general, Raspberry Pi cameras have a very high resolution, which delays the rate at which each frame is sent to the server. Therefore, the camera resolution was adjusted for transmission speed. In addition, delays can be prevented using such methods as the protocol definition for wireless transmission [50, 51]. The captured image is transmitted to the socket server for each frame in real time. We use Python’s socket library to send data to the socket server, and the internet protocol (IP) of the socket server can be set arbitrarily by the administrator. The behavior classifier classifies the current monitoring target’s behavior by analyzing the images transmitted from the residential space monitoring system to each frame’s socket server. Then, the system saves the current monitored behavior in the database with the time.

3.2.2 Implementation of the behavior classifier

This study trained the behavior classifier using the faster R-CNN algorithm, the object detection algorithm for image deep learning methodology, to construct the behavior classifier. To learn the behavior classifier, it first defined nine behaviors. The defined behaviors are listed in Table 1. The nine defined behaviors were selected as the highest-frequency behaviors through consultation and actual surveys of researchers from the Department of Social Welfare at various universities.

Table 1 Patterns for the behavior classifier

The images were collected using the Extreme Picture Finder System from the web to learn the nine defined behaviors. The Extreme Picture Finder System searches for images related to keywords on the web by entering the keyword for the searched image, and it can be downloaded with a simple click. Duplicate images may exist in image datasets downloaded through the Extreme Picture Finder System, which causes an unnecessary learning process and increases the learning time. Therefore, the VisiPics System is used to eliminate duplicate images [52].

In this study, about 2000 learning datasets for each behavior and about 100 testing datasets were prepared. When the image dataset for the defined behavior was ready, it was labeled. The defined behavior was specified as a class, and the coordinates corresponding to the ROI were extracted from the image from the training dataset and saved as a file in CSV format, which becomes metadata. In this case, the ROI was specified in the form of a rectangle, and the x and y coordinates of each vertex were extracted and stored.

A labeling program was used for labeling, and the ROI must be specified manually [53]. Labeling proceeded by directly designating the ROI as a human in a rectangular form. Therefore, no problem occurs when the dataset is small, but the labeling process takes substantial time when the dataset is large. In this case, the recommended method was to locate the shape of a person using an image classifier that had already been learned. Typically, the image classifier displays the searched object in the image as a rectangle. Therefore, using the learned classifier, the person in the image was recognized, and the coordinates of each vertex were extracted for the ROI and displayed as a square. Using this method, even if the dataset increases, labeling can proceed quickly.

After the labeling phase, the faster R-CNN algorithm was trained using the training dataset and metadata. In this study, we set the learning count to 3000 and stored every 500th generated model. As depicted in Fig. 6, the 2500th model with the smallest loss was selected as the behavior classifier. The learning accuracy of the selected model was 91%. The specifications of the computer used for learning are presented in Table 2.

Fig. 6
figure 6

Results of behavior classifier learning

Table 2 Pattern for the behavior classifier

When the behavior classifier was selected, the image captured in real time was analyzed for each frame. When a frame was sent to the socket server, the behavior classifier was used to analyze the current monitored subject’s behavior, storing it in the database with the time. Figure 7 displays the results of classifying the captured monitoring image and the current monitoring subject’s behavior.

Fig. 7
figure 7

Results of the behavior classifier

3.2.3 Implementation of the pattern generator

The pattern generator derives subsequences from the stored behavior over a period. In this study, behavioral sequences stored for one week were used. First, the sequence of actions was listed in chronological order and was cut by the hour. The generated sequences were input into the PrefixSpan algorithm, which is a sequential pattern algorithm, to derive subsequences.

The PrefixSpan algorithm determines the length-1 subsequence pattern of the minsup or greater and creates a projected DB with a prefix. Then, the projected DB is created by deriving the length-2 subsequence pattern from the projected DB. This process was repeated up to length N to derive the subsequence pattern and store it in the database. In this study, we used Python’s MLlib and spark library to implement the pattern generator.

3.2.4 Implementation of the anomaly calculator

The anomaly calculator compares the subsequence patterns stored in the database with the behavior sequences of the current monitored subjects accumulated for a certain period. In this study, we use the Smith-Waterman algorithm [54], which is the most used among local alignments, to compare the two sequences. The Smith-Waterman algorithm is implemented using Python’s biopython library. This study compared the one-minute stored behavior sequence and all derived subsequence patterns in a 1:1 traversal. The comparison result is an alignment score that indicates the abnormality of the current monitoring target.

3.2.5 Implementation of the anomaly warning module

When the anomaly detector’s alignment score exceeds a certain value, the anomaly warning module determines that the monitoring target is in a dangerous situation and sends an SMS alarm to the guardian. The SMS transmission uses the message transfer application programming interface (API) of an internet hosting company. In this study, it used Python’s requests library to use the message transfer API of an internet hosting company. The ID, password, sender number, recipient number, and risk notification text of the hosting company are sent to the message API using the request post function.

4 Experiment of HBPAADS performance

Data were collected from real residential spaces to verify the HBPAADS proposed in this study. A pattern derivation experiment was conducted using the data. In addition, the sequence alignment execution time was measured according to the real-time data collection period.

4.1 Applied to actual residential space

In this study, to verify the proposed system, an environment similar to that of a residential space was built, and data were collected by installing a web camera and a server that can recognize behavior from the monitoring images. Data were collected in a residential space of about \({26 \mathrm{m}}^{2}\) for one person, and a camera was installed where the entire residential space was visible. Figure 8 depicts a residential space where some data were collected.

Fig. 8
figure 8

Residential space for the experiment

The data were collected for one hour a day for two weeks. Figure 9 presents the analysis of the monitored subject’s behavior from the images collected in the experimental environment. The analyzed behavior was saved in the database in text format with the time.

Fig. 9
figure 9

Results of behavioral detection from real data

4.2 Derivation of patterns using real data

A pattern was derived using the actual collected data to verify the system proposed in this study. Data noise and behavioral false positives were not considered in deriving the pattern. The support of the PrefixSpan algorithm was set to 80% of the total data. The derived patterns are presented in Table 3.

Table 3 Derived pattern using the pattern generator

When the same motion was recognized continuously, the continuous sequence was regarded as one activity to derive a pattern. The PrefixSpan algorithm included in Apache Spark’s MLlib library was used to guarantee the accuracy of the derived pattern. The derived pattern indicates that most of the subjects watch TV, use a computer, or use a smartphone in a residential space. In addition to the derived pattern, other detected actions were excluded from the pattern because they appeared with a lower frequency than the set value of support. Even if an action is excluded from the pattern, it may be an important action, such as eating or drinking. Therefore, it is very important to set the appropriate support. Moreover, the derived pattern reveals various starting activities. The pattern is derived by assuming that all events of the PrefixSpan algorithm can be starting events. Therefore, the derived pattern can have various starting events resulting from deriving the pattern, starting with events above support.

4.3 Performance test of the anomaly calculator

The time taken when the behavior sequence that was collected for 10 s and all derived patterns were compared 1:1 was measured to verify the performance of the anomaly calculator. In addition, the behavioral sequence was compared with the pattern until it was increased by 10–100 s. A total of 100 experiments were conducted, and all behavioral subsequences for each second were changed.

As presented in Fig. 10, after a certain collection time passes, the execution time of the anomaly calculator rapidly increases. Therefore, when applied to an actual residential environment, the action collection time must be set via experiments.

Fig. 10
figure 10

Results of the performance test for the anomaly calculator

5 Conclusion

In this study, we proposed the architecture of the HBPAADS, which derives and considers the behavioral patterns of the monitoring targets in residential spaces. We verified the system through implementation and experiment. The HBPAADS uses the behavior classifier with the object detection algorithm, a deep learning methodology that can distinguish the monitoring subject’s more detailed behavior and analyze the current state. In addition, subsequence pattern derivation using the sequential pattern algorithm satisfies the requirements of a customized monitoring system by deriving patterns according to individual life cycles, not behavioral patterns in generalized residential spaces.

Most research on monitoring systems uses data collected from sensors using contact or noncontact sensors. The monitoring target must wear the contact sensor, and if it is not worn, tracking the current monitoring target’s state is challenging. For a noncontact sensor, many sensors, such as those on the wall or ceiling of the monitoring space, should be installed to avoid interfering with the monitored subject’s daily life. The disadvantage is that the cost of the installation and the sensors is high.

In addition, most studies that analyze human behavior to derive patterns indicate that the monitored subjects themselves record their current behavior. Because this involves the subjective judgment of the monitoring target, the monitoring target’s behavior is not objective. In this study, the object detection methodology, a deep learning image method, analyzes the monitoring target’s behavior in the monitoring image collected using the camera. This method enables an objective behavioral analysis without attaching specific sensors to the monitoring target or installing them at home.

This study’s contribution is that, unlike previous studies, it is possible to provide personalized monitoring services by classifying the monitored subject’s current behavior and deriving the monitored subject’s life pattern based on the deep learning methodology. In addition, using the deep learning image method, sequential pattern mining, and the sequencing algorithm, various technologies were fused to provide new services.

Because this study is centered on object detection, where the image was used to classify the monitored subject’s behavior, a behavior classification method is required that considers more diverse data. The experiment of constructing an experimental environment similar to that of the residential environment in Sect. 4 and collecting actual data reveals that the difference in accuracy according to the shooting angle and distance appeared when the behavior was recognized using only object detection. Therefore, it is necessary to expand the future research and study behavior classification considering biodata and the surrounding data. In addition, data noise was not considered in detecting abnormalities of the monitored subject. In addition, the setting of support has led to cases in which important actions are not included in the pattern. Therefore, research on the optimal support setup is needed.