1 Introduction

Families often wish to monitor the current situation of other family members who are elderly, living alone, or hospitalized in remote locations. However, elderly people and inpatients monitored by video equipment often do not want to share personal details such as their attire or behavior. In this paper, we propose a novel video input method that considers both privacy and security. The degree of the criteria of the blur with video input method is also described here. In addition, we propose a basic algorithm for determining the number of persons in a living room, as an example to estimate the room circumstances from the blurred image. We evaluate and verify the proposed algorithm.

2 Monitoring Systems and Privacy

In general, monitoring systems do not need to capture information that is as detailed as that required by video surveillance systems; typically, monitoring systems only need to provide an approximation of a situation. Moreover, elderly people living alone or inpatients who are monitored with video cameras usually do not want to share personal details such as their faces, attire, or behavior. Therefore, a monitoring system can provide a sufficient level of detail by capturing the following information: the presence or absence of a person in a room, the number of people, their position and movement, and their approximate actions. The required information is obtained by blurring the video image. In general, video blurring methods use mosaics generated by image processing techniques. However, these methods have a security risk. In other words, because the image is processed after raw data is stored on a PC, there is a risk that information may be stolen before processing. To address this problem, we propose a novel video input method that considers security.

3 Proposed Monitoring System and Results

To reduce the risk of information being stolen prior to image processing, we propose a method for blurring input images. This method employs an image that is blurred by the focus ring of the camera. This eliminates security problems, because even if the video images are stolen, they are already blurred. As for utilization problems, users are unaware of the degree of blurring. When using a simple numerical scale for the degree of blurring, a user cannot understand intuitively; a user interface is not preferable. Thus, it is necessary to establish meaningful blurring criteria. We propose using a checkerboard pattern as the input image, and employing the pattern’s standard deviation of concentration to determine the blurring criteria. Furthermore, to understand the standard deviation subjectively, this study examines the correspondence between the standard deviation and eyesight.

3.1 Blurring Degree Criteria

The blur settings rotate the focus ring of the WEB camera from the in-focus state to the close-up state. In the experiment, a one-level blur was performed using a 1/4 rotation of the focus ring. By using the value of the standard deviation, it is no longer dependent on the amount of rotation of the focus ring. Furthermore, even with another WEB camera, it is possible to match the same conditions. Figure 1 shows the checkerboard pattern used in the experiment. The image of the checkered pattern was taken at a distance of 30 cm. The checkered pattern is printed over the entire surface of an A4-size generic sheet size. The checkered pattern printed on the A4 sheet was taken from a range vertical width of 90 % and a left and right width of 70 % of A4 sheet. Table 1 and Fig. 2 show the experimental results for the standard deviation of concentration. Because the standard deviation changes linearly for each blurring level, it can be used for determining the degree of blurring.

Fig. 1.
figure 1

Checkerboard pattern used in the experiment

Table 1. Standard deviation of concentration for the checkered pattern
Fig. 2.
figure 2

Graph of the standard deviation of concentration for the checkered pattern

3.2 Correspondence Between the Blurring Level and Eyesight

We conducted experiments to study the correspondence between the blurring level and eyesight. The brightness of the experimental room was 400 lx. The eyesight of the subject was 1.0 or more in corrected visual acuity or naked eye vision. Figure 3 shows the manner in which a Landolt ring is captured. The Landolt ring of an eyesight test was captured at a distance of 30 cm by using a WEB camera. The distance was set to 30 cm to obtain the same blurriness as the checkered pattern. We prepared six images of the Landolt ring, which were captured every 1/4 rotation of the focus ring from the in-focus state to the close-up state. Figure 4 shows an experimental method. The viewing distance is 3 m to an 80-inch monitor. The size of the Landolt ring to be displayed on the monitor is adjusted to be appropriate so as to size at a distance of 3 m. The viewing distance is 3 m, to match the eye chart for 3 m. A display that is large enough to match the size of the Landolt ring to the lower eyesight is required. Thus, an 80-inch display is used in this experiment. The screen resolution is 1920 × 1080 px. Figure 5 shows Landolt rings of blurring level 4 on the monitor. In the experiment, participants viewed images of Rundle rings with different sizes and blurring levels. Subsequently, we measured each participant’s eyesight. The experimental results in Fig. 6 show the defocused room image of blurring level 4.

Fig. 3.
figure 3

Landolt ring of acquisition method

Fig. 4.
figure 4

Experimental method

Fig. 5.
figure 5

Landolt rings of blurring level 4 on the monitor

Fig. 6.
figure 6

Defocused room image of blurring level 4

3.3 Experimental Results

Figure 7 shows the experimental results. The vertical axis represents eyesight, and the horizontal axis represents blurring levels. Blurring level 0 is the in-focus state. Level settings from 1 to 4 were found to correspond to eyesight. It was also found that the Rundle ring could not support eyesight levels of 0.01 or less.

Fig. 7.
figure 7

Experimental results

4 Estimation of Number of Persons Using Blurred Video

We propose a basic algorithm for determining the number of persons in a living room, as an example to estimate the room circumstances from the blurred image. We evaluate and verify the proposed algorithm. We have already studied the algorithm for estimating the number of persons using RGB information. This paper describes an improved algorithm using the HSV color space. We report the results.

4.1 Basic Idea and Algorithms for Estimating Number of Persons

Our previous algorithm [1] using RGB information was poor in terms of accuracy. The first problem is the use of RGB information. If the environmental conditions by shadow and lighting conditions change, it becomes difficult to set the threshold in this method. The second problem is determining the number of persons, depending on the amount of change in the entire image. This method cannot respond to noise and small changes. The first solution is to use the HSV color space which can separate color and brightness. Therefore, the solution is not dependent on changes in the brightness because only color information is used. The second solution is to divide the obtained image into a block of 40 × 40 px, and to determine the presence or absence of the change in each block. Based on the above details, we implement the algorithm.

4.2 Initial Setting of Hue and Saturation

Hue and saturation are acquired from the room of the image for the determination of a 0 person threshold as the initial value. The hue and saturation value for each block (40 × 40 px) are calculated. The acquired image is in a blurred state. The change values for each block are calculated by the Euclidean distance of the initial value and the measured value for each block. Equation (1) is an expression for the Euclidean distance. The data of hue and saturation during 10 s are measured, and the amount of noise is determined by the data. This experiment sets the threshold for the Euclidean distance of hue to 450 and the Euclidean distance of saturation to 350.

$$ d\left( {b,f} \right) = \sqrt {\sum\nolimits_{i = 0}^{n} {\left( {b_{i} - f_{i} } \right)^{2} } } $$
(1)

4.3 Estimation of the Number of Persons

We detect persons from the blurred images. This proposed algorism uses the labeling by four neighbors and the amount of change measurement technique by the Euclidean distance for each histogram value. We calculate the Euclidean distances of the histogram in the initial state and the current state from the values of hue and saturation in the background image and the current image. bi is the value of the hue or saturation in the background image. fi is the value of the hue or saturation in the current image. The value of n sets the saturation to 359, and hue to 255. The presence or absence of change in each block is determined by comparing the Euclidean distances and thresholds. Next, the labeling by four neighbors technique is performed, and blocks exceeding the threshold are labeled. Using this labeling technique, the number of blocks corresponding to the person-size in the image is determined. In addition, since the size of the captured area of the person by a position standing of a person is different, the weighting of the number of blocks corresponding to the coordinate position of the image is processed. The mean value of the hue of each labeled block is determined. The value of the hue for each labeled block is stored in the array. When the number of persons increases, the number of the newly emerged hue block is calculated. The number of persons is determined, via a comparison between the newly emerged hue block and a threshold number.

4.4 Evaluation and Discussion of the Algorithm

Used video scene (25 s) is that a person enters the living room, is sitting and leaves the living room. Videos for zero person and one and two persons were prepared. Blur is a state as shown in Fig. 8. The room is 20 m2, and the illuminance is set to 400 lx. The camera installed at a height of 2.3 m at the corner of the room. Table 2 shows the experimental results. The rates for detecting zero persons when using the algorithm are 99 %, 78 %, and 100 %, respectively. On the other hand, the rates for detecting zero persons, one person, and two persons when using the previous algorithm using RGB are 100 %, 86 %, and 50 %, respectively. The proposed algorithm was able to achieve a higher accuracy than the algorithm using RGB. In the future, there will be a need to consider the threshold to improve the detection rate of three or more persons.

Fig. 8.
figure 8

Blur state image

Table 2. Experimental results

5 Conclusion

To enhance the effectiveness of monitoring systems, we proposed a novel video input method that considers both privacy and security. By employing a checkerboard pattern and standard deviations to determine blurring criteria, we verified the validity of the image input method in experiments. In particular, we demonstrated the correspondence between the blurring level and eyesight. In addition, we described the problems associated with the algorithm for detecting persons using RGB information, and proposed a method for mitigating the problem. From the experimental results, we showed that a high accuracy in estimating the number of persons was achieved with the quite blurred videos. In the future, we will attempt to estimate a human’s behavior patterns from quite blurred videos.