1. Introduction

The detection of moving objects is critical in many defense and security applications, where motion detection is usually performed in a preprocessing step, a key to the success in the following target tracking and recognition. Many videos used in defense and security applications are outdoor videos whose quality may be degraded by various noisy sources, such as atmospheric turbulence, and sensor platform scintillation. Meanwhile, moving objects may be very small occupying a few pixels only, which makes motion detection very challenging. Under this circumstance, existing approaches may generate significant amount of false alarms.

Motion detection has been extensively investigated [13]. Many research works are conducted for indoor videos with large objects. As one of the major techniques, optical flow-based approaches have been widely used for motion detection. There are two classic methods of optical flow computation in computer vision: Horn-Schunck (HS) method and Lucas-Kanade (LK) method [47]. Both of them are based on the two-frame differential algorithms. LK method may not perform well in dense flow field; on the other hand, HS method can detect minor motion of objects and provide a 100% flow field [7]. Thus, we focus on HS method for optical flow computation in our research. Considering outdoor videos with low quality, special care needs to be taken in order to better extract features related to moving objects from optical flows while suppressing false alarms.

Principal component analysis (PCA) is a typical approach in multivariate analysis [8]. It is also named the discrete Karhunen-Loève transform (KLT) or the Hotelling transform [9]. PCA includes the eigen-decomposition of a data covariance matrix or singular value decomposition of a data matrix, usually after mean centering. It projects the original data onto an orthogonal subspace, where each direction is mutually decorrelated and major data information is present in the first several principal components (PCs). For optical flows in a local window, moving objects have consistent flows while pixels with only turbulence have random flows. Thus, if PCA is applied to the two-dimensional (2D) data of optical flows, the difference between desired motion pixels and random motion pixels may be magnified because their contributions to the two eigenvalues are very different; the contribution from random motion pixels can be very small, even to the second eigenvalue. Experimental results show that this approach actually is an effective way of analyzing outdoor videos; it can reduce false alarms for videos with either static or dynamic background, and it is also useful to delineate the size of moving objects.

This paper is organized as follows. Section 2 explains the proposed method based on optical flow and PCA. Section 3 presents experiments using ground-based and airborne videos. Section 4 draws the conclusion.

2. Proposed Method

HS method is a special approach of using global constraint of smoothness to express a brightness variation in certain areas of the frames in a video sequence. It is also a specially defined framework to lay out the smoothness of the flow field. Let represent the brightness of a pixel at coordinates and the frame. According to [4], the image constraint at with Taylor series can be expressed as

(1)

which results in

(2)

where and are the and components of the velocity or optical flow of respectively, and and are the derivatives of the image at in the corresponding directions. A constrained minimization problem can be formulated to calculate optical flow vector for the th frame:

(3)

where and are the estimated local average optical flow velocities, and is a weighting factor. A larger value of results in a smoother flow; in our experiments using 8-bit videos, it is empirically set to be 30000. Based on the norm of an optical flow vector, one can determine if the motion exists or not, while the direction of this vector provides the motion orientation.

Two optical flow images can be constructed by pixel optical flow vector A mask of size slides through these and images. At location a two-dimensional (2D) data matrix X can be constructed, which includes all the 2D vectors covered by the mask. The covariance matrix can be calculated as

(4)

where is the optical flow matrix after mean removal. After eigen-decomposition, two eigenvalues are assigned to the central pixel of the mask. Motion detection is accomplished by analyzing or thresholding the eignenvalue (s). Since is the major flow component and is the minor flow component, it may be more effective to considering than the values in the original space.

Intuitively, only needs to be considered because it corresponds to the major flow component and corresponds to the minor flow component or even turbulence. An appropriate threshold can be determined by using the Ostu's method on the histogram [10]. However, in practice, should be considered as well since pixels inside object boundaries usually have quite large but not Thus, thresholding may need to be taken on the histogram; a pixel is claimed to have motion if either or are above the corresponding thresholds.

Thus, the motion detection algorithm can be described as follows.

  1. (1)

    Calculate optical flows between two adjacent frames (after registration as needed).

  2. (2)

    For each pixel in the 2D optical flow data, perform PCA for a local mask (of size in the experiment), and two eigenvalues are assigned to the central pixel.

  3. (3)

    Apply the Ostu's thresholding to the eigenvalues of all the pixels ( in the experiment).

Figure 1 illustrates the framework of the proposed method with a mask and resulting data matrices.

Figure 1
figure 1

The framework of the proposed method.

It is noteworthy that some variants exist when implementing the proposed method differently.

  1. (1)

    In Step we may use the optical flow data from multiple frames. For instance, optical flow data from Frames 1 and 2 can be combined with optical flow data from Frames 2 and 3; this may help to emphasize the desired optical flows of moving objects and to emphasize the randomness of turbulence.

  2. (2)

    In Step masks with different sizes can be used. Intuitively, for a large moving object, mask size should be large.

  3. (3)

    In Step thresholding can take place on either or depending upon the object size and the features of turbulence.

In the experiments, we use two adjacent frames, a mask, and only for thresholding. It is to show that such simplest implementation is sufficient to provide better performance than other widely used techniques.

3. Experiments

In the experiments, videos with both static and dynamic backgrounds were analyzed. They were taken by a commercial Sony Camcorder. We compared our proposed method with the original optical flow method, the motion detection methods based on Kalman filtering [11], background modeling using Gaussian mixture model [12], difference-based spatial temporal entropy image (DSTEI) [13], and forward-backward motion history images (MHI) [14]. They were chosen for comparison because they are either typical methods or designed specifically for more complicated videos (e.g., those with dynamic background).

3.1. Experiment  1: Ground-Based Video with Relatively Large Object

In this experiment, a video with static background in a small regional airport was studied, which was taken when the camcorder was mounted on a tripod. As shown in Figure 2, a Hughes Cayuse helicopter was the moving object. Since the video was taken during a humid summer afternoon, there were significant atmospheric turbulence effects, which were visible around the vehicle, runway, and tree profiles.

Figure 2
figure 2

The two input frames of a helicopter video. (a) frame 1 (b) frame 2

Figure 3 shows the detection result using optical flow only, where detected pixels were highlighted in red. It contained many false alarm pixels in runway and tree profiles. Figures 4, 5, 6, and 7 are the detection results using Kalman filtering, background modeling, DSTEI, and MHI methods, respectively. We can see that they all could detect the helicopter but with some regions missing and a few false alarm background pixels. The background modeling method could detect the largest areas of the helicopter; however, there were erroneously detected pixels scattered in the scene (even in the sky area). This method relies on an accurate background model, generally requiring complicated computations.

Figure 3
figure 3

The result from optical flow method.

Figure 4
figure 4

The result from Kalman filtering.

Figure 5
figure 5

The result from background modeling.

Figure 6
figure 6

The result from DSTEI method.

Figure 7
figure 7

The result from MHI method.

Figure 8 is the result of the proposed method, where almost all the false alarm pixels were removed (only two pixels in the vehicles were left) and major regions in the helicopter were detected. Compared to Figure 3, introducing PCA can significantly improve the performance of optical flow-based detection. Compared to the results in Figures 47, the proposed method can reduce false alarm while detecting larger regions in the moving object.

Figure 8
figure 8

The result from the joint optical flow and PCA method.

Figure 9
figure 9

The two input frame about an airborne video. (a) frame 1 (b) frame 2

3.2. Experiment  2: Airborne Videos with Small Objects

The second experiment used an airborne video with low quality. It was taken by the camcorder mounted on the helicopter in the video shown in Experiment  1. In addition to atmospheric turbulence, scintillation from the airborne platform (i.e., the small helicopter) further degraded the video quality. As shown in Figure 9, there were three moving vehicles on the highway, highlighted in yellow circles. They consisted of only a few pixels. The two frames were pre-registered using the method in [15].

Figure 10 shows the detection result using optical flow only, where three vehicles on the highway were completely detected and the shape of the vehicles were outlined compactly. Figures 11, 12, 13, and 14 are the results for comparison, where the three vehicles were detected but not well delineated. For instance, the detected vehicle sizes were too small when using Kalman filtering and background modeling, and too big when using DSTEI and MHI. More false alarm pixels were contained in these results. Figure 15 is the result using optical flow and PCA, which could further reduce false alarm and the vehicle sizes seemed to be more reasonable. Although the proposed method provided the best result, there were still several false alarmed pixels, mainly located around the edges of buildings.

Figure 10
figure 10

The result from optical flow method.

Figure 11
figure 11

The result from Kalman filtering.

Figure 12
figure 12

The result from background modeling.

Figure 13
figure 13

The result from DSETI method.

Figure 14
figure 14

The result from MHI method.

Figure 15
figure 15

The result from the joint optical flow and PCA method.

We found out that such false alarms in airborne videos with small moving objects can be better removed by corner-based detection [16]. Harris corners were detected from two difference images, and many false alarm pixels around buildings could be removed; false alarms were further reduced through local tracking of detected corners in several consecutive frames. The drawback is that the detected result contains only object corners. In conjunction with the proposed method, the complete regions of moving objects can be segmented for the corner-based detection while the false alarm can be reduced in the proposed method. As shown in Figure 16(a), the corner-based method can accurately detect the three vehicles without false alarms; however, it detects only a corner corresponding to an object as detailed in Figure 16(b). Figure 16(c) shows the extracted vehicles using the MHI method, where the object sizes were slightly magnified. Figure 16(d) is the extracted vehicles using the proposed method, where the object sizes were reasonably reduced and pruned.

Figure 16
figure 16

The result by combining the corner detection and the propose method. (a) detected vehicles based on corner detection (b) the three vehicles in (a) (c) extracted entire region using MHI method (d) extracted entire region using our method

The result using another airborne video is shown in Figure 17, which further demonstrated that our method can better extract object sizes.

Figure 17
figure 17

The result by combining the corner detection and the propose method in another airborne video. detected vehicles based on corner detectionthe four vehicles in (a)extracted entire region using MHI methodextracted entire region using our method

4. Conclusion

In this paper, we propose a joint optical flow and PCA approach for motion detection. Instead of considering the original optical flow, the two eigenvalues of the covariance matrix of local optical flows are analyzed. Since the first eigenvalue represents the major motion component and the second eigenvalue represents the minor motion component or turbulence, they are more useful to detect true motions while more successfully suppressing false alarms. The proposed method is also effective in extracting the actual size of moving objects.

The computational complexity involved in PCA includes the calculation of covariance matrix of local optical flow and its eigen-decomposition. For a mask of size the number of multiplications in calculating the covariance matrix of size is and complexity of eigen-decomposition is generally For an image frame with m pixels, the total computational complexity is It can be reduced to if using iterative PCA (IPCA) as discussed in [17], where is a small integer. As the future work, we will investigate the performance when using IPCA to expedite motion detection.