Keywords

1 Introduction

With the increase in the number of city population, traffic is one of the most concerned problems that have direct impact on the quality of life. As a result of the economic growth the number of private vehicles has been at a high rate of growth. Finding a parking space brings much trouble for drivers. In general, parking lot management system only consists of several simple functions such as counting the entered vehicles and calculating their parking time. Using the old parking lot management, drivers have to stop at the entrance to take a card and waste much time to find a vacant position.

In recent years, many methods for vacant parking detection have been proposed. But a smarter and simpler one is needed constantly. Parking lot management prefers a fully automatic system, no card any more. However, this aspect has hardly ever been mentioned in the literature. Here, we are talking over such a fully automatic system for parking lot management. This smart system can automatically capture the moment when vehicles enter and leave a parking space, and then tell whether a parking space is vacant or not. Cooperated with the vehicle license plate recognition module, the surveillance module can calculate the stopping time.

2 Related Work

For the vacant parking space detection, vision-based methods have been widely used [1–3]. In these methods, parked vehicles are considered as foreground while the parking ground is considered as background. Foreground extraction approaches are used to calculate the possibility that the parking space is occupied. As for foreground extraction, a set of features such as edges and colors are focused on. In [4, 5], the color changes of parking spaces are modeled to determine whether a space is occupied; color histogram and the mean value of the parking ground color are used. Liu et al. [6] proposed a new edge extraction approach to detect the moving vehicles on the highway. The proposed edge extraction method decreases the influence of the vehicle shadows and illumination variation. In [7], edge feature and color feature are combined for the detection of interested vehicles. Liu et al. [8] integrate edge density, closed contour density and foreground/background pixel ratio of each parking space to identify whether a car is present. P. Almeida et al. [9] employ LBP and LPQ descriptors for parking space detection, combined with diverse classifiers further cut down the error rate to 0.16 %. Our analysis of the listed papers shows that most of proposed methods rely on diverse features. Usually, multi-features provide a more robust detection system than single feature.

In many papers, adaptive background subtraction methods are proposed to detect interested vehicles. Kairoek Choeychuen [10] model the background by computing the mean and variance of each color component for each pixel, perform foreground detection using background subtraction with the adaptive background model. The fusion of foreground density of the masked-area and the edge orientation histogram(EOH) feature is used to detect the parked cars. But the pixel-level background subtraction approaches are time-consuming for vehicle detection. In paper [11], an adaptive multi-cue background subtraction method is proposed to segment the foreground pixels corresponding to vehicles. Song et al. [12] propose a block-based background model to count the vehicles. They first segment the moving targets through frame differences and extract their binary images; according to them a digital sequence is built. Then establish the feature-based background model using the digital sequence. Foreground extraction approach is used for counting the vehicles.

Unlike Liu et al. [8], they use static background image when extracting the foreground information. In our study, we develop a feature-based background model using edge and color characteristics, and extract foreground feature using the adaptive background model. A multiple feature background models using morphology and color difference are proposed by Yuan et al. [13], which are used for highway vehicles capturing and counting. But their morphology method is not suitable for the scenes discussed in our paper, and their simple color-based background model leads to low accuracy in our system. In addition, we employ adjacent frame difference image to find the steady states, which means no moving vehicles in the parking space. When a car is entering into a parking space, we may not catch sight of the vehicle plates for the limit of visual angle. The pictures captured at the steady states are then passed to vehicle plate recognition system to count parking time. Most of the papers focus on vacant parking space detection, but few articles refer to the vehicle capturing. Another thing must be noted is that we express the parking spaces by several rectangles, which are drawn by users in the initial step of the progress. Two thresholds are introduced to help judging the entrance and departure of a car. Different sizes of rectangles require different thresholds. An adaptive method for thresholds updating is proposed in this paper.

3 Proposed Method

We extract foreground feature by modeling the background using edge and color information. The parking space with a high value of foreground feature is regarded to be occupied. At the beginning of the system, we set ROIs by drawing several rectangles,which represent the parking spaces. And we apply our method only on the ROI regions. The initial background model is calculated according to the background images collected offline, and then renewed online when the parking space is considered as vacant. Adjacent frame difference method is introduced to find the static state of the scene, in which there are no moving vehicles. We capture the entering pictures only at the static state of the scene. As the disposition of each parking space is the same, we’ll just describe the processing of one ROI region as a representative. The uniform algorithm is simultaneously used to the rest. The flowchart of proposed method is shown in Fig. 1.

Fig. 1.
figure 1

Flowchart of proposed method

3.1 Edge-Based Background Model and Foreground Feature Extraction

Edge detection is commonly used for feature detection, segmentation or motion analysis. In our research, we use edge images to establish our edge-based background model. The edge is extracted using canny edge detector [11]. We obtain the edge-based background model by computing the average of edge value for each pixel. The edge-based background model is updated only when the space is vacant, expressed by equations as below:

$$ {\text{BEM}}_{\text{n}} \left( {\text{x,y}} \right){ = }\left\{ {\begin{array}{*{20}c} {{\text{BEM}}_{\text{n - 1}} \left( {\text{x,y}} \right) ,\;{\text{flag}}_{\text{in}}^{\text{n}} { = 1}} \\ {\frac{\text{n}}{\text{n + 1}}{\text{BEM}}_{\text{n - 1}} \left( {\text{x,y}} \right){ + }\frac{ 1}{\text{n + 1}}{\text{CE}}_{\text{n}} \left( {\text{x,y}} \right) ,\;{\text{flag}}_{\text{in}}^{\text{n}} { = 0}} \\ \end{array} ,\;{\text{n}} \ge 1} \right. $$
(1)

\( {\text{BEM}}_{ 0} ( {\text{x,y)}} \) describe the initial background model, obtained by offline training. \( {\text{CE}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right) \) express the edge image of current frame, and the value is one when the pixel is an edge point. The flag \( {\text{flag}}_{\text{in}}^{\text{n}} \) equals one when the parking space is occupied.

We extract the foreground edge points with two steps. Firstly, those background edge points should be removed from the current edge image. Secondly, we remove the isolated points after background subtraction. The two steps can be expressed by formulas below:

$$ {\text{CE}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right) = \left\{ \begin{aligned} & 0, \;{\text{if BEM}}_{{{\text{n}} - 1}} \left( {{\text{x}},{\text{y}}} \right) > \alpha \\ & {\text{CE}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right),\;{\text{others}} \\ \end{aligned} \right.,\quad {\text{n}} \ge 1 $$
(2)
$$ {\text{CE}}_{\text{n}} \left( {\text{x,y}} \right) = \left\{ \begin{aligned} & 0,\;{\text{if}}\;\sum\nolimits_{x - 1,y - 1}^{x + 1,y + 1} {{\text{CE}}_{\text{n}} \left( {\text{x,y}} \right) \le 1} \\ & {\text{CE}}_{\text{n}} \left( {\text{x,y}} \right) ,\;{\text{others}} \\ \end{aligned} \right.,\;{\text{n}} \ge 1 $$
(3)

Where, parameter α describes the probability of the pixels as a background edge point. The value of α is set to 0.7 in this paper. Finally, count the number of foreground edge pixels, we represent the counting number with \( {\text{E}}_{\text{n}} \):

$$ {\text{E}}_{\text{n}} = \sum\nolimits_{\text{ROI}} {{\text{CE}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right)} $$
(4)

Foreground feature based on edge characteristic is expressed by \( {\text{E}}_{\text{n}} \) which convey the edge change of the parking ground. The scene and the results of proposed edge-based background subtraction are shown in Fig. 2.

Fig. 2.
figure 2

The results of edge-based background subtraction. Many background edge pixels have been removed, but there are still some noise remained (shown in the red ellipse) (Color figure online).

3.2 Color-Based Background Model and Foreground Feature Extraction

The edges noise remained impact the detection results lightly. To ensure detection accuracy, we also take the color characteristic in consideration. As we know, the Gaussian Mixture Model (GMM) [12] is a pixel level of background modeling method based on colors. But the pixel level model is time-consuming in vacant parking space detections. Differently, we proposed a region level of background modeling method using the Cr and Cb components in the YCrCb space. Calculate the mean value of Cr and Cb components for each pixel and describe it with parameter CH(i,j). Divide the ROI region into a plurality of m*m size of image patches. The segmentation of the ROI region are shown in Fig. 3. Compute the mean value and the variance of CH for each patch and describe them with parameters M(x, y) and \( {\text{D}}^{2} \)(x, y). F(x, y) is the color feature of each patch. Here, (x, y) is the coordinate of the top left vertex of the patch. The same, parameter n is the frame number. The calculation formulas are listed as follows:

$$ {\text{CH}}_{\text{n}} \left( {{\text{i}},{\text{j}}} \right) = \frac{{{\text{Cr}}\left( {{\text{i}},{\text{j}}} \right) + {\text{Cb}}({\text{i}},{\text{j}})}}{2} $$
(5)
$$ {\text{M}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right) = \frac{1}{\text{m*m}}\sum\nolimits_{{{\text{i}} = {\text{x}},{\text{j}} = {\text{y}}}}^{{{\text{i}} = {\text{x}} + {\text{m}},{\text{j}} = {\text{y}} + {\text{m}}}} {{\text{CH}}_{\text{n}} ({\text{i}},{\text{j}})} $$
(6)
$$ {\text{D}}_{\text{n}}^{2} \left( {{\text{x}},{\text{y}}} \right) = \frac{1}{\text{m*m}}\sum\nolimits_{{{\text{i}} = {\text{x}},{\text{j}} = {\text{y}}}}^{{{\text{i}} = {\text{x}} + {\text{m}},{\text{j}} = {\text{y}} + {\text{m}}}} {\left( {{\text{CH}}_{\text{n}} \left( {{\text{i}},{\text{j}}} \right) - {\text{M}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right)} \right)^{2} } $$
(7)
$$ {\text{F}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right) = \left\{ {{\text{M}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right),{\text{D}}_{\text{n}}^{2} \left( {{\text{x}},{\text{y}}} \right)} \right\} $$
(8)
Fig. 3.
figure 3

Segmentation of the ROI region. The remainder part whose width or height is less than m pixels is ignored. Parameter m is set to 10 in this paper.

The color-based background model of each patch can be described by a set of background templates:

$$ {\text{T}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right) = \left\{ {{\text{F}}_{{{\text{bg}}1}}^{\text{n}} \left( {{\text{x}},{\text{y}}} \right),{\text{F}}_{{{\text{bg}}2}}^{\text{n}} \left( {{\text{x}},{\text{y}}} \right), \ldots ,{\text{F}}_{\text{bgM}}^{\text{n}} \left( {{\text{x}},{\text{y}}} \right)} \right\} $$
(9)

\( {\text{F}}_{\text{bgm}} \left( {{\text{x}},{\text{y}}} \right) \) are background templates which represent the color feature of background patches, they are ordered according to the time when the template was added to the collection. M denotes the number of background templates and is set to 5 for the consideration of computational complexity. Initialize the background model \( {\text{T}}_{0} \) using Eqs. (5–9). In the scene of indoor parking lot, the illumination changed little. Therefore, we have no need to update background templates frequently. Meanwhile, half an hour at one time is supposed to be a choice. We only update our color-based background model when the parking space is determined as vacant. If the feature of the current patch cannot match any of the background templates, new template will be added to the set while the oldest one will be removed.

When it comes to foreground detection, the feature \( {\text{F}}_{\text{n}} \left( {{\text{x}},{\text{y}}} \right) \) of each patch is computed in the ROI region, and compared with corresponding background templates. In circumstances that it matches nothing, it’s regarded as a foreground patch. Make a statistics of foreground patches occupy the total patches, which indicates the probability that the parking space is occupied. The match process is expressed by equations as follows:

$$ \text{p}_{\text{n}} \left( {\text{x,y}} \right) = \left\{ \begin{aligned} & 0,\quad \text{if}\;\exists \text{m}\; \in \left[ { 1 , {\text{M}}} \right]:\left| {{\text{M}}_{\text{bgm}}^{\text{n - 1}} \left( {\text{x,y}} \right)} \right| < \theta \\ & \quad \quad \quad and\;\left| {{\text{D}}^{ 2}_{\text{n}} \left( {\text{x,y}} \right){\text{ - D}}_{\text{bgm}}^{{{\text{n - 1}}^{ 2} }} \left( {\text{x,y}} \right)} \right| < \theta \\ & 1,\quad {\text{others}} \\ \end{aligned} \right. $$
(10)
$$ {\text{K}}_{\text{n}} = \frac{{\mathop \sum \nolimits_{\text{ROI}} {\text{p}}_{\text{n}} ({\text{x}},{\text{y}})}}{\text{P}} $$
(11)

The value of \( \theta \) is 5. P is the amount of patches in the ROI. \( {\text{p}}_{\text{n}} ({\text{x}},{\text{y}}) \) equals 0 when this patch can match any one of the relevant background templates. Similar to \( {\text{E}}_{\text{n}} \), \( {\text{K}}_{\text{n}} \) is the foreground feature based on color information and it conveys the color change of the parking space.

We tested our algorithm for color-based foreground feature extraction, the results are presented in Table 1. We tested vehicles with different colors at 50 frames intervals. From Table 1, we can see that the \( {\text{K}}_{\text{n}} \) value is less than 0.1 with no car in the parking space, and it comes to more than 0.5 when the parking space is occupied. We can conclude that our color-based foreground feature extraction method make a good performance in vacant parking space detection.

Table 1. Results of color-based foreground feature extraction

3.3 Feature Fusion and Thresholds Update

Vehicles in small size usually introduce less edge. Therefore, when the background has more edge, it leads to low accurate results. Making a fusion of two features conducts to a better performance. However, the edge feature \( {\text{E}}_{\text{n}} \) mostly values between 2000–5000, while the color feature \( {\text{K}}_{\text{n}} \) is a decimal less than 1. The different scale of two feature values brings difficulty for feature fusion. For this, we fuse these two features using equations below:

(12)

The parameter ME is an empirical value set to 3000 in this paper. It represents the mean value of \( {\text{E}}_{\text{n}} \) when the parking space is occupied by a car. The parameter is an adjustable weighting factor for edge and color feature.

Two thresholds \( {\text{TH}}_{\text{in}}^{\text{n}} \) and \( {\text{TH}}_{\text{out}}^{\text{n}} \) are used for judging the entrance and departure of a car. When \( {\text{V}}_{\text{n}} \) is more than \( {\text{TH}}_{\text{in}}^{\text{n}} \), a car is considered into the parking space. When \( {\text{V}}_{\text{n}} \) is less than \( {\text{TH}}_{\text{out}}^{\text{n}} \), we think the parked car has left. It is obvious that the area of the ROI region and the different applying scenes of our system require different thresholds. An adaptive updating method for thresholds is designed. We can see that the \( {\text{V}}_{\text{n}} \) is at a low value when the parking space is vacant whereas at a high value in reverse. The decision problem can be converted to a similar to clustering analysis. The feature values are divided into two categories, we named them high-level category and low-level category. The averages of the two categories are expressed by \( {\text{HM}}^{\text{n}} \) and \( {\text{LM}}^{\text{n}} \), respectively. The thresholds can be calculated by equations as bellow:

$$ {\text{TH}}_{\text{in}}^{\text{n}} = {\text{HM}}^{\text{n}} - \frac{{{\text{HM}}^{\text{n}} - {\text{LM}}^{\text{n}} }}{3} $$
(13)
$$ {\text{TH}}_{\text{out}}^{\text{n}} = {\text{LM}}^{\text{n}} + \frac{{{\text{HM}}^{\text{n}} - {\text{LM}}^{\text{n}} }}{3} $$
(14)

\( {\text{HM}}^{0} \) and \( {\text{LM}}^{0} \) are initialized to zero and renewed at each frame. We put \( {\text{V}}_{\text{n}} \) to high-level category if \( {\text{V}}_{\text{n}} \) is more than the mean value of \( {\text{HM}}^{{{\text{n}} - 1}} \) and \( {\text{LM}}^{{{\text{n}} - 1}} \). Calculate the average of high-level category and assign it to \( {\text{HM}}^{\text{n}} \). On the contrary, \( {\text{V}}_{\text{n}} \) is put to low-level category if \( {\text{V}}_{\text{n}} \) is less than the mean value of \( {\text{HM}}^{{{\text{n}} - 1}} \) and \( {\text{LM}}^{{{\text{n}} - 1}} \), the average of low-level category is calculated and assigned to \( {\text{LM}}^{\text{n}} \). \( {\text{TH}}_{\text{in}}^{\text{n}} \) and \( {\text{TH}}_{\text{out}}^{\text{n}} \) are updated according to the change of \( {\text{HM}}^{\text{n}} \) and \( {\text{LM}}^{\text{n}} \).

We set ME to 3000, and to 0.6. Figure 4 shows the change of thresholds with the variation of \( {\text{V}}_{\text{n}} \). The horizontal axis represents frame number. From the figure, we can see that the two thresholds change rapidly according to \( {\text{V}}_{\text{n}} \) at the beginning, and then tend towards stability. It proves that our thresholds updating method works well.

Fig. 4.
figure 4

Blue line records the variation of \( {\text{V}}_{\text{n}} \). The green line and red line show the change of \( {\text{TH}}_{\text{in}}^{\text{n}} \) and \( {\text{TH}}_{\text{out}}^{\text{n}} \), respectively. The parking space is vacant when the blue line is below the red line, and occupied when the blue line is above the red line.

3.4 Decision and Capture

We use flag \( {\text{flag}}_{\text{static}} \) to indicate the static state of the scene with no moving vehicles.

$$ {\text{flag}}_{\text{static}}^{\text{n}} = \left\{ \begin{aligned} & 1,\sum\nolimits_{{{\text{n}} - 5}}^{\text{n}} {{\text{MFG}}_{\text{n}} < {\text{N}}_{\text{MFG}} } \\ & 0, {\text{others}} \\ \end{aligned} \right. $$
(15)

\( {\text{MFG}}_{\text{n}} \) count the number of pixels not zero in ROI region for the frame difference image. The equation means that if the \( {\text{MFG}}_{\text{n}} \) remains small within 5 frames, the parking space is in a possible static state. \( {\text{N}}_{\text{MFG}} \) is set to a small value 50, and it hardly influences the result unless set it too large.

The flag \( {\text{flag}}_{\text{in}}^{\text{n}} \) is used to determine whether the parking space is vacant. The parking space is regarded as occupied when it is one. When \( {\text{V}}_{\text{n}} \) is more than \( {\text{Th}}_{\text{out}} \) and less than \( {\text{Th}}_{\text{in}} \), the car may have two states. For one, a car is entering into the parking space with \( {\text{flag}}_{\text{in}}^{\text{n}} \) is false. For another, the car is departing the parking space with \( {\text{flag}}_{\text{in}}^{\text{n}} \) is true. The value of \( {\text{flag}}_{\text{in}}^{\text{n}} \) doesn’t change at these two states.

$$ {\text{flag}}_{\text{in}}^{\text{n}} = \left\{ \begin{aligned} & 1, {\text{V}}_{\text{n}} > {\text{Th}}_{\text{in}} \\ & 0,{\text{V}}_{\text{n}} < {\text{Th}}_{\text{out}} \\ & {\text{flag}}_{\text{in}}^{{{\text{n}} - 1}} ,{\text{others}} \\ \end{aligned} \right. $$
(16)

We capture the moments when vehicles enter and leave a parking space. As the captured pictures are conveyed to vehicle license plate recognition system, we should save the picture when the car is completely into the parking space. Thus, the moving moment of the car is not needed. Therefore, we save the entered picture when \( {\text{flag}}_{\text{in}}^{\text{n}} \) and \( {\text{flag}}_{\text{static}}^{\text{n}} \) equal 1 simultaneously. To ensure that we can get complete vehicle license plate through those captured pictures, we save three entered pictures for each car. The three images are captured at the interval of 10 frames. A mark is set after capturing three times and no more image is needed. When the value of \( {\text{flag}}_{\text{in}}^{\text{n}} \) is changed from 1 to 0, we capture the departure picture and clear the mark we set before. Small number of pictures saves storage space and reduces the amount of calculation for subsequent processing system.

4 Experiment Results

For initial experiments, a camera is used to monitor two parking spaces. We record four videos of different parking lots. Every video is up to one week, thus it includes different lighting conditions.

We set ME to 3000, and to 0.6. We list the experiment results of four frames chosen from video1, Table 2 shows the feature fusion results and their decision results. Letter O means occupied and V means vacant. Letter R means the decision result is right while F means false. The Table 2 interprets the feature fusion results, from Table 2 we can acquire the foreground feature changes in scale.

Table 2. Results of feature fusion and detection

To evaluate the performances of our proposed method, the decision accuracy, false alarm rates, miss alarm rates, as well as correctly captured rates are used. Cars were correctly captured means that the entrance and departure moment were both captured, and we can catch sight of complete vehicle license plate on one of these photos. Accurate decision means the real state of the parking space is obtained. False alarm means the parking space is wrongly judged to be occupied but it is vacant in fact, while the miss result means that the parking space is decided as vacant but it is occupied actually. We should capture the entrance and departure moments of the car in our system, so the correctly captured value is the main evaluate parameter.

In fact, ME is an experimental value which has high correlation with monitoring scene. We can get the value of ME by offline training, which means that compute the mean value of \( {\text{E}}_{\text{n}} \) when the parking space is occupied in advance. The parameter is an adjustable one which affects the performance more. We make statistics of the correctly captured rates by setting the parameter differently. The statistical result is shown in Table 3. The parameter is 0 means that we use only edge feature for the detection and leads to the lowest accuracy. Some cars may have small number of edges and can be ignored by the detection system. Cooperated with color feature improves the performance. The values of \( {\text{K}}_{\text{n}} \) vary largely for different colors of vehicles, as shown in Table 1. Thus the over high value of reduces the accuracy of detection. The system performs better when the parameter is set to 0.6 in our experiments. We set to 0.6 and list the performance of our method in Table 4.

Table 3. Correctly captured rates with different values of
Table 4. Statistical data of the performance of our method

Many articles focus on the detection of vacant parking spaces, but few of them involve the capture of vehicles. The decision accuracy can be used to compare with other methods. However, for the lack of a common database, comparing different works in the literature is not a straightforward task. For an alternative, we summarize some recent works in the literature in Table 5. The value of our method listed in Table 5 is the average of four decision accuracy values. According to the results listed in Table 5, we can assert that our multi-features based method can be an effective alternative for vacant parking space detection problem.

Table 5. The summarize of related work reported in the literature

5 Conclusions

In this paper, we proposed a solution for parking lot management by monitoring every parking space. We can find the vacant parking space and capture the moment when a car is entering into the parking space or departing from it. Our main contributions include: (i) propose a multi-feature based background model and foreground feature extraction method for the detection of vacant parking spaces; (ii) two thresholds help determining whether there is a car in the parking space. An adaptive updating method for the two thresholds is proposed, which removes the trouble of adjusting parameters; (iii) adjacent difference image is adopted to help estimate the appropriate capture time. The proposed method is proved to be highly accurate and robust to different monitoring scenes. However, in some special conditions, the car’s license plate couldn’t be captured. In addition, the proposed method mainly targets indoor parking lot, a video camera only monitors several parking spaces. More improvements can be made in the future.