1 Introduction

Sensor networks are very useful in collecting data from large areas. Many potential examples of sensor networks can be found in the literature [14]. The sensor network can also be effectively used for urban traffic. Traffic monitoring systems are very useful in providing management of the traffic flow and they can help in increasing the throughput and safety of the transportation and provide valuable information for planning of future road developments. Typically, detection of moving vehicles is done by inductive loops, passive infrared sensors, magnetometers, microphones, radars or microwave sensors [58]. Also, high-resolution cameras connected to the monitoring center using high-bandwidth cables or fiber optic links are often used but collecting data from remote sensors is usually very expensive as far as installation and usage are concerned. In recent years, an automatic video detection has been becoming more popular in the literature. There are many algorithms and systems for this purpose. Unfortunately, most of them work with high-resolution images and require significant computing power, which is not suitable for low-power sensor networks. The paper [9] contains a survey of various image detection algorithms, from simple background subtraction to optical flow and stereo vision methods. In [10], the authors present the algorithm dedicated for existing street cameras, which uses background subtraction with the Bayesian Network Classifier. The background subtraction is also used in the algorithm described in [11, 12], where pixels are modeled as the mixture of Gaussians and the pixel model is learned using the EM algorithm. The mixture of Gaussian for low-level tracking is also proposed in [13], while high-level position estimation is done using the Kalman filter. The PC-based system described in [14] utilizes the probability-based foreground extraction, with color image processing and shadow removal. Two separate algorithms for day and night operation are proposed in [15]: the spatio-temporal analysis is used during daytime, the morphological analysis of headlights is used at night, the collected information is processed with the forward chaining rule production system. Another approach is described in [16], where the outlines of the detected cars are represented as quadratic B-spline curves. The 3-D model consisting of the edge segments is matched with 2-D image segments in [17]. In [18] an adaptive background estimation on image divided into non-overlapped blocks is described, with PC-based implementation capable of 15 fps image processing. In [19, 20], FPGA-based implementations are proposed, where some regions of interest are defined and analyzed in the observed image. The self-contained computer system running Linux with the attached camera is described in [21], where the authors demonstrate the application of simple background detection algorithm. Traffic detection based on frame differencing, background subtraction and virtual induction coils are presented in [22]. [23] contains camera detection systems evaluation and optimal camera placement considerations. Other video detection systems use statistical approach using principal component analysis and independent component analysis [24], neural network approach with multilayer feed-forward neural network [25], support vector machine [26] or symmetry-based vehicle detection [27, 28].

The motivation of the presented work was to propose the low cost method (in terms of the hardware and the maintenance costs) of measuring the city traffic. The set of small, low-power, autonomous devices (i.e. sensor network nodes), taken out-of-the-box and attached to the street lamp-poles, is able to evaluate the traffic using their built-in cameras and to transmit traffic data to the central computer, using the self-organized radio network (even ad-hoc installation is possible in emergency situations). Data collected by the network can be used for many purposes, such as: informing citizens by local radio station or internet, detecting extraordinary traffic events, guiding the emergency vehicles to achieve their goal more quickly or interacting with the traffic lights to improve the traffic flow, resulting in a decrease in traffic jams and environmental pollution. Compared to the standard vehicle detection method with the inductive loops, the use of small sensor network nodes has many benefits: simple installation, lack of wire connections, independent radio communication, possibility to estimate the speed, direction and size of the vehicles, capture and transmission of single images to the monitoring center.

In this paper, a sensor network for monitoring the traffic of the vehicles on the street of a city is described. The design target is a low-power sensor network node capable of estimating the traffic flow, due to the carefully developed video detection algorithms and proper hardware/software co-design. The node of the sensor network is designed to be installed on street lamp-poles to observe the scene from the location high above the road. Analysis of the video stream locally by each node of the sensor network significantly reduces the amount of data which needs to be transmitted over the radio, further decreasing the energy usage. The nodes are capable of detecting car traffic; additionally, single images from the nodes’ cameras can be sent, informing the operator, for example, about snow on the street.

The proposed sensor network node (Fig. 1) uses a typical low-cost camera, thus the installation near the street lamp is mandatory to enable observation in the night.

Figure 1
figure 1

A conceptual diagram of the sensor network node.

The sensor network can be quickly and simply installed without the need for expensive cabling or permissions, since the nodes have a built-in, low-power radio working on license-free frequency band (see Fig. 2) and they can be powered from the solar panel.

Figure 2
figure 2

An overview of the sensor network application for collecting traffic data.

The layout of the paper is as follows: in section II, the authors present the algorithm and realization of the image processing for moving object detection. Section III describes the protocol and organization of the radio communication network. The details about the realization of the prototype nodes are given in section IV. The results of the sensor network operation and conclusions are presented in sections V and VI, respectively

2 Moving Object Detection

Monitoring of the traffic with a camera is a very challenging task. It requires an image-processing algorithm capable of detecting cars under variable light conditions. Implementation of an image-processing algorithm in the sensor network node is further constrained by the limited resources of the node. The sensor network node is usually a low-power device using as simple hardware as possible to decrease the size and the power consumption. Due to this fact, a careful design of an image-processing algorithm which would fit into those limited hardware resources is very important. In the presented solution, a monochrome camera is used to reduce the complexity of the hardware by about 3 times at a cost of decreasing the segmentation sensitivity; additionally, the image resolution has been decreased to 128 × 128 pixels.

  1. A.

    Low–Level Image Processing

In the presented implementation, the non-model-based approach for moving object detection with background subtraction is used, where each current frame from the camera is subtracted from the background image stored in the memory. Among the algorithms mentioned in section I, the background subtraction is the easiest to implement in hardware, but it does not provide in its basic form the detection quality acceptable for the outdoor vehicle detection. The complete block diagram of the image-processing algorithm is shown in Fig. 3. The detailed description of the low-level image segmentation algorithm can be found in [29], where the earlier work of the authors is presented.

Figure 3
figure 3

General diagram depicting the idea of the proposed image processing algorithm.

The authors decided to use two background models concurrently [3033]: long-term with non-selective update and short-term with selective background update. In this way, the stopped objects are initially not included into the selective background, but after a while become a part of the non-selective background. After being part of the non-selective background, the stopped object is not detected any more, thus it can be quickly included into the selective background too.

Both background models assume single Gaussian distribution of pixel intensities for time t with the average brightness μ t and the standard deviation σ t updated using running mode, which can be very efficiently realized in hardware.

$$ \left| {{{I}_t} - {{\mu }_t}} \right| > k \cdot {{\sigma }_t} $$
(1)

The detection results, using inequality (1), in the form of the masks: m N and m S from non-selective and selective background model, respectively, are combined into a single binary mask m B , using special combination of and and or operations, similarly as described in [32]. For the simplicity of the hardware, apart from the current pixel, instead of eight neighboring pixels, only the four previously analyzed pixels are used. To improve the segmentation quality, two edge detection blocks have been introduced: temporal edge detection and spatial edge detection, resulting in mask m ET and m ES , respectively. To further increase the selectivity of the object detection, several additional blocks have been added, such as shadow detection block (mask m SH ) or highlight detection block detecting highlights from the car lights in the night (mask m HI ) and very dark pixels on a bright background (mask m X ). The basic detection of shadows is done by comparing the decrease in brightness [34]. The Final Processing block from Fig. 3 is responsible for basic morphological operations (erosion followed by dilation, using 2 × 2 rectangular structuring element) to condition the final mask m BEHSX . The final detection result is processed by the Hough transform based block, which helps to obtain convex, filled blobs representing the detected objects. The Hough transform with the rectangular structuring element improves the shape of the blobs (Fig. 4). The mask m V is the final detected vehicle mask image of elements equal to 0 or 1, where 1 denotes the pixels belonging to the detected moving objects.

Figure 4
figure 4

An example of the operation of the Hough transform block: a—image m BEHSX before Hough block, b—image m V after Hough block.

All the image-processing operations shown in Fig. 3 have been selected and adapted for efficient hardware implementation. The details of the pipelined implementation of the algorithm from Fig. 3, as well as the segmentation results, can be found in [29]. In this paper, instead of typical pipelined hardware, a set of concurrently working, dedicated Pixel Processors (PP) has been developed to enable the modifications of the algorithm in the ASIC implementation. The PPs have been designed as scalable modules in VHDL: 16-bit, 8-bit and 1-bit versions are provided. The system contains nine PPs, as shown in Fig. 5. The processors are connected to the common data bus via a programmable switch matrix, where the bus contains data of the currently processed pixel. Each processor can read any of 128 data bits, but it can write only to the exclusively assigned selection of bits, except from 1-bit processors which can write to each bit of the bus. The 1-bit processors have a built-in memory for caching the previously analyzed pixels - data from this memory are used to perform simplified morphological operations that require information about the neighboring pixels, such as erosion or dilation. The processing tasks have been divided among the processors, as listed in Table 1.

Figure 5
figure 5

Block diagram of the image-processing hardware.

Table 1 The assigned tasks for PPs.

The camera can view the street from various angles. To estimate the speed and the size of the moving objects, the image must be geometrically transformed. An example of the transformation is shown in Fig. 6, presenting the input image before and after the transformation. In the real system, the mask m V is transformed into the mask m TF . The image transformation is realized by the hardware block equipped with the memory block containing the image-mapping coordinates.

Figure 6
figure 6

An example of the input image before (a) and after (b) the geometrical transformation canceling the perspective.

The detected blobs are labeled and their basic parameters are calculated, such as: object’s boundaries, the center of the object and the area in pixels. The labeling and calculating of the parameters of the objects is done during pixel by pixel revision of the image by the selected PPs.

  1. B.

    High–Level Tracking

The detection results from PPs in the form of the table containing the basic parameters of the blobs are passed to the main 32-bit processor of the system, where the objects are traced, classified and measured. The detected blobs are usually noisy and inaccurate, i.e. they can change shape, disappear, split or join with the other blobs. The main processor services the radio network and it also has to extract as much information as possible to infer the moving cars, but the limited computing resources did not allow the use of the Kalman filter approach in real time, therefore a simplified method has been used. For that purpose, a set of if-then heuristic rules has been applied, working on simple rectangular models of the blobs. For every picture frame, each detected blob is always modeled as the rectangle B (denoted as blob B in the further text) with the assigned attributes shown in Table 2.

Table 2 The attributes assigned to each detected blob B.

For simplicity, all the average values are calculated as running mean and the movement path is approximated with a straight line. The overlapping of the blobs in the current and the previous frame is detected by the PP and both such blobs are treated as representing the same moving object, providing the continuity of the existence of the blob’s rectangle B. If the blob B disappears, the virtual model M is created, continuing the movement of the blob B, using the last known values for size, direction and speed of the disappeared blob B. The attributes of the model M are listed in Table 3.

Table 3 The attributes assigned to each model M.

A simple example of blob processing is shown in Fig. 7, where at the time t, a new blob has been detected and marked as B t (the speed \( \overline {\text{V}} {\text{B,t}} \) and the direction \( \overline \alpha {\text{B}},{\text{t}} \) are unknown at this time). At the next time step t + 1, the blob changes its position and shape, but, due to the overlapping with the blob from time t, it is treated as the same moving blob, so the attributes of B t are copied to B t+1, also the first approximation of the speed of the blob \( {{\overline {\text{V}} }_{{{\text{B,t}} + 1}}} \) can be calculated, as well as \( {{\overline \alpha }_{{{\text{B}},{\text{t}} + 1}}} \). At the time step t + 2, the blob disappears, but its existence is represented by the newly created model M t+2 with the following attributes:

Figure 7
figure 7

An example of transforming the blob B into the model M. At the time t a new blob has been detected and marked as B t . Next, at t + 1, the new blob overlaps with the blob from time t, so it is considered as the continuation of B t . At the time t + 2 the blob disappears, but the model M t + 2 continues the movement of B.

$$ \begin{array}{*{20}{c}} {{{{\text{t}}}_{{0{\text{M}}}}} = {{{\text{t}}}_{{0{\text{B}}}}}} \\ {\left( {XM,YM} \right) = \left( {{{{\overline {\text{X}} }}_{{{\text{B}},{\text{t}} + 1}}},{{{\overline {\text{Y}} }}_{{{\text{B}},{\text{t}} + 1}}}} \right)} \\ {{{{\text{V}}}_{{\text{M}}}} = {{{\overline {\text{V}} }}_{{{\text{B}},{\text{t}} + 1}}},\,{{\alpha }_{{\text{M}}}} = {{{\overline \alpha }}_{{{\text{B}},{\text{t}} + 1}}}} \\ \end{array} $$
(2)

When the size of the blob B changes (which could mean joining with another blob or splitting), new objects are created: a new model M, as a continuation of the B, and the new blob (or blobs in the case of splitting) representing the current blob. As an example, two moving blobs are shown in Fig. 8, represented by B a and B b at the time t. In the next frame at the time t + 1, both blobs have joined together and from this moment B a is representing the connected blobs. However, due to the size change of the blob B a and disappearing of the blob B b , the blobs B a and B b from time t have been converted into the models M d and M c , respectively. Later, at the time t + 2, the blob B a splits again into two blobs B a and B b ; the overlapping models M c and M d are then absorbed by the blobs B a and B b , respectively, providing the continuity of their movement.

Figure 8
figure 8

An example of blobs’ tracking after blob joining.

Having a set of blobs B and models M for each frame, the main processor executes the heuristic rules to extract data about moving vehicles. For example, if any blob and model are close together and have similar speed and direction of movement, depending on the situation, the model can be deleted or its position can be centered with the blob. Small models that are close to a bigger one and have similar speed and direction are deleted. The models that overlap with each other and move in the similar direction and with similar speed are consolidated. The most important rules concerning the blobs and models are summarized in Table 4 and below:

Table 4 Overview of situations when model is created and deleted. The details are presented in form of the rules in the text.

High level tracking rules:

  1. rule #1

    If BlobSizeChange(B a,t , B a,t+1) ≥ 50 % then

    M a,t+1 = NewModelFromBlob(B a,t )

    B b,t+1 = NewBlob(B a,t+1)

    DeleteBlob(B a,t+1)

  2. rule #2

    If BlobDisappeared(B t ) then M t+1 = NewModelFromBlob(B t )

  3. rule #3

    If Distance(B, M) < D min and \( \left| {{{\text{V}}_M} - {{{\overline {\text{V}} }}_B}} \right| < {\text{V}}\,min \) and AngleDifference(\( {{\overline \alpha }_{B}},{{\alpha }_{M}} \)) < α min and t 0B < t 0M then BlobAbsorbsModel(B, M).

  4. rule #4

    If Overlap(B, M) > 50 % and \( {{\overline {{V}} }_B} = 0 \) and V M  = 0 then DeleteModel(M)

  5. rule #5

    If SpeedDifference(\( {{{V}}_{{{{{M}}_{{a}}}}}}{, }{{{V}}_{{{{{M}}_{{b}}}}}} \)) < 50 % and AngleDifference(\( {{\alpha }_{{{{{\text{M}}}_{{\text{a}}}}}}},\,{{\alpha }_{{{{{\text{M}}}_{{\text{b}}}}}}} \)) < α min and Distance(M a , M b ) < D min and (\( {{{\text{S}}}_{{{{M}_{a}}}}} = {\text{small}} \) or \( {{\text{S}}_{{{{{M}}_{{b}}}}}} = {\text{small}} \)) then ConsolidateModels(M a , M b )

  6. rule #6

    If SpeedDifference(\( {{{V}}_{{{{{M}}_{{a}}}}}}{, }{{{V}}_{{{{{M}}_{{b}}}}}} \)) < 50 % and AngleDifference(\( {{\alpha }_{{{{{\text{M}}}_{{\text{a}}}}}}},\,{{\alpha }_{{{{{\text{M}}}_{{\text{b}}}}}}} \)) < α min and Overlap(M a , M b ) < min(\( {\text{t}} - {{\text{t}}_{{0{{{M}}_{{a}}}}}} \), 100) then ConsolidateModels(M a , M b )

  7. rule #7

    If SpeedDifference(\( {{{V}}_{{{{{M}}_{{a}}}}}}{, }{{{V}}_{{{{{M}}_{{b}}}}}} \)) < 50 % and AngleDifference(\( {{\alpha }_{{{{{\text{M}}}_{{\text{a}}}}}}},\,{{\alpha }_{{{{{\text{M}}}_{{\text{b}}}}}}} \)) < α min and (Overlap(M a , M b ) < min(\( {\text{t}} - {{\text{t}}_{{0{{{M}}_{{a}}}}}} \), 100) or Overlap(M a , M b ) < min(\( {{t}} - {{{t}}_{{0{{{M}}_{{b}}}}}} \), 100)) then ConsolidateModels(M b , M a )

Rules used for conversion of blob B into model M:

  1. rule #8

    If NewModelFromBlob(B) and \( {{\overline {\text{V}} }_{\text{B}}} \) and \( {{\overline \alpha }_{{\text{B}}}} \) are not known then create a new short-term model M using the parameters of B and T M = T MIN .

    Short-term model is supposed to live shortly and it will disappear after T MIN frames. It is used to track “blinking” moving blobs, for which it is difficult to estimate \( {{\overline {\text{V}} }_{\text{B}}} \) and \( {{\overline \alpha }_{{\text{B}}}} \) due to their short existence on the screen. T MIN = 6 has been set experimentally for frame rate 32 fps.

  2. rule #9

    If NewModelFromBlob(B) and \( {{\overline {\text{V}} }_{\text{B}}} \) and \( {{\overline \alpha }_{{\text{B}}}} \) are known and \( {{\overline {\text{V}} }_{\text{B}}} < {{V}_{\text{MAX}}} \) and not (is_human(B)) and \( {{\text{q}}_{\text{B}}} > {{\text{Q}}_{\text{MIN}}} \) then create new model M using the parameters of B.

where:

V MAX :

the maximum allowed speed for the vehicle

D min :

the minimum allowed distance (assumed 4 pixels)

α min :

the minimum allowed angle, assumed α min = 45°

The function is_human() is a simple and rough estimate, if the detected blob could be a person walking along the street (assuming that the camera is situated high above the road) and is calculated as:

$$ {\text{is}}\_ {\text{human}}(B) = \left\{ {\matrix{ {\text{true}} &{when\,\,{{{{{{\overline {\text{V}} }}_{\text{B}}} < {{\text{V}}_{\text{MIN}}} \wedge 3 < {{{\overline {\text{Y}} }}_{\text{B}}}}} \left/ {{{{{\overline {\text{X}} }}_{\text{B}}} < 6 \wedge {{\text{P}}_{\text{HMNIN}}} < {{\text{P}}_{\text{B}}} < {{\text{P}}_{\text{HMAX}}}}} \right.}} \\ {\text{false}} &{otherwise} \\ }<!end array> } \right\} $$
(3)

where:

V MIN :

the maximum speed for walking human

P HMIN , P HMAX :

the constant defining the minimum and maximum number of pixels, respectively, for the blob recognized as a person. The values of P HMIN and P HMAX are calculated to represent the area of 0.75 m2 and 3 m2, respectively, at the image after geometrical transformation.

Q MIN :

the constant defining the minimum allowed object’s movement path quality q B .

The quality q B of the movement path of the blob B is calculated according to the equations:

$$ \matrix{ {{{\text{q}}_{{{\text{B}},{{\text{t}}_{{0{\text{B}}}}}}}} = 0} \\ {{{\text{q}}_{{{\text{B,t}}}}} = {{\text{q}}_{{{\text{B,t - 1}}}}} + 2{\text{Q}} - \left| {{{\text{x}}_{{{\text{B,t}}}}} - {{\text{x}}_{{{\text{B,t - 1}}}}} - {{{\overline {\Delta {\text{X}}} }}_{{{\text{B,t - 1}}}}}} \right| - \left| {{{\text{y}}_{{{\text{B,t}}}}} - {{\text{y}}_{{{\text{B,t - 1}}}}} - {{{\overline {\Delta {\text{Y}}} }}_{{{\text{B,t - 1}}}}}} \right|} \\ }<!end array> $$
(4)

where:

\( {{\overline {\Delta {\text{X}}} }_{{{\text{B,t}}}}},\,{{\overline {\Delta {\text{Y}}} }_{{{\text{B,T}}}}} \) :

the average differences between the traveled distance by the blob B at two consecutive time moments t-1 and t for the direction x and y, respectively.

Q :

acceptable change of \( {{\overline {\Delta {\text{X}}} }_{{{\text{B,t}}}}},\,{{\overline {\Delta {\text{Y}}} }_{{{\text{B,T}}}}} \) for consecutive frames, assumed value of Q = 2.

$$ {\text{BlobSizeChange}}\left( {{{\text{B}}_{\text{t}}},{{B}_{{{\text{t}} + 1}}}} \right) = max\left( {\frac{{\left| {{{\text{X}}_{{{\text{B,t}}}}} - {{\text{X}}_{{{\text{B,t}} + 1}}}} \right|}}{{max\left( {{{\text{X}}_{{{\text{B,t}}}}},{{\text{X}}_{{{\text{B,t}}}}} + 1} \right)}},\frac{{\left| {{{\text{Y}}_{{{\text{B,t}}}}} - {{\text{Y}}_{{{\text{B,t}} + 1}}}} \right|}}{{max\left( {{{\text{Y}}_{{{\text{B,t}}}}},{{\text{Y}}_{{{\text{B,t}}}}} + 1} \right)}}} \right) $$
(5)
Distance(B, M):

the shortest distance between the outlines of two rectangles: B and M. If the rectangles overlap, the distance is 0.

$$ SpeedDifference\left( {{{\text{V}}_1}{,}{{\text{V}}_2}} \right) = \frac{{\left| {{{\text{V}}_1} - {{\text{V}}_2}} \right|}}{{max\left( {{{\text{V}}_1},{{\text{V}}_2}} \right)}}100\% $$
(6)
AngleDifference(α 1, α 2):

the difference between two angles α 1 and α 2.

Overlap(B, M):

returns % of overlapping area between the rectangles B and M with respect to the area of B.

BlobAbsorbsModel(B, M):

the attributes’ values of the model M are copied to the blob B and the model M is deleted.

ConsolidateModels(M a , M b ):

the models M a and M b are compared and the model which exists shorter is deleted, the remaining model is updated with the speed and moving direction as the average of the respective values from M a and M b .

The rules have been presented in a simplified form to aid the readability, in fact more conditions are checked, i.e. to protect from processing invalid data, division by 0, etc.

The moving models which leave the picture frame are filtered and only those considered as reliable are counted. The model is considered reliable if it comes from a blob that has been moving smoothly for some time (depending on the quality q B of the movement path) or it has been often overlapping with blobs moving with similar size, speed and direction. The traffic flow data are constantly updated, the moving vehicles are classified according to their speed and direction—this information is periodically transmitted by the radio to a nearby node which is closer to the data sink.

3 Data Transfer in the Sensor Network

To avoid the overheads of the standard wireless communication protocols such as 802.11 or 802.15.4, the authors decided to develop and implement the proprietary protocol optimized for this application. The proposed protocol has shorter lengths of the headers (Table 5), moreover, the header already contains information about the state of the node (battery state, data buffer occupancy), which is important for neighboring nodes in routing the traffic. Also, data length is suited to carry only important information on car traffic. The simple comparison of the proposed protocol to the standard 802.15.4 is shown in Table 6. The sink nodes have a simple serial interface with text commands, so connecting the network to a PC is very easy.

Table 5 The bit lengths of the transmission headers in different protocols.
Table 6 The comparison of the most important parameters of 802.15.4 and the protocol presented in this paper.

The access to the transmission medium is based on decentralized Time Division Multiple Access (TDMA). Each node is equipped with a radio transceiver module using ISM 869.5 MHz band utilizing data rate of 38.4 kbps with Manchester encoding. The nodes’ transmitters have the power of 500 mW (transmitter duty cycle <10 %), which helps in providing longer communication range in the harsh environment. The transmitter’s power can be remotely changed by the software to the values 40 mW, 125 mW, 250 mW and 500 mW, if needed. The dedicated operating system of the node, running on the main 32-bit on-chip processor, controls all the operations of the node and collects data from the image-processing part and the PP processors.

  1. A.

    Transmission Time t tx

Each node has its own unique address and a table for storing data about its neighbors. All the nodes use the same period T t of the transmission schedule, basing on their local clocks. Each node selects its own transmission start time t tx in the TDMA transmission period T t , providing:

$$ _{\text{i}}^{\forall }\left| {{{\text{t}}_{{{\text{tx,i}}}}} - {{\text{t}}_{\text{tx}}}} \right| > {{\text{T}}_{\text{MIN}}} $$
(7)

where t tx,i is the transmission start time of the other previously discovered node i (this can be also an indirect neighbor, i.e. the neighbor of the neighbor), T MIN is a constant defined globally for the network, providing the required time distance between the transmissions. The times t tx and t tx,i are relative to the node’s local time source. The time distances of all the transmission start times t tx,i of the neighboring nodes are continuously monitored. When a node detects that among the neighboring nodes the condition (7) is not satisfied, it asks the problematic node to change its t tx .

  1. B.

    Beacon

The beacon contains basic information about the node, such as the node’s address, the distance of the node from the sink (measured in transmission hops), the node’s battery condition, the node’s data buffer occupancy and data about the node’s neighbors. Beacon is transmitted at t tx every N B data transmission periods, typical value of N B is from 3 to 10. Beacons are also used for establishing data links between the nodes. Each node infers from its neighbors’ beacons the existence of the indirect neighboring nodes, which helps to prevent the hidden terminal problem [2].

  1. C.

    Startup and Discovery of the Neighbors

At the startup, each node enters the discovery mode, when it continuously listens to the transmitted beacons, identifies neighbors and saves neighbors’ transmission times t tx,i . To find new nodes and accommodate the network topology changes, the discovery mode is periodically repeated. This is similar to the procedure used in S-MAC protocol [35].

There are four types of frames that can be transmitted through the network: beacon, data, acknowledge and configuration packets. All the frames have a 64-bit header of a similar structure. The beacon frame may contain information for the neighboring nodes needed for organizing the network (Table 7). The data frame carries the encrypted 128-bit packets (PKT) with car traffic information (Table 8), which can be aggregated from several nodes and transmitted together in a single frame. The acknowledge frame, transmitted just after the successful reception of the data frame, is used to confirm the reception of data, so the data transmitting node can free its buffer.

Table 7 The general structure of the frame.
Table 8 The structure of the single 128-bit payload packet PKT.

Each node switches on its receiver at the time t tx,i to listen to the possible transmission of the i-th neighbor. If the receiving node is not an addressee of the data transmission, it switches off its receiver just after receiving the header, in order to avoid overhearing.

Car traffic data in the sensor network are regularly transmitted from the nodes to the data sinks using data frames. To provide the ability of sending information in the reverse direction, i.e. from the sink to distributed nodes, the configuration frame is used. The configuration frame is transmitted without acknowledgement. Each node, after receiving a new configuration packet, simply retransmits it several times. This simple mechanism causes a heavy load of radio links but it is used rarely and only for service purposes, such as configuring remote nodes’ parameters, updating the firmware (FPGA nodes only) or the software.

Each data packet (PKT) contains the time stamp coming from the node’s real clock. The time stamp is used for checking the validity of data at the host computer. The sensor network utilizes global real clock synchronization scheme. The sync packets are periodically generated by the main sink node and distributed through the network by each node using the beacons.

  1. D.

    Data Protection

Each packet header is protected with 16-bit CRC, so the receiver can quickly verify the integrity of the header and the addressee of the packet without receiving the whole packet. Additionally, the payload has its own 32-bit CRC.

The header is transmitted using plain text, while the payload (PKT_DATA) is encrypted using AES with symmetric key. The hardware encryption, compared to the software encryption, has better energy efficiency measured in energy/bit, thus each node has the hardware encryption block. Decryption is used occasionally by the nodes (except the sink nodes), so it has been implemented in the software.

The 3 prioritized AES keys have been used: KEY1 and KEY2 are the same for all the nodes, the KEY3 is different for each node. During regular transmission, the KEY1 is used. To change the KEY1, the KEY2 is required. At a regular time interval, the change of the KEY1 (protected by KEY2) is commanded by the sink node. The KEY3 enables changes to the KEY1 and KEY2 and it should be used in case of a node’s takeover.

4 Hardware Realization of the Sensor Network

  1. A.

    Block Diagram of the Sensor Network Nodes

The block diagram of the node’s hardware, common for FPGA and ASIC prototype, is presented in Fig. 9. The main part of the hardware has been integrated into a Sensor Network Processing Module (SNPM), containing the custom microelectronic system with 32-bit processor BA12 from Beyond Semiconductor (the same class as Arm’s ARM9™) and the peripherals connected to the Wishbone bus. The hardware moving object detection system, described in section II, has also been integrated, together with the additional hardware blocks providing quick AES encryption and control of the low level operations of the transceiver (Fig. 10).

Figure 9
figure 9

Block diagram of the prototype node.

Figure 10
figure 10

Block diagram of the Sensor Network Processing Module.

  1. B.

    Hardware Implementation

The first prototype sensor network node has been developed using Xilinx’s XC4VLX60 FPGA. The main board contains all the elements from Fig. 10 with the camera MT9V111 from Micron and the transceiver ARF29 from Adeunis. The power supply has been implemented on additional boards using 12 V 6 Ah gel cell sealed lead acid battery. The resources used in FPGA implementation are listed in Table 9 and the pictures of the prototypes with the FPGA are shown in Fig. 11.

Table 9 The resources used in the FPGA implementation using Xilinx Virtex-4 XC4VLX60.
Figure 11
figure 11

Prototype sensor network node with FPGA, a—picture of the node, b—two test nodes installed on a street lamp-pole.

After the successful startup with the FPGA, the ASIC has been designed in 130 nm UMC CMOS process and manufactured through Europractice (Fig. 12), using the RTL code from the FPGA prototype. Both FPGA and ASIC provide almost the same functionality. The design of ASIC has been made using Cadence SoC software with Faraday L130FSG (LP and HS) library of digital gates. The code consisting of ~95 000 lines of VHDL and Verilog has been synthesized using Cadence RTL Compiler with clock gating optimization. DFT flip-flops and JTAG controller have been added to enable future tests. For the implementation, Cadence SOC Encounter GXL 6.2 has been used, with crosstalk and signal integrity analysis and power supply analysis (electromigration, IR drop). The parameters of the designed ASIC are listed in Table 10. Comparing Tables 9 and 10, a large difference in the number of Flip Flops can be observed. The reason is that small memories and shift registers in FPGA are implemented using LUTs, while ASIC implementation uses FFs for that purpose. Moreover, in the FPGA all the memory is composed of 18Kbit block memories, resulting in wasted bits. Each memory block used in ASIC is designed to exactly fit the required size.

Figure 12
figure 12

A photo of the manufactured ASIC realized in 130 nm UMC process packed in CQFP208 package.

Table 10 The resources used in ASIC implementation using Faraday library and UMC 130 nm CMOS process.

The manufactured ASIC has been used to build the low-power version of the node shown in Fig. 13, which works with Li-Ion 3.7 V 3.5 Ah single cell battery and can be supplied by a solar panel of area of 0.5 m2 (50 W peak power).

Figure 13
figure 13

Photo of the ASIC version of the node.

The set of the nodes has been installed on the street lamp-poles on several streets, as shown in Fig. 14. The detailed plan of the deployment is presented in the next section of the paper. The nodes have been using their batteries during the day; at night the batteries have been charging from the lamps’ power supply. The ASIC node consumes less power than the FPGA counterpart and therefore it is capable of working with a solar panel, instead of using the lamp’s power supply.

Figure 14
figure 14

Photographs of the test nodes installed on the street lamp-poles, a—set of nodes along the street, b—two nodes with FPGA, c—the node with ASIC from Fig. 13 installed in a typical housing of an industrial security camera.

5 Simulation and Test Results

  1. A.

    Object Detection

The vehicles detected by the system have been verified on-line by the human operator. The road selected for traffic detection tests had various traffic conditions: from traffic jams to speeding vehicles. The sensor network node installed above the road was detecting and counting passing vehicles. Concurrently, the human operator was manually counting the cars. The comparison of the detection for 100 cars for various conditions has been presented in Table 11. More results, including the object segmentation simulations and videos, are available on-line at http://www.ue.eti.pg.gda.pl/sn.

Table 11 Evaluation of vehicle detection rate for the sensor network node for various scenes. As the reference, 100 cars were counted by the human operator in each test.

As can be seen from Table 11, the image detection system recognized 63–93 % of the moving vehicles registered by the human. There were typically 1–9 % additional false detections of non-existent vehicles. During the sunny day, the most important problem are the shadows, resulting in joining the blobs from different cars. The basic shadow detection used in this system is not able to detect all the shadows. On a cloudy day, the most errors come from the dark cars of gray level similar to the color of the road. In this situation, the edge detection blocks are very helpful in increasing the detected pixel rate. The detection quality is significantly lower in the night, where mostly the car lights are detected. The achieved accuracy is the result of the compromises made during the design, such as a processing of monochrome and low resolution images and 4-bit representation of the pixel values, to obtain a low-power operation and simpler hardware. The detection rates are satisfactory for the collecting of the statistical data on average traffic flow and getting the overall general picture of the traffic. However, the authors feel that improvements in accuracy should be the next step in the further development of the presented idea.

  1. B.

    Network Data Transmission

The sensor network has been simulated to estimate the traffic parameters. The proposed sensor network is based on non-standard communication protocol and the use of popular simulation packages such as ns2 would require the creation of the exact communication model of the node. Instead of this, the authors decided to embed the exact copy of the node’s software into the custom simulator written in C++. In this way there is no need to create the separate simulation model of the node and the simulation results are very close to reality.

In the simulator, ideal radio links have been assumed, so all transmissions are successful. Assuming that each node regularly generates on average 9 bps of data stream containing information about the car traffic and basic node’s information such as battery condition or node’s neighbors, the maximal throughput for the data packets at the single radio link is 620 bps. Other parameters are listed in Table. 12.

Table 12 Parameters of the proposed radio protocol.

Two basic network configurations of Fig. 15 have been simulated. For the network configuration shown in Fig. 15(a), the delay and radio activity have been analyzed. Both source nodes were generating the test messages consisting of single 128-bit packet. The total average delay of packets was 2.7 s, when the source nodes generate packets every t gen  = 0.5–2 s (Fig. 16). The activity of the radio receiver and transmitter is shown in Fig. 17(a) and (b), respectively. The radio activity is highest for the middle node, as it has the largest number of neighbors. The average delay from the node 1 to the node n in the network configuration from Fig 15(b) is shown in Fig. 18. As can be seen, the average delay of packets per single hop in this configuration is almost constant and equal to approx. 1 s.

Figure 15
figure 15

Test configurations of sensor network.

Figure 16
figure 16

Simulation results of average delay of packets for the network configuration from Fig. 15(a).

Figure 17
figure 17

Simulation results of the activity of radio receiver a and radio transmitter b for the network configuration from Fig. 15(a).

Figure 18
figure 18

Simulation results of average delay of packets for the network configuration from Fig. 15(b) for t gen  = 1 s.

The effective radio range of the transceivers installed in real environment was about 900 m at the direct visibility of the antennas. The direct visibility can be easily achieved when the nodes are mounted on the street lamp-poles. After the installation of the nodes, the network organized itself and was correctly transmitting data to the sink, in the same way as during the simulation, except for sporadically occurring radio interference. The nodes have large data buffers for collecting unsent data (for at least 30 minutes); in the case of transmission problems, data is retransmitted at a later time. All the transmission times have been verified in practice and conform to the simulations. Apart from regular data about the car traffic, the operator can ask a node to send a single frame of the picture (Fig. 19), and uploading new firmware or software to the selected node, or all the nodes, is also possible.

Figure 19
figure 19

The picture uploaded from the node.

  1. C.

    Power Consumption

The comparison of the power consumption of FPGA and ASIC prototypes is shown in Fig. 20. As could be expected, the main power savings are at the core in ASIC version. The other blocks (SDRAM, camera, I/O) have also slightly lower power consumption in ASIC prototype, but it is only due to the fact that they work with lower supply voltage (3 V instead of 3.3 V used at FPGA prototype).

Figure 20
figure 20

Power consumed by the main blocks of the sensor network node prototypes during typical work (detecting traffic and exchanging data with 5 neighbors). FPGA prototype works with the following power supply voltages: 1.2 V and 2.5 V for the core, 3.3 V for SNPM I/O, camera and SDRAM and 3 V for the transceiver. ASIC prototype uses 1.2 V for the core, 3 V for all other blocks. The power losses at the power supply and battery charging circuitry are not included.

  1. D.

    The Sensor Network Installation

The sensor network consisting of 26 nodes (22 FPGA-based and 4 ASIC-based) has been tested in real conditions. Data from the remote nodes are transferred to the sink node. The sink node is connected via RS-232 to the PC computer, which plays the role of operator’s console, where the collected data are visualized. The map of the installation is shown in Fig. 21, the installed nodes are configured to measure only the traffic along the street, as indicated by the arrows on the map. For the deployment of the nodes, the following criteria have been chosen: the vicinity of the university, the possibility to test multi-hop radio transmission, one-way and two-way streets and the various congestion of the nodes’ positions.

Figure 21
figure 21

The location of the sensor network nodes (black dots) on the streets. The arrows indicate the directions of measured traffic.

Each node accumulates information about the traffic and transmits it every 60 s. Finally, this information is presented at the console as simple histograms (Fig. 22), showing the number of vehicles detected for each direction. Each node maintains statistics about average tracks of the moving vehicles, so standing vehicles, for which the moving direction cannot be determined, shall be classified according to their location on the street. The project had a strictly scientific purpose, therefore the collected data were used only by the researchers, the controlling of the city lights was not introduced at this stage.

Figure 22
figure 22

Histograms representing the traffic detected by a node.

Apart from the current state of the traffic, the history of the traffic for each node and each direction can be displayed in the form of a time graph. The operator can also: configure a single parameter of each node, reset the node, update its software or download logs or single camera pictures, however these operations work slowly due to the low throughput of the radio network. Each node is also equipped with on-board temperature sensor and A/D converters for measuring the solar panel and battery voltages and currents. The proposed network protocol has dedicated data slots for transmitting 8-bit values of the measured temperature, noise, sun intensity and node’s internal battery state.

6 Conclusions

This work represents an attempt to solve the problem of measuring street traffic using a sensor network. For the purpose of video detection, dedicated multi-processor hardware and algorithms have been developed. The system has been realized as a relatively low-power device, with low hardware and software resource usage.

The video detection of the vehicles is realized using low-resolution images and simplified algorithms, thus its accuracy is not high, but it seems appropriate for a rough estimation of the traffic flow. The reduction of video data from 8 bits to 4 bits resulted in a decreased sensitivity for low-light scenes.

The sensor network nodes acquire data about car traffic and environmental parameters, such as temperature or the voltage from the solar panel, and they transmit them by the radio in the compressed form to the node closer to the sink. Other measured parameters, such as acoustic noise level, can be easily added, giving the possibility to monitor the noise in the area of the city. For that purpose, a self-organizing multi-hop radio network protocol has been designed, responsible for downloading data from the remote nodes to the sinks. Data are aggregated at the nodes to decrease the number of radio transmissions. The radio protocol also enables the uploading of data to configure the remote nodes. Despite using 500 mW transceivers, the transmission range was only 900 m, which requires dense location of the nodes. In some situations, it might be more convenient to install GSM transceivers for the remote nodes.

The sensor network node has been practically realized in two versions: FPGA and ASIC and compared with each other. The network consisting of 26 nodes has been tested in real conditions for more than 12 months. The achieved detection rate (as shown in Table 11) was enough to report the traffic flows in the monitored area. The radio transmission performance was very close to the simulation results. In the case of transmission problems, the large nodes’ data buffers were able to buffer data locally for up to 30 min, waiting for the radio link improvement. The installation was successfully collecting data, which were recorded and visualized on a PC, proving the usefulness of this idea.