Effective distributed service architecture for ubiquitous video surveillance
- First Online:
- Cite this article as:
- Chang, R., Wang, T., Wang, C. et al. Inf Syst Front (2012) 14: 499. doi:10.1007/s10796-010-9255-z
- 378 Views
Video surveillance systems are playing an important role to protect lives and assets of individuals, enterprises and governments. Due to the prevalence of wired and wireless access to Internet, it would be a trend to integrate present isolated video surveillance systems by applying distributed computing environment and to further gestate diversified multimedia intelligent surveillance (MIS) applications in ubiquity. In this paper, we propose a distributed and secure architecture for ubiquitous video surveillance (UVS) services over Internet and error-prone wireless networks with scalability, ubiquity and privacy. As cloud computing, users consume UVS related resources as a service and do not need to own the physical infrastructure, platform, or software. To protect the service privacy, preserve the service scalability and provide reliable UVS video streaming for end users, we apply the AES security mechanism, multicast overlay network and forward error correction (FEC), respectively. Different value-added services can be created and added to this architecture without introducing much traffic load and degrading service quality. Besides, we construct an experimental test-bed for UVS system with three kinds of services to detect fire and fall-incident features and record the captured video at the same time. Experimental results showed that the proposed distributed service architecture is effective and numbers of services on different multicast islands were successfully connected without influencing the playback quality. The average sending rate and the receiving rates of these services are quite similar, and the surveillance video is smoothly played.
KeywordsUbiquitous video surveillanceMultimedia intelligent surveillanceMulticast overlay networkForward error correctionAES security mechanismCloud computing
Since there are multiple services composed by single or multiple cameras, captured image should be distributed over more than one processing center. For example, in a typical airport security system, it may need to provide two MIS services of face recognition and dangerous object recognition (DOR) (Beynon et al. 2003; Regazzoni and Sacchi 2000; Smith et al. 2006; Stringa and Regazzoni 2000; Venetianer et al. 2007) from a single camera somewhere in airport. These processes are very complicated not only because varied objects needed to be recognized in a complex scene, but also the objects should be processed in real-time. Thus, distribute varied recognition processes among multiple networked computers as a MSSC service model will be an effective way to provide these services. The distribution of services inside the networks is an important advantage of UVS service architecture. This paper concerns the decomposition of logical surveillance functionalities into a set of logical components that can be allocated to different processes inside an intelligent and physical network, which is built from proposed service architecture for UVS. The main advantages of such decomposition will help UVS to become more efficient, flexible and intelligent to support four surveillance service models mentioned above.
Besides, as the applications of video surveillance service are growing, value-added and diversified ubiquitous video surveillance applications are produced by integrating with other full-fledged services over Internet. Then, the loading of specific surveillance spots will be getting heavier than before. For example, the loading of CPU and network bandwidth in surveillance content provider, e.g., smart camera, should also be considered for multiple accesses from varied video surveillance services with real-time requirements. These issues are challenging the scalability of UVS to support MSSC services. In this paper, we apply multicasting surveillance videos for all generations of surveillance service models (i.e. SSSC, SSMC, MSSC and MSMC) to furnish the effective distributed architecture for UVS scalability. IP multicast (Deering 1989; Quinn 2001) is a one-to-many protocol of data communications, originally designed for multimedia conferencing and suitable for network applications of multiple accesses. Different with IP unicast, IP multicast serves only single traffic no matter how many clients request. IP multicast is much useful for large scale network applications of single data source. However, in order to avoid service abuse and malicious attack of flooding traffic, ISP (Internet Service Providers) usually disable multicast forwarding ability on routers. Thus, multicast packets cannot pass through Internet. Multicast backbone (MBone) (Kumar 1995) arises as a virtual network for connecting multicast islands over Internet. On each of these islands, there is a host running a multicast routing demon and these islands are connected with one another via unicast tunnels. Service components are connected over Internet and multicast traffic can be reached with all service components via applied multicast overlay network (Chen et al. 2009).
Notably, some surveillance videos in UVS may preserve privacy and sensitivity while delivering them over the public Internet. The public Internet also preserves the network dynamic of occasional packet loss to jeopardize the video playback quality. Thus, we apply not only the cost-effective AES encryption (NIST 2001) with Diffie-Hellman key negotiations (Rescorla 1999) to protect different surveillance videos from Internet eavesdropper, but also the open-loop error control of forward error correction (FEC) (Macker 1997; Luby et al. 2002) to preserve the playback quality of delivered surveillance videos in UVS for further processing and presentation in MIS services. Therefore, the effective and secure distributed service architectures are proposed to provide UVS with ubiquity, scalability and reliability. Beside, the proposed UVS architecture constructs a distributed, heterogeneous and intelligent video surveillance network. As cloud computing, users do not need to own the physical infrastructure, platform, or software. They consume resources as a service, where Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), Software-as-a-Service (SaaS), and pay only for resources that they use.
The paper is organized as follows. Section 2 presents the levels from the physical to the logical of the proposed UVS architecture. An example of decomposition of a surveillance functionality related to multiple object tracking and object recognizing is given. In Section 3, design and implementation of the proposed UVS architecture are presented. It describes how the proposed UVS architecture applied in our OpenIVS (Open Internet Video Surveillance) system (Wang et al. 2007) to achieve scalability, ubiquity, security and reliability on Internet. Section 4 shows the performance results of quality of UVS service. Two value-added MIS services of fire detection and fall-incident detection in homecare surveillance are shown. Conclusions and future works are presented in the final section.
2 UVS architecture
In recent years, some researchers proposed different automatic and smart video surveillance systems. Goshorn et al. (2007) proposed a cluster-based automated surveillance network. They applied clustering techniques into cameras for detecting and tracking multiple persons’ activities. IBM Smart Surveillance System (S3) (Tian et al. 2008) further provides not only the capability to automatically monitor a scene but also the capability to manage the surveillance data, perform event based retrieval, receive real time event alerts through standard web infrastructure and extract long-term statistical patterns of activity. Snidaro et al. (2008) proposed an automatic multi-sensor surveillance system focusing on detection and alarming if unauthorized human trying to cross in the harbor area. Kandhalu et al. (2009) presented a distributed real-time surveillance system, named OmniEye with large-scale deployment of video cameras over wireless mesh networks. Other researchers also proposed such as foreground analysis of real-time video surveillance (Tian et al. 2005), detection and tracking of moving object (Connell et al. 2004), and automatic detection and indexing of video-surveillance-event shots (Foresti et al. 2002). However, the proposed distributed service architecture for UVS system further considers not only the system integrity, scalability and ubiquity, but also the communication reliability and security over Internet. Besides, the cloud computing technologies can be further applied for user to enjoy the effective UVS services in a more pervasive way.
2.1 Physical level
Nodes which capture surveillance images/video such as IP cameras.
Nodes which are images/video intelligent processing (IIP/VIP) units such as the servers of object tracking and object recognition.
Links which represent heterogeneous communication channels such as wireless channels and wired channels.
The physical level is designed for general distributed video surveillance environment without limitations and constraints on the specific applications. Therefore, diversified surveillance applications can be effectively built on top of this level. The capacity of the communication links to deliver surveillance information including the surveillance videos in high bandwidth is considered to be the primary issue.
Concerning communication links for practical UVS systems, high-speed wireless links (Keller and Hanzo 2000; Hanzo et al. 2000) are becoming more and more important for connecting cameras to VIP units to achieve ubiquity in video surveillance as they allow higher flexibility and lower costs in installation. The most widely used wireless surveillance cameras generally adopt H.264 source coding and robust wideband transmission techniques (e.g., direct sequence spread spectrum) at high frequencies (e.g., 2.4-GHz ISM band). Besides, the prevalent IP cameras in market apply error-free protocols, like HTTP/TCP of Internet protocol suite, to deliver surveillance video packets of adaptive bit rates, which is dependent on time-varying channel conditions. IP cameras can also apply well-known link-layer protocol IEEE 802.11/b/g/n WLAN with multiple-access schemes to connect to IIP/VIP units to furnish UVS.
On the other hand, if wired links (such as xDSL links, cable networks, and fiber optics) have to be used between surveillance cameras and IIP/VIP units, wired transmission channels can be applied for long distance UVS connections (e.g., between hubs, routers, and control centers). Internet is heterogeneous which applies wireless or wired links with different capability of bandwidth. UVS should characterize the bottleneck links along the delivery path between surveillance camera and IIP/VIP units to maintain the quality of UVS service for later presentation and processing. For example, applying layer video coding (Zink et al. 2005; Kim and Ammar 2005) techniques can help UVS to serve more Internet users with different access bandwidth.
2.2 Logical level
By applying the components in physical level, logical level can be effectively built to gestate more diversified video surveillance applications. As shown in Fig. 2, the major components in logical level are summarized as follows.
2.2.1 Content provider (CP)
CPs are responsible for providing not only surveillance images/videos to other components through stream links, but also operations on these images/videos and make them accessible to all applications. IP cameras or hosts attached with cameras can be CPs to provide raw or compressed data of captured images/video. Besides, CPs also include the stored video servers for others components to browse, review and process the backup surveillance videos to extract more valuable surveillance events.
2.2.2 Service requester (SR)
MIS services for requested surveillance images/videos are also provided in SRs. Due to the models of SSMC and MSMC, SR may need different surveillance images/videos with different coding formats from CPs. Trans-coding (Dogan et al. 2001) techniques may also be applied to translate different coding of surveillance images/video to a common format for further processing. Moreover, an SR is played as the role of CP to provide the processed images for other SRs. This characteristic also makes the proposed UVS architecture become very suitable for distributed processing to speed up applied MIS services.
2.2.3 Service provider (SP)
SPs are the key components of the proposed UVS architecture to provide different kinds of surveillance services for SRs. Namely, while the SP receives request from SR for a certain video surveillance service, SP has to find out all the locations of CPs for the requested service. Thus, the proposed surveillance services model of SSSC, SSMC, MSSC and MSMC can be fulfilled. SPs are components for users to directly access UVS services provided by CPs. While users request UVS services, SP has to find out the corresponding locations of CPs.
A service in UVS is namely a MIS services and is a well-defined set of operations involved in the intelligent processing in surveillance video such as event recognition and extraction. For example, DOR service is a conventional surveillance service, especially used in airport security system, to derive whether a dangerous object is in passengers’ luggage. DOR service used to decompose a given functionality into several sub-services which is mainly responsible for various objects recognition, alarm and presentation.
MIS services are logical entities and need to be developed on top of physical level. The distribution architecture in UVS is described as the process of mapping the logical level defined by a set of service chains in levels. We also applied the idea from the DAVIC architecture (http://www.davic.org/). There are two kinds of service components, service providers and requesters for accessing UVS services. Another component in logical level is CP which is an interface between logical and physical level. SPs are entities used for requesting services, presenting or processing the surveillance information including events, alarms and received surveillance images/videos from SRs. SRs are entities used for mapping SP’s requests to proposed service models, send corresponding content requests to CPs, and then providing requested services to SPs. Contents needed by SRs or SPs are fulfilled by CPs, which can deliver the surveillance data including the images/videos from cameras (e.g., CCTV camera, CCD camera and IP camera) from physical level in proposed UVS architecture.
The relations among SPs, SRs and CPs are also shown in the top of Fig. 2. There are two kinds of communication link in proposed UVS architecture. They are command links and stream links for the above-mentioned components applied in the logical level to communicate the commands and surveillance images/videos respectively with each other. The advantages of applying formal DAVIC architecture into proposed architecture come to high compatibility and easy integration with present video streaming services, such as video on demand for stored surveillance videos and IPTV for live public surveillance videos.
An example of logical level architecture is a generic video surveillance system of traffic surveillance of city or highway. There are tens to hundreds of cameras installed on the city streets and highway spots. The automatic alarms/acknowledgements of traffic jams/incidents on traffic surveillance spots are provided by MIS service on IIP/VIP units from this traffic surveillance system. The MIS service applies image/video processing techniques to find out the traffic events on surveillance spots and then inform operators in remote control center to pay attention right away. Then, the operators can use camera control service to zoom in/out, pan/tilt the surveillance camera to further clarify the alarms/events.
The above mentioned automatic alarm/acknowledge service, MIS service and camera control service are classified to the entities of logical level of the proposed UVS architecture. Those cameras, the communication link connecting cameras and IIP/VIP units are the entities of physical level of UVS architecture. The functionality of video surveillance system can be decomposed into basic processing components to further provide “services”. We will give more details about how three components cooperate in logical level and what applications will apply the proposed service models at the following subsections.
2.3 Cooperation in UVS components and applications
UVS architecture can accomplish four service models (i.e. SSSC, SSMC, MSSC and MSMC) of 3GSS via three proposed system components (CP, SR and SP). In the following, we will describe and demonstrate not only how system components cooperate, but also what applications can be conformed to these four service models.
SP maintains the information of all surveillance services, includes the service description, which CP and/or SR provide the service, where the CP and/or SR locate, and other information needed by SRs. Thus, there exists command links for all CPs and SRs to register to SP with the provided services. Besides, service streams between CPs and SRs can be directly transmitted without the need of SP’s relay. The advantage is not only reducing the latency of the service streams but also decreasing the loading of SP.
In Fig. 4, two MIS servers of license plate recognition and car speed measurement are illustrated. Moreover, the MIS servers also can be played as a CP for providing the processed images for other MIS services, for example, the license plate images could be used for another MIS of stolen car recognition system over Internet.
On the other hand, independent MIS servers simultaneously starts to process the surveillance images/video to provide valuable surveillance information for each client, due to the multicast delivery of the surveillance video to multiple MIS servers applied in the proposed UVS architecture. Moreover, the transmission loading of CP can be relieved to cost-effectively conform to MSSC model.
3 Design and implementation of UVS
Multicast agent (MA) play as the MBone to provide UVS with scalability via application-layer multicast to connect Internet multicast islands, where the CPs and SRs are located, to construct multicast overlay network for UVS (i.e. so-called UVSMON) (Chen et al. 2009). Security and reliability of UVS can also be achieved via MAs.
Super Agent (SA) is UVS’s central agent, which is located in public Internet, closely related to SP for UVS users. The CPs and SRs located in private networks over Internet can be accessible through the SA’s help for UVS to achieve ubiquity over Internet. Before asking SA for help, CPs and SRs must register their connection information first, includes surveillance location, host address/port, video type and etc. to SA for accessibility from other CPs and SR.
Customary subsystem is a subsystem to comply the previous generations of surveillance systems (i.e. 1GSS & 2GSS) in market with 3GSS for proposed UVS. Customary subsystem will provide agents (Wang et al. 2003; Wang et al. 2008) to extend the isolated surveillance services of previous generations of surveillance systems connecting to Internet for UVS’s interoperability.
Extended service subsystems are some extended surveillance services gestated from subsystems mentioned above. They can be applied further as CP or SR in the proposed UVS architecture and demonstrate the cost-effectiveness and scalability in diversified surveillance applications.
After briefly introducing the four major subsystems for the implementation of UVS, we are going to present in further details that how these subsystems can help UVS to achieve ubiquity, privacy, reliability, interoperability and scalability.
3.1 Ubiquity implementation via MA and SA
dMA in public network
dMA in private network
3.2 Privacy protection in MA and SA
Because surveillance videos usually preserve privacy, we also take account of the security issues for the proposed UVS over public Internet. As mentioned earlier, SA is a centralized architecture and contains all the important information of UVS. Malicious users can be prevented by applying authentication procedures from accessing valuable information of UVS. However, sensitive surveillance videos should also be protected from eavesdroppers due to its transportation over the public Internet. Thus, we are going to focus on the security issue of surveillance video delivered between SA and MAs as follows.
Considering security strength and real-time constraint of the surveillance videos, OpenIVS applies the well-known symmetric cryptography AES as the encryption and decryption algorithm to protect sensitive surveillance video content while forwarding. Besides, Diffie-Hellman key negotiation algorithm (DH) is not only applied to negotiate an encryption key, but also periodically update encryption key to boost security strength for UVS.
As the same cases mentioned in Section 3.1, Figure 10 shows the key negotiation procedure between MA (i.e. sMA or dMA) and SA. By exchanging MA’s and SA’s public key and applying DH, MA and SA then negotiate a common secret key (session key) for privacy protection to the surveillance videos forwarded between MA and SA. Besides, before the surveillance videos are directly transmitted between sMA and dMA, sMA and dMA negotiate a common secret key (session key) through SA’s help. Figure 11 presents the key negotiation procedure between sMA and dMA via SA. While the key negotiation procedure is done, sMA then sends the encrypted surveillance video to dMA.
3.3 Scalability for extended services
4 Theoretic analysis and experimental results
4.1 Theoretic analysis
According to the study of bandwidth cost for the proposed architecture in average case, we elaborate all possible situations where SRs located: from all SRs behind a MA to all SRs behind different MAs, and the average traffic load BUVS_average is shown in Eq. 9. The analysis expressed the proposed architecture can significantly reduces traffic load in 50% off from CP in average. Thus, diversified video surveillance applications of MSMC service model can benefit from the proposed UVS architecture.
4.2 Experiments and performance results
For UVS, image/video quality after transmission is very important for object recognition in remote site. Thus, we did several experiments to demonstrate the impact of playback quality of surveillance video in H.263 compression with privacy protection of AES encryption and reliability provision of FEC error control are applied to OpenIVS for UVS test-bed. Figure 14 shows the experimental environment. There are four multicast islands, such as NTU R125A lab which is located in the laboratory of National Taiwan University in Taipei city, MCU s206 lab which is located in Taoyuan County, NCNU lab which is located in Nantou County and a private network which is also located in Taipei. Each of these multicast islands has a MA to forward their surveillance video. Besides, a client located at the private network would display these surveillance videos from surveillance spots in these multicast islands.
Finally, we did a simulation to measure the reliability that can be achieved in OpenIVS by applying FEC error control. The simulation was done by sending 5000 multicast packets with sending rate 225 kbps, each packet contained payload size of 512 bytes, and 25 FEC packets were added for every 50 video surveillance packets. In the FEC simulation over the private network with Wi-Fi environment, 103 packets were totally lost at receiver after transmission. Packet lost rate reached to 2.05%. While applying FEC during the transmission, all lost packet were recovered at receiver. Packet lost rate can be significantly reduced to zero if the extra bandwidth cost is allowable in the networks of OpenIVS. According to the simulation results, FEC is very helpful to recover lost packets for UVS over Internet to preserve playback quality, especially on the prevalent wireless environment.
We also did experiments to demonstrate the efficiency of FDS and FIDS while applying in different scenes. First of all, in order to demonstrate FDS, we collected several kinds of videos, with and without fire scenes, and played the video continuously in full screen. A web camera is used to capture the video screen to simulate the situation of played scene.
Experimental results of FDS
Number of frames with fire objects
Fired Christmas tree
Fire extinguisher instruction 1
fire extinguisher advertisement
Fire extinguisher instruction 2
Waving national flag
A part of movie
Sunrise over the ocean
A film without fire scene
Surveillance systems have been playing an important role for human lives and properties. As the technology advancements, surveillance systems will toward the diversified development of multiple intelligent services in ubiquity. In this paper, we firstly point out the four service models in abstraction for UVS systems. Secondly, we perform a logical decomposition based on the point of view of services as a basic step to define the logical components associated with a given surveillance functionality. A novel distributed UVS architecture can be constructed with theses logical components to preserve ubiquity, privacy, reliability, interoperability and scalability. Then, we also design and implemented a distributed UVS prototype which is called OpenIVS. Three kinds of extensive services are implemented and demonstrated to effectively detect fire and fall-incident features, and record the captured video at the same time. Moreover, we demonstrate some experiments to evaluate the effective OpenIVS by good performance results of playback quality. At the end, we also did an analysis for evaluating the bandwidth cost on CP. The analysis result shows that the proposed distributed service architecture has the ability to significantly reduce half bandwidth cost on CP in UVS systems.