Towards a resource efficient and privacy-preserving framework for campus-wide video analytics-based applications

Gupta, Ankur; Prabhat, Purnendu

doi:10.1007/s40747-022-00783-w

Towards a resource efficient and privacy-preserving framework for campus-wide video analytics-based applications

Original Article
Open access
Published: 24 June 2022

Volume 9, pages 161–176, (2023)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

Towards a resource efficient and privacy-preserving framework for campus-wide video analytics-based applications

Download PDF

Ankur Gupta¹^na1 &
Purnendu Prabhat¹^na1

1693 Accesses
11 Citations
Explore all metrics

Abstract

Video surveillance and analytics solutions based on Artificial Intelligence (AI) are increasingly being deployed across industries, including academia. There are a number of use-cases for campus-wide video analytics applications. Detecting events of interest in real-time and generating alerts is a core requirement for such applications, making them both network and compute intensive. Thus, the underlying framework needs to be resource optimized in terms of latency, compute and storage requirements for a multitude of video applications. Increasingly privacy concerns have been voiced against the pervasive deployment of video analytics-based applications. Thus, protecting the privacy of students and staff in a campus setting shall be a major design consideration for such systems going forward. This paper presents a resource optimized and privacy preserving framework for campus-wide video analytics applications. Several use-cases are presented and early results from the deployment of the proposed framework establish its feasibility and effectiveness.

Real-Time Surveillance Video Analytics: A Survey on the Computing Infrastructures

Intelligent Video Surveillance as a Service

Video Big Data Analytics in the Cloud: Research Issues and Challenges

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

A video analytics system comprises multiple cameras installed at locations of interest, connected to edge nodes, which in turn are connected to the decision making core, usually hosted in an on-premises data-center or on the cloud [1,2,3]. The edge node is considered to have relatively low computational power, sufficient enough for basic pre-processing and forwarding the video streams to the core. The core is a Graphical Processing Unit (GPU)-based system and software capable of parallelizing computation over the GPUs, enabling the use of Deep Learning for computer vision [4]. This generic video analytics framework is illustrated in Fig. 1.

Video streams consist of a time-based sequence of frames. Computer vision techniques are used to process individual frames. The inference from these individual frames is then derived and collated over time to determine actionable inputs based on the application processing logic. Computer vision tasks such as image classification, object detection, and segmentation are compute-intensive and need to be performed for each video frame.

According to Cisco, by 2022, 82% of all internet traffic is expected to be video and live video usage will increase 15-fold by 2021 [5, 6]. Furthermore, video surveillance demands a steady upstream of video frames from deployment locations to the cloud, typically via the edge. A single IP-camera can produce up to 300 GB of data per month, while 770 million such video surveillance cameras were present globally in 2020 alone [7, 8], requiring significant network bandwidth and storage. A December 2019 report states that US alone has around 70 million CCTV cameras [9]. There is thus an increasing need for resource optimized video surveillance systems with reduced:

communication overheads and delay
storage requirements

After cost, data privacy concerns are the major barrier in implementing AI-based video surveillance systems [10]. According to a poll released by the American Civil Liberties Union of Massachusetts, 80% of the voters support complete ban on facial recognition-based surveillance [11]. Three countries (Belgium, Luxembourg and Morocco) have even banned the use of facial recognition technology [12]. The Draft of the “Civil society initiative for a ban on biometric mass surveillance practices”, an initiative funded by the European Digital Rights (EDRi), raises serious concerns over the fundamental rights of citizens like privacy, right to free speech and not to be discriminated against being compromised due to mass video surveillance [13]. Other than hackers getting live video stream or accessing recorded feeds from the surveillance systems, the privacy of individuals is at risk even with the operators being able to misuse recorded footage. There have been cases of major privacy breaches from video surveillance systems [14,15,16,17]. Recently, a group of hackers breached live security camera feeds of 150,000 surveillance cameras inside hospitals, organizations, police departments, prisons and schools [18]. Thus, it is imperative to preserve the privacy of the individuals whenever video surveillance systems are deployed at scale. This applies to the various stakeholders (students and staff) in a campus setting too. Privacy-preservation is thus emerging as a major requirement for future campus-wide video surveillance applications. Therefore, the motivation for this work is:

resource optimization in a campus-wide video analytics system in terms of communication overheads, compute and storage requirements; and
privacy preservation of the stakeholders for specific applications.

The main contributions of this paper are:

A multi-layered framework, Campus-wide Video Analytics Framework (CVAF), is proposed, which is resource optimized and privacy preserving while meeting the functional requirements for campus surveillance.
Low cost-edge devices and efficient algorithms are deployed for:
- frame filtering
- inference deduction
- face obfuscation for privacy preservation
Five custom video-analytics applications are built and deployed for experimental evaluation of the proposed framework.
Server-side storage and computing is significantly reduced and some applications are entirely off-loaded to the edge.
Experimental results establish the viability of a low resource, privacy preserving video surveillance framework.

The rest of the paper is organized as follows: Sect. 2 discusses the existing techniques that have been applied across different relevant implementations of video surveillance systems to solve at least one of the above-stated problems. Section 3 describes the Campus-wide Video Analytics Framework (CVAF) in detail. Section 4 discusses the implementation details with the experimental results, while Sect. 5 concludes the paper with an insight into the possible directions for future work in the domain.

Related work

The fundamental decision that affects both the network bandwidth consumption and computational load on the core is whether to allow a frame to be sent to the core or to drop it. The point where this decision is taken is crucial to its impact. Thus, resource optimized video-applications typically resort to some sort of filtering at the video camera or the edge to reduce the load on the core.

Authors in [19] propose RealEdgeStream (RES) framework which filters out low-value video streams at the edge based on configurable rules while performing inference on the local cloudlets. Their approach reduces the network bandwidth consumption by allowing minimal data to be sent to the cloud (core). The system is evaluated using a simulation and thus the exact hardware requirements are not available.

Authors in [20] propose an FPGA-based smart camera design to provide efficient stream processing of the captured video on the camera itself. The FPGA-based smart cameras perform initial filtering, while the edge servers perform basic deep learning-based inference with only advanced inference being done in the cloud. Filtering on the camera reduces the network bandwidth and storage consumption. However, specialized cameras need to replace all existing cameras.

Authors in [21] describe their pilot study in which they used NVIDIA Jetson TX2 embedded computing platform in the visual sensor and all computation is made onboard. Inference is done on the sensor itself and only meta-data travels through the network which offers a privacy compliant tracking solution. They implemented the project as a pilot study with 20 visual sensors in the city of Liverpool and were able to reduce the network bandwidth requirements.

Chameleon [22] is a video-surveillance system that dynamically adjusts the video stream configurations based on video segment profiles intending to optimize resource utilization and increase accuracy. It was able to achieve the same accuracy with only 30–50% of the resources used by traditional systems. They, however, worked only to reduce the compute load on the GPU in systems, where there is a choice between different neural networks, frame rate and image quality. The authors have mentioned the need for reducing the network bandwidth consumption and profiling at the edge.

Authors in [23] address the issue of network bandwidth consumption. They design FilterForward which reduces the network bandwidth consumption by transmitting only the relevant frames to the datacenter. Frames that are not relevant are discarded on the edge (early discard strategy). Thus, it allows for optimization of the feeds for multiple video analytics applications. The edge node, that the paper describes, is a heavy-duty computer with a quad-core i7-6700K CPU with 32 GB of RAM.

In [24], the authors describe a 3-layer architecture for video stream analytics in which the edge layer performs low-level feature extraction, the fog layer performs recognition, and the cloud computing layer performs behavior analysis tasks. The architecture intends to offload the computation to the edge and fog, addressing only one of stated objectives for a campus-wide video surveillance and analytics framework.

Table 1 Existing video-surveillance frameworks – optimization and privacy assurance evaluation

Full size table

Recently, researchers have started focusing on privacy-preservation in video-surveillance applications. An extensive survey of visual privacy preservation methods is presented in [25]. One of the extensively discussed methods by them is redaction in which the region of interest, for example the face or the identity card number, are detected and blurred, replaced, removed or de-identified. Redaction can be performed at the camera, video processor, database or the user interface. A system for privacy preserving video surveillance that takes the computation for face detection, obfuscation and encryption to the camera itself is presented in [26]. Authors in [27] use a privacy mediator which is a module installed in the camera which denatures the live video feeds. However, such a solution would require complete replacement of existing cameras which do not offer this feature, entailing significant costs.

REVAMP2T [28] is a pipeline for detection, re-identification, and tracking across multiple cameras without the need for storing the streaming data. Although, it does not cater to multiple video analytics applications, it is an important breakthrough in the privacy preserving tracking of individuals. It uses key features of the pedestrians at runtime without the use of personally identifiable information to track their movement.

Authors in [29] introduce OpenFace and RTFace. OpenFace, trained for a pool of users, detects and classifies their faces. It uses deep neural network for feature extraction followed by support vector machines for classification. RTFace, built on top of OpenFace is a complete pipeline to track and blur regions of interest, i.e., recognized faces by OpenFace, in real-time at full frame-rate. This enables privacy-preservation for live video analysis. The system uses cloudlets on virtual machines on top of rack servers with Intel Xeon E5 v4 processors. The system, however, has decreased accuracy and speed when the pool of users increases.

Authors in [30] design a privacy-preserving and efficient mobile video analytic system PrivacyEye, which is composed of two major modules including the privacy-preserving module and the efficient video process module. The privacy-preserving module employs adversarial training process to extract useful information for model prediction while preserving privacy of sensitive information of the user. The efficient video process module combines the dynamic frame scheduler and optical-flow-based feature propagation to greatly reduce video processing time.

Authors in [31] propose DeepCache to accelerate the execution of CNN models via leveraging video temporal locality for continuous vision tasks.

Table 1 summarizes the relevant frameworks and how they cater to the need of network bandwidth reduction, storage consumption reduction and privacy preservation.

Campus-wide video analytics framework

Campus-wide video surveillance systems and applications are gaining traction. From basic recording of video feeds and manual review, such applications have evolved to use AI-based interventions and automated actions being initiated. Thus, video surveillance-based systems are becoming increasingly complex, requiring much higher storage and computing power. While the cloud has been able to cater to the storage and compute requirements, response times for critical surveillance applications required moving many of the analysis tasks to the edge. Thus, an effective campus surveillance systems is resource heavy, entailing significant costs. Privacy has also emerged as a major requirement for such surveillance systems with many countries adopting strict laws to prevent unauthorized video recording, storage and analysis to safeguard an individual’s right to privacy. This necessitates efficient privacy preservation mechanisms to be built in or over existing surveillance systems. Thus, it is desirable that surveillance systems for specific campuses be customized, resource efficient, privacy preserving while delivering the required functionality with acceptable accuracy and performance. We propose the Campus-wide Video Analytics Framework (CVAF) which is a resource optimized and privacy preserving framework that moves frame filtering, inferencing and privacy preservation (face obfuscation) tasks to the edge. This reduces the network bandwidth and video storage requirements significantly while preserving privacy of the students and the faculty. Figure 2 provides an overview of the framework.

Major design considerations for CVAF emerged from:

Analysis of the recorded video feeds in the existing NVR surveillance system on campus indicated that more than 70% of the recorded video streams did not contain information of interest for a given application.
Due to high network bandwidth requirements of IP cameras, the campus network can become constrained requiring network infrastructure management solutions to be deployed besides storage provisioning and expansion.
Majority of professionals in the field of video surveillance say that edge-based analytics is the ideal for video surveillance and it provides more efficient use of bandwidth and storage, since video data is large, complex and used in real time [32].
The European Data Protection Supervisor (EDPS), the European Union’s (EU) independent data protection authority describes 3 main data protection issues in case of video surveillance [33]:
- Minimize gathering of irrelevant footage,
- Notify the stakeholders, and
- Timely and automatic deletion of footage.

The CVAF framework has a 3-layer architecture in which layer-1 edge is responsible for frame filtering, layer-2 edge is responsible for deriving inference and layer 3 is the server core for maintaining application-specific campus-wide analytics. The video cameras installed at different locations of interest are connected to the layer 1 edge nodes which are light-weight compute devices (Raspberry Pi or its variants) capable of ingesting video streams, executing the frame-dropping logic and forwarding the stream to the layer 2 edge nodes. The layer 2 edge nodes are GPU-based edge devices optimized for deep-learning inference applications (NVIDIA Jetson Nano/Xavier or its variants) [34]. They are capable of processing multiple filtered streams from Layer 1, perform inferencing using deep learning models and initiate any application-specific configured actions. The video stream is then forwarded to the in-premises server(s). NVIDIA Jetson Nano is a family of low-cost GPU devices that consume only 5W of power and are able to process up to 5 video streams and perform deep learning inference [35]. These are used to perform inference, detection and removal of Personally Identifiable Information (PII) on the edge ensuring privacy preservation. In particular, layer 2 edge devices perform feature extraction and object detection using inference optimized (using TensorRT or ONNX) models [36, 37]. Furthermore, the layer 2 edge nodes perform inference and drop the frame if the frame is not required to be stored. It also obfuscates PII in the frame, encrypts and forwards the frame if it is required to be stored. This significantly reduces the network, storage and computational load on the core. The in-premises server(s) comprising the core hosts all the relevant web-applications including application-specific databases and file systems for storage. In our setup two Dell Precision T5500 and Dell Precision 5820 workstations serve as the core.

Table 2 Applications implemented on CVAF and layer-wise processing performed

Full size table

Table 3 Layer-wise technical specification for CVAF implementation

Full size table

Table 4 Number of devices used for each video application

Full size table

Layer 1: Inexpensive edge filtering

Layer 1 edge nodes filter out frames that are not useful for further processing, for instance similar frames during time periods of low or no activity within the campus. Thus, the layer 1 edge nodes perform a singular task, i.e., whether to allow a frame to be forwarded to the core, based on the similarity of successive frames. Layer 1 drops the frames, generated from a static camera at 25 frames per second, if they are similar to the preceding frame beyond a certain threshold. We calculate the similarity between two successive frames as the Mean Squared Error (MSE) between the two frames. There are advanced methods available for finding the similarity between two images [38,39,40]. However, MSE is used as the similarity measure because of its efficiency in determining the similarity between successive frames 25 times per second [41]. Here speed more than accuracy was given precedence due to the non-critical nature of the video applications deployed. Furthermore, this allowed the inexpensive layer 1 edge nodes to process and filter 2–3 video streams simultaneously during peak loads. Figure 3 shows the schematic diagram of the Frame Dropping Logic implemented in Layer 1 Edge Nodes.

The frames from the cameras are received by the layer 1 edge node, temporarily stored in the buffer and are processed by the Frame Dropping Logic. The Frame Dropping Logic maintains the application specific MSE threshold. If the calculated MSE between the present and the previous frame is more than the MSE threshold, the frame is forwarded to layer 2 edge node otherwise dropped.

Layer 2: Inference

The layer 2 edge nodes implement the computer vision tasks including face obfuscation for privacy preservation and also implementing the video application processing logic for specific applications. They receive frames from Layer 1, perform real-time inference, trigger specific action as per the application specific action trigger and send the inference to the server for further processing or action and generates notifications for registered devices if required. Based on the application, if the frames have to be sent to the server then all the Personally Identifiable Information (PII) are obfuscated and the frame forwarded to the server. Figure 4 shows a schematic diagram of the processing at the Layer 2 Edge Node. CVAF performs real-time face detection using Haar Cascade and applies a Gaussian blur on detected faces for face obfuscation. Although more accurate methods of face detection are available and well documented, we choose Haar Cascade for its real-time performance and low memory consumption [42,43,44,45,46].

Layer 3: Server

The Server fulfills two requirements: secure storage of video streams for multiple video analytics applications and hosting of web applications to enable the use of the inference. Figure 5 shows a high-level architecture of the server-side video analytics applications. The Data Receiver Module receives data from the Layer 2 Edge Nodes and saves the video stream to the Application-Specific Secure Storage and the inference data to the Application-Specific Database, respectively. The Analytics Engine analyzes the data and generates patterns and insights. The Web Application provides a dashboard to the Administrator. Administrators with special privileges have application-specific private key which they can use to view the obfuscated video streams through the Web App when required.

Implementation and experimental results

We implemented the proposed framework at our institutional campus. Five major video-analytics applications were deployed:

Parking Management System
Traffic Rule Violation Detection System
Crowd Detection System
Teaching Effectiveness Analysis System
Video-based Attendance System

as a pilot program. These applications use real-time feeds from IP-cameras over the campus network. Cameras installed inside buildings are connected to layer 1 edge nodes through Wi-Fi, while cameras installed outside the buildings are connected to layer 1 edge nodes through Power over Ethernet (PoE)LAN. The action trigger is specific to the application and decides the action to be triggered in case a particular inference is made. The server receives inference data and the video stream from the layer 2 edge nodes, stores the video streams, pre-configured triggers initiate actions/notifications as required. The servers comprising the core also maintain the application-specific and campus-wide analytics for all the deployed video applications. Table 2 describes the applications implemented on CVAF and the layer-by-layer processing performed, while Fig. 6 shows the working screenshots of the said applications.

Table 5 Time taken per frame for different operations at Layer 1 and Layer 2 edge nodes

Full size table

Table 6 Improvement in the average response time of different applications during peak and non-peak hours

Full size table

Experimental setup

For the evaluation of the effectiveness of CVAF, 54 IP cameras were deployed in campus transmitting frames at 25 FPS. We measure the network bandwidth and storage consumption for the applications implemented with and without CVAF. For Layer 1 edge nodes, we use Raspberry Pi 3 Model B+ [52], for Layer 2 edge nodes we use NVIDIA Jetson Nano and for the servers at the core, we use the Dell Precision workstations. Table 3 shows the specifications of the Layer 1, 2 and the Server. Table 4 shows the number of cameras, Layer 1 and 2 nodes used.

Results

We evaluated CVAF on the following parameters:

Overall network bandwidth usage by the video analytics applications in a 24-h window with and without CVAF optimizations (Fig. 7).
Overall storage consumption for video analytics applications with and without CVAF optimizations (Fig. 8).
Percentage delay caused due to CVAF and the percentage improvement in the response time of the applications (Table 6).

Figure 7 shows a radar chart of the combined network bandwidth consumption for the deployed video analytics applications with and without CVAF optimizations (filtering, inferencing, obfuscation turned on or off). CVAF reduces the network bandwidth usage by 27% even during peak usage hours and by 70% during non-peak usage hours. A majority of the optimization is attained by Layer 1 frame dropping during periods of low usage. During peak hours, the percentage of dropped frames reduces at layer 1, while layer 2 filtering post-processing and inferencing becomes more effective.

Figure 8 shows a radar chart of the storage consumption for the deployed video analytics applications with and without CVAF optimizations. CVAF reduces the storage consumption by 40% during peak usage hours and by 75% during non-peak usage hours, primarily due to duplicate frame dropping at Layer 1 and post-processing frame dropping at Layer 2. If video streaming is enabled at the server side, then the layer 2 frame dropping is disabled and frames are forwarded to the server, reducing the effectiveness of the filtering.

Table 5 shows the time taken by different operations at the Layer 1 and Layer 2 edge nodes. Layer 1 duplicate frame determination and dropping introduces a 6% delay in overall latency and Layer 2 introduces 11% delay causing the total overhead due to CVAF to be around 17%. However, lesser load on the network and quicker inference at the edge improve the response time of the applications by nearly 34%.

The average response time of a given application is the average time taken between the receipt of the frame at layer-1 edge and the final response generated at either layer-2 or server core depending upon the video application. The average response time decreases considerably when the application is deployed with CVAF. Table 6 shows the improvement in the average response time of different applications during the peak and non-peak hours.

The Parking Management System performs best when deployed on CVAF as its peak time of operation is highly concentrated, i.e., mostly during the morning and evening times. The improvement in average response time is attributed to the frame dropping logic on layer 1 edge nodes which filters frames not containing substantial information and the early inference on the layer 2 edge nodes. Figure 9 shows the variation in the average response time for different applications on an hourly basis. The overall effectiveness of CVAF is defined as following:

$$\begin{aligned} E_w= & {} ((N_{traditional,w}- N_{CVAF,w})/N_{traditional,w} \\&+(S_{traditional,w}- S_{CVAF,w})/S_{traditional,w})/2 \end{aligned}$$

where

w is the time window taken for calculating the effectiveness
$E_w$ is the effectiveness of CVAF in a time widow w
$N_{traditional,w}$ and $N_{CVAF,w}$ are the network bandwidth consumption in the traditional system and the CVAF system, respectively, in a time window w
$S_{traditional,w}$ and $S_{CVAF,w}$ are the storage consumption in the traditional system and the CVAF system, respectively, in a time window w

Figure 10 shows the hourly variation of overall effectiveness of the CVAF. CVAF achieves 0.75$-$0.80 effectiveness range during late evening/night when there is no or little human movement in the campus. The effectiveness decreases sharply during peak hours when people and vehicles are coming in the campus, and in the evening when people are leaving the campus.

We also surveyed 200 students and faculty members in the institution regarding their acceptance of the Teaching Effectiveness Evaluations System. The pie-chart in Fig. 11a shows that a majority of the students and faculty members did not want cameras to be installed in the classrooms due to privacy concerns. However, when privacy preservation and a recording-less model is factored in, the respondents agreed to having the system installed in the classroom.

Conclusions

We present a low cost, resource optimized and privacy preserving Campus-wide Video Analytics Framework (CVAF) which is novel and caters to multiple video applications with varying requirements. Privacy preservation is expected to emerge as a major requirement for large scale video surveillance applications in the near future. Our framework attains acceptable performance using off-the-shelf hardware components. Two major optimizations include duplicate frame dropping to reduce network bandwidth and storage requirements and fast face detection and obfuscation for privacy preservation. Deploying light-weight pre-trained AI-models at the edge for faster processing using GPU-based devices greatly speeds up video processing and reduces latency significantly. Major implication of this research is that standard off-the-shelf hardware, open-source software and existing network infrastructure can be used to build a minimum viable and scalable framework that optimizes resource utilization and preserves privacy of the users. Future work shall focus on security aspects of the video surveillance framework to work seamlessly in open environments such as smart cities. Ensuring performance, security and privacy at scale shall be the next big challenge for video analytics frameworks.

References

Haering N, Venetianer PL, Lipton A (2008) The evolution of video surveillance: an overview. Mach Vis Appl 19(5):279–290
Article Google Scholar
Dey S, Chakraborty A, Naskar S, Misra P (2012) Smart city surveillance: Leveraging benefits of cloud data stores. In: 37th Annual IEEE Conference on Local Computer Networks-Workshops, pp 868–876. IEEE
Ajiboye SO, Birch P, Chatwin C, Young R. Hierarchical video surveillance architecture: A chassis for video big data analytics and exploration. In: Video Surveillance and Transportation Imaging Applications 2015, vol. 9407, p. 94070 (2015). International Society for Optics and Photonics
jetson_tx1_whitepaper.pdf. https://www.nvidia.com/content/tegra/embedded-systems/pdf/jetson_tx1_whitepaper.pdf. (Accessed on 01/21/2022)
Video will account for 82% of all internet traffic by 2022, Cisco says. [Online; accessed 21. Jan. 2022] (2018). https://www.fiercevideo.com/video/video-will-account-for-82-all-internet-traffic-by-2022-cisco-says
Service Provider. Cisco. [Online; accessed 21. Jan. 2022] (2021). https://www.cisco.com/c/en/us/solutions/service-provider/index.html
Surveillance Camera Statistics: Which City has the Most CCTV Cameras? [Online; accessed 21. Jan. 2022] (2021). https://www.comparitech.com/vpn-privacy/the-worlds-most-surveilled-cities
UITP. [Online; accessed 21. Jan. 2022] (2022). https://www.uitp.org/publications/international-trends-in-video-surveillance
Ivanova I (2019) Video surveillance in U.S. described as on par with China. [Online; accessed 21. Jan. 2022]. https://www.cbsnews.com/news/the-u-s-uses-surveillance-cameras-just-as-much-as-china
The Video Surveillance Report (2020). https://www.ifsecglobal.com/resources/the-video-surveillance-report-2020/. (Accessed on 01/21/2022)
O’Carroll E (2019) Face off? Americans fear privacy loss to recognition software. Christian Science Monitor
Facial Recognition Map - Surfshark. [Online; accessed 21. Jan. 2022] (2022). https://surfshark.com/facial-recognition-map
Civil society initiative for a ban on biometric mass surveillance practices. [Online; accessed 21. Jan. 2022] (2022). https://europa.eu/citizens-initiative/initiatives/details/2021/000001_en
BBC NEWS $\vert $ England $\vert $ Merseyside $\vert $ Peeping tom CCTV workers jailed. [Online; accessed 21. Jan. 2022] (2006). http://news.bbc.co.uk/1/hi/england/merseyside/4609746.stm
Leaked camera footage from S’pore homes sold online: Change passwords to deter hacking. [Online; accessed 21. Jan. 2022] (2020). https://mothership.sg/2020/10/hacked-camera-footage-singapore-leaked
iVideon Russian-based video surveillance solution leaked data, hundreds of thousands of records exposed. [Online; accessed 21. Jan. 2022] (2018). https://securityaffairs.co/wordpress/72424/data-breach/ivideon-data-leak.html
Exposed Video Streams: How Hackers Abuse Surveillance Cameras - Security News - Trend Micro SG. [Online; accessed 21. Jan. 2022] (2022). https://www.trendmicro.com/vinfo/sg/security/news/internet-of-things/exposed-video-streams-how-hackers-abuse-surveillance-cameras
Henriquez, M.: Verkada breach exposed live feeds of 150,000 surveillance cameras inside schools, hospitals and more. Security Magazine (2021)
Ali M, Anjum A, Rana O, Zamani AR, Balouek-Thomert D, Parashar M (2020) Res: Real-time video stream analytics using edge enhanced clouds. IEEE Transactions on Cloud Computing
Wang S, Zhang C, Shu Y, Liu Y (2019) Live video analytics with fpga-based smart cameras. In: Proceedings of the 2019 Workshop on Hot Topics in Video Analytics and Intelligent Edges, pp. 9–14
Barthélemy J, Verstaevel N, Forehead H, Perez P (2019) Edge-computing video analytics for real-time traffic monitoring in a smart city. Sensors 19(9):2048
Article Google Scholar
Jiang J, Ananthanarayanan G, Bodik P, Sen S, Stoica I (2018) Chameleon: scalable adaptation of video analytics. In: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pp. 253–266
Canel C, Kim T, Zhou G, Li C, Lim H, Andersen DG, Kaminsky M, Dulloor SR (2019) Scaling video analytics on constrained edge nodes. arXiv preprint arXiv:1905.13536
Xu R, Nikouei SY, Chen Y, Polunchenko A, Song S, Deng C, Faughnan TR (2018) Real-time human objects tracking for smart surveillance at the edge. In: 2018 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE
Padilla-López JR, Chaaraoui AA, Flórez-Revuelta F (2015) Visual privacy protection methods: A survey. Expert Syst Appl 42(9):4177–4195
Article Google Scholar
Bentafat E, Rathore MM, Bakiras S (2020) A practical system for privacy-preserving video surveillance. In: International Conference on Applied Cryptography and Network Security, pp. 21–39 . Springer
Das A, Degeling M, Wang X, Wang J, Sadeh N, Satyanarayanan M (2017) Assisting users in a world full of cameras: A privacy-aware infrastructure for computer vision applications. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1387–1396. IEEE
Neff C, Mendieta M, Mohan S, Baharani M, Rogers S, Tabkhi H (2019) Revamp 2 t: Real-time edge video analytics for multicamera privacy-aware pedestrian tracking. IEEE Internet Things J 7(4):2591–2602
Article Google Scholar
Wang J, Amos B, Das A, Pillai P, Sadeh N, Satyanarayanan M (2018) Enabling live video analytics with a scalable and privacy-aware framework. ACM Trans Multimed Comput Commun Appl (TOMM) 14(3s):1–24
Article Google Scholar
Du W, Li A, Zhou P, Niu B, Wu D (2021) Privacyeye: A privacy-preserving and computationally efficient deep learning-based mobile video analytics system. IEEE Trans Mob Comput
Xu M, Zhu M, Liu Y, Lin FX, Liu X (2018) Deepcache: Principled cache for mobile deep vision. In: Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, pp. 129–144
Bannister A (2019) The video surveillance report 2019. https://www.ifsecglobal.com/video-surveillance/the-video-surveillance-report-2019/. (Accessed on 01/25/2022)
Video-surveillance | European Data Protection Supervisor. https://edps.europa.eu/data-protection/data-protection/reference-library/video-surveillance_en. (Accessed on 01/25/2022)
Buy the Latest Jetson Products | NVIDIA Developer. https://developer.nvidia.com/buy-jetson. (Accessed on 01/25/2022)
NVIDIA Jetson Nano Developer Kit | NVIDIA Developer. https://developer.nvidia.com/embedded/jetson-nano-developer-kit. (Accessed on 01/25/2022)
NVIDIA TensorRT | NVIDIA Developer. https://developer.nvidia.com/tensorrt. (Accessed on 01/25/2022)
ONNX | Home. https://onnx.ai/. (Accessed on 01/25/2022)
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article Google Scholar
Zhu J-J, Xiong Y-K, Lu X, Liu H-F (2015) Image processing device and method for determining a similarity between two images. Google Patents. US Patent 9,076,066
Van Heel M (1987) Similarity measures between images. Ultramicroscopy 21(1):95–100
Article Google Scholar
Silva EA, Panetta K, Agaian SS (2007) Quantifying image similarity using measure of enhancement by entropy. In: Mobile Multimedia/Image Processing for Military and Security Applications 2007, vol. 6579, p. 65790. International Society for Optics and Photonics
Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) Faceboxes: A cpu real-time face detector with high accuracy. In: 2017 IEEE International Joint Conference on Biometrics (IJCB), pp 1–9. IEEE
Chen D, Ren S, Wei Y, Cao X, Sun J (2014) Joint cascade face detection and alignment. In: European Conference on Computer Vision, pp 109–122 . Springer
Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: past, present and future. Comput Vis Image Underst 138:1–24
Article Google Scholar
Kumar A, Kaur A, Kumar M (2019) Face detection techniques: a review. Artif Intell Rev 52(2):927–948
Article Google Scholar
Zhou Y, Liu D, Huang T (2018) Survey of face detection on low-quality images. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp 769–773. IEEE
GitHub - tesseract-ocr/tesseract: Tesseract Open Source OCR Engine (main repository). https://github.com/tesseract-ocr/tesseract. (Accessed on 01/26/2022)
YOLO: Real-Time Object Detection. https://pjreddie.com/darknet/yolo/. (Accessed on 01/26/2022)
Azhar MIH, Zaman FHK, Tahir NM, Hashim H (2020) People tracking system using deepsort. In: 2020 10th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), pp 137–141. IEEE
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Article Google Scholar
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 815–823
Buy a Raspberry Pi 3 Model B+ - Raspberry Pi. https://www.raspberrypi.com/products/raspberry-pi-3-model-b-plus/. (Accessed on 01/26/2022)
Motion. https://motion-project.github.io/motion_config.html. (Accessed on 01/26/2022)
XAMPP Installers and Downloads for Apache Friends. https://www.apachefriends.org/index.html. (Accessed on 01/26/2022)
Raspberry Pi Documentation - Raspberry Pi OS. https://www.raspberrypi.com/documentation/computers/os.html. (Accessed on 01/26/2022)
Jetson Linux | NVIDIA Developer. https://developer.nvidia.com/embedded/linux-tegra. (Accessed on 01/26/2022)
The CentOS Project. https://www.centos.org/. (Accessed on 01/26/2022)

Download references

Author information

Ankur Gupta and Purnendu Prabhat have contributed equally to this work.

Authors and Affiliations

Department of Computer Science and Engineering, Model Institute of Engineering and Technology, Kot Bhalwal, Jammu, 181122, Jammu and Kashmir, India
Ankur Gupta & Purnendu Prabhat

Authors

Ankur Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Purnendu Prabhat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Purnendu Prabhat.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gupta, A., Prabhat, P. Towards a resource efficient and privacy-preserving framework for campus-wide video analytics-based applications. Complex Intell. Syst. 9, 161–176 (2023). https://doi.org/10.1007/s40747-022-00783-w

Download citation

Received: 03 May 2021
Accepted: 18 May 2022
Published: 24 June 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s40747-022-00783-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Towards a resource efficient and privacy-preserving framework for campus-wide video analytics-based applications

Abstract

Similar content being viewed by others

Real-Time Surveillance Video Analytics: A Survey on the Computing Infrastructures

Intelligent Video Surveillance as a Service

Video Big Data Analytics in the Cloud: Research Issues and Challenges

Introduction

Related work