1 Introduction

Scenarios such as industrial monitoring  [1], environmental monitoring  [2], and smart city [3] have to a wide extent changed the constraints towards wireless vision sensor networks (WVSN), requiring several cameras to cover large areas, while performing image-processing and communication tasks with real-time performance. Looking back at the evolution of camera-based networks, many changes have been introduced through the years in terms of design and architecture. Initially, such systems consisted mainly of CCTV cameras connected either to monitors for constant visual inspection, or stored to a memory device. As a result, the architecture came with high financial costs due to the constant need for staff to perform visual inspection, as well as extensive memory requirements. The necessity for the systems to be more autonomous resulted in the design of smart camera nodes, which embedded processing capabilities in the camera node itself. However, scenarios such as industrial or outdoor monitoring bring attention to the complexity of wired vision systems architectures due to camera deployment limitations and electromagnetic noise. From the communication perspective, this has been overcome through the implementation of wireless communication technologies, i.e. WVSN. To address the problem as a whole, we need to investigate energy-efficient designs of smart cameras, enabling their implementation as battery-operated devices with a satisfactory lifetime.

Energy efficiency in the smart camera node is a twofold problem, highly influenced by the allocation of image-processing tasks. On the one hand, there are tasks such as background modelling, segmentation, morphology, detection and tracking that are computationally demanding. On the other hand, there are different communication technologies alternating in terms of energy efficiency and delay for different data rate requirements. An overview of state of the art research on WVSNs indicates that smart camera optimisation (either image-processing performance or energy consumption), and communication networks optimisation (energy efficient protocols) are analysed as separate entities. The energy efficiency of the smart camera node through processing focuses on energy-efficient implementation of image-processing algorithms [4], or distribution of tasks between the node and a server/cloud, connected by a communication technology chosen a priori   [5]. The presented architectures consist of three main configurations:

  • In-sensor processing: the processing is done in the sensor node, while the configuration between hardware and software implementation can vary. In  [4, 6] the aim is to optimise such architectures for smart cameras.

  • Sensor and remote processing: the processing tasks can be partitioned between the sensor and the remote processor, which can be a server or the cloud. In  [7] they provide an example of outdoor monitoring with the allocation of image-processing tasks chosen a priori.

  • Remote processing: the sensor node has no processing capabilities, and all the data are transferred to the cloud for processing and analysis.

However, in articles on the analysis of communication energy efficiency, they implement a predefined image-processing design on the node, while the emphasis is on the optimisation of the energy consumption for the communication model. As a result, all previously mentioned approaches provide partial understanding of the variations in the smart camera node energy consumption, as they omit the inter-effects of processing and communication configurations.

We use the term intelligence partitioning to refer to a design exploration approach that analyses the inter-effects of the processing and communication component in the WVSN, optimising the node energy consumption and the real-time performance of the system. This approach relies on the analysis of several partitioning configurations of the image-processing tasks between the computational entities. The schematic representation in Figs. 1 and 2 show that for each configuration, we select a partition point; the tasks on the left side are allocated in the smart camera, while the tasks on the right side are executed in the cloud. Furthermore, for the tasks allocated in the smart camera we also consider the partitioning configurations generated by hardware/software partitioning of the tasks. The overall energy consumption of the smart camera node, for each of the task partitioning configurations, is the sum of the processing energy consumption and the energy consumption of each of the communication technologies included in the analysis.

Fig. 1
figure 1

Representation of the image-processing tasks partitioning

Fig. 2
figure 2

Schematic representation of intelligence partitioning between the smart camera node and the cloud, including the communication technologies

In this paper, we introduce an analysis of the inter-effects between computation and communication energy consumption in a smart camera node, for different task partitioning configurations of image processing. We investigate how the allocation of image-processing tasks on embedded hardware, software or cloud affects the overall node energy consumption, for a people counting scenario. Furthermore, we implement this approach to the scenarios presented by  [8] and  [9] to compare the energy efficiency of this method with their results. The aim of this work is not only to provide an energy-efficient design for a people counting scenario, but mainly to explore the trade-off between computation and communication energy for data-intensive IoT scenarios.

The remainder of this paper is organised as follows. In Sect. 2 we review WVSN architectures focusing on approaches for energy-efficient nodes, and their decision-making process in the design space exploration. Section 3 introduces the models for intelligence partitioning and communication, while Sect. 4 provides a detailed analysis of the selected communication technologies. In Sect. 5 we present the design cases used for the intelligence partitioning analysis and measurements results for aspects such as communication delay and the node energy consumption, which are further discussed in Sect. 6. Section 6 summarises the results and key findings of this paper.

2 Related work

The architecture of people counting systems can be divided into two categories: image based and non-image based. Several non-image-based methods rely on the use of passive infrared sensors (PIR) in the region of interest, but with a major drawback due to occlusion handling. Alternative methods such as analysis of WiFi channel use [10], or a combination of infrared lasers alongside an IR camera to detect the IR rays displacement  [11] have been implemented, providing an approximate count, but without distinguishing between people and other moving elements that can be present in the region of interest. Instead, image-based people counting models introduced in  [12,13,14,15] rely on depth imagery to overcome occlusion limitations, while using GPU as processing element. The emphasis of their approach is on the accuracy of detection; hence, they propose models that are computationally intensive in devices that cannot be battery operated. Furthermore, cloud processing for such scenarios would require high bandwidth and inflict significant delays, which would affect the real-time performance.

In the last decade, many WVSN architectures have been developed and further optimised in terms of energy consumption. Several strategies have been introduced through the years, one of which has been an energy-efficient design through hardware/software optimisation in the smart camera node. Birem et al.  [4] introduced DreamCam, a smart camera based on low-power Altera-Cyclone III FPGA. They optimised the design by implementing their own hardware modules for feature extraction. Another smart camera architecture is the SENTIOF-CAM designed by  [5], where they implemented a low-complexity background modelling algorithm and duty cycling to optimise the energy consumption of the smart camera node. In addition, they partitioned the image-processing tasks between the node and a server. However, in their evaluation of a partitioning point they only considered the reduction of the data size to be transferred, not including the trade-off between processing and communication energy.

In several WVSNs, the smart camera node’s duty cycling and sleep mode have been used to reduce energy consumption. Hengstler et al. [16] implemented a sleep mode in the smart camera MeshEye based on a predefined time interval. This resulted in a reduced performance of the smart camera, as the active intervals are not event driven. Instead,  [7, 17] control the sleep/wake-up mode by using a passive infrared motion detector for scenarios of surveillance of restricted areas. However, the performance of the proposed system would be affected by two elements: wake-up delay and non-accurate pre-processing. The system wake-up delay can be significant in the chosen scenario, while the background modelling algorithm they implemented requires several history frames before providing a robust model to support an accurate detection.

An alternative approach towards the energy optimisation of WVSNs is introduced in the HuSIMS  [18] project. They propose the use of semantic conversion to reduce the frame size before transferring the information to the server, without providing an analysis of the energy consumption changes between the proposed method and in-node processing. In addition, the model uses mobile communication networks to transfer full frames to the end user when the alarm is activated, without considering the resulting high data rate. Furthermore, Berni et al.  [19] designed the WVSN node Wi-FLIP, based on analogue pixel-level processing. Their system provides parallelism and energy efficiency, while penalising detection performance and adaptability of the design towards alternative vision sensors. Cao et al.  [20] have instead designed a self-optimising IoT WVSN, carrying out real-time processing and experimental communication configurations depending on the current energy levels. From the node architecture perspective, the configuration varies only in algorithms selection, while the processing allocation remains constant. Regarding communication, their prototype is promising for ideal environments, but the results might differ if implemented with current technology. In   [21, 22, 23], they consider optimisation in WSN, and focus on the distribution of the computational load to optimise resource utilisation. The framework introduced in  [21] optimises the distribution of computational load and communication from the server to the processing nodes; the framework in [22] is implemented on the node itself to provide reliable processing, communication and data, while the framework in [23] focuses on task scheduling for cloud architectures, to reduce the execution latency for intensive tasks. The three above-mentioned frameworks provide a limited view of the problem for battery-operated nodes, as they present no data regarding the energy consumption and delay for the sensor to server/cloud communication, or regarding the effects of implementing the framework in the node itself.

The architectures reviewed present a variety of algorithmic and architectural methods to optimise the energy consumption of the edge node represented by the smart camera. The main limitation of the models is the exclusion, or restriction of the communication component to a single protocol chosen a priori. Instead, the growth in size and complexity of WVSNs requires a more inclusive analysis of the energy consumption in the node, in the IoT context. In this paper, we provide an analysis of the energy efficiency of the smart camera node evaluating the trade-off in energy consumption for different allocations of image-processing tasks between the smart camera node and the cloud. In addition, the model includes the energy consumption estimation for three categories of communication technologies (LAN, Cellular, and IoT) related to IoT scenarios. The aim is to provide an insight into the inter-effects of processing allocation and communication technologies in the overall energy consumption of the WVSN.

One important aspect of distributed smart camera systems is the interconnection of the different nodes by means of communication networks. Several networking technology options have been suggested for the various systems described above. However, the focus of designing smart camera systems is typically on the image-processing features and their implementation rather than communication technologies, leading to the selection of communication interfaces based on modules availability and data rate requirements. As a result, such systems tend to employ energy hungry communication technologies, such as cellular mobile communication or WiFi (e.g. [17, 18]) that provide high data rate. In terms of overall node energy consumption, this is likely to result in sub-optimal technology selections.

The process of selecting the communication technology requires application-specific traffic pattern and technology-specific protocol operation. Previous approaches that do consider communication costs apply simplistic models of the interfaces, especially the technology-specific protocol operation that controls the timing of the communication, and thus the energy consumed is often omitted (e.g.  [20, 24]). Hence, we developed a model  [25] reviewing the available technology options, including the differences in energy consumption, based on different protocol behaviour. In contrast to other communication-specific articles with IoT focus, we target quite opposite assumptions regarding the data rate and sending intervals. The focus of other publications is on handling many devices in parallel with low data rates and typically long duty cycles. Based on these assumptions, several low-rate technologies were developed for both short-range (e.g. IEEE 802.15.4, Bluetooth) and long-range applications (e.g. LoRa, NB-IoT, SigFox)  [26]. Due to the low data rates provided by these technologies, they are often not considered for WVSNs, even if they are designed to provide more energy-efficient communication and require less power during operation. In  [26] , Morin et al. present node lifetime estimations for different ad hoc style IoT communication candidates based on the behaviour of the protocols. However, in their analysis they omit scenarios with frequent and large data transfers, such as WVSN scenarios. In  [25], we provided models allowing this comparison and apply them in the current study.

3 Methods

3.1 Intelligence partitioning

The evolution of requirements and constraints in WVSN in terms of energy consumption and delay requires a paradigm shift in design space exploration. Intelligence partitioning, providing an insight into the inter-effects of processing and communication, provides the support needed for the design of energy-efficient WVSN. A WVSN consists of a set of tasks, where a specific task \(t_{i}\) is not bound to a specific geographical location; as such, it can be mapped to either the smart camera node or the cloud. The distribution function F in Eq. (1) has its functionality distributed between the camera node and the cloud, where the subsets \(f_{{\text {Node}}}\) and \(f_{{\text {Cloud}}}\) are the different clusters of tasks mapped to different computational elements.

$$\begin{aligned} \begin{aligned} F&=\big \{t_{1}, t_{2}, t_{3}, t_{4}\big \}, F = f_{{\text {Node}}} \cup f_{{\text {Cloud}}} \quad and \quad \varnothing \\&= f_{{\text {Node}}} \cap f_{{\text {Cloud}}}. \end{aligned} \end{aligned}$$
(1)

The mapping function of the computational load to the computational elements is referred to as intelligence partitioning \(\mathfrak {I}(F)\).

$$\begin{aligned} \mathfrak {I}\big (F\big )=\bigg \{ \begin{array}{c} \big \{ f_{{\text {Node}}}, f_{{\text {Cloud}}} \big \} \\ D_{Node \rightarrow Cloud} \end{array}. \end{aligned}$$
(2)

Computational latency is defined in Eq. (3) as the time for processing each set of tasks \(\big \{ f_{{\text {Node}}}, f_{{\text {Cloud}}} \big \}\) on the respective computational platform at each computational layer \(\big \{ P_{{\text {Node}}}, P_{{\text {Cloud}}} \big \}\).

$$\begin{aligned} L= \sum _{p \in \{{\text {Node}}, {\text {Cloud}}\}}^{} L_{{\text {p}}}\big (f_{{\text {p}}}, P_{p}\big ) + \sum _{c \in \{{\text {Node}} \rightarrow {\text {Cloud}}\}}^{} L_{{\text {c}}}\big (f_{{\text {c}}}, P_{{\text {c}}}\big ). \end{aligned}$$
(3)

Communication latency is defined as the need to communicate data between the different computational levels \(D_{{\text {Node}} \rightarrow {\text {Cloud}}}\) on the communication link \(C_{{\text {Node}} \rightarrow {\text {Cloud}}}\), between the layers. From these definitions, the latency for the intelligence partitioning function \(\mathfrak {I}(F)\) is derived as in Eq. (3), where \(L_{{\text {P}}}\) and \(L_{{\text {C}}}\) are the measurements or estimation function for the processing and communication latency, respectively.

Another limiting resource is the node energy consumption, which can also be expressed as battery life or energy harvesting resource. The node energy \(E_{{\text {Node}}}\) can be formulated as:

$$\begin{aligned} E_{{\text {Node}}}= E_{{\text {p}}}\big ( f_{{\text {Node}}}, P_{{\text {Node}}}\big ) + E_{{\text {c}}}\big ( D_{{\text {Node}} \rightarrow {\text {Cloud}}}, C_{{\text {Node}} \rightarrow {\text {Cloud}}}\big )\nonumber .\\ \end{aligned}$$
(4)

\(E_{{\text {P}}}\) and \(E_{{\text {C}}}\) refer to estimation functions of the energy consumption for processing and communication, respectively. The objective is to find the lowest energy per sample \(E_{{\text {Node}}}\) under the constraint of a minimum latency \(L_{min}\).

$$\begin{aligned} \min _{f\big (F\big )}^{E_{{\text {Node}}}} \; s.t. \; L<L_{\max }. \end{aligned}$$
(5)

The constraints can also be formulated as in Eq. (6) where the aim is to find the lowest latency L under the constraint that the node energy consumption \(E_{{\text {Node}}}\) is below the node energy per sample available \(E_{{\text {a}}}\).

$$\begin{aligned} \min _{f\big (F\big )}^{L} \; s.t. \; E_{{\text {Node}}}<E_{{\text {a}}}. \end{aligned}$$
(6)

In Sects. 5 and 6, we provide the results and analysis of the effects of intelligence partitioning in three deployment scenarios, with consideration regarding latency in Eq. (5) and energy consumption constraints in Eq. (6). Table 1 presents the resulting configurations of image-processing tasks allocation based on Eq. (2).

3.2 Communication energy

The focus of this paper is on partitioning image-processing tasks between the smart camera node and the cloud, with regard to the inter-effects in processing and communication energy consumption. Hence, the communication has to reflect the impact on the sending node only for a complete picture of the trade-off between different configurations of task partitioning. We simplified the communication model by analysing a two-node system in Fig. 3 with point-to-point communication between the peer nodes. Even though we consider only the sending node for the energy consumption analysis, two nodes are required to describe the correct timing constraints of the communication, as the node will also receive control information from the peer. Furthermore, we assume that the nodes are already connected to each other and the sensor node can send its data, as soon as the image processing requires it. The model is based on ideal communication conditions with no transmission errors. In addition, the interference by other nodes/technologies has been omitted, assuming ideal resource assignment for the transmitting sensor node. To transfer larger amounts of data, subsequent transmissions take place until the required data amount is transferred, depending on the technology-specific aspects.

Fig. 3
figure 3

Two-node representation of the node-cloud communication model

Fig. 4
figure 4

Activity cycle of data transmission

To model the energy consumed by the communication module or interface of the node, we calculate the required transmission time for the given data amount for each technology based on its physical and medium access layer operation. Higher layers of the communication stack are not considered. This assumes that fragmentation at the network layer is available, if the data to be transferred is larger than a single packet. To ensure reliable data transmission, acknowledgements are used as feedback from the receiver, according to the protocol specification of the given technology. The overhead considered includes all training and synchronisation sequences on the physical layer as well as any overhead due to header information at the MAC layer. The model uses a typical communication cycle according to Fig. 4, which is repeated multiple times, if more than one packet is needed to transfer the given data amount. This model follows the approach in  [27] for subsequent packet transmissions. Based on this, we are able to describe how long each transmission stage (e.g. sending tx, receiving rx, or waiting idle) takes for a single transmission of the given data amount. We then derive the overall energy consumption of the communication module \(P_{C({\text {tech}})}\) of the node based on the time it spends in the different states as well as current consumption taken from the data sheets of corresponding transceiver chips for each technology and the data amount d according to this equation:

$$\begin{aligned} P_{C\big ({\text {tech}} \big )}\big (d\big )= \,& {} t_{tx}\big (d_{rx}\big )P_{tx} + t_{rx}\big (d_{tx}\big )P_{rx} \nonumber \\&+ t_{{\text {idle}}}P_{{\text {idle}}} + t_{{\text {sleep}}}P_{{\text {sleep}}}. \end{aligned}$$
(7)

\(P_{tx}\), \(P_{rx}\), \(P_{{\text {idle}}}\) and \(P_{{\text {sleep}}}\) are the different power consumption levels per state of the given transceiver. \(t_{tx}\), \(t_{rx}\), \(t_{{\text {idle}}}\) and \(t_{{\text {sleep}}}\) are the protocol-specific durations spent in each state to transfer the data amount d. It should be noted that not all generic states will be used, depending on the actual operational specification of the given technology. The proposed communication model provides an estimation of communication specific energy consumption, required for the task partitioning analysis. Energy consumption and delay estimation results are optimistic compared to real-world implementation, as the systems are considered isolated, and any re-transmission of packets is not included. However, this model provides the fundamentals upon which to analyse the inter-effects of communication and processing energy consumption regarding the allocation of image processing tasks between the camera node and the cloud.

3.2.1 Compression and data aggregation

In current literature, the addition of image compression techniques is considered a priori as a method to reduce the smart camera node energy consumption, because of the reduction in communication workload. Data rate requirements vary significantly amongst them due to the different configurations of tasks partitioning, resulting from the application of intelligence partitioning. Considering this data rate variation, and the inclusion in the analysis of several communication technologies, we decided to investigate how the inter-effects of processing and communication energy consumption would be affected by image compression techniques. Initially, we implemented lossless greyscale image compression on embedded software. The results motivated us to investigate further, hence we implemented CCITT group 4 binary image compression in both embedded hardware and software. The software implementations are based on the OpenCV library, while the hardware implementation is from the work of Imran et al.  [28]. Another method to reduce communication workload is data aggregation. For partitioning configurations where all the processing is done locally in the node and only the counting result is sent to the cloud, data transfers for each frame would be redundant. Instead, we transfer the data in predefined time intervals, without affecting the statistics of people counting, while optimising the communication workload.

4 Communication technologies

There is a wide range of communication technologies that could be suitable for IoT systems. However, the suitability of a technology mainly depends on the constraints of the specific user case [29,30,31,32]. In the case of battery-operated smart camera systems, these constraints are energy consumption; relatively high data amount compared to traditional IoT systems with low duty cycle; and real-time performance. The latter two result in high data rate requirements, if nodes exchange raw image data. Despite the need for high data rates, previous approaches on smart camera systems also considered traditional communication for wireless sensor networks such as Bluetooth Low Energy (BLE) or, 802.15.4 with rather low data rates  [29, 33]. Therefore, we introduce the following categories for communication technologies.

4.1 Local area network communication

This category contains Bluetooth and WiFi as two technologies that are able to provide higher data rates and are available in various devices other than traditional sensors. The achievable communication range is approximately 100 m, and thus rather short. Out of these, we selected two versions of Bluetooth Low Energy (BLE): version 4.2 with a data rate of 1 Mbps and version 5 with 2 Mbps as well as WiFi according to the IEEE 802.11n standard.

4.2 Cellular communication

This category contains traditional public land mobile networking technologies that were developed for mobile phone communication, which are mainly used for server to user communication, rather than communication between the sensor nodes and the server (e.g. in [18]). In this category, different technologies support a wide range of data rates, which are asymmetric between up- and downlink. Hence, the sensor node has different rates for receiving (downlink) and sending (uplink) data. The exact rate of a user will depend on the current state of the network and the resources that are assigned to the device in question. All technologies have long-range capabilities with ranges over 1 km and mainly require more energy than technologies designed for low-power IoT applications. However, for long-range IoT applications that require a high data rate, employing cellular communication technologies can be the optimal choice. In this study, we consider GPRS, HSPA and LTE Cat4 in this category.

4.3 IoT-specific communication

This category covers both traditional communication for low data rate wireless sensor networks and long-range technologies designed for the IoT context. The IEEE 802.15.4 standard represents the traditional low-rate communication in its conventional specification. BLE can also be a candidate in this category, but we chose to add it to the LAN category as it is also used there and supports higher data rates. Among the recently developed IoT-specific technologies, we consider NB-IoT, an LTE extension for machine-to-machine communication  [34], and LoRa, but we omit SigFox from our analysis due to its significantly low data rate for WVSN applications. Furthermore, we added LTE Cat. 1 devices to this group, which were designed as intermediate technology between LTE devices of the previous category and NB-IoT, in terms of low power consumption requirements. Both NB-IoT and LTE Cat1 belong to the group of cellular technologies.

With this selection, we are able to cover the complete data rate of currently available technologies for WVSN communication. Table  2 gives an overview of the considered technologies. It also indicates which transceiver hardware was used to evaluate the energy consumption of the technologies. We based this analysis on the capabilities of suitable embedded transceiver chips that might result in lower data rates than expected for each technology, especially in the case of WiFi.

5 Results

5.1 Design examples

5.1.1 People counting

Applications such as surveillance systems, or environmental monitoring, consist of outdoor deployment of the camera node, with public spaces captured within the field of view. The main complexity of such systems is to provide a robust application despite the abrupt illumination changes throughout the day/night cycle. In addition, in many countries, including Sweden, there are law restrictions prohibiting installation of cameras in public spaces; hence, privacy concerns become a major constraint in the design of the system.

Table 1 Configurations resulting from intelligence partitioning for the people counting scenario
Table 2 Configurations of communication technologies

To overcome these complexities, we used a low-resolution thermal sensor that gives a generic temperature profile of the region/object of interest while providing a low-weight system compared to RGB cameras  [45]. For the people counting scenario, we created our own dataset from a setup installed in Härnösand, Sweden. The camera sensor used is the FLIR Lepton 3, a long-wave infrared sensor with wavelength 8–14 \(\mu \)m. The analysis is based on the processing of a video of 1:45 hours recorded from a stationary camera, with a frame rate of 9 fps, and frame size \(60 \times 149\) pixels. The processing platform used for the smart camera is the TE0726-03M Raspberry Pi with a System on Chip module that includes the Xilinx Zynq-7010 FPGA. The image-processing tasks used in the people counting scenario are listed below:

  • Background modelling and subtraction use low-pass IIR filter due to energy efficiency and accuracy in the background model [45].

  • Segmentation is based on a global threshold defined experimentally.

  • Morphology relies on erosion and dilation operations with a \(3 \times 3\) mask.

  • Detection uses the bounding box method for each foreground element, while tracking relies on Kalman filter.

  • Image compression uses CCITT group 4 and PNG compression for the binary and greyscale images, respectively, with more implementation details in Sect. 3.2.1.

Tasks such as background modelling, segmentation, morphology and image compression were implemented in both the programmable logic and the ARM Cortex A9 processor; the remaining were only implemented on embedded software  [46]. Software implementation of the image-processing tasks is based on OpenCV libraries. To estimate the energy consumption of the different configurations of task partitioning included in our analysis, we used the Xilinx Power Estimation tool with an error margin of 20%  [47] and the post-synthesis hardware description.

5.1.2 Pedestrian and particle detection

Besides the people counting scenario, we also evaluate the effects of intelligence partitioning in the node energy consumption for the smart camera nodes introduced by Imran et al.  [8] and Maggiani et al.  [9].

The energy consumption results regarding the computation are, respectively, based on the Xilinx Power Estimation tool and PowerPlay Early Power Estimator tool  [48] with a 20\(\%\) error margin. The flowcharts in Fig. 5 show the computational tasks considered for intelligence partitioning in each of the implementation cases analysed.

Fig. 5
figure 5

Flowchart representation of computational tasks for: a people counting; b particle detection  [8]; c pedestrian detection  [9]

5.2 Measurement results

The analysis of the overall energy consumption in the smart camera node for the three scenarios is based on the combined results of image-processing task partitioning and communication technologies. The computational tasks in Fig. 5 are allocated among the processing elements in the node and cloud, resulting in several configurations for the people counting scenario as shown in Table 1. In the following analysis, the communication technologies will be referred to according to the plotting order in Table 2, while all the calculations for delays and energy consumption in the smart camera node are frame based.

Table 3 Delay constraints in the communication technologies for data rates resulting from intelligence partitioning

5.2.1 Delay and channel usage

Scenarios such as environmental monitoring, industrial monitoring, or surveillance system are highly dependent on the timing constraint to maintain the performance of the system. Task partitioning configurations result in data rate requirements ranging from 8940 down to 0.5 bytes per frame for the people counting scenario and 964, 608 down to 259 bytes per frame for the other two scenarios. Subsequently, this affects the performance of the communication technologies considered, especially IoT communication technologies, as they have been designed for long duty cycles and low data rate operations. To define the delay constraint for each scenario, we referred to the frame rate of the smart cameras and compared it to the communication delay. We omit from our consideration the processing delay in the smart camera node, as it is in the range of ns to \(\upmu \)s. Table 3 summarises the effects of the delay constraints in the ten communication technologies considered, based on the data rate requirements resulting from intelligence partitioning in each scenario.

As expected, the IoT communication group is the most affected by the delay constraint. For the people counting scenario, LoRa supports only transfers of a few bytes per second resulting from full in-node implementation of the image-processing tasks, while NB-IoT and 802.15.4 support data rates up to binary image transfer (compressed and non-compressed, respectively). Instead, for the particle and pedestrian detection cases, the support of IoT technologies is highly restricted due to higher data rate requirements, and a three times shorter delay interval. The LTE Cat. 1 technology is an exception to the group performance, as it meets the delay requirements for data rates up to 11, 264 bytes per frame.

In the cellular communication group, HSPA and LTE Cat. 4 technologies meet delay requirements for data rates up to 119, 808 bytes per frame, while GPRS only supports data rates up to 91 bytes per frame for the people counting scenario. The LAN communication group provides support for all the data rates resulting from intelligence partitioning in the people counting scenario. However, for the particle and pedestrian detection scenarios, Bluetooth Low Energy devices are restricted to only three configurations, with data rates up to 680 bytes per frame, unlike 802.11n technologies that support up to 119, 808 bytes per frame.

One of the three communication technology groups included in our analysis is cellular technologies. In this case, the communication channel is owned by a third party, and its utilisation results in additional monetary costs for the smart camera node. Hence, we need to take such costs into consideration in terms of subscription costs, which subsequently limits the amount of data to be transferred. As a result, cellular technologies would not be an optimal choice for the pedestrian detection scenario with monthly data rates in the range of terabytes. In the remaining two scenarios, the monthly data rate requirements are significantly reduced for task partitioning configurations with image compression, enabling the use of cellular technologies.

Table 4 Energy consumption per frame for greyscale compression

5.2.2 Data reduction

The constraints in terms of delay and channel utilisation showed a significant reduction in the number of communication technologies capable of meeting the requirements as the data rate increases. For the people counting and particle detection scenarios, the task partitioning configurations with the most restrictions were the ones without any data reduction method implemented. However, the focus is on exploring the energy efficiency of the smart camera node; hence, we analyse how the implementation of image compression affects the overall node energy consumption for the people counting scenario.

Table 5 Energy consumption per frame for binary image compression

Table 4 shows the resulting energy consumption in the smart camera with and without greyscale image compression for configuration 9 of the task partitioning. The results from configuration 13 and 14 of task partitioning have been omitted; all three showed the same behaviour. Contrary to expectations, the overall energy consumption of the smart camera is higher for the case with image compression. From a comparison of the points with the lowest energy consumption, consisting of BLE 5 communication, the energy consumption is 24 times higher for the scenario with greyscale compression. This shows that processing the image compression algorithm on embedded software increases the processing energy consumption much more than the reduction in communication energy due to data rate reduction.

Considering the high impact of the processing energy consumption for the compression algorithm, we investigated into hardware/software partitioning, this time for the lossless binary image compression with CCITT G4. In addition, we omitted the frame header from the compressed packets, considering an already acknowledged communication between the node and the cloud. The task partitioning configurations resulting in binary frames are 2, 4, 5, 7, 8, 11 and 12, all showing similar behaviour; hence, the results in Table 5 are only for configuration 2. Similarly to the previous case, the software implementation of the compression algorithm increased the overall energy consumption of the smart camera, with a difference of up to 200 times higher than the case with no compression. Instead, due to the fine-grained use of computational and memory requirements for the hardware implementation, the overall energy consumption is five times lower than for the non-compressed case.

Among the several task partitioning configurations considered in this analysis, configurations 1, 3, 6 and 10 rely on full in-node implementation of the image-processing tasks. For the people counting scenario, this would result in transferring only the counted number to the cloud for further statistical analysis of the data. Hence, we considered the use of data aggregation to avoid redundant information regarding the people count, while optimising the communication energy consumption. Figure 6 shows the energy consumption per frame for all the task partitioning configurations, including those with full in-node implementation with and without data aggregation. The effects of applying data aggregation vary for the different communication groups, with minimal effects in the LAN technologies, and a maximum reduction of 50\(\%\). The cellular group shows moderate results with a reduction of 14–29 times energy consumption, while the IoT group has the highest reduction for the cases with NB-IoT and LoRa communication, resulting in a reduction of 33 and 124 times, respectively, compared to the case without data aggregation. This is due to the design of such communication technologies, optimised for low-data rate transfers and long duty cycles.

Fig. 6
figure 6

Energy consumption per frame in the smart camera node resulting from intelligence partitioning in the people counting scenario

5.2.3 Node energy consumption

The aim of this paper is to analyse how the overall energy consumption in the smart camera node is affected by processing allocation and communication technology choice. The results in Fig. 6 show the energy consumption per frame for the people counting scenario, based on the combination of task partitioning configurations, communication technologies and data reduction approaches introduced above. We begin the analysis of the results from left to right, with the two data rate groups related to full in-node implementation with and without data aggregation. The variation in energy consumption among the communication technologies is about one order of magnitude, while all the communication technologies support the resulting data rate. This is followed by the data rate groups resulting from binary image compression after and before morphology, respectively. The variation in the overall node energy consumption is of about two orders of magnitude, representing the variations due to hardware/software partitioning of the tasks within the smart camera, but most importantly representing the variation in energy consumption between the different communication technologies. The results show that LAN technologies provide a lower energy consumption compared to the remaining, while LoRa and NB-IoT have been omitted due to the delay constraint. The following data rate groups consist of binary frame transfer, compressed greyscale and non-compressed greyscale. The results show high energy consumption, while the number of communication technologies that support the data rates is reduced. The minimum energy consumption among all the configurations considered is achieved when we implement background modelling, segmentation, morphology and image compression on embedded hardware in the smart camera, supported by BLE 5 communication.

Fig. 7
figure 7

Energy consumption per frame in the smart camera node resulting from intelligence partitioning

Similarly to the people counting scenario, the results in Fig. 7a for the particle detection scenario show reduction of the node energy consumption as we distribute the processing tasks between the node and the cloud. As the data size increases, the number of communication technologies supporting it decreases, with cases of full cloud processing becoming obsolete due to communication delay. The optimal configuration consists of capturing, pre-processing, segmenting and compressing the images before transferring the data to the cloud with BLE 5 communication.

A contrast to the previously introduced results is the energy consumption of the smart camera node in the pedestrian detection scenario. The data in Fig. 7b shows that the optimal node energy consumption is achieved for full in-node processing, while supported by 802.11n communication. The high data rates resulting from the partitioning configurations of this scenario exclude most of the communication technologies due to the delay constraint.

6 Discussion on intelligence partitioning in WVSNs

In this paper, we have analysed the effects of intelligence partitioning in the energy efficiency of the smart camera node. The combination of three processing environments and ten communication technologies provided a broader perspective of the problem of design space exploration for smart camera architecture. Traditional architectures focus on either full in-node implementation of the processing tasks or remote processing of the captured frames. The results from intelligence partitioning challenge such views, proving that the distribution of the image-processing tasks between the node and the cloud optimises the node energy consumption due to inter-effects of processing and communication energy consumption. The processing energy consumption for the node is generated by power estimation tools with a 20% error margin. This uncertainty from the processing component leaves the results of intelligence partitioning introduced above unchanged, as the difference between intelligence partitioning groups is much higher than 20%.

Performing intelligence partitioning in the three design examples provided configurations with data rates varying from 0.5 to 964,608 bytes per frame, which has a major effect on constraints regarding delay and channel utilisation. The results showed that partitioning configurations with no data reduction methods implemented have a limited choice of communication technologies. Furthermore, the cases with fully remote processing of the data in the particle and pedestrian detection scenarios cannot be supported by any of the communication technologies considered due to the delay and channel utilisation costs. To summarise on the performance of communication technologies, BLE 4 and 5 are the two better performing technologies of the communication technologies analysed for all the configurations in the people counting scenario. Furthermore, for the particle and pedestrian detection scenarios, BLE provides the best energy performance for data below 1 kB, while for the remaining configurations, the 802.11n and LTE Cat. 4 provide better energy efficiency.

The use of data reduction techniques not only affects the choice of communication technologies, but also the overall node energy consumption. The results obtained from the analysis on greyscale and binary image compression for the people counting scenario disproved the general assumption that image compression a priori reduces the node energy consumption. Fine-grained use of computational and memory resources for the hardware implementation of the compression algorithm reduces the processing energy consumption compared to the software implementation. Therefore, hardware/software partitioning of the compression algorithm influences the outcome due to the trade-off between processing and communication energy consumption.

A comparative assessment of the energy consumption results obtained from intelligence partitioning in the three scenarios shows that intelligence partitioning can improve the overall node energy consumption, while satisfying the constraints of real-time performance. However, the results also show that the effects of intelligence partitioning are affected by the relationship between additional processing load and the resulting data rate reduction, which can be the product of image-processing tasks or data reduction techniques. For the people counting and particle detection scenarios, this enables the allocation of the partitioning point in between the image-processing tasks. However, for the pedestrian detection scenario, there is a negative trade-off between the processing and data reduction, resulting in an energy-efficient partitioning at the end of the image-processing pipeline.

7 Conclusion

The results presented show that to improve the energy efficiency of WSNs, we should review preconceptions regarding design space exploration. Energy-efficient distribution of image-processing tasks between the smart camera node and the cloud, as well as the selection of the communication technology, can improve the longevity of battery-operated nodes, compared to full in-node or remote processing scenarios. Therefore, these results can motivate future work that investigates in the introduction of intermediate processing layers between the camera node and the cloud for further node energy efficiency.