Embedding autonomy in large-scale IoT ecosystems using CAO and L4G-CAO

Recently, special attention has been paid in developing methodologies and systems for embedding autonomy within smart devices (Things). Moreover, as Things typically operate in an interconnected IoT ecosystem, autonomous operation must be performed in a cooperative fashion so the different Things coordinate their autonomous actions towards meeting high-level objectives and policies. Embedding Things with cooperative autonomy typically requires a tedious and costly effort not only during the original ecosystem deployment but throughout its lifetime. The current study describes CAO (Cognitive Adaptive Optimization)—and its distributed counterpart L4G-CAO (Local for Global Cognitive Adaptive Optimization)—which can overcome this shortcoming. CAO and L4G-CAO—which have recently been introduced and tested in a variety of IoT applications—can embed Things with cooperative autonomy in a plug-n-play fashion, i.e., without requiring the aforementioned tedious and costly effort. Results of the application of the aforementioned approaches in three different application domains (smart homes and districts, intelligent traffic systems and coordinated swarms of robots) are also presented. The presented results demonstrate the potential, of both approaches, to exploit the IoT automation functionalities in order to significantly improve the overall IoT performance without tedious effort.


3
• A unified mathematical formulation of the problem of embedding cooperative autonomy within IoT ecosystems and the demonstration of how CAO and L4G-CAO can be employed for addressing such a problem; • An overview of the main functionalities and mathematical attributes of CAO and L4G-CAO when applied for embedding autonomy in IoT ecosystems ; • A brief overview of the results and main conclusions of implementing CAO and L4G-CAO in challenging real-life large-scale IoT ecosystems.

The Problem set-up
Let us consider an IoT ecosystem consisting of N Things (smart devices) with each of the Things being embedded with an Autonomy Decision-Making Mechanism -(ADMM) as follows: where t denotes the time index; (⋅) is a non-linear vector function; z i (t), d i (t) denote the vectors of local data (e.g, sensor measurements) and external data (e.g., information available through the web/cloud), respectively, available to the ith Thing at time t; and i is a vector of tunable parameters configuring the ADMM of the ith Thing, i.e., for different choices of i we obtain different autonomous behaviours for the ith Thing. Let z, d, u, denote the augmented vectors of local and external data, actions and tunable parameters, of the overall IoT ecosystem: The performance of the overall IoT ecosystem, is evaluated through an objective function (performance index) over a time-horizon T where t are known non-negative functions.

Example 1
To better understand the above definitions consider the example of a smart home that is comprised of N rooms: a smart device in each room is used to autonomously control the room's A/C (Air-Conditioning) set-points so as to (a) minimize energy bills and (b) keep the rooms' climate conditions (e.g., temperature, humidity, etc.) within some pre-specified limits. Then, z i contains the ith room's indoor sensor conditions (e.g, temperature,humidity, etc.), u i denotes the ith room's A/C set-point and d i contains external information such as the current and forecasted external weather data, energy prices, etc. The functions i are typically calculated as follows: where the function "Penalty for indoor conditions" penalizes the cost whenever some room's indoor conditions exceed the pre-specified limits and a, b are appropriately defined weighting/normalizing factors. Finally, the time-horizon T is typically selected to be one day. See e.g., [4,9,27,36] for more details on the above definitions. ◻ Remark 1 Typically, the ADMM is designed using parametrized rule-based logics or it is based on standard control system theory tools. Apparently, the choice of the ADMM is crucial for the efficiency of the IoT ecosystem: it must be designed in such a way that different choices of its tunable paraConstant AC set-pointmeters i should cover all possible and feasible autonomous behaviours. The reader is referred to the practical applications described in the next two sections where examples of choices for the ADMM are provided. ◻

Remark 2
The above formulation is valid not only in the case of IoT ecosystems consisting of homogeneous Things (like the smart home example provided above) but also for cases where heterogeneous Things live and interact in the same

z(t), d(t), u(t)) = a Energy Consumed (t) + b Penalty for indoor conditions (t)
where D T = [d (1), … , d (T)] . Therefore, the problem of optimizing the overall IoT ecosystem performance can be mathematically formulated as the problem of finding the values for the tunable parameters that optimize the cost criterion (4). Please note that the dynamics (3) are "hidden" in the equation (4): in other words, the computation of (4) requires knowledge of both the cost function elements i (⋅) as well as the functions g i (⋅) . As a result, there are two main limitations when attempting to solve such an optimization problem: • (Limitation 1). It is difficult, if feasible at all, to apply standard optimization approaches (such as e.g., gradient descent). Standard optimization approaches require an analytic form of the cost function (4) and since this function depends on the dynamics (3), knowledge of the analytic form of the overall IoT ecosystem dynamics is required. However, extracting the analytic form of the IoT ecosystem dynamics is an extremely difficult task, if not impossible at all, even for small-scale implementations. To make things even worse, as the IoT ecosystem is usually subject to minor or major changes (e.g., addition/removal of devices, changes in the end-users behaviour, etc.), a constant adaptation of the model for the IoT ecosystem is required. • (Limitation 2). Intelligent, adaptive and/or learning approaches which do not require knowledge of the analytic form of the IoT ecosystem dynamics may exhibit [29,35] a very poor performance due to adaptation which, in turn, may put safety of operations at stake. Moreover, typically such approaches are applicable to small-or medium-scale applications.

Centralized version: the cognitive-based adaptive optimization tool
CAO [29,35] can overcome both Limitations 1 and 2, described in the previous section. Below, we provide a brief description of CAO along with its main properties. To start with, let us briefly explain how CAO is implemented. CAO starts with an initial set of tunable parameters (0) and lets the ADMM mechanisms operate the Things over a time-horizon T by keeping the tunable parameters constant and equal to (0). 1 After the system operates over T time-units, CAO evaluates its performance through the cost function J(0) and calculates (1) using the algorithm of Table 1. This procedure is repeated for the next T time-units so as for CAO to calculate (2) using J(1),then for the next T time-units in order to calculate (3) using J (2) and so on. The details on how (1), (2), … are calculated are provided in Table 1. The next Theorem summarizes the main properties of CAO. Its proof can be found in [29]. Theorem 1 Let D T (k) −D T be zero-mean and bounded. Then, under some mild conditions on the continuity of J, the following hold: (a) where * denotes a local optimum of J, i.e., ∇J( * , D T (k)) = 0.
where (k) is a term that decays to zero exponentially fast.
In simple words, the above Theorem states that: • CAO guarantees that the tunable parameters of the ADMM mechanisms will converge to their locally optimal values, provided that the prediction D T satisfies some typical assumptions (see the 3rd item in this list for more details). Apparently, the performance improvement depends on the nature of the particular local optimum (k) where it will converge. If the ADMM mechanism is suitably chosen then the improvements that such local optimum may provide could be significant: for instance, in the practical applications, which are described later in this paper, the improvements can reach 30% or even higher. On the other hand, if the ADMM is chosen according to the procedure suggested in [2], (k) converges to the point whose distance from the globally optimal performance is proportional At every kth iteration (where each iteration involves the IoT ecosystem operating for T time-units with being constant and equal to (k) ) measure the IoT ecosystem performance J(k) and update using the following steps: 1. Construct an estimator for J(k + 1) as follows: where Ĵ (k + 1) denotes the estimate (prediction) of J(k + 1) , is the regression vector and is the estimator vector. Standard function approximation schemes (e.g. polynomials) can be used to construct estimator (5). The reader is referred to [29,34] for more details on how to construct such an estimator (it must be emphasized that it suffices to use estimators of very "simple" structure and not very elaborate ones). The estimation vector is constructed using standard Least-Squares (LS) estimation, i.e., where W(k) denotes the time-window over which the LS estimation is taking place. 2. Choose a positive function (k) to be either a constant positive function or a time descending function satisfying 3. Generate-randomly or pseudo-randomly-a set of L candidate perturbations (1) (k), (2) (k), … , (L) (k) where (j) (k) are vectors of the same dimension as (k) and L is an integer satisfying L ≥ 2dim( ). 4. Estimate the effect of each of the candidate perturbations to the current vector (k) by employing the estimator (5) and pick the candidate perturbation with the "best" effect, i.e., choose the vector (j * ) (k) that satisfies where D T (k + 1) denotes an estimate (prediction) of D T (k + 1).
The reader is referred to [29,34] for more details on the CAO algorithm as well as for guidelines for the selection of the different design parameters of the algorithm (regression vector , (k), W(k),etc.).
to the complexity of the ADMM mechanism: the more complex is the ADMM mechanism the closer to the globally optimum performance is obtained (at the expense, of course, of a convergence speed which is inversely proportional to the ADMM complexity). • Part (b) of Theorem 1 establishes that CAO does not face the risk of the poor performance (which is one of the main shortcomings of other adaptive/learning approaches): the cost J(k + 1) is less than its value of the previous iteration plus two terms: (a) an-unavoidable-term that depends on the accuracy of the prediction of the external data and (b) a term that converges to zero exponentially fast. The exponentially fast convergence to zero is the best that any adaptive/learning algorithm can achieve [19]. As a matter of fact, in the vast majority of adaptive/learning schemes, a term similar to (k) is always present -with the difference that such a term do not converge exponentially fast: as a result such a term may take significantly large values during adaptation, leading to situations of very poor or, even, unsafe performance. • The properties of CAO are established based on some typical assumptions on the prediction D T . Apparently, any type of algorithm depends on the accuracy of the prediction D T which corresponds e.g., to weather predictions in the case of smart home/districts, traffic predictions in the case of traffic systems, etc. • Last, but not least, it is emphasized that due to the model-free nature of CAO, it possesses self-reconfiguration capabilities: if the IoT infrastructure changes (e.g., nodes added/removed), then CAO will automatically re-learn and re-adjust the tunable parameters towards optimizing the altered system. The robotic application mentioned in this paper exhibits such an attribute: whenever the IoT system changes (because a node joins/leaves the system) or whenever the user requirements change (which corresponds to a change in the cost function J structure), CAO rapidly reconfigures itself towards efficiently optimizing the altered system.

Smart Traffic Control (STC): real-life application in the city of Chania, Greece
One of the benefits of the impressive recent advances of the field of IoT, is that it becomes more and more affordable and "easier" to deploy Smart Traffic Control (STC) systems to intelligently and more efficiently control and manage traffic operations [3,34,38]. Unfortunately, embedding STC systems with intelligence requires a tremendous amount of human effort and time for programming and tuning the IoT involved in these operations. The programming and tuning procedure involves the calibration, adjustment and programming of hundreds of parameters, rules, operational schedules, decision-making mechanisms, etc. and are typically performed by experienced personnel. Thus, because of the complexity of the problem, there is no guarantee that the overall programming and tuning procedure will end-up successfully.
The CAO system has been implemented in a real-life STC system towards demonstrating its potential for providing an automated and systematic approach that will neglect the need for the tedious and costly human involvement. The particular STC system where CAO has been implemented is the STC system for the urban road network of the city of Chania, Greece (see Fig. 1) which is a highly challenging traffic network: it involves a very complex signalling structure; frequent illegal or double-parking which change the network characteristics and junction capacities in an unpredictable way; and a traffic demand that changes significantly throughout the year (Chania is a touristic city with its population increasing by almost 100% during summer time). It is also emphasized that the ADMM employed and its original parameters (i.e., the tunable parameters before their tuning by CAO) correspond to a very well-designed STC system, achieving the best the state of the art can offer [28]. Table 2 provides the details of the CAO implementation for the STC system of the city of Chania.
The real-life results after implementing CAO for about 60 days (see Fig. 2), indicate that CAO was able to provide ∼ 50% improvements over a well-designed STC system. The improvements have been calculated based on the productivity index (the mean speed achieved inside the network multiplied by the traffic demand). The calculations for estimating the cost savings in Table 2 assume a fuel consumption of 10L per 100Km [53] in urban areas and a price of 1.2 €/L. Table 3 summarizes the result of CAO application in the STC system of the city of Chania.

Smart energy homes (SEH): real-life applications in two large-scale buildings
Calculating the optimal decisions that balance energy and user needs is by no means an easy task. Extensive research and real-life experiments performed over the last decades exhibited that demand-optimized actions require modifying the operating set-points many times during the day, in an intelligent and delicate manner. Such decisions should also consider the complex interplays between diverse factors such as equipment and envelope dynamics, user comfort and needs, occupancy schedules, weather conditions, etc. [9,13,27,40]. Things become way more complicated when local energy generation (renewable sources, spinning reserves etc.) and storage are involved: in this case, the problem of generating The ADDM consists of the strategy TUC( , z(t)) , a well-established traffic control strategy which is based on control systems principles [28]. The initial set of parameters (0) were the ones obtained after a quite lengthy and tedious manual tuning in the past.
Time-horizon J(t) (average mean speed of the whole traffic network) × (total no of vehicles entering the traffic network)= System Productivity

Fig. 2
Real-life Application of the CAO system to the urban road STC system of Chania: traffic network performance improvements (blue=real data; black=linear fit of the real-data). The x-axis corresponds to the number of days the CAO system is operating. The y-axis reflects the daily system performance in terms of (speed×demand), which is known as the system productivity Table 3 Results of CAO application in the STC system of the city of Chania Annual fuel savings (due to reduction of travel times) as compared to the "best state-of-the-art" 1-2 Million €/year for an urban area of 100,000 people Improvement of Traffic Network Performance as compared to the "best state-of-theart" optimal decisions that guarantee the aforementioned attributes becomes way more complicated. Unfortunately, existing methods for calculating such optimal decisions usually rely on the analytic knowledge of the building dynamics [26,42]. Apparently, such assumption is not realistic since developing an analytic model is an extremely "expensive" and cumbersome procedure. Moreover, such models would require continuous recalibration since SEH ecosystems are not static in time, a fact which renders the model "maintenance" extremely expensive when it comes to large-scale deployments. To make things worse, even in the cases where an elaborate model is available, existing methods for calculating the optimal decisions are computationally quite expensive [51].
The CAO system has been implemented in two real-life, large-scale SEH systems towards employing an automated and systematic control approach that is able to overcome the aforementioned drawbacks involved in existing solutions:

Application to the office building of AFCON Ltd. (Tel Aviv, Israel)
The first SEH system concerns the main office building of AFCON Ltd., which is located in a suburban area of Tel-Aviv, Israel. It was built in 2004 to host over 600 employees. The building is comprised of 5 floors: Floors 1 and 2 are used as storage spaces without any air conditioning units, while floors 3, 4 and 5 are consisted of offices (about 70 offices and rooms per floor) (see Fig. 3a). The net heated floor area (3rd, 4th and 5th floor in total) is around 2350m 2 . The daily energy demand is approximately 11879kWh during spring period. Two couples of chillers are installed for indoor climating purposes; each of the chiller can deliver up to 150 refrigeration tons (total of 600 tons) which corresponds to 527kW per chiller (total 2.108MW). The indoor air-conditioning system includes AHUs (Air Handling Units) for offices located on the same floor-on average 10 offices share the same AHU.
It should be noted that the comparison performance benchmark (base case scenario -BCS) is the common control practice adopted in the real-life building employing a constant chiller set-point of 11 o C during working days. The AHU thermostats were constantly set to 21 o C . Experiments were conducted focusing on the 3rd, 4th and 5th floors consisting of offices. The test period refers to the period from Monday 30/3/2015 to Friday 10/4/2015, when, due to mediocre outdoor conditions, the energy-efficiency of the BCS was poor. Table 4 provides the details of the CAO implementation for this application.
The evaluation results demonstrated that CAO led to substantial power savings of ∼ 35% translated into 6711kWh average daily consumption, without violating the acceptable comfort bounds. An estimation of the potential savingssummarized in Table 5-in terms of energy cost, can be extracted considering that the benchmark control application requires 11879kWh/day in average and CAO requires only 6711kWh/day. Using the EU-28 average price of 0.125€/kWh for industrial consumers [15], such difference can be translated in a daily amount of 646€ savings during summer period.

Application to an office building of Technical University of Crete, (Chania, Greece)
The second SEH application involves a 2-floor office building, located inside the campus of the Technical University of Crete, Greece (see Fig. 3b). The building area of 450m 2 is divided into 10 offices, each equipped with a 12000btu conventional air conditioning unit as well as indoor temperature and humidity sensors. The building is also equipped with a photovoltaic (PV) panel, which provides solar energy to the building. The building is considered as a conventional building with poor insulation characteristics which render the problem of optimization and efficient control design to an extremely challenging one, due to the strong dependence of the indoor conditions to the outdoor ones. The energy consumption is highest during the summer period when large cooling loads are required to achieve an acceptable indoor thermal conditioning. Large glass surfaces, combined with the Greek summer and the poor insulation factor of the building, usually lead to overheating. Therefore the respective tests focused on reducing the air-conditioners energy consumption during the summer period. The simple rule-based control strategy, which is used in the building control practice, was adopted as the base case for comparison purposes. The rule-based control employs a very simple strategy, which consists of keeping the air-conditioner set points constantly equal to 25 o C during the office hours, and turn them off outside office hours (8 ∶ 00 − 17 ∶ 00) . Table 6 provides the details of the CAO implementation for this application.
CAO was able to reduce energy consumption by 19% , while indoor comfort conditions remained within acceptable comfort bounds. An estimation of the potential savings-summarized in Table 7-in terms of energy cost, can be extracted by considering that the benchmark control application requires 126kWh/day in average while CAO requires only 100kWh/day. Using the EU-28 average price of 0.125€/kWh for industrial consumers [15], such energy consumption difference can be translated in 3.25€ daily savings during the summer period.

Autonomous trajectory design system for AUVs: real-life application in the Port of Porto, Portugal
Another instance of IoT application is the deployment of underwater robots (AUVs) to accomplish underwater mapping, see e.g., [6,22,43,46,48]). Despite these advances, however, almost all underwater map-building methods are characterized by low autonomy, since they typically rely on a set of pre-defined trajectories and often on human intervention. In other words, AUVs usually follow trajectories designed off-line, before the actual deployment, which is a limiting factor AHU thermostat set-points regulating in real-time the water temperature Indoor temperature for all 210 offices located on the 3rd, 4th and 5th floor d i (t) Current and forecasted ambient temperature, total solar radiation and occupancy ADDM Combination of a linear controller and a rule-based controller Weighted summation of the active chiller energy consumption and indoor comfort Table 5 Results of CAO application in the SEH system of AFCON Ltd.
Daily energy savings during spring period as compared to the "usual practice" 5168 kWh/day Daily economic savings during spring period as compared to the "usual practice" 646 €/day Indoor temperature and humidity for all 10 offices d i (t) Current and forecasted ambient temperature, outdoor humidity, total solar radiation and occupancy ADDM Combination of a linear controller and a rule-based controller Weighted summation of the total energy consumption and indoor comfort when a totally unknown area is to be mapped underwater: pre-defined trajectories are quite likely to "miss" areas rich in information or AUVs may waste valuable time focusing on low informative regions. A common approach for tackling these problems in practice, is to perform the following repetitive procedure. Initially, AUVs map the sea-floor following blindly defined trajectories (usually in a lawnmover pattern). Once this first step is accomplished, new trajectories are generated, always off-line, but now using the existing seabed knowledge from the constructed maps and this procedure is repeated many times. To alleviate the previously described shortcomings we apply the centralized CAO algorithm. The aim of this research is to generate on-line trajectories for a team of AUVs in order to construct fast and accurate sea-floor maps [21] while also enabling the possibility to simultaneously track a dynamic event. Two different experiments were conducted in the Leixões Port, located in the city of Oporto, Portugal. Both experiments involved a fleet of 3 AUVs (called Noptilus-1, Noptilus-2 and Noptilus-3) shown in Fig. 4. Table 8 provides the details of the CAO implementation for this application. Next we summarize the details of the 2 real-life experiments.

1st experiment: one AUV faces hardware malfunction during the mapping mission
In this experiment, we deployed the fleet of the 3 AUVs having as an objective to perform cooperative mapping of the seafloor using their bathymetric measurements. Figure 5a illustrates the progress of the 3 AUVs (blue lines) until timestep 90. The AUVs' positions, at this time-step, are depicted with the magenta spheres. The black tiles correspond to areas where the AUVs have not yet acquired any measurement, while the colorful ones correspond to the areas where the AUVs have started (and may have completed) their estimation process. The color in each one of them is an error index that varies from dark-blue, in case where the AUVs have acquired a perfect match from the ground truth, to darkred in case where the measurements do not have any correspondence with the actual surface (ground truth map) that underlines the specific tile. It should be highlighted that the CAO algorithm does not use any information regarding the ground truth map (or error index): during the exploration process, the AUVs adjust their movements taking as input only their bathymeters' measurements and their locations (as estimated by the localization module).  Safety distance from the ground d r = 0.5m Safety distance between any two robots J(t) Summation of the mapping performance on each tile time-step where a (simulated) hardware malfunction took place. The malfunctioned vehicle had to return immediately to the base-station to avoid jeopardizing such an extremely expensive infrastructure. Figure 5c exhibits the adaptation in the navigation schemes of the two remaining AUVs. The important feature here is that, one AUV autonomously chose to cover the tiles that would have been assigned, under normal conditions, to the damaged AUV. The mapping process was terminated after 450 time-steps when the AUVs covered the majority of the operation area having estimated 136 from 144 tiles (Fig. 5d). It is worth mentioning that in the majority of the estimated tiles, the AUVs acquired a satisfactory number of bathymeter's measurements, different in each case, since it is highly dependent on the actual morphology that underlines the tile. A comparison was also performed versus the usual practice of mapping using pre-defined trajectories [21]. The results of the comparison are summarized in Table 9.

2nd experiment: performing target tracking simultaneously with the mapping task
In this scenario, the task was to construct a map of the seafloor area while, concurrently, tracking the trajectory of a moving target. In this scenario we utilized a fleet of only 2 vehicles, due to the fact that the third available vehicle was utilized as the moving target. The information regarding the moving target was available through AUV-to-moving-target distance. In other words, the two AUVs do not know the position of the moving target, but they are using their AUV-tomoving target distance measurements in order to estimate the-dynamic-position of the target. Even from the initial time-steps, the difference from the previous experiment is evidential. Figure 6a depicts such an initial state, where one AUV seems to approach almost directly the position of moving target in order to minimize their in-between distance.
In a subsequent timestep (Fig. 6c) another feature of the utilized navigation algorithm can be observed. At this very moment, the distance between the target and any of the two AUVs was more or less the same. However, the bathymetric information below the AUV which was responsible for tracking the target, was far more important than the other one. The CAO algorithm without any build-in mechanism to detect and appropriately act on such cases, chose to "switch"

Review
Discover Internet of Things (2021) 1:8 | https://doi.org/10.1007/s43926-021-00003-w the tasks between the two AUVs. By doing so, the AUVs (as a whole) were able to keep track of the movements of the moving target without undesired spikes on the estimated trajectory and, at the same time, to dedicate one vehicle to gather sensor data from regions where the mapping accuracy was low (Fig. 6d). The aforementioned switching process was performed several times during the experiment, in cases where the AUVs had more or less the same distance from the target and there was a clear advantage for the specific switching. It is worth highlighting that, the algorithm chose to make the transitions only when the AUVs' distances from the target were the same, in order to avoid sudden increases in the estimation error of the target's motion.
The experiment was terminated after 450 time-steps where the AUVs had accurately estimated, mainly but not limited, the area where the target was moving, while at the same time had almost perfectly estimated the target's trajectory.

Distributed version: the Local4Global cognitive adaptive optimization tool
The CAO algorithm described in the previous sections, assumes a centralized form. However, in large-scale IoT implementations, such a centralized formation is not practically implementable: instead, the local parameters i of the ith Thing must be updated using only locally available information (plus information about the global criterion time-history). L4G-CAO [30] suitably revises CAO so as to meet such a requirement. Table 10 describes the details of the L4G-CAO algorithm.
The following Theorem provides the basic attributes of L4G-CAO which-despite the distributed nature of L4G-CAO-are similar to those of CAO.

Theorem 2 Let D i,T T(k) −D i,T be zero-mean and bounded. Then, under some mild conditions on the continuity of J, the following hold:
(a) where * denotes a local optimum of J, i.e., ∇J( * , D T (k)) = 0.

(b)
where (k) is a term that decays to zero exponentially fast.

Proof
The proof-see also [35]-can be established by using standard results from representing state-space systems with input/output models. Using these results it can be seen that Theorem 2 is a direct application of Theorem 1. More precisely: As a first step, it is not difficult for someone to see that the L4G-CAO algorithm assumes a mathematical form as follows: for some nonlinear vector function P i (⋅) . Therefore, the overall L4G-CAO dynamics can be written in state-space form as follows: where ̄= [ 1 , 2 , … , i−1 , i+1 , … N ] , F = [P 1 , P 2 , … , P i−1 , P i+1 , … P N ] and y = J . Please note that i is considered as an exogenous input in the above equations. Using standard results from transforming state-space into input/output systems (see e.g., Theorem 2 in [32]) we can see that where ℑ i (⋅) denotes an unknown nonlinear function. Therefore, the global performance index J(k) can be calculatedat the ith Thing level-through a nonlinear function ℑ i (⋅) by using the previously measured values of J. By defining , we have that the problem of optimizing J can be transformed into the problem of optimizing the cost J i at the ith Thing level, where J i is as follows: and thus the CAO algorithm-and its attributes-are directly applicable by replacing J, D T in CAO by J i , D , respectively. ◻

Distributed smart energy systems (DSES): real-Life application in a large-scale building
The first of L4G-CAO experiments concerns the case where there is a number of independent SEH (Smart Energy Home) systems in a large building, with each SEH system operating over a distinct part of a building (e.g., each apartment or office of the building is equipped with a distinct SEH system that operates independently of the others). The different SEH systems are not allowed to communicate to each other due to e.g., privacy preserving reasons. The only information that is common to all different SEH systems is the total daily energy performance of the whole building along with a daily comfort index indicating the degree of satisfaction in all the different apartments/offices (for instance, this index may correspond to the worst of comfort conditions among all different apartments/offices). The particular building where the L4G-CAO experiments were performed is an office-building that belongs to E.ON. Energy Research Centre of RWTH University and is located in Aachen, Germany. Figure 7 below, illustrates the building's south façade and its ground-floor plan. The available control and sensing infrastructure consisted of: • sensors: room temperature (T), room CO2 level, occupants' presence contact (PS), window-opening sensor (WS), manual temperature dial (TD) and energy measuring devices in each room, and; • actuators: (i) Air Chiller (ACH) systems for cooling the supply air from the central air handling unit individually for each room; and (ii) Volume Flow Control (VFC) systems, for adjusting the air flow rate individually for each room, separately in supply and exhaust air duct.
It must be emphasized that the energy supplied was a mixture of renewable and non-renewable (i.e. from the power distribution grid) energy provided by the central supply system. The usual case for buildings located in northern climates suggests that the largest amount of the total energy demand is consumed during winter and autumn periods, mainly for heating purposes. For this reason, the L4G-CAO real-life experiments were conducted during 21st − 26th of November 2016. The goal of L4G-CAO was to reduce the Non-Renewable Energy Consumption (NREC) while keeping user comfort at satisfactory levels. Table 11 provides the details of the L4G-CAO implementation for this application.
For comparison purposes, the L4G-CAO strategy is compared with the base case control strategy. The base case control strategy has been designed and implemented in the respective Building Management System (BMS) by the planners and the commercial system provider in a conventional manner. Such a strategy employs a closed PID-based control-loop, designed to react on room temperature and CO2 deviations on ACs and VFCs. It should be noted that three rooms of about 30m 2 each were utilized for the L4G-CAO application (see Fig. 7b blue area). Moreover, two neighboring rooms Table 10 The L4G-CAO Algorithm At every kth iteration (where each iteration involves the IoT ecosystem operating for T time-units with i being constant and equal to i (k) ) measure the IoT ecosystem performance J(k) and assume that the value of J(k) is available to each of the Things. Then, i is updated using the following steps: 1. Construct an estimator for the global performance J(k + 1) at the ith Thing level as follows: where Ĵ i (k + 1) denotes the estimate (prediction) of J(k + 1) , is chosen as in the CAO case, d is a positive integer chosen typically in the range 5 − 10 and D i, . Please note that each Thing has its own estimator. The estimation vector i is constructed using standard Least-Squares (LS) estimation, i.e., where W(k) denotes the time-window over which the LS estimation is taking place. 2. Choose (k) as in the case of CAO. 3. Generate-randomly or pseudo-randomly-a set of L i candidate perturbations (1) i (k), (2) i (k) are vectors of the same dimension as i (k) and L i is an integer satisfying L i ≥ 2dim( i ). 4. Estimate the effect of each of the candidate perturbations to the current vector i (k) by employing the estimator (6) and pick the candidate perturbation with the "best" effect, i.e., choose the vector where D i,T (k + 1) denotes an estimate (prediction) of D i,T (k + 1).

Set
i (k) 6. Go to step 1 until performance convergence has been achieved.
with similar thermal characteristics, where the benchmark control was applied during the experimental period, served as the base case control scenario test-bed (see Fig. 7b red area). The real-life application of the L4G-CAO optimization tool employed a distributed topology to ensure seamless scalability and confirmed all of the aforementioned properties in real-life operating conditions. It is worth mentioning that NREC improvements could be observed even from the very first experimental day. The total improvement of the defined NREC index was 34% during the considered test period. In particular, during the experiments the average daily NREC consumption was about 0.067kWh∕m 2 ∕day in the benchmark control case (see red circled area in Fig. 7b) while in the L4G-CAO case it was reduced to 0.043kWh∕m 2 ∕day (see blue circled area in Fig. 7b). Note that internal solar heat gains were also negligible during the experimental period therefore indoor solar heat gains did not affect the evaluation process. In addition, the indoor comfort levels achieved were similar in both L4G-CAO and the base case control strategy.
An estimation of the potential savings in terms of non-renewable energy cost can be extracted considering that the benchmark control application requires 0.067kWh∕m 2 ∕day in average and L4G-CAO 0.043kWh∕m 2 ∕day . Using the EU-28 average price of 0.125€/kWh for industrial consumers [15], daily savings of 0.003€∕m 2 ∕day during the cold period of the year can be obtained (Table 12).

Distributed Smart Energy Systems (DSES): simulated application in a microgrid of 100 buildings
The second experiment of L4G-CAO concerns a simulated experiment of a connected microgrid of 100 buildings with each of the buildings equipped with each own independent SEH system (see Fig. 8). Moreover, the buildings of the microgrid share different energy sources: first, renewable energy sources (photovoltaic panels) are shared as a 'must-take' source, i.e. photovoltaic energy is always used when it is available; as a second source, the microgrid is also connected to the main electricity grid, i.e. if the output of the renewable energy sources is not enough, the extra electricity is absorbed from the main grid. In the following, more details about the different components of the microgrid are given.
It is important to underline that each one of the 100 buildings has a different size, different orientation, and different occupancy schedule (cf. Table 13): this implies that each building has different energy needs. For example, because of the orientation, each building receives a different portion of solar radiation, which might influence drastically the selection of the Heating, Ventilation, and Air Conditioning (HVAC) set point in each room (and thus the energy need). The size of the building and the fact that the building is occupied or not are additional factors influencing the selection of the HVAC set point. In particular, Table 13 shows that buildings may have 10, 6 or 4 rooms: the size of the buildings goes from 300 to 900 m 2 , and the rooms in a single building have the same size. Buildings may host office activities, commercial activities, or residential activities. Each activity has its own occupancy schedule. It is assumed that all the rooms of a building exhibit the same occupancy pattern. Table 14 provides the details of the L4G-CAO implementation for this application.

3
The L4G-CAO results are compared to two the base case control strategies RBC 24 o C and RBC 24 o C , which are two Rule-Based-Controllers setting the building set-points to 24 o C or 24 o C when occupants are present. The two histograms presented in Fig. 9, have been obtained from a one-week simulation. The first histogram presents the energy absorbed from the grid in € for each type of buildings and the whole microgrid. The second histogram presents the mean percentage of people who are dissatisfied. Similarly to the single building test case, L4G-CAO achieves better scores in both histograms. In particular, with respect to RBC 24 o C , L4G-CAO manages to save more than 400€ for the whole system, while maintaining the comfort at better levels. On the other hand, L4G-CAO achieves a slightly better energy cost than RBC 24 o C : the energy cost is slightly better despite the pre-cooling effect implemented by L4G-CAO that demands more energy consumption. Table 15 summarizes the results of the application of L4G-CAO to the microgrid case.

Continuous monitoring/inspection of critical infrastructures utilizing a team of robots (simulated experiment)
The final L4G-CAO application concerns a multi-robot mission where the objective is to continuously monitor an area of interest using the team of robots. Such tasks can be found in several real-life applications including: surveillance in hostile environments (i.e. areas contaminated with biological, chemical or even nuclear wastes), environmental monitoring (i.e. air quality monitoring, forest monitoring), and law enforcement missions (i.e. border patrol), etc. The task of continuous monitoring can be shortened to the task of designing the robots trajectories, in real-time, so that: (1) the part of the terrain that is monitored (i.e. visible) by the robots is maximized; (2) for every point in the terrain, the closest robot is as close as possible to that point.
The second objective is significant for two practical reasons: (a) at first, the closer is the robot to a point in the terrain, the better its ability to monitor this point becomes and (b) secondly, in many multi-robot monitoring applications, fast and accurate robot intervention (when needed) is highly essential. More information about this problem set-up along with the specialized version of the distributed-CAO algorithm for it can be found in [23]. To validate our approach in a realistic environment, we used data which were collected from the Birmensdorf area in Zürich. The main constraints imposed on the robots are that they must remain within the terrain's limits, i.e. within [x min , x max ] and [y min , y max ] in the x-and y-axes, respectively. At the same time they have to satisfy a maximum height requirement whilst not hitting the terrain, i.e. they must remain within [z + d, z max ] along z-axis. Moreover, the operational robots had a maximum threshold regarding their sensors' capabilities, i.e. ||x i − q|| < thres where x i denotes the    Table 16. Several initial configurations for the robot team were tested. In Fig. 10 the cost function of an illustrative scenario is presented, while the initial and the final configuration of the team (for the same scenario) is displayed in Figs. 11 and 12 respectively. Please note that, in both figures, the color in each cell of the surface, denotes the closest robot that actively monitors that cell. If the cell is marked with black color, it means that no robot is able to monitor that cell, either due to the maximum visibility range or the geometry of the environment. In Table 17 the final achieved coverage percentage for different initial configurations and different clustering in the Birmensdorf area, is presented.

Conclusions
Despite the complexity and heterogeneity aspects involved in IoT, the CAO and L4G-CAO methodologies have presented a quite robust and inter-operable behavior in all application domains considered herein. The absence of elaborate simulation models and analytic knowledge of the specified use case scenario did not hinder the applicability of both methodologies due to their model-free operation feature. CAO and L4G-CAO applications proved the high potential of model-free intelligent control in orchestrating a cooperative web of autonomously acting entities in order to improve the overall IoT performance in a real-time cognitive manner. Both have been evaluated in three different application domains under diverse conditions and scenarios presenting a quite promising behavior. CAO and L4G-CAO were able to improve significantly the overall IoT performance as compared to well-established base case strategies.   Terrain measurements (may be corrupted by noise) T = 1000 timesteps Time-horizon d = 0.5m Safety distance from the ground d r = 0.5m Safety distance between any two robots thres = 16m Maximum visibility of the robots J(t) Summation of the distance between any point of the terrain with the closest robot and the number of invisible points Advantages over usual practice No need of tele-operation and explicit coordination Fig. 10 Cost function evolution in the scenario of monitoring an unknown terrain

Fig. 11
Initial robots' configuration. Black area corresponds to the area that has to be monitored