1 Introduction

One common requirement for the constituents of sensor-actuator networks and IoT infrastructures is that they should access and transform the environment in which they are situated. Consider, for example, the “smarter planet” vision where cognitive AI is applied on sensors and actuators embedded in physical objects found in every environment of human activity (IBM Watson IoT 2017). Such a vision to apply AI to a global network of sensors is further reinforced by analogous efforts [see Google DeepMind (2018), TensorFlow (2016)], who are showing an increasing interest in home automation technologies (see Nest Labs 2019), IoT platforms (see Google Cloud IoT 2018) and smart services (see Amazon Web Services IoT 2018). Similar ideas, for example Sundmaeker et al. (2010) and earlier works such as that of de Bruijn and Stathis (2003), require things to “interact and communicate among themselves and with the environment by exchanging information sensed about it while reacting autonomously to the physical world events and influencing it by running processes that trigger actions and create services with or without direct human intervention”. The adoption of these ideas for a variety of popular applications that provide smart electronic services for domestic, healthcare and work environments suggest that their supporting technologies are here to stay.

However, the numerous application areas requiring IoT and sensor-actuator networks combined with the specialized devices used in each has led to the creation of countless specialized middleware. Zachariah et al. (2015) show that this problem has led to a multiplicity of problem specific middleware, creating interoperability issues between the architectures they enable due to the diversity in the technologies used and the architectural approaches to IoT. For example, Hydra [see Eisenhauer et al. (2009), LinkSmart (2018)] abstracts devices as services using semantic ontologies to implement discovery while Google Fit (2018) uses a representational state transfer (RESTful) application programming interface (API) as discussed in Fielding (2000) and does not use high level abstractions for incorporating new devices in its architecture. Google Cloud IoT (2018) and Amazon Web Services IoT (2018) on the other hand work with edge-based services (closer to the sensors/things, localised services) and cloud based web services. They both require from devices and settings to run their proprietary software to access their infrastructures and their cloud web-services that include powerful machine learning and analytics functionalities.

But what if we wanted to apply a machine learning algorithm to an existing IoT setting to add further intelligence to the environment (e.g. Mehmood et al. 2019), or enable a software agent to enrich with AI capabilities a smart home (e.g. Poncela et al. 2018) consisting of multiple and not necessarily interoperable technologies? We are trying to answer these questions through a framework that demonstrates how to simplify the incorporation of AI capabilities to existing sensor-actuator networks or IoT infrastructures making the services offered in such settings smarter. Yet another IoT or sensor network middleware directly connecting the low level sensors and physical devices with AI capabilities would only contribute to the existing mosaic of available middleware. As the current state of the art indicates, such attempt would inevitably be application/hardware specific and not very useful to most existing systems. Instead we want to integrate, when possible, with existing settings and make them smarter with the added benefit of interoperability between heterogeneous and diverse IoT architectures.

Working at this level would take away the specialized sensor and smart device integration complexities that has led to the multitude of IoT middleware approaches and would allow the middleware to work with existing settings instead of requiring their replacement. In this context, we argue that what will simplify a developer’s task is a more customized middleware that takes into account the particularities of binding an AI to a sensor/actuator network to make their integration transparent and systematic. The reason why integration transparency is important is that it can abstract away the low-level details of how an AI discovers and interacts with a set of a sensors and actuators, as in Görgü et al. (2018). In other words, the system developer that uses eVATAR+ in an application would describe AIs and devices using abstractions and the middleware would bind them to each-other and route messages between them without the developer having to deal with or have knowledge of how this is done inside the middleware i.e. transparently. Systematization refers to providing developers with a standard way of implementing the specific type of integrations that involve AI platforms and sensor/actuator/smart device networks. We aspire to simplify the task of integrating agent AI with sensor, actuator and smart device networks and a way of achieving this is by using a familiar and easy to use API. Interoperability and heterogeneity are also important features of the discussed middleware because a middleware would not be of much use if it contradicted its intrinsic goal to interconnect heterogeneous software.

Our middleware is associated with an interaction paradigm for binding AI capabilities with sensor, actuator and smart devices; the capabilities will be part of intelligent agents using different agent models (Kakas et al. 2008) or architectures (Witkowski and Stathis 2004). According to Heim (2007), an interaction paradigm is a model or pattern of human–computer interaction encompassing all aspects of interaction, including physical, virtual, perceptual and cognitive. Our middleware’s paradigm is inspired by the familiar concept of avatar as it has been popularised in virtual reality and computer games applications. However, instead of representing a user in a virtual environment, our avatar architecture explores the reverse arrangement, viz., where an AI agent running in an electronic environment is bound with an avatar body that is essentially comprised of sensors, actuators and smart devices deployed in a physical world. In this new view, the AI provides an invisible mind that controls a physical body, thus adding an anthropomorphic dimension to the integrated system. According to Epley et al. (2007) human-like qualities enable robotic and AI systems to become more familiar and comprehensible by both end users and developers. Thus, we used the notion of the avatar to conceptualize and develop a familiar, comprehensible and therefore intuitive interaction paradigm for the systematization of interactions between AI programs and heterogeneous sensor, actuator and smart device technologies.

The notion of a software component acting like a mind to control another software component representing a body with sensors and actuators is not new, for example see the agent architecture described in Stathis et al. (2004). We essentially augment that architecture to control physical sensors and actuators. Also, an important focus of our work is to produce middleware that implements such interaction transparently, i.e. a developer can bind AI agents with specific sensors and actuators for free and thus concentrate on other aspects, viz. the modelling of the application level interaction between components (e.g. Stathis and Sergot 1996), its specification and architecture (e.g. Stathis 2000), and their implications on a specific domain (e.g. Cohen and Stathis 2001). The resulting middleware is called eVATAR+, a play with the words electronic and avatar to denote that it enables an entity in an electronic environment to have an avatar through specific sensors and actuators situated in the physical environment (see Fig. 1 in the next section for an example use).

Fig. 1
figure 1

Example of a smart home system that uses a middleware implemented following the avatar framework (e.g. eVATAR+). The locks indicate secure communications

eVATAR+ is an evolution of our previous work with the EVATAR system (see Dipsis and Stathis 2010 and Dipsis and Stathis 2009). The older version featured a message-oriented middleware with a centralized broker and XML based messaging that was enabled by adaptors that were running on the various communicating entities (agents and devices) implemented on a service-oriented architecture (see Dipsis and Stathis 2012). The new version discussed here, however, is a substantial reengineering of the previous system as a web server with enhanced mediation capabilities and a RESTful API for easier interaction with third party software (vs the cumbersome and complicated XML ontologies of the older version). In addition, we present here an approach that utilizes widely-used technologies integrated in such a way that can be easily replicated by developers.

To present our approach, the paper is structured as follows. In Sect. 2, we take a look at current state of the art of middleware that could enable the connection of AI capability to existing smart home and IoT settings and identify potential shortcomings. In Sect. 3, we describe our approach attempting to overcome these shortcomings where we present the architecture of eVATAR+ and its instantiation. Then in Sect. 4 we present a case study that illustrates the type of applications we envisage when using eVATAR+, particularly in the area of AI enabled smart homes. Finally, we summarise our contributions in Sect. 5 where we also present our plans for future work.

2 State of the art

A plethora of middleware for the IoT community is already available, see Ngu et al. (2016) for more details. We have selected here from the current state-of-the-art those middleware that could support AI to sensor, actuator and IoT device integration and we evaluate them against the following requirements. Particularly, we will be focusing on the following characteristics that could simplify the proposed integrations:

  • API technology prevalence of use for the development of similar APIs and API level of complexity indicating ease of use;

  • a systematic and familiar way for implementing the integrations between AI and IoT/Sensor networks;

  • integration transparency.

Agent based middleware intrinsically support agent AI to sensor network integration. They are used to implement software agents on sensor and actuator networks. Specifically, an intelligent agent is commonly deployed on a single sensor/actuator/device node. Agents running on a node react to their environment in ways that support complex tasks requiring intelligence. However, key frameworks under this approach such as SensorWare in Boulis et al. (2007), Impala in Liu and Martonosi (2003), Agilla in Fok et al. (2009), Smart Messages in Kang et al. (2004), Ubiware in Michal et al. (2009) and UbiRoad in Terziyan et al. (2009) are usually implemented with an intrinsic support of a single hardware/software platform overlooking heterogeneity issues and also they are usually bound to a single AI platform.

More generally, deploying intelligent applications using this type of middleware that implements software agents can be a challenging task due to: (a) hardware limitations; (b) the complexity of programming decentralized nodes to exchange increased volumes of context data about the environment and coordinate their activities to support cooperative tasks; (c) proprietary APIs. In environments with fast and reliable networking connectivity such as in smart homes (our area of interest), there are commonly facilities for centralized processing and the sensors and actuators are usually either static (wired) or wireless. Decentralized (node level) computation is not as critical in such settings. Therefore we see that agent based middleware provide with a systematic way of integrating agents with sensor and actuator networks but they tend to be complex to use and application/domain/platform specific.

A more common approach follows a service oriented architecture (SOA) paradigm to implement middleware that provide with ways to interconnect sensor (and actuator) network nodes, IoT and smart home devices. Such middleware components focus on connectivity, interoperability and on low-level tasks such as gathering information from sensors or controlling actuators. The SOA paradigm enables middleware to also implement integrations with external applications through the representation of nodes in a sensor/actuator network as services and external systems as service consumers, for example in our case an AI agent. In this way such middleware can support data integrations in which a sensor network produces data used by external systems.

SOA-based middleware components such as Hydra (LinkSmart) (Eisenhauer et al. 2009), TinySOA (Rezgui and Eltoweissy 2007), USEME (Cañete et al. 2009) and SIXTH (O’Hare et al. 2012), NOSA (Chu et al. 2006), OASiS (Kushwaha et al. 2007), SenseWrap (Evensen and Meling 2009), MUSIC (Rouvoy et al. 2009) and SOCRADES (Guinard et al. 2010) provide services for the integration with an AI program. Web-server based middleware exposing their services using RESTful APIs such as Kaa-IoT Technologies (2017), Konker Labs (2017) and DataArt Solutions, DeviceHive (2017) work in a similar way. However, they do not provide a systematic way for linking AI capabilities to a sensor and actuator network that abstracts away from the low-level details of how this is achieved, nor do they achieve this linking in a transparent way. In order to make this process systematic and transparent we would require a middleware that looks at the integration from an application perspective. SOA-based middleware for sensor actuator networks tend to focus on gathering information from sensors and they tend to ignore how to use this information effectively at a higher-level. SOA based middleware can achieve the integrations and they tend to offer industry standard API interactions that are familiar to a wide range of engineers and developers but their intrinsic focus is the interconnectivity of sensors, actuators and IoT devices because this is what they are designed to do. Therefore they do not offer a systematic and transparent way enabling developers to integrate AI agents to existing IoT and sensor/network ecosystems because they were not designed for this particular type of integrations.

Pervasive and Ambient Intelligence middleware such as SALSA in Favela et al. (2004), RoboCare in Bahadori (2005), the middle layer in Kim et al. (2007) and ReMMoC in Grace et al. (2005) are more flexible in terms of creating application specific solutions that use the data of low-level sensor/actuator network middleware. However, none of these middleware satisfies both required characteristics for: (a) a systematic integration (providing developers with a standard way for implementing the specific type of integrations that involve AI platforms and sensors and actuator networks) and (b) transparency (abstracting the low-level details of how an AI discovers and interacts with a set of a sensors and actuators). In general, state of the art IoT middleware such as the ones mentioned above and other notable examples such as PEIS (Saffiotti et al. 2008) and iCore (Giaffreda 2013) could also potentially enable architectures that integrate AI with IoT infrastructures, sensors and actuators. A downside in the above approaches is that there is usually a steep learning curve when attempting to integrate such middleware into a new system or use their APIs that tend to be complex, proprietary and targeting specialized audiences (due to the IoT middleware pluralism). Furthermore, being middleware, they intrinsically focus on connectivity and interoperability between different elements of sensor/actuator networks, smart home and IoT infrastructures as opposed to simplicity and integration transparency. We also considered middleware approaches for robotics such as MARIE (Cote et al. 2006) and Player (Gerkey et al. 2003) that can achieve integrations between AI programs and sensor and actuator networks, but they also do not satisfy the systematic integration and transparency requirements. Google Fit (2018) is another example of a body network middleware that is application specific and thus not suitable for the purposes of our research for a middleware capable of integrating various AI software with existing IoT and sensor network ecosystems.

Google Cloud IoT (2018) offers a solution for connecting, processing, storing and analysing data both at the edge and in the cloud. A similar approach is followed by Amazon Web Services IoT (2018). The downside of these approaches is that they require proprietary software to be run on the devices, sensors and actuators in order to participate in their infrastructures and they limit the AI to the services offered by their private clouds. Thus, they are not easily interoperable with different technologies.

Atmojo et al. (2015) propose an approach for designing AmI systems based on the use of a concurrent programming language called SystemJ. SystemJ programs control heterogeneous sensor/actuator nodes to implement distributed AmI systems. SystemJ runs on the Java Virtual Machine and provides high-level abstracted objects, signals and channels, for communications between different software entities and the nodes. It can be a complementary approach to eVATAR+ as it is generally designed to implement programmable distributed systems consisting of sensors and actuators while eVATAR+ is designed to make them more intelligent.

The following table summarizes the representative list of middleware that were considered.

Table 1 suggests that there is paucity of frameworks that enable the linking of the computation and functionality of AI programs to networked sensor/actuation devices in a way that fulfils all desired characteristics that were identified in the Introduction i.e. offering: a simple, familiar and easy to use API, a systematic and familiar way for implementing the integrations and integration transparency. We therefore found an opportunity to build upon experience gained from current research and proposed our own approach, eVATAR+. Our approach is tailor made to the particularities of integrating agent AI to sensor-actuator networks, IoT settings and smart homes offering a systematic and transparent way to achieve this (similarly to the agent based middleware) while at the same time offering a commonly used and familiar approach to API interactions. Furthermore, another goal of our middleware was independence of AI agent or IoT/sensor network platforms.

Table 1 Current state of the art evaluated against our researched application

3 EVATAR+

In order to ground our discussion on eVATAR+ and exemplify its concepts we consider as our motivating application one that needs to use IoT integration for a smart home. The way we envisage the use of eVATAR+ middleware for this class of applications, is depicted in Fig. 1 below.

Figure 1 illustrates an AI agent interacting with smart appliances, smart phones and with a gateway that controls a smart home sensors and actuators. eVATAR+ is installed in the edge of the local network and interacts with an AI in the cloud. In other settings the AI software would typically run locally or on a smart phone.

The main idea behind the middleware is to enable AI programs/agents to register with it by sending abstract descriptions of the functionalities they require in the search of a new avatar. The avatar would be a set of physical devices that also register with the middleware by sending abstract descriptions of their functionalities. eVATAR+ performs discovery by matching the registered agents with suitable registered devices. It will then map them together in a way that the agent is able to send action requests to the devices or receive sensory data from them, thus enabling it to enhance existing sensor networks or IoT settings with AI capabilities. This way the devices will constitute the avatar body of the agent in the physical world.

Throughout the paper and for presentation simplicity whenever we speak about physical devices, we refer to physical devices as exposed to eVATAR+ via:

  • a sensor and actuator network middleware

  • an IoT infrastructure

  • a smart home application

  • directly in the case that they are “smart” and capable of supporting the calls required by the eVATAR+ API.

Only in the last case eVATAR+ would communicate with the devices directly. Still, whether the software that registers a device is run on the device itself or whether it is run in a controller gateway that exposes its functionality to eVATAR+, the functionality of the latter will remain the same. In other words, the way eVATAR+ will perform registration, discovery, binding and mediation of messages does not depend on the type of communication supported by the endpoint devices, as long as they are capable of using its API. Therefore, for simplicity we will be using the word device, sensor or actuator for all above cases as aliases to an exposed device, exposed sensor or an exposed actuator.

Having an idea of how a smart home application using eVATAR+ looks like (Fig. 1), we can now proceed with describing the middleware, starting with the architecture.

3.1 The architecture of eVATAR+

Figure 2 presents the building blocks of the reference architecture for eVATAR+. In this section we will refer to this architecture to present the implementation choices and functionality of the middleware.

Fig. 2
figure 2

The reference architecture of eVATAR+

The architecture assumes that any external resource such as sensors, actuators, smart devices and the AI software should call the middleware’s API that uses REST (Fielding 2000) and JavaScript Object Notation (JSON) to exchange messages (JSON 2017). This allows a feedback loop between sensor data that trigger the AIs to select actions (using cognitive capabilities such as decision making, planning, learning and reactivity or social capabilities such as cooperation or negotiation) and then instruct the sensor and actuator network to perform these actions.

In order to simplify a developer’s task we have selected implementation technologies that would enable deployment to the cloud, on a dedicated board computer (e.g. Raspberry Pi 2018), on an existing local server/pc or a smartphone. As a result, we have chosen the DropWizard framework (DropWizard 2017) that is essentially a collection of Java libraries glued together to enable RESTful server applications. The benefit of using DropWizard is that it enables the implementation of lightweight monolith servers and microservices. The latter can be deployed in the cloud as well as on already existing local server PC, dedicated board computers and smartphones. All options except the one of the dedicated board computer would alleviate the need for another device in the smart home. This way we can view eVATAR+ as a low footprint add-on to an existing architecture.

As we can see in Fig. 2 eVATAR+ uses a Jetty container (Jetty 2018) handled by Jersey that enables us to implement in Java a handler that supports the REST API (Jersey 2018).

The persistence layer of eVATAR+ supports the business logic with a relational database. We have chosen to use PostgreSQL because it is a powerful, open source object-relational database system. It supports multiple datatypes, scalability, a good online support community, it is being constantly improved and updated, it is cross-platform and has good administrative tools (pgAdmin, DBeaver). In our typical so far web server setting, the persistence layer uses the hibernate object relational mapping (ORM) framework that is responsible for saving the entity that is a Java object as a relational database record (Hibernate 2018). Most of the libraries that we use are part of the DropWizard framework. Furthermore, we implement the data access object (DAO) design pattern to provide an abstract interface and access to a database by using the ORM. It provides and includes all basic create, read, update, delete (CRUD) methods to interact with the database. The DAO is as light as possible and exists solely to provide a connection to the database.

Furthermore, in our web server architecture of Fig. 2 there is a service layer which can be described as a layer between the resources in the controller layer and the DAO class in the persistence layer. The service layer is called by the resource layer and makes use of the DAO to interact with the database. The service layer provides the business logic to operate on the data sent to and from the DAO and the client. This is why we all this layer as the business layer in the architecture of Fig. 2. Another reason for using an extra service layer to add business logic is security. A service layer that has no relation to the DB, makes it more difficult to access the DB from the client unless it goes through the service. If the DB cannot be accessed directly from the client then an attacker taking over the client will need to hack the service layer as well before gaining access to the data. In the service layer we implement mediator functionality, see Gamma (1995). To support the mediations we also implement discovery of suitable sensors and actuators to an agents’ requirements and their binding i.e. the creation of exclusive communication relationships between agent components and physical sensor, actuators and devices (see below for more details and clarity).

In this architecture the resource layer is independent of the data storage engine. To further ensure layer independence we use dependency injection for simplifying testing and improving decoupling. Dependency injection is a practice where objects are designed in a manner where they receive instances of other objects instead of constructing them internally.

The OAuth2 protocol (OAuth2 2018) is used for authentication and authorization for the agents and the devices that connect to the middleware and the communications are secured with TLS protocol (TLS 2018) as per standard. In an example application a user would login using a password and acquire an expiring session token. Agents and devices in the application would use the expiring token to authenticate their communications with eVATAR + . In addition, in our proof-of-concept prototype we strive to keep the system with the latest versions of the libraries and software used (such as DropWizard and Jetty). Further considerations regarding this issue are more relevant for a commercial deployment of the system and therefore are beyond the scope of this work.

The application block, refers to the term used in the DropWizard framework for the centralized piece of code that puts everything together and runs the server (or microservice in a different architectural context).

Having identified an appropriate technological framework for our middleware, we can now proceed with the description of the design of eVATAR+ within the particular framework. In the following, we will describe only the relevant functionality and classes that are needed to implement the integrations omitting details about configuration, authentication and parts of functionality that come as standard with the DropWizard framework.

3.2 The controller layer

Describing the eVATAR+ API would be a good starting point for providing an overview of what eVATAR+ does. The eVATAR+ API allows entities to interact with other entities via eVATAR+ by sending and receiving JSON objects. eVATAR+ provides a REST/JSON API for interoperability and easy integration. REST APIs dominate the Internet because they are easy to use and widely known. Similarly, JSON is lightweight and intrinsically designed for describing data in way that is easy to be read by humans. Thus, the proposed middleware becomes accessible to a wide audience of developers. Any existing AI software technology or IoT/sensor network middleware and gateway, sensor, actuator, sensor-actuator, smart appliance or IoT device that has the resources to make REST calls can interact with eVATAR+ and participate in the proposed architecture enabling interoperability and heterogeneity.

The API allows the integrated devices and software to establish loosely coupled, asynchronous coarse grain communications between them. Most devices in modern smart homes and especially IoT infrastructures tend to feature connectivity and computing capabilities and show autonomy on their own or as part of a local setting that uses a gateway that controls them. Consequently, there is no need for a fine grain communication between a software agent and a physical device. Instead a sensor can use the API to PUT sensory data to eVATAR+ while the agent would be polling to GET the sensory data or an agent can PUT a request to be temporarily stored in eVATAR+ while an actuator, or a sensor actuator or an IoT device would be polling to GET the request (or set of requests) and carry on the appropriate tasks in the physical world.

The API work in eVATAR+ is performed in the “controller layer” by resource classes which model the resources exposed in our RESTful API (i.e. the Jersey handlers of the http requests Fig. 2). The resource layer is essentially a handler to the Jetty server managing API calls. The UML diagram below shows the main resource classes of eVATAR+ (Diagram 1).

Agents and devices register by POSTing their descriptions in JSON format to eVATAR+ (see Table 2). The device description object in JSON always contains a type tag indicating whether it is a sensor, actuator or smart appliance (including IoT devices). There is a “description” element where we can describe the device and what it does and this description can be used (future work) for display purposes in a UI. The metadata section is important as it uses an array describing the particulars of the device such as the status of the sensor (e.g. 1 for sensed motion 0 for the opposite) and its location. There is no limitation in what we can put in the metadata section. The “name” element should contain a unique name for the device.

Table 2 API for registration by POSTing JSON descriptions of agents and devices

The agent description object in JSON similarly contains its type and description as well as an array of device descriptions. Its device description in the agent object has the exact same structure as the standalone JSON descriptions of the physical devices. eVATAR+ then performs discovery by matching the required by the agent device descriptions to already registered descriptions of sensors and actuators and maps the compatible ones. A required by an agent device and a physical device are compatible if their metadata arrays have the same values. For example in Table 2 shown below the required motion sensor and the physical motion sensor have the same values in their metada arrays (“metadata”:[“status”]).

When we POST a description, eVATAR+ returns JSON objects containing unique identifying (Ids) Long integer numbers for the entities being described. As we can expect the device will receive a unique Id that it will be using for every future communication with eVATAR+ and the agent will receive a unique Id for itself and an array of identifiers for every required device that it describes.

Agent registration messages like the one of Table 2 are handled by the AgentResource class in Fig. 3 and in particular by the registerAgent handler function that deals with the agent registration. Similarly, device registration are handled by the DeviceResource class and the registerDevice handler function. The locality element allows us to determine a particular sensor/device and comes handy in settings where we have the same type of sensors/devices. It takes a string that can describe a location or an identifier. Other API functions handled by the DeviceResource and the AgentResource handlers:

Fig. 3
figure 3

UML diagram of key classes in the resource layer (we show only the important aspects)

The register devices PUT sensory data with messages like the ones in Table 4 below and the agent is polling the middleware to GET this sensory data. In general, when a device or an agent make a PUT rest call with an action request or to send sensory data, they include JSON objects like the ones of Table 4. These calls include the Id (e.g. 334) that was returned to them by eVATAR+ when they registered by POSTing their descriptions.

We notice that the message JSON objects contain a metadata section that has elements with names that match the String values in the metadata array of the corresponding device registration description JSON object (see Table 2).

3.3 The Business Layer

The business layer in eVATAR+ implements the business logic that supports the resource (controller) layer. As we have seen in the architecture the classes in the business layer are called services and they enable the REST API handlers in the resources layer to interact with the persistence layer indirectly while also providing the business logic to operate on the data sent to and from the DAO and the client. Figure 4 shows a UML class diagram of the most important Service classes in the business layer. We can see that they all have the same superclass the “Service” class that calls the DAO (data access object) functionality enabling them to interact with the persistence layer and the database. We use a separate service layer not having the resource classes calling the DAO directly because it enables the controller layer to be independent from the data storage and we can add extra business logic here.

Fig. 4
figure 4

UML class diagram of the most important service classes in the business layer

With regards to the business logic, eVATAR+ implements in the services layer the mediator design pattern, as described in Gamma (1995). The agents/AI software and the sensor networks/IoT settings do not exchange messages directly but via eVATAR+’s mediator functionality. The mediator pattern reduces communication complexity between multiple endpoints and it supports loose coupling (this way a change in the code of one participant would not require a change in the code of the other and thus the code is easier to maintain). The mediator behavioural pattern in eVATAR+ is supported by the processes of discovery and binding. In eVATAR+ discovery is the process by which an agent finds (discovers) the set of sensors, actuators and devices that it requires in order to sense and act in the physical world. We saw that during the registration stage, an agent will POST a description JSON object. This JSON object also includes a set of descriptions of required sensors, actuators and devices. Discovery in eVATAR+ is the process of finding sensors/actuators/device records in the database that are compatible to the ones described in the agent description. The compatibility is determined by comparing their metadata elements. For example, in Table 2 we see the motion sensor required by the agent and the device description (of the motion sensor in the right) have identical strings in their metadata arrays of strings (“metadata”:[“status”]). eVATAR+ would consider them as compatible. Discovery is implemented in the DeviceService class of Fig. 4 (“discoverCompatibleDevice()”).

Binding is the result of the discovery and it essentially means that eVATAR+ maps required by an agent devices to physical devices and when a message from a required device is received, eVATAR+ will make it available to the mapped (mapped) physical device and the opposite. The mapping logic is implemented in the DeviceMapService class.

If an agent/AI software upon registration does not find a suitable physical device for all the required device in its description it will poll eVATAR+ to check if one is found later on (see Table 3). After this point agents can interact with the physical devices by sending messages such as the ones in Table 4. Agents and devices use their own device Ids when making GET REST calls and eVATAR+ uses these Ids to identify their mapped counterparts in the persistence layer (see below). The mapped device will access the message when it polls eVATAR+ for its messages (Table 4). This way the different communication endpoints are loosely coupled, do not communicate directly with each other and thus eVATAR+ implements the mediator functionality.

Table 3 API calls for updating and deleting agent and devices registrations in eVATAR+
Table 4 JSON objects for PUT calls to send action requests/store sensory data and GET calls for polling eVATAR+ for action requests and stored data

3.4 Persistence layer

The business logic of eVATAR+ is supported by the persistence mechanism that uses a relational database (PostgreSQL). The services in the business layer use the DAO object to interact with the database of Fig. 5 that shows the most important to the described framework database tables.

Fig. 5
figure 5

Entity relationship diagram with the main tables required by eVATAR+

The posted JSON descriptions of agents and devices that are handled by the Jersey REST API handler (in the controller layer) of eVATAR+ are represented in the software as JPA entities (Java Persistence API—https://docs.oracle.com/javaee/6/tutorial/doc/bnbqa.html) and persisted in the database. JPA entities are used for mapping Java objects to relational database tables and in particular they are Java objects whose non-transient fields should be persisted to a relational database (according to Oracle). An agent entity Java object in other words contains all the data fields of the JSON object for the agent. The hibernate ORM framework persists such entities as relational database records (agent records in Fig. 5). Similarly, a device (sensor, actuator, IoT device) description is sent in JSON object format, converted into a device JPA entity and persisted in the database as device records (Fig. 5). We saw that the POSTed agent description contained an array of required device descriptions. These will also be stored in the relational database just like their real counterparts. The JPA entities that will be stored as PostgreSQL records (Listing 1).

Every agent and device record in the database has a unique identifier (Id). The database stores agent records as referencing multiple required device records (this is an “one-to-many” relationship Fig. 5). This means that when we store an agent, besides the agent record the database stores new records for every sensor/actuator/device that it requires and all these records have also their own unique Ids. These are the Ids that are returned to the agent (receives its own agent Id and the Ids of all its required device database records) as a response to a POST description call when it registers. The registerAgent function in the business layer returns the agent and device records that include the Ids making them available to the API. The Ids are used for further communication with eVATAR+. Similarly, registerDevice will return the deviceId.

The persistence layer also the business layer functionality that implements discovery, binding and the mediator pattern. When a required device by the agent is compatible to a physical device that has been registered and has a record stored in the database, eVATAR+ updates a database table that stores a mapping of their identifiers (in our example device Id 232 with device Id 334). This is the “device_map” table in Fig. 5 (Listing 2).

Now we say that the required device by the agent is bound to the real device and this enables an exclusive communication between the agent and that particular device. When the agent polls (GET) for sensory data for deviceId (232 in our example) using the Id of the requested motion sensor, eVATAR+ uses the “device_map” table in the database that has it mapped with 334. Then it will use the mapped Id (334) to retrieve all stored messages from the device with Id 334 if any and return them to the agent that contains the required device with Id 232. The aforementioned process illustrates how the persistence layer supports our implementation of the Mediator pattern. The agent and the devices do not talk directly to each other. Instead they communicate via a shared memory space in eVATAR+ thus achieving loose coupling. For fine grain communication, if needed, Jersey supports streaming and it is similar to uploading a file one end and downloading it on the other end. On our setting though the costly streaming should be rarely needed.

After describing all key components of eVATAR+ the following UML activity diagram would provide the reader with an overview of what happens eVATAR+ receives a message.

Having described the most important aspects of the eVATAR+ architecture and design we can exemplify its use with a case study.

4 Case study: agent capabilities in Google NEST

In this section we discuss a case study exemplifying how eVATAR+ (and its associated architectural framework) binds AIs with sensors and actuators in a systematic, secure and transparent manner. The presented case study intends to provide insight about the type of applications eVATAR+ is designed to support. In our case study our goal is to show how we can use eVATAR+ to enable a Jade (Bellifemine et al. 2007) MAS agent to integrate with an Nest application in order to become an actor within a Nest smart home setting by reading sensory events within it and also creating events. We have selected NEST because it offers a well-documented API, a simulator and is also a widely used in modern households allowing us to demonstrate how eVATAR+ would be applied in such a setting.

4.1 The scenario

Let us consider that we have a Nest application controlling a Nest smart home setting. We are going to show how a developer can use eVATAR+ to enrich the smart home setting with software agent capabilities. We are going to show how the Nest application uses the eVATAR+ API to register the smart home devices and how an agent registers its interest for smart home devices by using the same API. eVATAR+ will perform discovery and bind the agent to a set of smart home devices enabling it to apply its functionalities to the smart home.

Initially we used the Nest Home Simulator (https://developers.nest.com/guides/home-simulator) that allows us to easily simulate events in the smart home that were also made available to the Jade agent via eVATAR+ . We will show how to enable an agent to sense the simulation environment by receiving e.g. motion detection events from a camera (Fig. 6) and perform actions in it e.g. control a thermostat. The simulation uses exactly the same API as the real sensors and actuators and we could switch our application to a real environment without making any changes to our application. In the second part we exemplify this point by replacing the Nest simulator with a Nest thermostat (as the simulation cannot coexist with the physical Nest devices in the same setting). Figure 7 illustrates an overview of the architecture that would enable a Jade agent (Bellifemine et al. 2007) to control Google Nest devices.

Fig. 6
figure 6

The Google Nest simulation. Here we see a camera sensor. Notice that we can set motion and sound events that will be sensed by the camera, their duration, even a streaming video status

Fig. 7
figure 7

Jade agents applied to a Google Nest smart home setting. The setting is inspired by the example architecture in https://developers.google.com/nest/guides/architecture

Google Nest and Google IoT infrastructures offer a complete solution for connecting, processing, storing and analysing data both at the edge and in the cloud. Their infrastructure software is not accessible to developers in any way other than via using their public APIs. Nest offers a RESTful API, therefore a Nest application making REST calls to Nest should also be capable of making REST calls to eVATAR+ and use its API. In our scenario we will integrate a Nest smart home application with a Jade agent via eVATAR+ .

4.2 The smart home setting

The Nest home simulator is a self-contained application for creating virtual versions of the Nest physical devices. Interaction with the Nest home simulator is identical to the interaction with a similar setting consisting of physical devices including authentication and identical REST API calls to communicate with the devices, whether virtual or physical. The added benefit of the simulator is that it simulates conditions that would be expensive and time-consuming to replicate in a real world setting. Our Nest application connects to simulations of:

  • smart thermostats that can read and set current temperature, set target temperature, read humidity levels.

  • smart cameras that also perform motion detection cloud storage and send notifications.

  • Smoke and carbon monoxide alarms that trigger as expected a smoke and CO alarm.

The following snapshot of the Nest simulator shows a virtual camera and how we can set events that will be sensed by it.

The simulation enables us to generate different types of events such as sound, motion, smoke, carbon monoxide leak and temperature related events to name a few. These events are sensed by the virtual devices and we can access them via the Nest REST API. Also, we can use the same API to alter the state of the devices in the simulation, for example to set the target temperature of the thermostat.

We implemented our Nest application in Java. An example call to Nest for setting the target temperature of a Nest Thermostat would look like shown in Table 5.

Table 5 API reference to POST and GET calls to Nest for reading and setting the temperature of a Nest thermostat

In order for an application to interact with the NEST infrastructure, it needs to authenticate itself using OAuth 2.0 after which it receives an access token (long alphanumerical token that will be used by the API calls to verify that the application is authorised to control the devices). Having a Nest application that interacts with the Nest devices (in the simulation) using the REST API of Nest we can pursue the goal of this case study which is to show how we can enrich an application in the Nest environment with agent capabilities. To achieve this, the Nest application should use the eVATAR+ API to register the simulation devices with eVATAR+. We remind that in order to register the devices, the Nest application will need to POST to eVATAR+ their descriptions using JSON. In order to create the JSON description of e.g. a smoke and CO sensor we would need to extract the useful features of the particular sensor. In our case, the place to look at is at the JSON objects that are already defining its interaction with Nest.

As we can see in Listing 3 we can encapsulate the description of the Nest API JSON object into the metadata section of eVATAR+ and this way all features of the sensor are potentially accessible to the agent. In practice, when we integrate AI functionality into an existing setting we normally do not intend to replace all native control and sensing functions with new ones stemming from the AI (e.g. agent) component. Instead, we select those useful to the agents’ goals and capabilities.

In Listing 4 we see a more compact description that only contains information that would be useful to an agent. We also see how an API call by the Nest application to eVATAR+ would look like. In particular this call sends the state of the sensor for it to be read by the agent (see chapter 3 for more information).

4.3 The multi-agent system

Jade applications (Bellifemine et al. 2007) are implemented in Java. We implemented for the purposes of our case study a single Jade agent that would interact via eVATAR+ with the Nest smart home. We saw that the Nest application would POST three devices to eVATAR+: a smart camera, a smart thermostat and a smoke and CO sensor. In Listing 4 we saw the JSON description for eVATAR+ of the smoke_co_alarm. The smart camera and smart thermostat were similarly described. On the other end our Jade agent sent a JSON object describing three required devices (matching the physical devices of the Nest smart home) (Listing 5).

Our agent features cyclic behaviours (atomic behaviours that must be executed forever). There are cyclic behaviours polling the state of sensors, e.g. periodically requesting the last change in the state of the camera sensor and update the internal state of the agent with the acquired information. Other cyclic behaviours consult the current internal state of the agent that is essentially a collection of data structures reflecting the sensory data describing the physical environment and send action requests via eVATAR+. The following JAVA code sample illustrates an example of an implementation of the sensing behaviour in the Jade agent (we notice that it uses the JAVA API to eVATAR+) (Listing 6).

Similarly, cyclic agent behaviours check variables like “motion_sensed” and send action requests to Nest devices via eVATAR+ and the application.

4.4 Completing the Picture

The Nest application registers the Nest devices and the MAS registers the descriptions of the devices it requires. eVATAR+ performs discovery as described in 3.3 and binds the required devices to the physical ones. This way Jade has access to the existing Nest application and can add intelligent and interoperable behaviours. In our simulation we can create events e.g. set the smoke alarm go and the Jade will be polling the smoke and CO sensor and as soon as it receive a smoke event it will send a notification (for the purposes of our test it sends an email). In general for the purposes of our integration capability evaluation we had Jade sending emails of describing events that we set in motion in the Nest simulation that were sensed by the sensors and sent to eVATAR+. Also, Jade was able to change the state thermostat and this way we show how the agent becomes another actor in the Nest environment capable of reading and creating events.

We also implemented an integration with a real thermostat simply by replacing the simulation with the real smart home. No code changes where required apart from the configuration to target the different setting (Listings 1, 2, 3, 4, 5, 6, Diagram 1).

Listing 1
figure 8

JPA entities representing an agent and a device. This is how they will be stored in the relational database. Note that the required devices of the agent use the same type of device records as the physical devices

Listing 2
figure 9

JPA entity representing the mapping between a requested agent device database record that has been registered via the agent registration process and a physical device record that was stored via the physical device registration. This mapping is stored in the relational database. Note: (i) the required devices of the agent use the same type of device records as the physical devices. (ii) this table enables the routing of messages from the agent with the requested device Id to the physical device with the mapped device Id and the opposite

Listing 3
figure 10

Encapsulating the native description to an eVATAR+ JSON description

Listing 4
figure 11

Useful JSON description to the smoke_co_alarm. The agent does not need to know about “software_version”, “battery_health”, “is_manual_test_active”, “last_manual_test_time”, “ui_color_state” etc

Listing 5
figure 12

JSON description of an agent requiring a smoke_co_alarm, a smart thermostat and a smart camera

Listing 6
figure 13

A sensing behaviour of the Jade agent

Diagram 1
figure 14

UML activity diagram describing what happens when eVATAR+ receives a message

5 Conclusions and future work

We have presented eVATAR+, a framework with an associated middleware that binds systematically and transparently interactions between AI capabilities and existing sensor-actuator networks or IoT infrastructures, thus making the services offered in such settings smarter the context of such integrations while providing a simple and easy to use interface for developers to use. Our evaluation of current state of the art middleware with regards to the integration of sensor actuator networks and IoT settings with AI agents resulted in a set of characteristics that were used in the design of eVATAR+. We exemplified eVATAR+ with a concrete case study that illustrated a possible use of eVATAR+ and demonstrated a systematic and transparent integration of AI platform functionality (implemented in the Jade agent platform) with a smart home setting that contains physical sensors and actuators (Nest Simulation, Nest smart home). More specifically, we have shown that eVATAR+ features a standard and systematic way of achieving the integrations by:

  • abstracting sensor, actuator and AI program functionality by describing it using simple and readable JSON documents;

  • using standard RESTful API calls to enable sensors to PUT sensory data in eVATAR+ that was read by polling agents (GET calls) as well as to enable agents to PUT action requests in eVATAR+ that were read by polling actuators (there was no need to write extra code for dealing with how the integrations between agents).

In addition, we illustrated how agents via eVATAR+ performed transparently dynamic discovery and binding to the physical sensors and actuators without the developers having to deal with the low level implementation of how the discovery and the binding are achieved within the middleware.

Systematic and transparent integrations already point to a simpler task for integrating AI with sensor networks/IoT environments. Furthermore, our choice of JSON/REST style integrations that are abundant in today’s Internet technologies due to their simplicity and ease of use further enhances our goal to simplify developers’ tasks when attempting the integrations described in this paper.

As part of our future work plans we will explore the possibility of providing a qualitative analysis of the followed approach and start a discussion as part of a research paper about the merits of providing systematic and transparent middleware solutions not just in the area of IoT. We will also explore the possibilities of implementing an application that integrates AI (e.g. Jade agents or use a machine learning component) with existing sensor/actuator technologies such as Google Nest, the Zigbee wireless standard (ZigBee 2019), Arduino (2017), Google Assistant (2017), Amazon Alexa (2017), and IFTTT (2019) among others. The proprietary nature of these technologies and the competing standards tend to lead to interoperability issues between them and the lack of a systematic way for implementing integrations. There are not many systems allowing the control of different competing technologies from a single user interface. We intend to overcome this problem by integrating their APIs with eVATAR+ enabling a centralized control unit that uses a learning AI and a UI (User Interface) for user input. We intend to investigate the possibilities and the advantages/disadvantages of deploying the application as part of an Android app (DropWizard that implements eVATAR+ can run on Android devices) and/or on the cloud or a dedicated low-cost device deployed in the edge.

We also plan to look into deploying eVATAR+ in a variety of settings and applications domains. For example we intend to investigate the possibility of deploying eVATAR+ as part of ecosystems integrating sensor networks with cloud based architectures that provide semantic world knowledge in the form of linked open data. We will then evaluate our approach in conjunction with approaches like SPITFIRE (Pfisterer et al. 2011; Chatzigiannakis et al. 2012) that provides vocabularies to integrate descriptions of sensors and things with the “linked open data” cloud, describes their high-level states and provides search for sensors and things based on their states. Similarly to eVATAR+, they also claim ease of use due to the fact that they use commonly used and familiar technologies. In their case any application experts who are able to publish web pages should also be able to use SPITFIRE. They also provide a qualitative evaluation of their approach.

In view of extending the functionality of eVATAR+ and potentially adopting more flexible deployment possibilities we will look at deploying it as part of a microservices architecture. DropWizard is a common framework for developing microservices as well as web servers. Every DropWizard microservice would have the exact same layers and components within the DropWizard framework i.e. a Jetty container, Jersey REST API controller, a services business logic layer, ORM framework and its own database. Spring Boot (2019) offers a similar architecture with a difference that it provides a variety of choices for particular technologies used e.g. tomcat as an alternative to jetty. A transition from a DropWizard web server to a microservices architecture would involve using the exact same architecture and dividing the business logic and distributing the overall functionality into different microservices by: (a) dividing the database tables, (b) dividing the REST API and (c) dividing the business logic in the services layer.

The current version of EVATAR+ could be viewed as a monolithic server. At this stage there is no justification for implementing eVATAR+ using microservices as the business logic revolves around a specific task i.e. the integration of AI with sensor networks. Furthermore, we could add eVATAR+ as it is to existing microservices architectures as a separate microservice. In the future we would like to see eVATAR+ presenting more intelligent functionality e.g. to support operations on historic events, that clearly constitute big data, needed to train an AI model.

When we add this extended functionality it would be logical to migrate to a microservices architecture where one microservice would be responsible for integrations of AI with sensor/actuator networks, another for dealing with data analytics and possibly a third one for authentication and user management. The fact that every DropWizard server and microservice has the same layered architecture would make the transition easier as it would involve splitting the code and the database but keeping the same structure. Furthermore, we will also investigate ways to improve operations at a streaming level for example for anomaly detection. This would fit in with a new microservices architecture and architecturally deployed before a load balancing server that routes the data to the different microservices.

In terms of currently proposed deployment of eVATAR+ the data is not anticipated to reach high enough volumes that would significantly affect performance especially with the addition of in memory caching such as memory caching of Redis (2019). However, storage based message switching using a faster database technology and potentially with a smaller footprint such as a NoSQL/key-value database is a possible direction to explore if performance is affected by high data volumes. In this context, we will need to weigh the benefits of selecting such a technology instead of using a relational database that would support more complex business logic as we add new features and possibly data analytics.