Towards a cyber-physical system for sustainable and smart building: a use case for optimising water consumption on a SmartCampus

In recent years, the joint advance of the Internet of Things and Artificial Intelligence is enabling challenging developments for Smart Cities and Communities (S&CC). In particular, the SmartCampus, as an essential part of S&CC, acquires a transverse protagonism. On the one hand, SmartCampuses are a realistic representation of more complex systems (i.e., intelligent cities or territories) where to deploy sensors and plan specific goals. Nevertheless, on the other hand, Smartcampuses allow the coexistence of different technologies and networks of experts that facilitate the development, testing, and evolution of technologies. This paper describes the Cyber-Physical System SmartPoliTech, an Internet of Things Framework, as part of a future smart campus. SmartPoliTech develops an innovative framework that facilitates communication between different systems, data visualization, consumption modeling, alert generation, and the awareness of sustainability and environmental issues. This framework is based on a Service-Oriented Architecture to control all processes, from hardware to decision-making systems. This paper provides a sustainable and intelligent water management system to predict water consumption using Gaussian Mixture Models as day-, month- and even hour-dependent functions based on this Cyber-Physical System. The proposed solution can be used in any facility, with significant benefits being foreseen in metrics such as the minimization of water wastage.


Introduction
The concept of Smart Cities and Communities (S&CC) and its benefits for modern society will become a reality in the coming years. Technologies included in the term smartX, such as Cloud Computing, Big Data, Artificial Intelligence, and the Internet of Things (IoT), are increasingly evolving and integrated into the industry, public buildings, and, in recent years, universities. In the latter case, the concept of the SmartCampus encompasses different objectives depending on its design: optimize efficiency, comfort, safety, or security (Wang et al. 2017;Alghamdi and Shetty 2016).
The deployment of intelligent campuses requires university management policies that invest in and support the above objectives. SmartCampuses present the same problems as Smart Cities: efficient use of the resources available or the development of high-quality IoT services for the community, but all at a reduced cost. In this sense, the safe, efficient, and functional use of public spaces is an urgent challenge with increasing priority in public administration agendas.
Most university facilities and public buildings misuse their resources, causing water and energy wastage, lack of comfort, and underutilization of spaces. For this reason, it is necessary to implement resource management systems so that the buildings that make up the smart campus are gradually more efficient and better adapted to the actual needs of their users. Typically, SmartCampuses works on building automation systems to integrate the facility's core systems, such as heating, ventilating, air conditioning, lighting, power meters, or water meters (Alghamdi and Shetty 2016).
There has been a growing interest in recent years in the development of smart campuses and universities. From the system architecture and the technologies involved to the services and applications offered to users, numerous studies in the literature demonstrate the importance of this topic (two recent surveys are highlighted in Fernandez-Carames and Fraga-Lamas (2019), Muhamad et al. (2017)). Most of these works focus on specific solutions, not integrated into an architecture that provides a global solution to the multiple problems faced by the universities of the future. Motivated by current smart campus initiatives, this paper describes a new Cyber-Physical System (CPS): SmartPoliTech. Smart-PoliTech is an IoT framework that provides tools and solutions for sustainable and intelligent building management. The proposal includes as a main input: 1. A CPS architecture for sustainable and smart campuses, including a detailed description of the subsystems, communications, and specific decision-making applications. SmartPoliTech provides an integrated solution to solve most of the issues faced by sustainable smart campuses: energy and water consumption, security, safety, and resources optimization (Wang et al. 2017).

As a novelty, SmartPoliTech uses a Service-Oriented
Architecture (SOA) for the communication and control of all processes, from the physical world to decisionmaking systems. The main goal of the proposed CPS is to construct simplified models to achieve the optimal ability of the system to provide sustainable decisionmaking. 3. A data system open and visible to all users of the Polytechnic School or anyone interested in general. The users will always have in view all the data produced by the CPS through a system of screens distributed throughout the buildings, with the aim of raising awareness about the use of energy resources and thus reduce any bad habits that users may have. 4. To demonstrate the potential of the proposed CPS in the real world, this paper performs field experiments in the Engineering School at the University of Extremadura, site in Spain. In this experiment, the CPS is deployed in seven buildings, monitoring variables such as energy or water consumptions, temperature, and CO 2 (see Fig. 1). 5. A case study that demonstrates the use of the SmartPol-iTech CPS for optimizing the water consumption in the campus buildings. Data collected by physical devices are used to predict future water consumption using Gaussian Mixing Models (GMM). From these predictive models, following a user-centered philosophy, a warning system is developed, which is adaptive and detects anomalies and water leaks, and generates quick responses. The result also showed that the use of a mixture of Gaussians is favorable for reducing water wastage.
The rest of the work is organized as follows. Section 2 presents a general overview of Cyber-Physical Systems and the related background of CPS on smart campus, as well as an analysis of IoT systems that are used to predict water consumption in intelligent buildings. In Section 3 the general overview of the Cyber-Physical System SmartPoliTech is presented, which revolves around the different IoT infrastructures, introducing the architecture and main services. Section 4 focuses on a specific use case for optimizing water consumption and, thus, making buildings more sustainable and efficient. From the previous points, Section 5 presents the experimental results and the main discussion on the lessons learned from this experience. Finally, Section 6 presents the main conclusions of this work as well as an outlook on future lines of research.

General overview of CPS in SmartCampus
Modern IoT technologies are rapidly moving forward, engaging in more and more areas of life. The development of Cyber-Physical Systems has become a natural continuation of the transition to a qualitatively new level of engineering and technology in different areas of interest. The literature uses the concept of Revolution 4.0 (Dimitrios 2018) to describe this new development. Industry 4.0, for example, directly depends on key topics related to CPS and IoT technologies, defining the future of manufacturing (Jamaludin and Rohani 2018). Although Industry is the activity that has been able to adapt more and better to the evolution of IoT technologies, there are other issues where the development of CPS is being explored. IoT is also an integral part of Agriculture 4.0, Medicine 4.0, or Education 4.0 (Bhrugubanda 2015; Jamaludin and Rohani 2018). In all of them, Fig. 1 Aerial view of the Engineering School and its facilities. The Cyber-Physical System SmartPoliTech has been deployed in its seven buildings the advances of CPS are a crucial goal in developed societies. This section provides an overview of smart campus initiatives and CPS and their main characteristics.

Smart campus initiatives
There is a growing literature on smart campus initiatives (see reviews Fernandez-Carames and Fraga-Lamas (2019), Muhamad et al. (2017)). Smart campuses and universities need to provide connectivity to IoT devices, deploy architectures that make it possible to offer a communications range through the latest technologies. Most of the current state of-the-art works focus on these applications and the experience from the real-world IoT implementation. Some authors, such as (Fernandez-Carames and Fraga-Lamas 2019), present an architecture for intelligent campuses based on the new Low Power Wide Area Network (LPWAN) technologies. This LPWAN has emerged as a promising solution to provide low-cost and low power consumption connectivity to distributed nodes in the deployed area. Specifically, these authors propose an architecture based on LoRaWAN, making it possible to monitor energy sources in distant places. Other works are adopting smart grids or microgrids within their campuses (SMARTGRID) , taking a step towards operating the university network as a smart grid in response to increased energy demand, environmental protection, and the need to rely on renewable energy (Alghamdi and Shetty 2016). Some universities are also opting for on the blockchain to develop applications for SmartCampus (Fernandez-Carames and Fraga-Lamas 2019) so that they can, for example, guarantee the authenticity of educational certificates, manage digital copyright information or verify learning results, or improve interaction with e-learning. With the development of smart campuses, it is possible to propose different teaching methods and even to create a unified platform that integrates various systems such as library management, student identification, access cards to buildings or transportation, or even attendance control (Majeed and Ali 2018).
Regarding sustainable campuses, the use of IoT for managing water or energy consumption in their buildings is proving very useful. The most common idea in most of the works is an intelligent management system that uses realtime data from sensors and actuators to monitor and improve resources management (Robles et al. 2014). The above solution, for example, is a monolithic system containing all application functionality, mixing component roles such as data persistence, business logic, or user interface. Today's systems offer new, more general perspectives, defining architectures derived from the Smart City concept. The following is a list of the last works that show solutions to the deployment of sustainable SmartCampus platforms. These initiatives aim for various purposes, including criteria related to Sustainable Development Goals. Tables 1, 2, 3 summarize the main features of these smart campuses comparing with our proposal.
In (Fortes et al. 2019), the authors describe a pioneering project that aims to apply the Smart-City concept into a smaller scale, providing an urban-lab for researchers and imposing the University of Málaga as a reference campus in environmental sustainability. The basis is a layered architecture, from sensor and actuators (top layer) to the data analysis. The system measures several parameters (electricity, water consumption, among others) and uses the communication layer to have data stored and managed by the European open-source initiative FIWARE. Some researchers have also proposed alternative paradigms for deploying smart campuses. In (Simmhan et al. 2018), the authors describe the The aim of this initiative is to manage the energy resources, specifically focusing on water management. The architecture follows the layered model that adds different functionalities at the top of the architecture, including several data analytics and visualization modules that help with a manual and fast automated decision-making about the water domain. The Polytechnic University is also developing a SmartCampus project that aims to improve the management of information coming from the university's functioning (Álvarez et al. 2019). In this work, the authors detail applications that allow more agile and efficient management of resources, based on Artificial Intelligence for the calculation of optimal locations of buildings, as well as the implementation of the DOMOGIS System for automation, monitoring, and sensor data management. This idea of offering open access to users through a system of dashboards and interactive maps is also addressed in our project. The authors of (Haghi et al. 2017) propose a smart campus architecture based on cloud computing, which deploy a service-oriented architecture by using Commercial Off-the-Shelf hardware and Microsoft Azure cloud services. Sensor readings are processed by these cloud services responsible for carrying out storing, managing, and analyzing the data and making it available to developers to build applications. The technologies used for communication follow standards such as BLE, ZigBee, or 6LowPAN. More focus on ensuring a high level of security as well as high data confidentiality, in (Popescu et al. 2018) the authors describe a smart campus that integrates the use of cloud computing and IoT in a five-layer architecture. This solution can conveniently recognize locations from access between teachers and students to access and share learning resources online in real-time.

Cyber-physical systems and smart campus
A CPS connects the physical world to the real world, providing a means to add more intelligence to social life. It integrates physical devices, such as cameras, sensors, and actuators, with cyber agents to form an intelligent system that responds to dynamic changes in real-world scenarios. Formally, a CPS is an integration of computation with physical processes whose behavior is defined by both cyber and physical parts of the system (Lee and Seshia 2017 (Lee et al. 2015;Nie et al. 2014;Zhang et al. 2017) (an interesting review is provided by (Hu et al. 2012)). In (Lee et al. 2015), authors define a 5-level CPS structure for developing and deploying a CPS for manufacturing applications, from the initial data acquisition to the final value creation. Each level of this architecture defines main functions and attributes. Other works, such as (Nie et al. 2014), use a three-level architecture: the physical layer, the network layer, and the decision layer. A CPS architecture for health applications is proposed in (Zhang et al. 2017), where authors define an architecture of three layers, namely a data collection layer, data management layer, and application service layer. Each one of these architectures has been designed for a particular application; however, there is a consensus among most authors that these architectures should capture a variety of physical information, reliable data analysis, event detection, and security. Although many CPS architectures have been proposed in the literature, the number of them proposed for SmartCampus applications is very small. (Cecchinel et al. 2014) proposes an architecture for collecting sensor-based data in the context of the IoT, which is validated in a use case for SmartCampus, but it lacks the complete architectural framework. In (Sanchez and Oliveira 2018), the authors propose an IoT framework whose main goal is to monitor water consumption in university buildings. However, their architecture fails to address some important issues, such as security, privacy, and other highlevel services. Fig. 2 depicts a CPS for SmartCampus conceived based on this literature to facilitate further discussion in subsequent sections of this paper. Table 4 summarizes the main features of a set of representative CPSs proposed for various applications that have been analyzed in this work.
Regarding the SmartCampus applications, the research on CPS is still in its early stages. Over the last few years, different universities have contributed to making their campuses more intelligent, most of them to improve the experience of their users in terms of comfort (Alghamdi and Shetty 2016;Fortes et al. 2019) and the optimization of their resources, such as the distribution of parking spaces (Sari et al. 2017), space reservation or security (Abdullah et al. 2019), or for  precise and reliable control, service architecture, integration of IoT technologies remote teaching and learning. However, all these works propose specific solutions without a general framework with the characteristics of a CPS. In Wang et al.'s work, (Wang et al. 2017) a SmartCampus IoT framework is described to address issues related to energy consumption, classroom functionality, safety, and cyber-security. Although the authors describe many possible functionalities, the final implementation of the system consists of three main devices: a smart outlet, an intelligent switch, and a sensor hub. The possibilities of expanding all the SmartCampus functionalities using the advances of the IoT and the CPS are considerable, and that is the main objective of this article.

Consumption prediction for smart buildings
Predicting energy or water consumption in smart buildings has become a significant challenge in creating sustainable cities and communities. Cyber-Physical Systems facilitate these predictions thanks to the deployment of sensor networks and IoT infrastructure, which has led different authors to develop and implement solutions on the data stored in these systems. Traditionally, load analysis has been the main objective in most works. (Nizar et al. 2006) presents a proposal to detect the best load profiling techniques and data mining methods to classify and predict non-technical losses in the electric distribution sector. In (Chicco et al. 2006), the authors try to cluster similar customer consumption behaviors and compare various unsupervised methods, such as hierarchical clustering, K-Means, and fuzzy k-means. They also include principal component analysis (PCA) for dimensionality reduction. In (Prahastono et al. 2007), the authors compare several clustering techniques (e.g., hierarchical, K-means, fuzzy K-means, follow the leader, and fuzzy relation) and their main characteristics for the generation of electric load profiles based on a previous classification of customers. In general, most authors agree on the importance of a careful selection of the clustering algorithm since each one has its peculiarities that must match the data characteristics (Prahastono et al. 2007).
Regarding water consumption prediction, several works address this issue in recent years. In (de Souza Groppo et al. 2019), authors review several methods for predicting water demand employing artificial intelligence, which demonstrates how the use of big data techniques has grown considerably in recent years. Predicting long-term water demand has been studied the long term in several approaches using neural networks and econometric models, (Donkor et al. 2014;Ghalehkhondabi et al. 2017;Zhu and Chen 2013), where most approaches conclude that this demand depends on the expected vegetative growth, the socioeconomic and climatic variables, and geographic expansion. In the case of short-term water demand forecasting (e.g., water demands from 1 to 24h later), other approaches also based on artificial intelligence have been proposed (Gagliardi et al. 2017;Candelieri et al. 2015;Zubaidi et al. 2018), which usually try to understand the behaviors and dynamics of consumers using historical water consumption data. The common denominator of the studies carried out by the authors presented is the use of sets presented as a time series. These time series (time-stamped data) are mostly made up of data collected in the months before the studies, and in some cases, they not only use water consumption data but cross-match them with meteorological data.
Automatic detection of anomalies using big data techniques has also been applied in the scientific community. This approach helps build energy/water management systems that reduce operating costs and time by reducing human monitoring and providing the in-time diagnosis of false warnings. In (Khan et al. 2013), for instance, the authors apply three data mining techniques (classification regression tree, K-means, and DB-SCAN) to detect anomalous lighting energy consumption in buildings using hourly recorded energy consumption and peak demand (maximum power) data.
The work described in this paper uses a Gaussian Mixture Model to predict both short-term and long-term water consumption. In (Melzi et al. 2017), the authors use a large amount of data collected by physical devices to understand consumer behavior better and optimize electricity consumption in smart cities. They present an unsupervised classification approach to extract typical consumption patterns from data generated by smart city electric meters. Similar to the approach described in this paper, in their work, a constrained Gaussian Mixture Model, whose parameters vary according to the day type (weekday, Saturday or Sunday), is used and evaluated according to a real dataset collected by smart meters in households for a year. The use of a Gaussian mixture applied to this problem is not new; however, in this paper, we present a use case in which water consumption is estimated based on Gaussian mixture models.

Key finding
The literature review related to Smart Campus points out that proposed architectures differ widely, although most initiatives aim to meet sustainable development goals, improve energy resource efficiency and provide campus users with high-quality services. This article presents an easily replicable and scalable SmartCampus architecture with differentiating characteristics concerning other architectures analyzed. SmartPoliTech is based on cyber-physical system architectures, distinguishing a tangible and physical part of the architecture such as the different sensors deployed and another digital part. In our work, we describe the whole communication process, the data storage and visualization, and the services provided to the users. A cyber-physical system vision is a modern solution adapted to Industry 4.0 and linking with twin digital models for predicting anomalous situations and efficiently managing its resources.
Most of the solutions in the literature describe layered architectures or use proprietary service buses. Unlike them, in our proposal, both the communication of sensors to store the data collected in databases and the services offered to users is managed by an open-source service bus (Zato framework). This SOA architecture solves the growth of services and devices within a campus, making it necessary to develop connectors that allow the different applications to communicate. The possibility of using a service bus facilitates communication between systems over any protocol and device, i.e., it becomes a gateway, which translates from one language to another. This last ensures the scalability of the system.
Related works generally use time-series databases, collecting sensor readings for further analysis and visualization. This solution allows for improvements, such as geolocation within the campus. In our proposal, we study graph-based databases, which open up multiple possibilities, storing all attributes in a structure of nodes and links. These attributes range from the type of communication they use, the type of sensor, and the location and campus building they are located in. Thanks to this type of database, we have created the first specific application to visualize sensor readings. However, the possibilities for creating new applications are multiple. For example, the database establishes close relationships between sensor readings located in the same building, thus detecting changes in the use of energy resources or directly analyzing consumption patterns taking into account other parameters such as temperature or humidity.
Finally, the idea of the intelligent campus infrastructure being a living laboratory for testing technologies is shared in many of the papers reviewed. One of the main objectives of SmartPoliTech is to make the information generated by the project available to all users, and for this purpose, as is done in other smart campuses analyzed, an open-data system is created. This last also opens up new opportunities and initiatives related to smart citizenship and is outlined in this article. The SmartPoliTech proposal deploys a set of visualization systems on campus accessible to students and teachers to raise awareness of sustainable resource use. In the case described in this paper, if there is an anomaly in water consumption, any user can be aware of it in real-time, know the location of the fault, and act to solve it.

Cyber-physical system for SmartCampus: SmartPoliTech
A Cyber-Physical System (CPS) is a distributed, networked information system that fuses computational processes (i.e., cyber world) with the physical world. A SmartCampus is a typical example of a cyber-physical system, where a set of sensors acquires real-time information about the environment (physical world) to create and synchronize an information system (cyber world) used by the university community.
A CPS requires, among other subsystems, a communications infrastructure, a data storage system, the interconnection of all systems, processes, services, and tools to access and manage the stored data. The architecture of SmartPoliTech is shown in Fig. 3. Most of the technologies in the diagram are closely connected to IoT. The CPS presented here comprises several independent systems. Some of them are simple devices that acquire data, and others are complex modules that work together to achieve a common goal. Following a similar nomenclature as the one used in recent works found in the literature (Alam and El Saddik 2017), the CPS SmartPoliTech, , consists of the following subsystems: physical world , responsible for acquiring information from the environment and storing the data in local servers, . The set of functionalities that the CPS provides, ℚ , is managed through a service-oriented architecture. Finally, the system includes data visualization . Therefore, = ( , , ℚ, ) . The following subsections describe in detail each of these elements.

Introduction to SmartPoliTech
SmartPoliTech (Sánchez et al. 2017) is a CPS under development at the School of Engineering of the University of Extremadura in Spain. Its aim is to transform its facilities into a large experimental ecosystem, a living lab for the design, implementation, integration, and validation of systems capable of creating and managing intelligent environments. SmartPoliTech relies on IoT technologies to encourage better energy and water consumption habits by users while also improving energy and consumption efficiency in its facilities. The School of Engineering, which was built more than 40 years ago, presents a series of anomalies in energy and water consumption due to aging and the lack of adaptation of the buildings that comprise it. Some of the anomalies in the surroundings are as follows: -Excessive consumption of sanitary water, of which many liters are wasted. Currently, around 4000 cubic meters of water are consumed per year. -Inefficient consumption of electrical energy or gas oil. In one year, around 60 cubic meters of diesel is consumed. -Bad quality of the interior air due to lack of ventilation results in high concentrations of CO 2 . -Lack of thermal control in spaces, alternating freezing periods with others that are too hot.
This university complex has about 1500 users and consists of seven buildings of more than 20,000 m 2 , distributed on three floors (including the ground floor). In all of them, a set of sensors have been deployed to measure energy, water, and gas consumption, among others. Similarly, some sensors measure temperature, humidity, CO 2 in the classrooms, and occupation. Combining the existing historical data generated by the sensors (from 2013) with the analysis thereof using artificial intelligence algorithms makes it possible to establish a roadmap towards a CPS for intelligent and sustainable buildings.

Designing the physical world
A SmartCampus requires data to be collected from the physical world through a network of specific sensors. Firstly, our system must previously analyze factors such as the orientation of the buildings, the location of critical points concerning thermal conditions, the selection of meters, stopcocks, and essential points of the energy system of the surrounding pavilions. These physical objects (sensors) will be able to acquire data from the environment and have built-in communication capabilities. In addition, the physical world must incorporate access to data storage systems to use computing capabilities to predict future scenarios.
In the proposed CPS, the physical world consists of a set of physical sensors, which are classified as follows: (i) ambient temperature sensors ( w t ∈ ), (ii) relative humidity sensors ( w h ∈ ), (iii) stopcock ( w s ∈ ), (iv) presence and location sensors ( w p ∈ ), (v) temperature sensors in boilers(w b ∈ ), (vi) gas consumption sensors ( w g ∈ ), (vii) window status (open or closed) ( w w ∈ ), electricity consumption sensors ( w e ∈ ) and CO 2 sensors ( w CO 2 ∈ ).
Therefore, can be expressed according to 1. This subsystem is not closed and can be extended with new sensors if needed. Fig. 4 shows a diagram of the physical system implemented in the buildings of the EPCC campus.
Each sensor w i ∈ is defined by a list of components w i = (R w , Y w , X w , T w ) i . R w is the component responsible for capturing the real-world events, Y w the component responsible for adapting those events to the physical variables in which they are measured, X w the component responsible for connecting the sensor to the internet and providing it with data transmission capacity, and T w the component responsible for sending that information through a query in ℚ to the layer of the databases. Each device is named by a unique identifier which includes information about both its location and the type of sensor 1 is associated with a temperature, humidity, and CO 2 device -SEN_001_THC -located in a research lab -LAB001 -on the zero floor -P00 -of the computer science building -INF -in the Campus facilities -UEXCC -). This unique identifier is essential for the subsequent design and implementation of both the database storage system and the queries services in the and ℚ layers, respectively.
These sensors constantly acquire a certain amount of information stored in a virtualization server that also supports the reception, processing, and display of data. The sensors make use of the q w j � � � →D i service available on the ℚ service bus to send the information to the databases. Sensors use this service by making a call to its URL and introducing the JSON field with the data to be sent. The following attributes generally define this JSON: -Info: structure of the JSON which collects the static data from the sensor to be sent, and which consists of: -Apikey: unique key associated with each sensor -Device: unique identifier for each sensor -Data: structure of the JSON that collects the dynamic data (physical variables measured by the sensor) It consists of all those variables measured by the sensor.
Although not every type of sensor has generated the same amount of data, since they were not all placed at the same time, an average of the total number of samples generated by each sensor is shown in Table 5. Thus, the system is in a public cloud infrastructure that anyone can access via a web browser. The data can also be downloaded in JSON format using different APIs available in several languages (Python, MatLab, ...).

Designing the cyber world
The long-term objectives in designing the cyber world for the SmartCampus are to create a strong link with the physical world to support users in performing various specific tasks and also to provide real entities (e.g., humans, machines, or software agents) with a wide range of applications and services. Therefore, it is necessary to provide it with capabilities to access the physical world at a given time and store data, process it and offer services to different users through different channels. The design of the cyber world for the SmartCampus requires the different subsystems that are described below.

Data storage subsystem
To improve control efficiency and minimize expenses when installing new devices or recovering from system failures, the proposed CPS strives to optimize the system for storing data acquired by the physical world . The most important asset is data availability, persistence, and relevance which are the key factors to success. In addition, a correct and efficient design of data storage systems is essential for future CPS-controlled SmartCampus, where the number of devices is very high, and there is a permanent need to extend it with new elements. In this sense, scalability becomes a crucial feature, so the system can maintain its effectiveness and throughout even if there are additions or expansions of devices. With this premise, the data storage system is made up of two open-source databases with different and complementary features: the time series database Influxdb  182534 w e 1821510 (InfluxData 2021), D i ∈ ; and the graph database Neo4j (Neo4j 2021), D n ∈ (see Fig. 4).
-Time series databases: Time series databases, such as Influxdb are optimized for time-stamped or time-series data and are built specifically for handling metrics and events or measurements that are time-stamped (Influx-Data 2021). This feature makes this D i database an ideal instrument to store the data series that are acquired in the physical layer by the sensor network. D i stores data as time series with a variable number of measurements. In our CPS, each physical device is associated with a one of these series, i is the series associated with the sensor w j ∈ . D j i is defined as (timestamp, [label, value] n ). The CPS D i accepts queries through an API using mathematical operations and time groupings that facilitate data analysis and information gathering from the smart campus. Also, Influxdb is easily integrated with open-source visualisation environments such as Grafana, which is also part of (Grafana 2021). -Graph database: A graph database stores structures where semantic queries can be used. Graphs are composed of nodes, edges, and properties to represent and store data and its relationships. Our CPS uses the graph database Neo4J, D n , to hold and stores the rich spatial structure of the university complex. Neo4J is also capable of indexing geographical information (i.e., coordinates) associated with the nodes. This feature provides a direct way to locate all elements included in the physical layer subsystem . The SmartCampus facilities have been organized as a hierarchical tree with N nodes and E edge, G EPCC (N, E) . A node n i represents a physical element at different levels of the hierarchy. The parent node corresponds to SmartCampus, EPCC, and the rest of the nodes hang as subsets in different levels and categories (buildings, classrooms, laboratories, sensors, ...). The first level, B = B 1 , B 2 , ..., B K is associated with the set of K buildings and similar facilities. Each element B k ∈ B is defined by a series of attributes, such as its identifier, its geo-location, and many optional elements like spatial structures (GeoJSON and Well-Known Text textual attributes).
Edges E in G EPCC (N, E) are associated with the relationship 'HAS', i.e., the parent node n i has the n j child node. Therefore, from each node B k ∈ B hangs the set of L nodes F = F 1 , F 2 , ..., F L associated with the number of floors of the building B k . Also from the F l ∈ F node hangs the set of M nodes R = R 1 , R 2 , ..., R M , which are associated to the number of rooms (e.g., classrooms, offices, laboratories, ...) of the floor F l . Nodes F l and R m are defined by the same list of attributes as the node B k .
Finally, the last level represents the subsystem, i.e., the set of devices that have been installed in the CPS. Each sensor w i ∈ is related to each room or space in which it is located (i.e., HAS relationship), and therefore the device w i is in R m . In order to identify which sensor w i belongs to each room R l , the identifier explained in Section 3.2 has been used. In the G(N, E) tree, other levels represent physical elements, such as furniture or people, hanging from the R l level following the same logic as in the other levels. The list of attributes is similar to other nodes in the tree, adding a link to its temporal series database and the open-source visualization environments Grafana. Figure 5 illustrates a simple example of tree EPCC(N, E) with only one building, two floors with different rooms, and only one device. Figure 6 shows a partial view of the whole tree centered in the Computer Science building EPCC (N, E).
Graph database D n represented as a tree G EPCC (N, E) describes the SmartCampus' CPS according to a geometric point of view, associating each level with geolocalized physical elements. The hierarchical division into levels, starting from buildings and going down to  (N, E) tree. Each node is associated with a level in the hierarchy and is characterized by a list of attributes the devices responsible for data acquisition and the list of attributes of each node, facilitates future queries and visualization of CPS data.

Information handling and processing using enterprise service bus
Information is a critical factor in delivering services across CPSs. In the system described in this article, dynamic data obtained from readings in physical devices along with static data coming from blueprints, schedules, or inventories. It constitutes the core of the information system. This data is made available to SmartCampus users and machine-tomachine connections. This high number of interactions between devices, humans, and the CPS must be organized in a scalable, efficient, and reliable way. The choice here has been to use a Service-Oriented Architecture (SOA), where multiple services are provided through the open-source Enterprise Service Bus (ESB) Zato (Zato 2021). An ESB facilitates communication between software agents (Chaudhari et al. 2017) while integrating and managing multiple information sources with different access methods. Let ℚ be the set of N services ℚ = q 1 , q 2 , ..., q N provided by CPS. Each service q i ∈ ℚ implements a function f i and has a maximum activation rate i which denotes the frequency at which this service q i is requested (e.g., physical devices use a specific service for storing data in each period). Each service q i uses the HTTP protocol by creating a specific plain HTTP channel c i that accepts synchronous HTTP service invocations. Specifically, with Zato, several REST channels are used, which requires an identifier and the path to mount this channel on, urlpath, in the URL ip:port/ urlpath. Plain HTTP channels in Zato do not expect the data to be in any particular format; it can be anything that can be  Figure 7 illustrates an overview of how services are managed in this proposal.
Creating services in an SOA is not a complex task. One of the main advantages of this system is its high scalability and flexibility when new services are required. The following are examples of some of the key services deployed by the Zato EBS: physical device to time series database ( q w j � � � →D i ): this service is intended to generalise the insertion of data from the subsystem to the data storage system, in particular, to the temporal series database D i . Using the service q w j � � � →D i , physical devices make a request to the channel c w j � � � →D i at rate w j � � � →D i using only an URL and their last measurement. -users to time series database ( q u j ← � D i ): This service is intended to allow users or other software agents in CPS to access the data stored in in a simple and generalised way. The service q u j ← � D i uses the specific channel c u j ← � D i and obtains data from D i by means only of an URL, a sensor identifier and the date of interest. Due to the asynchronous nature of these queries, w j ← � D i is not defined in these services. -viewer service ( q ← � ): This service is intended to allow users to access the data generated by the CPS in real-time through an interactive map or viewer. The service q ← � uses the channel c ← � , but unlike the previous services, it does not directly access any of the databases in but load an interactive map of the CPS into the browser. This interactive map allows users to move around it so that the browser displays a map area defined by coordinates at a rate defined by ← � . Then, a request is made to D n looking for all those nodes whose coordinates are within the coordinates that define the displayed map zone. The nodes found are arranged based on their label or level (Building, Floor, Room, Device) and then depicted in the map along with their attributes.
One of the critical features of this ℚ is that precisely the same service q i can be displayed over multiple channels without any changes to the service's implementation. Besides, ESB improves the security of all communications by avoiding the shipping of sensitive data between physical devices and the data storage system and by limiting the number of queries to databases. The ℚ system also allows changes in the CPS set-up without re-implementing services (e.g., change IP addresses of the servers or changes in the structure of the data storage system).

Visualization system
One of the main goals of the CPS described in this work, apart from collecting data on the variables that affect energy efficiency in the SmartCampus, is to make this data available to the users of the buildings. The fact that people in the SmartCampus can know in real-time their use of critical resources (e.g., water consumption or electricity consumption) can be used to make them aware of their responsibility and contribution to environmental sustainability, helping to transform SmartCampus users into intelligent citizens (Sánchez et al. 2017).
Generally, a well-designed visualization system facilitates high-level application designs. Among these design issues, the system must monitor physical devices and infrastructure to ensure stable and proper operation (e.g., measurements, communications, among others). It also needs support realtime decision making by combining multiple data sources into a specific viewer. For this reason, the CPS for Smart-Campus proposed in this paper defines the visualization system , which consists of two different viewer tools. The first one is based on the open-source visualization tool Grafana, specific for monitoring and analyzing time-series (Grafana 2021). The second one is based on an interactive map viewer of the SmartCampus facilities. Access to both viewer systems is made through their corresponding services defined in ℚ . Figure 8 shows both viewers. Figure 8a shows the Fig. 7 Overview of the services management in Zato Enterprise Service Bus water consumption in the different buildings of the Smart-Campus. In Fig. 8b, a fragment of the SmartCampus map is illustrated, showing the energy consumption of the building.
-Grafana viewer: The organisation of the different data visualization that can be built in Grafana is done in dashboards. In this proposal, the available physical devices have been organized in different sections: water consumption, environmental data, access points, energy consumption (electricity, gas), and cameras. In this way, any interested user can access the data freely, since the tool is publicly accessible 2 . Additionally, this viewer is also periodically displayed on several smart-TVs distributed throughout the different buildings of the SmartCampus. -Interactive map viewer: this visualization tool uses specific ℚ services to display a map of the SmartCampus in the user's browser, with access to measurements of all the physical devices. This interactive viewer allows navigation by the user (e.g., zoom in, zoom out or move around a specific region), as well as different types of interaction: real-time access to physical devices, such as visualization of camera streams in real-time, download- ing of selected data, among others. All these functionalities are offered with different access levels to provide security and confidentiality according to the area of the map being visited 3 .

Designing high-level applications for sustainable SmartCampus
Previous sections have highlighted the importance of providing users with a wide range of high-level services and applications. In this sense, the data acquired in the physical world is not only used to show the past and current state of the SmartCampus to the users in the visualization system , but also makes it possible to automatically detect energy problems, observe usage trends and propose improvement strategies. Analyzing the data makes it possible to create algorithms to detect abnormal energy consumption, predict future demands, or interrelate information from different sources.
Given the current state of most buildings built more than 40 years ago, several severe anomalies have been detected that affect their efficiency and overall sustainability. Some of these problems are related to water leaks and excessive consumption of electricity and gas. These are clear examples where AI techniques embedded in a CPS can improve the working of its natural counterpart.
For instance, the SmartPoliTech project is currently carrying out a campaign to raise environmental awareness among users (students, teachers, teaching support staff) by displaying energy consumption data. The primary vehicle to achieve this now is the use of smart TVs connected to the CPS. These strategically placed screens display instantaneous, daily, and monthly consumption data easily and understandably. Graphics specially designed for the campaign show the cost in euros of keeping the buildings functioning hour by hour. In addition, they are accompanied by messages primarily addressed to the university community based on the analysis of a team of psychologists and sociologists. They have based these campaigns on techniques to motivate users and improve their commitment to the sustainability cause.

Use case: optimization of water consumption in smart buildings
As a practical example of the application of the CPS presented in this paper, we describe here how it can be used to reduce and optimize water consumption in the Smart-Campus. The process begins with the modeling of water consumption in the different buildings. In this estimation problem, it is important to note that water consumption may vary depending on the day of the week (e.g., workday or weekend) or the month (e.g., work month or holidays), resulting in different averages and standard deviations (Melzi et al. 2017). The proposal described in this article is based on a Gaussian mixture model, which is a well-known method for estimating unknown distributions of data (McLachlan and Peel 2000). Historical data is used to create timedependent models in this work, identifying hours, time slots, or higher consumption and lower consumption days. Finally, once the model is available, its prediction is directly compared with actual consumption measured by the installed sensors.

Use case definition
This use case evaluates the CPS proposed in this article for the specific objective of optimizing water consumption in buildings on a smart and sustainable Campus. The Computer Science Building has been chosen from all the facilities that make up the SmartCampus. Among other reasons, because it is currently the most sensorized building and the most Fig. 9 Computer Science building. It consists of two floors and different classrooms, offices and laboratories visited by users. This building is approximately 4000 m 2 distributed over two floors. It currently houses more than one hundred physical devices that acquire and store data in real-time on physical servers. Figure 9 shows 3D models of the two floors of the building, where classrooms, laboratories, and offices are also labeled. The layout of the sensors that measure the water consumption in the EPCC facilities can be seen in Fig. 10. There are twelve sensors distributed throughout the six buildings. In this same figure, the Computer Science building, where the tests of the study will be carried out, is labeled. The name of each sensor in the CPS implementation, the place where it is located, and the start date for data collection are shown in Table 6. Following the definition of the CPS, = { , , ℚ, } , and considering the Computer Science building, B 1 , as an independent system, B 1 ⊂ , is the CPS definition for the use case. In this scenario, only B 1 ⊂ is considered as the set of 12 sensors responsible for of measuring water consumption, B 1 = w 1 w , w 2 w , ...w 12 w . Each physical device w j consists of the sensor, in this case a commercial IWM-PL3 sensor, plus a list of software components defined in Sect. 3.2. IWM-PL3 is an electronic pulse emitter module for multi jet water meters, whose output is one pulse every 10 litres. The rest of the software components have been programmed in Python language.
Information acquired by these sensors is independently stored in the data storage subsystem B 1 . On the one hand, as described in Sect. 3.3, for each sensor j w the data is stored in a time-series database D j w i . As indicated in Table 6, in some cases, there is water consumption stored data since 2016. On the other hand, Graph database D n is defined in this particular case as G B 1 EPCC (N, E) ⊂ G EPCC (N, E) , which is composed of the 2 levels F = F 1 , F 2 associated to the two floors of the building, as well as the set of nodes R associated with all the classrooms, laboratories and offices 4 .
For the use case described in this article, the following services q i ∈ ℚ have been implemented: , responsible for recovering data from B 1 and generate alarm signals. This data is used in a software component u j w , implemented in Python, which is part of the AI of the proposed CPS to detect anomalies.
q u j w � � � →m , which is responsible for sending a warning messages to the different stakeholders when there are anomalous values, being m the communication channel of this alarm signal; The overall structure of the q u j w ← � D j w i service is outlined in the Algorithm 1. This service is the basis of the use case described in this work. First, this algorithm uses the water consumption prediction of the sensor j w in a time window t, by using a Gaussian Mixture Model, C p (t) , and then compares this prediction with the actual measured value in the same time window, C c (t) . In this comparison, the security margin m is added to C p (t) to minimize the number of false positives. In case of anomalies, that is, |(C p (t) + m ) − C c (t)| ≥ 0 , a warning signal, s m is generated and the anomaly is addressed using the service q u j w � � � →m . In the proposed system, m is a percentage of the water consumption predicted by the model C p (t) .  A mixture of Gaussians is seen in the literature as a combination of Gaussian features providing a good model for clusters of points: each cluster corresponds to a Gaussian density whose mean is located about the centroid of the cluster and whose covariance matrix estimates the spread of that cluster. Therefore, given a set of points in ℝ d , it is possible to find the mixture of Gaussian functions f (x| , ) that best fits those points. In the use case discussed in this article, GMMs are used to model and predict water consumption in a specific time window based on consumption patterns in the buildings of the SmartCampus; that is, inputs of our algorithm correspond to water consumption values in different points of time of the historical data depending on the specific model.
A classical method to derive the GMM from training data is the iterative two-step Expectation-Maximization (EM) algorithm. This algorithm finds the maximum likelihood solution in a very efficient way (Figueiredo and Jain 2002). ( The E-step computes the expectation of the log-likelihood evaluated using the current parameter estimates followed by the M-step step, which estimates parameters that maximize the expected log-likelihood found by the E-step. Applied to water consumption prediction in the CPS proposed in this paper, the mixture of Gaussian is inferred as: where c ∈ ℝ corresponds to real water consumption values in the building B i in a time window t, which were acquired using the physical device j w and later stored in D i and D n by using the service q j w � � � →D i .

Experimental results and discussion
This section presents the main results of the CPS Smart-PoliTech for the use case described in this article, aiming to progress in sustainable campuses. Firstly, the predictive models obtained with the mixture of Gaussians are outlined. Next, the results obtained by using these models to generate warning messages that reduce the liters wasted are broken down. Throughout the section, these results and the possible decision-making that improves the whole system's performance are also discussed. Figure 11 illustrates real water consumption values of the Computer Science building, which is the sum of the water consumption of the three physical devices j w ∈ B 1 (see Fig. 10). These measurements are associated with different days in the same month. In this example, the time window, t, is 60 minutes. As shown in the figure, the mean for each hour of water consumption is similar. Figure 12 illustrates daily consumption during February and shows how consumption from Monday to Thursday is also similar, while consumption on Friday decreases and weekends is usually minimal. This trend is also repeated in other months during the academic year.
The mixture of Gaussians obtained from the model at April at 12:00pm is shown in Fig. 13. predicting the water consumption at this time and on this date. In this case, the Gaussian mixture model provides a curve with two probability maximums, c 1 and c 2 , respectively. If these two probability peaks are identified as the two most likely consumption values during that time in April, it is possible to Fig. 12 Daily consumption in February. Water consumption on weekends is usually next to zero. During the week, water consumption follows a similar trend Fig. 13 Water consumption prediction at noon in April. This curve has been obtained using the Gaussian Mixture Model described in this paper use these values as an adaptive threshold i for generating an alarm signal in case of anomalies. Two possibilities are analyzed in this paper: firstly, if the real water consumption provided by the physical device is compared with the two most probable consumption values provided by the model and this consumption value is less, then it can be considered adequate; secondly, if real water consumption acquired by the physical device is higher than the highest consumption value provided by the model, then there is something wrong, and a warning message is generated. Figure 14 illustrates the real water consumption on different days in a week (only working days are shown). Most of the days have global maximum consumption at 10:00am, and local maximum at 04:00pm. Figure 14 also shows the predictive model (using the time window, t = 60 minutes), considering: (i) water consumption prediction if the maximum probability, c 1 , is used (black line); and (ii) water consumption prediction if the second maximum probability c 2 , is used (fuchsia line). This predictive model has been used to generate warning messages only in case of higher real water consumption. Figure 14 indicates the fixed threshold, which has been chosen at 200 liters. As shown in the figure, the use of an adaptive threshold allows CPS to save water faster in case of anomalies.
Results of the system in several real scenarios are summarized in Table 7, where real water consumption and output of both predictive models, c 1 and c 2 , are presented, as well as the relative errors between real and estimated consumption. All values correspond to the same hour, in this example at noon. From the predictive model, c 1 = 120 litres and c 2 = 60 litres, as is shown in Table 7. Table 7 describes the actual consumption in April 2019, from which the non-school days have been omitted for Easter, as well as Saturdays and Sundays. Several conclusions can be drawn from this table: the first is that of the 15 days analyzed, 13 of them show an error below 20% , and only on two of the days an error above 30% is obtained, that is, all the analyzed real consumption values are adjusted to the two estimated consumption values. The results also show that a model is better than another depending on the day of the week. Higher water consumption (i.e., from Monday to Thursday) is better modeled using the prediction labeled as c 1 . On the contrary, water consumption on Friday is better modeled by the prediction c 2 . A logical decision between using c 1 or c 2 to model water consumption in a day is currently straightforward, but more complex decisions are being analyzed in the context of the proposed CPS.
These relatively low error percentages are similar in the rest of the hours tested in April 2019, so it is possible to use the estimated consumption values c 1 to generate an adaptive threshold t at each hour i of the day. t from which it is possible to send warning messages that indicate an abnormal water consumption. t is generated by adding a 30% margin  to reduce the number of false alarms. For instance, if the estimated maximum consumption at 12:00 pm is 120 liters, the maximum consumption threshold is 156 liters. According to this 12∶00pm value, and using the data shown in 7, the number of warning messages generated by the service is 2 (11th and 29th , April), while these messages number is zero in case of a threshold fixed at 200 liters. Table 8 summarizes a comparative study between two different warning systems during three consecutive days. The first one uses a fixed threshold, = 200 liters. The other warning system uses the adaptive threshold t , which is obtained by using the water consumption prediction c 1 plus a security margin. Table 8 shows real water consumption from 07:00 to 19:00h for each day of the comparative study Rc i , as well as the liters saved by using one warning system or another. Only the hours of the day where consumption is usual in the EPCC are shown, omitting those related to the night-time schedule. In summary, using the adaptive threshold during these 3 days would have resulted in a saving of 274 liters, while using a fixed threshold, the saving would have been 0 liters, without considering possible leaks that would have occurred during night-time hours.

Conclusion
The deployment of digital technologies on a SmartCampus to improve socially essential aspects such as comfort, energy efficiency, or sustainability is becoming a reality thanks to technological advances such as the Internet of Things, data science, and cloud computing. The future of the universities is to equip their facilities with a good set of devices -the physical world-, to provide users with monitoring tools to increase their security, optimize the use of spaces and time, and provide solutions that make the buildings more sustainable and efficient. In this context, Cyber-Physical Systems are conceived as a powerful tool that integrates most of the above technologies to create an ideal framework to achieve these objectives. These CPSs have made the leap from the industry to other sectors, such as agriculture, medicine, transport, and in recent years, although at a very slow speed, to universities. This article describes a general CPS for SmartCampus, SmartPoliTech, which has been successfully deployed at the School of Engineering of the University of Extremadura, a complex of more than 40, 000 m 2 consisting of seven buildings and more than 1500 users. This paper describes, following a similar nomenclature to other papers, the proposed CPS, detailing each of the components and agents that make up the complete system. As a novelty, the proposal uses a Service Oriented Architecture, integrating two-way communications and IoT services on an enterprise service bus.
The description of the CPS is not complete if it is not validated against a use case that requires the interaction of the different components and services. For this reason, this work presents a use case where the IoT infrastructure is used to optimize water consumption in buildings. For this purpose, the data collected by the sensors is used to detect abnormal water consumption -due, for example, to losses in the supply network or occasional failures in toilets-and to generate warning messages that reduce the liters consumed. Furthermore, this alarm system implements predictive algorithms based on Gaussian mixing models and efficiently creates long-and short-term water consumption predictions that are later used to create consumption alarms.
This work could be expanded on in various ways. For instance, use cases can be redefined to improve the sustainability and efficiency of buildings concerning electricity or gas consumption. The CPS described in this paper has been in use since 2016, and the historical data is extensive. The CPS currently has data on environmental variables ( CO 2 , temperature, humidity, among others), but also data on water, energy, and gas consumption. By combining all this data with, for example, building occupancy, it is possible to improve predictive models and thus create even more sustainable and intelligent buildings. Consent to publish All the authors consent to the publication of the manuscript.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.