Introduction

The construction industry is involved in many domains: residential, industrial, civil structures, infrastructures such as roads, rail, waterways, etc. Since the end of the twentieth century, the European construction industry has been facing a double challenge.

  • Maintaining a high level of employment compared to other industries which automate and exercise redundancies while increasing productivity.

  • Aligning with an increasing level of requirements related to the environmental and societal issues such as protection of vegetal and animal species, water management, health, and safety for the workers on site.

Consequently, construction projects are complex environments where multiple stakeholders must cooperate and interact to create more and more intelligent structures. A "smart" construction needs to manage more significant amounts of data. This data must be monitored, registered, and structured to develop the services such as digital archives, data analysis, and interpretation for prediction. The complexity of information required for today's decisions for the construction industry requires digital techniques. Consequently, there are considerable investments in digitization at the national, regional, and global levels, along with a need for collaboration and interoperability between systems. Digital techniques do not ensure this happens seamlessly. The observed environment is traditionally modeled using GIS (geographic information systems), while the built environment is modeled using BIM (building information modeling). The traditional difference in granularity between these two perspectives (GIS and BIM) is more and more vanishing. Use cases and perspectives converge and largely overlap. The models from the two domains are tightly bound to each other. Every built construction has a location in the existing environment, the Earth (if we do not consider outer space), and GIS incorporates all built environments. Each of these two domains spends huge investments in capturing information from the other domain—today, mainly resulting in double-spending. The removal of the barriers to sharing information across disciplines will save significant investments and result in higher quality information.

In line with this, the construction industry has already embedded in its processes the management of vast amounts of digital data for investigating and describing existing conditions, monitoring traffic, toll systems, or operation. Many sensors are already widely used in Europe to report in real-time the information related to mobility. State-of-the-art technology offers multiple data sources, sensors, and analytic engines to automate and improve the process. However, instead of each individual monitoring their small area of responsibility, sensors should have the capability to communicate with each other and, in real-time, draw the picture of the whole building as it exists at a specific moment in time. This is the understanding we have of the Digital Building Twin (DBT) concept.

Following this definition (DBT as lively representations of buildings' status and environment), this article presents the contributions made in this field and in the context of the McBIM project, started in 2017 and funded by the French National Agency of Research (ANR). The article comprises three main sections: “Research Issues for Digital Building Twins” summarises research issues regarding DBTs to identify the main related research domains. For each of these domains, “Standards and State-of-the-Art Approaches for Digital Building Twins” provides the state-of-the-art, namely existing standards for digitizing the construction process, latest advances in Semantic Web technologies, and wireless sensor networks. Section “Pushing Forward the State-of-the-Art: The ANR MCBIM Approach” presents the contributions envisioned by the ANR McBIM project, namely explainable decision-support based on real-time sensor data integrated into a digital building representation.

Research Issues for Digital Building Twins

Now in the twenty-first century, the construction domain faces an increasing number of challenges. Among those we may cite.

  • The growing competition to win control over construction data and information among the stakeholders present, including asset managers and operators.

  • The general tendency to develop and implement digital environments for monitoring the "connected humans" present in each territory e.g., "smart cities".

  • The necessity to provide innovative structures and services to dramatically decrease the carbon impact in the construction and operation phases.

Following our definition of the DBT, and considering the above challenges, we divide the research issues into the following categories. Each category pertains to a different domain, and the related state-of-the-art is provided in the specific sub-sections of “Standards and State-of-the-Art Approaches for Digital Building Twins” mentioned below.

The first domain is building information modelling (BIM). The recent advances in this field, emphasizing effective information management, have dramatically improved delivery and performance efficiencies by catalysing increasingly innovative ways of working in construction. Implementing BIM open standards allows better strategic decisions and improved predictability through better risk management. Section “Digitizing the Construction Process: Standards for Digital Building Twins” provides an overview of existing BIM open standards.

While digital tools and technologies handle complex information exchanges in the construction domain, digitalization does not automatically ensure interoperability. Existing software implementations of open BIM standards highlight the growing need for collaboration and interoperability among those systems. Today, in the context of a construction project, only syntactic interoperability can be reached, which constrains every actor to use the same software. Still, issues arise whenever an actor uses a different software or tool to query or produce information, whatever form such information may take (models, spreadsheets, drawings, certificates, programs, etc.). Indeed, construction projects are complex systems that require an intensive collaborative effort from all stakeholders involved, along with a complete understanding of the cause and effect of all inputs. Without these, it is impossible to identify deviations from the plan early on, let alone implement corrective actions and prevent adverse outcomes. Efficiency, safety, and accuracy in construction projects require a common understanding of the information exchanged. Thus, the second domain is semantic interoperability and the challenges related to its implementation among actors, tools, and technologies as involved in a construction project. Indeed, a design DBT differs from a construction DBT, which differs from a Facility Management DBT. Still, all the underlying digital models must be synchronized persistently, and interoperable information exchange must be ensured among stakeholders. Section “Achieving Semantic Interoperability—Definition and Problem Statement” further explains what is understood by semantic interoperability and how it can be implemented.

The third domain pertains to Semantic Web technologies. Indeed, another issue for DBTs is to capture all the knowledge related to the construction processes themselves and allow this knowledge to be handled by machines in (semi-)automatic ways. While today ICT systems allow storing BIM data and associating it with context elements (thus obtaining BIM information), they fail in providing the flexibility and the reasoning needed for lean decision-support. This is mainly because most of this knowledge is implicit and pertaining to human experts. Ontologies (as formal and explicit specifications of shared domain conceptualizations) allow integrating data and information from infinite data sources and information models and reasoning on such structures (making explicit knowledge from implicit knowledge). The flexibility, expressiveness, and the ability to explain all the deductions made have allowed ontology-based approaches (using Semantic Web technologies and Linked Data principles) to be identified as the only type of approach to address interoperability issues mentioned above [1] comprehensively. Section “Semantic Web Technologies for BIM and IoT” provides a review of such existing approaches as implemented and sometimes pushed at the level of standardization organizations.

After capturing underlying knowledge, the next step is to integrate building monitoring and analysis data, as provided by sensors deployed in the building. The fourth domain is wireless sensor networks (WSN). Our vision of a DBT goes beyond WSN, pushing the idea of materials that can communicate with their environment, sense it, and measure their internal physical states. It exploits the evolution of the Internet of things, leading to an increasing "sensorization" of physical spaces. In our vision, a DBT must be able to reason about its status and surrounding environment. Integrating sensor data with BIM models allows digitally representing physical and functional characteristics of physical spaces and, thus, provides relevant information about the building. Reusing the concept of "communicating material" (coined by CRAN in 2010 [2], our vision is entirely in line with the above idea of "smart cities", as DBTs must be aware of their environment and their status. Section “Data Dissemination and Energy Efficiency in WSN” summarizes state-of-the-art approaches existing in WSNs for ensuring data dissemination and energy efficiency.

Thus, to ensure semantic interoperability, sensor data are integrated into the DBT through ontologies. Analysis of such data as support of decision-making processes is implemented on top of such ontologies using logical rules and constraints. Explainable decision support is the fifth domain addressed by the McBIM project. Section “Specifying Expert Knowledge” presents how Semantic Web technologies ensure expert knowledge and how semantic rules can help implement explainable decision-support functionalities.

With the above issues in mind, this article presents how the French National Research Agency-funded ANR McBIM project helps push forward the state-of-the-art to provide an implementation for our vision of the DBT. Before submitting our contributions, the sections below further explain the existing approaches related to the issues listed above and provide a review of current standards applicable in the related domains.

Standards and State-of-the-Art Approaches for Digital Building Twins

Before describing the research orientations took in the context of the ANR McBIM project, this section will resume primary existing standards and approaches about the issues tackled by the project, namely: (1) digitizing the construction process, (2) achieving semantic interoperability, (3) semantic web technologies and their applications for BIM and IoT, (4) WSN: algorithms for data dissemination and approaches for energy efficiency and (5) specifying expert knowledge using semantic rules.

Digitizing the Construction Process: Standards for Digital Building Twins

Digitizing the construction process has been addressed through different standards, at different levels e.g., internationally with the ISO, European-wide in the context of CEN, and national-wide in national chapters such as the French AFNOR and the German DIN. Most of the existing standards are difficult to implement today. This is either because they do not have a computer-defined implementation process (e.g., the ISO 19650 standard family), or because they lack interoperability with other existing standards (as an example, at the level of the ISO TC59/SC13 a 14th Joint Working Group has been created for tackling interoperability issues among BIM and GIS systems). This section presents these standards.

First and foremost, ISO 19650-1 (published in December 2019) provides the base definition for BIM: "Use of a shared digital representation of a built asset to facilitate design, construction and operation processes to form a reliable basis for decision-making" [3]. Much more than a 3D model, BIM is considered a process for sharing information along design, realization, and operation phases. ISO 19650-2 [4] provides a clear view of the process of information delivering in a digital building twin context: the client specifies (a) the Asset Information Model (AIM) or the "as-built" or "as maintained" asset, and (b) the Project Information Model (PIM) or the system for delivering the information. These specifications are related to the whole building life cycle, covering its design, construction, and operation phases. Following a system engineering approach, one could define the AIM as "the system to be delivered" and the PIM as "the system for delivering". The idea of providing a "reliable" mechanism for "decision-making" is crucial. It's also one of the main reasons Semantic Web technologies have been widely recognized as enablers for such a mechanism.

In the context of the ANR McBIM project, we consider BIM as a system that allows the generation, storage, and exchange of information about building elements and sensors. Such a BIM-based system goes beyond a simple viewer of three-dimensional building models. Following the assumption that a BIM system must enable "reliable" decision-making, the ANR McBIM system includes semantic knowledge alongside building models. State-of-the-art approaches in semantics for BIM are presented in the next section (“Semantic Web Technologies for BIM and IoT”). The ANR McBIM approach is discussed in section “Implement Explainable Decision-Support”.

In this section, we present the standards forming the building bricks of BIM. They are illustrated in Fig. 1 [3]:

Fig. 1
figure 1

The building bricks of BIM: IDM, MVD and IFC [3]

ISO 29481:2016 or "Information Delivery Manual" [5] is the standard for describing actors and processes involved in a contracted exchange. Following IDM, Exchange Requirements (ER) are specified, in the "building planning" phase, in natural language by domain experts (or BIM users) and software developers (or BIM providers). An ER defines what kind of information must be included in the exchange. IDM Process Maps (PM) are specified using Business Process Model and Notation (BPMN) and usually include stakeholders, project stages, and activities.

In the next building lifecycle phase, namely design, the above informal IDM is adapted as Model-View Definitions (MVD), which represent a subset of the complete IFC Schema and are serialized in XML (mvdXML) [5]. Each ER defined in an IDM is translated into specific concepts and relations from the IFC Schema (see the paragraph below). The goal of MVDs is to enable compliance and conformity-checking of IFC files according to constraints defined in IDMs.

ISO 16739:2013 "Industrial Foundation Classes" [6] is the open BIM standard for representing buildings' conceptual models. Following ISO 10303-21:2012 or STEP (Standard for the Exchange of Product data) [7], IFC's main serialization is EXPRESS. The *.ifc format is open, public, and non-proprietary, promoted by the international standardization organization bSI (buildingSMART International). The current version of IFC is IFC4.2, but IFC5 should be made official by 2022.

The IFC Schema is divided into four main layers listed below [8].

  1. 1.

    The Core Layer comprises the most general entity definitions, each entity having a "globally unique identifier" along with optional "owner and history information"

  2. 2.

    The Interoperability Layer defines entities specific to different disciplines, in terms of general products, processes, or resources, and provides the concepts and relations for exchanging and sharing construction information among different domains (inter-domain)

  3. 3.

    The Domain Layer further specializes the element definitions from the Interoperability Layer and enables the exchange of construction information in the context of the same domain (intra-domain)

  4. 4.

    The Resource Layer includes the description of costs, actors, quantities, constraints, approvals, etc. All concepts from the Resource Layer do not have a "globally unique identifier" thus they must be related to a concept from the core, the domain, or the interoperability layer.

According to the IFC Schema, building elements generalize elements present in a building and are modeled using the IfcElement concept. Doors, beams, or walls represent typical examples of building elements. An instance of IfcElement is assigned to a building's spatial structure (e.g. a building storey) using IfcRelContainedInSpatialStructure. The property IfcRelDefinesByType allows setting an element type to an IfcElement. IfcBuildingElement is the sub-class of IfcElement representing tangible building elements, some of which can be made of concrete (which is the focus of the ANR McBIM project). Among those elements (sub-classes of the IfcBuildingElement class), in the context of the ANR McBIM project, we will focus on IfcBeam and IfcWall elements.

Additional to classes, the IFC specification defines property sets ensembles of property taxonomies. Users can add missing properties from the Psets specified in the standard [8], indicating they are not present in the standard by removing the "Pset_" prefix from those property names. Each of the classes above (IfcBeam and IfcWall) implements the three Psets for concrete or precast concrete elements, namely Pset_ConcreteElementGeneral, Pset_PrecastConcreteElementFabrication and Pset_PrecastConcreteElementGeneral. As the ANR McBIM project will not consider precast elements, we list in Table 1. The properties in Pset_ConcreteElementGeneral, which are implemented in the context of the project.

Table 1 IFC Pset_ConcreteElementGeneral [8]

sIfcSensor is a newly added concept in IFC4 and represents a device that measures a physical quantity and converts it into a signal that an observer can read or by an instrument. Using IfcSensorType with a value from IfcSensorTypeEnum (listed below in t), several types of building sensors can be specified into building data. Figure 2 illustrates the concepts and relations that need to be instantiated to assign a sensor type to an instance of IfcSensor (Table 2).

Fig. 2
figure 2

Associating sensor types to IfcSensor instances using IFC4

Table 2 Sensor types defined in IFC4.1 [8]

Achieving Semantic Interoperability: Definition and Problem Statement

Before discussing this, one must first define what interoperability is. ISO (International Standards Organization) provides several for "interoperability" depending on the domain of knowledge or application considered. Following the definition provided by ISO/TC 46/SC 4 Technical interoperability in ISO 21127:2014 [9], "technical interoperability" implies that either "two systems can exchange information" or that "multiple systems can be accessed with a single method". From a computer science point of view, several levels of interoperability are considered [10].

  • Level 1 interoperability or physical interoperability is defined as the "computation, use, transfer and exchange of data" [11]

  • Level 2 or syntactic interoperability concerns the "ability of two or more systems or services to exchange structured information" [12]

  • Level 3 semantic interoperability comes on top of the two previous levels and, when implemented, enables "the meaning of the data model within the context of a subject area [to be] understood by the participating systems" [13]

These interoperability levels are interconnected, e.g. implementing Level L's interoperability requires implementing interoperability of the level (L-1). Thus, reaching semantic interoperability cannot be done without having implemented physical and syntactic interoperability. Physical interoperability is solved using hardware standards (e.g. Ethernet) and standard network protocols (e.g. TCP/IP or HTTP). Syntactic interoperability has also been resolved through the specification and implementation of syntax standards such as XML, HTML, WSDL, or SOAP.

When it comes to semantic interoperability, several flavors exist, each resulting in different actions that can be performed on the underlying knowledge. To best apprehend these concepts, we will start from existing ISO definitions for "semantic interoperability" [10].

  • ISO 13606-1:2018 [14] defines it as the "ability for data shared by systems to be understood at the level of fully defined domain concepts". This definition points to the first level of semantic interoperability, which is "understanding of data". Such understanding is usually characterized as minimum semantic interoperability. It is enabled by approaches based on Resource Description Framework (RDF). In this case, only the minimum knowledge is modeled: the concept of a "building" is related to the concept "sensor" through the relation "contains".

  • ISO 16678:2014 [12] considers such interoperability as "the ability of two or more systems or services to automatically interpret and use information that has been exchanged accurately". This definition places the need for information interpretation, information being defined as contextualized data. This is called extended semantic interoperability and requires minimum semantic interoperability (thus RDF). This is called extended semantic interoperability and requires minimum semantic interoperability (thus RDF). The RDF Schema language allows defining a common interpretation of the elements contained in the message exchanged. Following our previous example, with RDF Schema, the "building" concept is identified by an URI (Unified Resource Identifier), allowing a computer agent to dereference the concept and access an RDFS-defined ontology specifying additional knowledge about a "building". Extended semantic interoperability allows obtaining additional knowledge about the concepts handled, but it does not allow to constraint such knowledge. Following our example, with this level of interoperability it is impossible to specify that if a "sensor" is in a "building", the same "sensor" cannot be contained in a "different building".

  • The definition from ISO/IEC 19941:2017 [13] point to full semantic interoperability and requires higher-order ontology description languages from the OWL family such as OWL-DL, OWL 2 RL, etc. This level of interoperability allows bounding knowledge: allowed interpretations represent the lower bound, while constraints preventing specific inferences form the upper bound. In the context of our "building" containing "sensors", an incoherence would be notified if the same "sensor" instance is in two different "building" instances.

Regarding the digitalization of construction, engineering, and architecture (AEC), the French expert commission PTNB identified Linked Data and Semantic Web technologies as the only approach that fully addresses interoperability issues in these domains [1]. Thus, the next section briefly presents knowledge modeling with Semantic Web standards and the main existing ontologies pertaining for digital building twins.

Semantic Web Technologies for BIM and IoT

In the last decade, Semantic Web and Linked Data technologies have received increasingly more attention to facilitating knowledge modeling in the AEC/FM sector. Since then, the topic of using semantics for delivering actionable knowledge has continuously gained attention from both researchers and industrials. This section's scope is not to provide a state-of-the-art regarding Semantic Web, but to list and describe main conceptual differences associated to knowledge modeling with Semantic Web languages such as RDFS or OWL.

Indeed, modeling with Semantic Web languages is different than object-oriented modeling. For example, in the EXPRESS language (used for serializing IFC), all the concepts are related to one single key or primary (meta-)concept. In EXPRESS, a property is always declared in the context of an entity. With Semantic Web languages, classes and properties are defined independently. The belonging of an instance to a class is determined by the set of necessary and sufficient conditions that the individual must observe. In the context of Semantic Web, a class is defined as an ensemble of properties its individuals must all implement, with specific values. Another difference in semantic modeling is the Open World Assumption (OWA), which states that it is not because some knowledge is missing or was not specified that it must be assumed as false. In EXPRESS, what is not specified is by default assumed false—it is called a Closed World Assumption (CWA). In Semantic Web, the assumption that applies is the No Unique Name Assumption (UNA). Indeed, Semantic Web handles resources identified by means of URIs (Unique Resource Identifiers). It is only if two URIs are identical, that the resources they identify are considered identical. In the context of BIM (namely in IFC files), resources are identified by means of so-called GUIDs (Globally Unique Identifiers) that usually contain UUID data. The main issue with this approach is that these GUIDs are not unique from one IFC file to the other. This is mainly justified by the freedom of implementation provided by the IFC standard for software companies. Identification using URIs as implemented in the context of the Semantic Web allows waiving the issues related to UUIDs.

Several ontologies exist and allow modeling and annotating data in digital building twins, namely building, sensor, urban and geographic data. They are listed and briefly described in the table below (see Table 3). More details about these ontologies, the Linked Data principles [15] they respect, along with an evaluation of their modeling quality can be found by the reader in [16].

Table 3 Existing ontologies pertaining for digital building twins

Data Dissemination and Energy Efficiency in WSN

Several research initiatives underlined the benefits of integrating Internet of things technologies such as RFID, WSN into construction products and an extensive review of research works or industrial initiatives in the construction domain using RFID technologies. These approaches show that RFID technologies have been tested and can bring significant economic leverages in all the phases of the precast concrete lifecycle, e.g. in precast quality management [17] or for construction supply chain [18], by bringing product information to stakeholders. Additionally, WSNs are also seldom used when active monitoring of the structure is needed as in manufacturing (for early-age concrete inspection as in [19]) or for structure health monitoring [20]. Industrial initiatives are also numerous, but RFID tags are used most of the time [16]. For example, we may cite Lafarge, who integrated RFID tags directly into the concrete of the D2 towerFootnote 1 for traceability applications.

Data dissemination in a wireless sensor network (WSN) relies on cooperation amongst nodes. Given that each sensor node has a limited transmission range, some intermediate nodes could act as relays to enable extensive network coverage. A WSN is considered a set of nodes equipped with wireless communication interfaces that work together. Concerning energy in WSN, a lot of research works have been proposed in the last two decades. Following the top-down survey for energy efficiency in WSNs [21], Fig. 3 introduces various ways to address this issue.

Fig. 3
figure 3

Approaches addressing energy efficiency in WSNs (adapted from [21])

If all categories can be explored to improve energy efficiency, in ANR McBIM we focus on data dissemination across the network. Some protocols are based on RPL (Routing Protocol for Low-Power and Lossy Networks) [21]. In [22] the authors proposed an RPL-based routing protocol (named PriNergy) that meets the quality-of-service requirements, avoids network congestion, and seeks for energy-efficiency. Tita et al. [23] proposed a protocol that uses two-hop information to improve the performance of WSNs in terms of energy and quality-of-service. They introduced two metrics: (1) the potential relay information (PRI), which considers the residual energy, the distance, the delay, and the quality of the links to neighbour nodes, and (2) the neighbourhood state index (NSI) algorithm, which helps reducing delay and load traffic. Some protocols use optimization techniques to improve clustering in WSNs. The authors of [24] used the MOA (Mayfly Optimization Algorithm) [25] to efficiently select the cluster heads (CH). This approach considers energy, position, and distance when choosing a CH. To optimize the network's overall energy, a rotation of the CH role over time is considered. Other techniques were explored in works like energy harvesting [17] or data reduction [26].

Routing is an essential task in WSN communications. Routing protocols allow finding paths to transport data from a source to a destination (e.g., the sink). A path is made up of the list of intermediate nodes. According to [27], WSN routing protocols are classified according to the network structure or the protocol operations (see Fig. 4). Various communication technologies like IEEE 802.15.4 (ZigBee), IEEE 802.11 (Wi-Fi), IEEE 802.15.1 (Bluetooth Low Energy), and so on can be used to transmit data through the network to an internet gateway. In the ANR McBIM project, low frequencies (from 900 MHz band to 2.5 GHz) were preferred for propagation in concrete. Successful communication tests were, by example, realized using LORA, BLE or ZigBee.

Fig. 4
figure 4

Classification of WSN routing protocols [27]

In flat routing protocols, all nodes play the same roles. These protocols are used, for example, in flooding or data-centric dissemination cases. In some cases, each node's role may depend on several criteria (position, remaining energy, etc.). This class is known in some other cases as hierarchical routing protocols.

LEACH (Low Energy Adaptive Clustering Hierarchy) [28] is one of the most popular WSN clustering protocols. Clusters are computed in a distributed way. A probability p is calculated for selecting cluster heads (CH), each node selecting the CH with whom it can communicate with the minimum energy. Any node can endorse the CH role over time to best distribute the traffic load and energy consumption. PEGASIS (Power-Efficient Gathering in Sensor Information Systems) [29] proposes an improvement of LEACH. It forms a chain amongst nodes and favors communications between closest neighbors. The Chain-Cluster-based Mixed routing protocol (CCM) [30] relies on LEACH and PEGASIS benefits, especially latency and energy consumption. CCM forms routing chains and selects a head (amongst chain heads), sending aggregated data to the sink. Chain-Routing-Based on Coordinates-oriented-Cluster (CRBCC) [31] is another cluster-based routing protocol. Nodes are grouped depending on their geographical coordinates. The clusters are formed based on the Y coordinate, and the algorithm forms a chain within each cluster and elects a leader for each chain. Balanced Chain-Based Routing Protocol (BCBRP) [32] is another data dissemination protocol that aims to extend the lifespan of the sensor nodes. It relies on three steps: (1) split the network into several subnetworks with equal size, then (2) select a bridge node within each subnetwork, and finally (3) build a chain that interconnects the subnetworks via the bridges. Table 4 summarizes the advantages of the common WSN routing protocols.

Table 4 Comparison of WSN routing protocols [66]

Usually, sensor nodes are battery-powered. The batteries could be hard to recharge or replace (because WSN could be deployed in hostile environments). Therefore, energy efficiency is a critical issue in this kind of network.

Nodes consume energy to send or receive messages and for carrier listening. Reducing the transmission power can lead to less energy consumption when sending messages. However, this method reduces the transmission range, increases the number of hops, and lengthens the paths. Energy could also be saved by arranging nodes to not remain active during the hole monitoring time. The sensor nodes will alternate the active and inactive (sleep) modes. The alternating of active/inactive modes is referred to as the duty cycling approach.

Duty cycling is an effective energy conservation mechanism in WSN. The lower the duty cycle, the longer a node is idle and saves energy. This lengthens the network lifetime. The duty cycle can be applied to all subsystems of a sensor node, including the radio communication subsystem. When they are actives, nodes can send or receive messages or simply listen to the radio channel. Inactive listening (idle) can represent a significant consumption of energy over time. Two common phenomena favour duty cycling. First, it exploits the redundancy in wireless sensor deployment [3]. The system can thus adaptively select only a minimal subset of nodes that will remain temporarily active to maintain connectivity. Second, it exploits the fact that in most applications, occurrences of events are rare. Most of the time, the nodes are listening to the channel. In other words, the selected subset of nodes must not be active all the time. Moreover, these protocols auto organizing the network are led by nodes considering their own or a local view of the network. These decentralized decision-making protocols are interesting locally but could be sub-optimal for the whole network.

Specifying Expert Knowledge

Integrating expert knowledge in data processing allows delivering a better understanding, thus handling such data. For addressing this, Semantic Web standards allow defining metadata (descriptions of the data) and constraints, thus supporting reasoning processes on top of the gathered data. Such approaches are beyond state-of-the-art today, as they have been proven efficient through implementation [3]. To support this claim, two references can be cited: (1) the IfcWoD (Web of Data) adaptation of ifcOWL, which allows almost 90% reduction of query execution times [33] and (2) the federation approach exploiting the FOWLA architecture [34] allowing to interoperate the COBieOWL [35] and ifcOWL ontologies (as provided by buildingSMART [1]). Using the ifcOWL ontology corresponding to the IFC version of the building data exchanged, one can annotate such data and ease its interpretation both by humans and machines (as illustrated in Fig. 5).

Fig. 5
figure 5

Illustration of semantic annotation in the interpretation of IFC data [54]

When applying Linked Data principles [15] to define semantic links among those ontologies and vocabularies, it is possible to create a so-called knowledge continuum (through explicit semantic links specified among their concepts and relations). Such combined knowledge becomes a conceptualization of the application domain considered. Semantic Web languages allow an adequate expressivity level for such conceptualizations, while maintaining the overall system's efficiency. On top of such conceptualizations, a set of semantic rules can be defined, corresponding to the expert "know-how". Contrary to machine learning approaches, semantic rules are based on description logics and enable tracking and constraining how the expert system rules. Such methods are called programmed or constrained reasoning. They must be implemented in AI-based systems where the deductions must be explainable to the human user (see Fig. 6).

Fig. 6
figure 6

Defining rules and interpreting them [55]

Not only can one implement constrained reasoning on top of such modeled knowledge but semantic rules also allow defining concepts missing in the considered ontologies. For example, the IFC specification does not hold a building envelope concept, meaning all elements of a building that have an isExternal property defined in the related IFC data. The rules illustrated below (Fig. 7) demonstrate how one can use semantic rules to describe this missing concept [36].

Fig. 7
figure 7

Using semantic rules to define the concept of a building envelope

Furthermore, such rules allow specifying building abstractions corresponding to existing Levels of Detail (LOD) or some specific expert needs. Authors in [37] have defined and implemented an approach based on semantic rules that can extract the exact sub-portion of an IFC file following an ensemble of provided elements e.g., either GUIDs, or relation or concept names.

Lead by LIB (Laboratory of Computer Science of Burgundy), WP4 from the ANR McBIM project will compose and specify constraints and rules about expert knowledge in structural health monitoring in construction and exploitation phases. Based on this, WP4 will provide methods to implement semantic compliance checks at these two building lifecycle phases, thus allowing to specify human knowledge and experience in a computer-processable way. Sections “Implement a Standard-Compliant Solution” and “Implement Explainable Decision-Support” further detail developments done in the context of this WP.

Pushing Forward the State-of-the-Art: The ANR McBIM Approach

With a consortium of academics and industrials, ANR McBIM focuses on the construction industry's need for new standards and methods for material tracking and recycling. In this project's context, we consider ontological approaches for constraint checking, following consistent information taxonomies and protocols and processes for information exchange, conforming to the concepts and principles defined in the ISO 19650 standard family [3, 4, 38,39,40]. More specifically, the ANR McBIM project aims to design a "communicating concrete", meaning embed concrete building elements with sensors and integrate the sensed data into the BIM platform, thus delivering an interoperable digital building twin (see Fig. 8).

Fig. 8
figure 8

The overall vision of the ANR McBIM project

The project exploits the benefits of "communicating concrete", namely:

  1. a.

    its data storing capacity allows them to convey information related to design, manufacturing, and logistics, valuable during these lifecycle phases or BOL (Beginning Of Life), but also during EOL (End Of Life), meaning building demolition and recycling;

  2. b.

    its sensing and processing capacities, practical during MOL (Middle Of Life), representing operation and maintenance lifecycle phases.

Two building lifecycle phases are considered for implementing and testing the considered approach, namely construction and exploitation phases (for structural health monitoring).

Still, before reaching this vision, several additional challenges must be tackled for integrating real-time sensor data with BIM models. Indeed, open BIM standards do not provide capabilities to process real-time data, as do those supplied by sensors in smart environments. Some of the specific challenges are (a) extracting knowledge out of BIM, using the Industry Foundation Class (IFC) standard [6], (b) ensure real-time processing and reasoning over sensor data. Many approaches have been proposed to solve these challenges by integrating BIM with real-time data. However, most of them lack practical validation or are highly dependent on domain specifications. Attempts are related to the development of applications concerning specific application domains, namely (1) energy management, (2) building automation, (3) fire control, (4) health and safety, (5) safety risk, and (6) augmented reality (AR), and therefore they cannot be considered as standard solutions, adaptable to a large class of application domains. These facts reveal a lack of a standard solution for combining BIM with real-time data that streamlines the creation of sophisticated solutions to leverage sensor data with complex characteristics of the built environment.

For addressing these challenges, ANR McBIM provides means to relate the physical world (e.g. the building element made of concrete) to the digital world (e.g. the data associated with this particular building element). For collecting the data, in the project's context, we use a WSN embedded into the concrete and composed of two types of nodes: the sensing nodes (SN) and the communicating nodes (CN). These are illustrated in the figure below (see Fig. 9). Sensing nodes are in charge of capturing the building element's physical values (e.g. temperature, moisture) and relaying them to the communicating nodes. SNs are powered via RF harvesting techniques. The communicating nodes have more capabilities and are battery-powered. They can communicate with each other and with the digital world (e.g. as a gateway). This architecture allows obtaining the values used for implementing explainable decision-support, as described in section “Implement Explainable Decision-Support”.

Fig. 9
figure 9

Communication among sensor nodes in the WSN considered for ANR McBIM [17]

In the context of the ANR McBIM project, we seek to conceive and implement a solution for integrating BIM with sensor data. The following objectives are therefore pursued: (1) consider an updated building model, (2) integrate data from multiple sensors, (3) resolve queries that combine sensor with building data, (4) produce answers for those queries in real-time in terms of the sensor data processed, and (5) over a clean, easy-to-use, and simple interface to the user. As mentioned in the introduction, this is a different concept from BIM, notably from the goals pursued. Indeed, while BIM only renders the elements in an IFC file, a digital building twin must integrate real-time data, mainly obtained from sensors integrated into the building. Moreover, as it is an ongoing project, not all the objectives above have been reached. The article at hand focuses on the use of ontologies for seamlessly integrating data gathered from sensors into building data. Sections below examine in more detail points (2) to (4) from the objectives listed. Sections “Designing Robust Wireless Communications” to “Reorganizing the Sensors” further discuss details regarding sensor communication, placing, and reorganization. Section “Implement a Standard-Compliant Solution” details the structure of the McBIM ontology and its relationship with existing standards applying in the context of BIM. Finally, “Implement Explainable Decision-Support” illustrates how the standard ifcOWL ontology can be used for implementing monitoring of specific building elements.

Designing Robust Wireless Communications

When sensor nodes are deployed in hostile or difficult-to-access environments, they should be designed to operate autonomously. In the framework of the McBIM project, human interventions to manage the sensor nodes after they have been poured into concrete will be very difficult. The autonomic computing paradigm allows self-management.

Self-management systems with limited human interventions allow to cope with complex management systems and reduces the overall maintenance costs. Systems become a collection of interconnected autonomous entities. The autonomic computing paradigm is inspired by the autonomic nervous system. The main objectives for autonomic systems are self-configuration, self-healing, self-optimization, and self-protection (also known as self-chop). These objectives have been recently extended [41]. To ensure self-chop properties, autonomous systems must be able to interact with their environment thanks to sensor and effector modules. In addition, they must have a knowledge base that can be made up of a simple configuration rules or enriched by artificial intelligence algorithms. The interaction with the environment must be continuous. It makes it possible to adapt to changing contexts. The overall working process of such a system is described by a closed control loop [41] (illustrated by Fig. 10).

Fig. 10
figure 10

Illustration of the interactions between an autonomous element and the knowledge base

Sensor Placing in the Communicating Material

The sensor nodes will be poured into the concrete to monitor specific areas of a wall. These areas are called targets. Sensor nodes should be as a priority around the targets. Redundancy can help to extend WNS lifetime (using duty cycling) and make up a fault-tolerant system. Figure 11 illustrates sensor deployment around the targets to form a connected network. Figure 12 shows how sensor placement can be modified to provide interconnection and act as relay nodes. The sensor placement around specific areas is like the target monitoring problem [42].

Fig. 11
figure 11

Target coverage

Fig. 12
figure 12

Target coverage while ensuring network connectivity

The efficient deployment of sensor nodes can be defined as a multi-objective optimization problem. Given a number of sensor nodes, the goal will consist of maximizing the redundancy around the targets while guaranteeing the network's connectivity (all the nodes must be able to communicate with each other directly or through relay nodes). The problem can also be defined as determining the minimum number of sensor nodes to cover all the targets and then place a few sensors to ensure connectivity. Our early works have provided promising results regarding target coverage based on evolutionary algorithmics [43] and stochastic physics-based optimization algorithm [44]. In the context of ANR McBIM, we seek to adapt these approaches, considering the concrete environment's specificities.

Reorganizing the Sensors

Again, as sensors are embedded into concrete, their lifetime varies depending on the internal routing of sensor data in the material. Thus, the issue is to adapt the routing strategy of the communicating material to maximize the global network lifetime. About this issue, the McBIM project vision is to use the DBT to represent the network inside the material (and not only the material itself). This WSN DBT will then be used to estimate/simulate each node's residual node lifetime of the network. This view corresponds to the "Analyse" function of an autonomic manager. Based on the system's node energy levels, centralized network organization methods will then be used on global data, leading to optimized routing structures. This approach is presented below in Fig. 13.

Fig. 13
figure 13

ANR McBIM approach for optimized routing structures [44]

The left side of Fig. 13 shows the communicating concrete with sensor nodes inserted. These nodes send data stored and analyzed in the concrete digital twin, in which a virtual node represents each node. The middle of the figure depicts the concrete digital twin (here a beam), in which the virtual nodes are represented as colored spheres. As for the real nodes, the virtual nodes are gathered in a network (links between virtual nodes being represented in Fig. 13 as white lines). The data sent by the real communicating concrete feeds the different energy models contained in each node [26]. Using these models, the concrete digital twin reports an estimation of nodes' remaining energy levels, which are used to evaluate the remaining network lifetime. Figure 13 depicts a "concrete agent" that monitors the real concrete through the digital representation built on its right side. When appropriate, the agent launches a reorganization process to explore new routing strategies. If a better routing is found (i.e., a design maximizing the concrete lifetime), the design is kept, and reorganization orders are then sent to the WSN.

Implement a Standard-Compliant Solution

As mentioned in “Digitizing the Construction Process: Standards for Digital Building Twin” and “Achieving Semantic Interoperability: Definition and Problem Statement”, implementing the functionalities needed for explainable decision-support while maintaining semantic interoperability requires defining standard-compliant ontologies (knowledge bases). Thus, the logic behind ANR McBIM is enabled by the ANR McBIM ontology, which uses concepts as defined in existing standards. Following ISO 19650-1:2018 [3], a Common Data Environment enables "the development of a federated information model". The project will provide a federated knowledge model for handling all knowledge related to the considered built asset, considered construction works, and life cycle, along with stakeholders' roles and requirements. Thus, ANR McBIM uses an ontology that federates concepts defined in existing standards and specified in separated modules. These modules are federated through semantic rules (defined in SWRL) following the FOWLA approach presented in [34]. This will be addressed in the context of the project's WP4, which will determine the federation of existing schemas and models by defining outgoing semantic links to concepts and properties already present in other vocabularies (as listed in Table 3). Reusing the FOWLA approach [34] enables a flexible solution while ensuring its efficiency regarding query execution times. Indeed, FOWLA considers interoperable sub-schemas among the assessed modules, thus maintaining the federation even if one module evolves (e.g. new concept added). Moreover, the approach depicted in [45] allows improved SPARQL query execution times by only selecting pertaining SWRL rules [46].

Several modules (or schemas) will be considered for the ANR McBIM ontology.

  • The "standards module" will use concepts from ISO 19650-1:2018 [3] and ISO 29481 Parts 1 [5] and 3 [47] along with the relations among them represented in Fig. 14.

  • The "concrete module" will explicitly define all characteristics (e.g. those from buildingSMART Data Dictionary,Footnote 2 but also national working groups such as mediaConstruct Masonry) and refer to existing classifications for concrete (e.g. OmniClass 2013). Principles defined in ISO 12006-2 [48] for information classification and those from ISO 12006-3 [49] about attribute metadata definition will be respected.

  • The "actor module" will list all actors involved in ANR McBIM processes regarding the actor terminology defined in ISO 19650-2 [4]. According to ISO 29481-1:2016 [5], an actor is a "person, organization, or organizational unit (such as a department, team, etc.) involved in a construction process". According to the different processes specified in ANR McBIM, the following actors are considered: the client, the delivery team, and the task team. Following ISO 19650-2 [4], each of these will either be an appointed party, or an appointing party, according to the process and lifecycle phase considered.

  • The "sensor module" will contain all necessary knowledge about the sensors deployed into concrete. Alignments are provided to the main standard ontologies listed in “Semantic Web Technologies for BIM and IoT”, while also defining equivalency links to pertaining classes in ifcOWL ontology for IFC4.1.

Fig. 14
figure 14

Standard concepts from ISO 19650-1, ISO 29481-1 and ISO 29481-3 considered by the ANR McBIM ontology

Following the study of Part 1 of the ISO 19650 standard [3], ANR McBIM aims at implementing the information delivery cycle defined by the standard. Namely, the following steps have been considered.

  1. 1.

    Specify information requirements through organizational information requirements (OIR), asset information requirements (AIR), project information requirements (PIR), exchange information requirements (EIR)

  2. 2.

    Define planning for information delivery

  3. 3.

    Implement automatic compliance checking for information approval, through the definition of (1) collaborative information management processes as defined in ISO 19650-2 [4], (2) information review process (e.g. spatial coordination, information compliance) as defined in ISO 19650-2 [4], (3) security-related best practices according to ISO 19650-5 [40].

For doing so, we based our definitions on BPMN specifications of the processes considered in the project context. For each building element considered (e.g. IfcBeam or IfcWall), we envisaged the following lifecycle phases: design, production, delivery (handover), operations, management, and demolition. We identified ISO 19650-1 concepts [3] and ISO 19650-2 actors [4] as involved. The above steps can thus be implemented as semantic rules (or constraints) on top of the McBIM ontology, ensuring the (semi-) automatic checking of the considered processes. The aim is to verify the compliance with the requirements about a lifecycle phase, for example, check the compliance with the client requirements during the delivery phase.

Implement Explainable Decision-Support

Explainable decision support is one of the main innovations in the context of this project. The idea is to ease the querying of the overall knowledge specified in the different federated ontology modules.

Based on the modules listed in the previous section, the ANR McBIM ontology implements Linked Data principles to align upper-level concepts of the ontology with existing standard ontologies e.g. ifcOWL4ADD1 [50], SSN/SOSA [51], SAREF [52], SAREF4BLDG [53], SEAS [62], etc.

Of course, similar decision-support could be implemented by using only ifcOWL, but given the high expressivity of this ontology, queries expressed on top of it have been proven time-consuming [33]. To further exemplify the complexity of writing queries for ifcOWL, the following example shows the SPARQL query for obtaining the data values from IfcSensor instances. Let us consider the next IfcBeam element, comprising three instances of IfcSensor (corresponding to the beam illustrated in Fig. 13).

For a clearer idea of the structure's complexity for ifcOWL, the table below (see Table 5) lists all triples about one IfcSensor instance, namely the one identified as IfcSensorType_60210 in Fig. 15. Line 1 references the IFC GUID instance (but not the GUID's value) and line 6 defines this as a temperature sensor. Still, the table below (Table 5) only represents the basic knowledge about this sensor.

Table 5 Triples extracted from the OWL file representing the beam from Fig. 15 and listing the knowledge about the instance IfcSensorType_60210
Fig. 15
figure 15

IfcBeam equipped with three instances of IfcSensor

The SPARQL listing below (Table 6) allows querying the minimum data values recorded by all three sensors from Fig. 15. Results are displayed in Table 7.

Table 6 SPARQL query for minimum data values recorded by the sensors from Fig. 15
Table 7 Results provided to the query in Table 6.

Given the SPARQL query's length and complexity listed in Table 6, the ANR McBIM ontology's goal is to allow formulating simpler queries. Using a more straightforward ontology structure, the ANR McBIM knowledge base only needs to be populated with instance data from IFC files (e.g. through ETL-based processes). The queries and the reasoning rules are to be implemented on top of the ANR McBIM knowledge, using concepts and relations from that ontology solely. Using the semantic links defined to other ontologies (such as ifcOWL) knowledge can be inferred in terms of IFC elements.

The ANR McBIM project will seek to implement advanced knowledge analysis based on semantic rules, thus enabling reactive and proactive explainable decision-making. Reactive decision-making algorithms will produce notifications and alarms to the platform's end-users considering specified thresholds for parameters of interest. Proactive decision-making (predictive modeling) will help better interpret real-time construction progress/structural health monitoring results and provide recommendations on preventing accidents, shortcomings, and deviations or how to foresee discrepancies at the considered construction stages. Both reactive and proactive decision-making algorithms will be integrated into the digital platform to enhance end-users' visibility on the construction site's progress.

Conclusion

With this paper, we presented the main research issues existing in the digital building twin field as imposed by the growing need to digitize construction processes. Approaches for sensing a building start becoming state-of-the-art. Still, regarding real-time surveillance and decision-making in construction processes encompassing building lifecycle phases other than facility management, several standards exist at the ISO level. In contrast, no standard implementation has been defined. This article further presented the contributions envisaged in the ANR McBIM project's context and how they can push forward existing state-of-the-art approaches. The concept of "communicating concrete" and its applications in optimizing the sensor network lifetime have tangible benefits for building structural health monitoring and building demolition and concrete recycling. The contributions regarding reactive and proactive decision-making are also trail-blazing and enable the level of confidence users to need for "a reliable basis for decision-making" (as defined by ISO 19650-1:2018 [3]).

Further actions to be considered in this project address the development and the alignment of the McBIM ontology with all the ontologies listed in this article's section “Semantic Web Technologies for BIM and IoT”. Once the McBIM ontology and its related modules are specified, they will have to be further implemented in the overall system environment developed by the French SME 360SmartConnect.Footnote 3 Expert semantic rules will have to be established on top of this ontology to implement the processes, and compliance-checking approach envisioned. The overall strategy will be tested in a real environment, and results presented to pertaining standardization organizations.