Keywords

1 Introduction

The digital twin concept involves integrating data from development, functional and industrial design with data from the physical domain along the whole lifecycle. This basic idea applies both to products and to production systems. It is grounded on the creation of a digital as-built model of a physical product [1] and in the integration of computation with physical processes, named as cyber-physical systems (CPS) [2]. A digital twin can be seen as the digital representation of a specific physical artifact, identified by its serial number, at a specific point in time.

In the context of production, designers and engineers define what must be achieved in the physical domain and how to accomplish it. They used different software applications along the development and design process, to define how the product must be, and how, where, for how long and in what quantity, it must be manufactured. The result is an extremely big amount of data related to the product itself, the processes to execute and the resources to use. Using a concept coined in the 1990s, we could frame such data into a kind of industrial digital mock-up [3]. Then data and working instructions, which are defined as mandatory, are transferred into the physical domain to manufacture physical products. The physical product exists, and therefore, it is the true one, but the digital one is the mandatory one. The physical product is assessed by performing a comparison against the mandatory one, i.e. the digital one. A clear extension of the digital mock-up concept is the current concept of the digital twin.

In the widest sense, a digital twin requires to implement an unbroken closed-loop data flow [4], where data, acquired from test specimens, physical scale models, physical products and production systems, are incorporated into a cyber-space to assist in predictive and decision-making processes [5, 6]. Data obtained from test specimens and physical scale models are used to perform engineering calculations, e.g. material property data [7] and performance data [8]. Data obtained from the physical product are used to validate and certify the product against engineering requirements [9]. In the cyber-physical production systems (CPPS), configuration and run-time data are used to implement production and maintenance adaptive strategies that rely on system simulations [5]. In parallel, to the digital twin concept, the digital thread concept was defined. The digital thread comprises linking data and information generated along the product lifecycle through a data-driven architecture of shared resources, this way, data from the physical domain can be fed back and linked to design data (cyber domain) [6]. The assumption, behind the unbroken closed-loop data flow, is that incorporating “true data”, from the physical domain into the cyber domain, allows reducing uncertainty, improving predictions and designing adaptive products and systems.

From an engineering perspective, the implementation of the digital twin concept is an ongoing process that faces challenges: data acquisition, gathering and processing of large data sets, data fusion, data standardization, uncertainty quantification, trustworthiness of data, data security, models interoperability, high-fidelity computational models for simulation and virtual testing at multiple scales, modeling of physical part variations, synchronization between the physical and the digital world to establish closed loops, complexity and cost of state-of-the-art IT-infrastructure [5, 9,10,11,12].

The acquisition of data in the physical domain involves the measurement of physical magnitudes or quantitative properties. In many occasions, the measurement result is expressed by a twofold structure: the nominal value of the magnitude and the measurement unit. The nominal value is considered as numerically exact and it is propagated throughout successive processes [13]. However, the result of measurement should be a threefold structure, where the uncertainty of the measurement is the third component. The uncertainty is a way to indicate how good a nominal value is [14]. The benefits derived from the digital twin implementation, depend on incorporating “true data” from the physical domain into the digital domain. Therefore, it is relevant to know how much the data are true. In the process of data transfer from the physical to the cyber domain, the trustworthiness of data depends on several factors, mainly: integrity, reliability, security, and quality [15]. The most widely used data quality dimensions are: accuracy, completeness, currency, and consistency. Within the accuracy dimension, the uncertainty of a measured magnitude is a significant contributor to the indicator of the data truthfulness.

This work provides a review of how the uncertainty is being considered in the context of the digital twin. The literature points out the relevance, the challenges and the lack of uncertainty quantification in the context of the digital twin [12, 13, 16, 17]. Section 2 provides a review of the uncertainty definition. Section 3 points out some of the main challenges of its quantification. Section 4 deals with the uncertainty representation, how it is modeled in standards related to product data representation, how it is considered in a data fusion context, and in particular, in an engineering data fusion context represented by the case of the Collaborative Research Center (SFB 805) at TU Darmstadt. The communication ends with conclusions and future works.

2 Uncertainty

In general, uncertainty involves imperfect, imprecise or unknown data, information or knowledge. There are two facts to keep in mind. The first one is that uncertainty refers to something: a property, a measurement, a model, an assumption, a specific data. The second one is that the existence of uncertainty imposes that truth exists [18]. In engineering, the truth exists in the physical domain: products and production systems.

Based on its nature, uncertainty is usually classified under two main types: epistemic and aleatoric [19]. Epistemic uncertainty relates to the lack of knowledge. Aleatoric uncertainty relates to the variability of physical processes. Epistemic uncertainty can be introduced by means of poor assumptions, poor models and missing data. Aleatoric uncertainty is inherent to the non-deterministic nature of the manufacturing and measurement processes [16]. Different approaches are proposed to evaluate the epistemic uncertainty, e.g. evidence theory, possibility theory and interval analysis [19]. The evidence theory uses basic probabilistic assignments to indicate the degree to which a piece of evidence supports a hypothesis. Aleatoric uncertainty is quantified by means of statistical methods using a probability distribution [19, 20]. Considering the design and development of complex multidisciplinary engineering systems, Thunnissen [21] proposed two additional uncertainty types: ambiguity and interaction. Ambiguity uncertainty relates to the use of imprecise terms and expressions by individuals when communicating a specification. Interaction uncertainty relates to situations where several disciplines, individuals and factors are involved but their interaction was not properly foreseen, or a disagreement arises.

3 Quantification of Uncertainty

The quantification of uncertainty is widely acknowledged and affects both the collected data and the models created for a process of interest. It can be seen as a process of determining uncertainties associated with model-based predictions. A sensitivity analysis is also required to quantify the impact that each input data have on the results provided by the model. It involves identifying and characterizing all the key sources of uncertainty and propagating input uncertainties through the model [19, 20].

One of the main challenges of uncertainty quantification in the digital twin context is that different types of uncertainty are present when transferring data from the physical domain into the digital or cyber domain. Material mechanical properties derived from test specimens, with values considered as exact, are used as input in models to estimate and simulate the result from manufacturing processes. However, the physical part may exhibit a microstructure that is not uniform and so are its mechanical properties [7, 22]. Data derived from functional tests of scale models are merged with simulation data, numerical uncertainties are derived from approximations to geometry and boundary conditions, and from physical models and parameters [8]. Different sensing and measuring devices are used to monitor and measure parameters, both in products and in production systems. This situation leads to fuse data from different sources. Quantifying the quality of the data, or uncertainty, provided by the different sensors and devices is critical, overall, when data from different sources are conflicting [23]. The quantification of the uncertainty of the true data gathered in the physical domain is an impediment to achieve an appropriate fusion of physical and virtual systems, and a limitation for the digital twin concept implementation [5, 11,12,13].

When analyzing current practices at the physical level, literature also shows examples of how the uncertainty quantification is an issue. The complexity in calculating the measurement uncertainty derives from the significant number of factors that affect it, e.g. human-caused, environmental, logical, mechanical, methodological and numerical [24]. As an example, Abollado et al. [25] concluded that at the shop-floor level, the quantification of uncertainty in the industry requires support with the identification of uncertainty key drivers and the definition of best practices.

4 Representation of Uncertainty

The formal representation of concepts, in a computer-processable way, is addressed by creating data models, information models or ontologies. This section shows how the concept of uncertainty is being represented in the context of CPS and the digital twin. A formal representation depends on how the concept of uncertainty is defined.

In a Cyber-Physical Systems (CPS) context, heterogeneous physical elements communicate via networking equipment and interact with applications and humans. Zhang et al. [17] consider uncertainty, as a lack of confidence, due to the interactions between hardware, software and humans, and the need for them to be context aware. Uncertainty represents a state where an agent does not have full confidence in a belief that it holds, and it can be represented by a measurement. Uncertainty is specialized into: content, environment, geographical location, occurrence, and time; and it can follow a pattern or be random. A measurement can be represented by vagueness, probability, and ambiguity.

In the context of virtual product development, Anderl et al. [13] and Heimrich et al. [26] consider uncertainty in properties related to products and processes. Uncertainty is modeled in the form of single values, intervals, fuzziness, and stochastic measures. They proposed an approach based on three layers: representation, presentation, and visualization. The representation layer comprises an ontology-based information model that supports an Uncertainty Mode and Effect Analysis (UMEA) process. The model extends concepts provided by the standard ISO 10303 for the exchange of product model data (STEP), e.g.: UncertaintyType, UncertaintyDriverProduction, and UncertaintyDriverUsage.

4.1 Uncertainty in the Standard ISO 10303 (STEP)

The standard ISO 10303 is an enabler of the digital thread and defines the basic concepts related to uncertainty in: ISO 10303-41 [27] and ISO 10303-45 [28]; which are part of the STEP Module and Resource Library (SMRL). The concepts are based on the definitions provided by the guide to the expression of uncertainty in measurement [20], which considers that a measurement is only complete when accompanied by a quantitative value of its uncertainty. The Part 41 defines the product_property_definition_schema, which specifies concepts to define product properties; the measure_schema, which specifies the concepts to describe physical quantities, e.g.: measure_value, unit, si_unit, named_unit, measure_with_unit [27]. In the Part 45, the material_property_definition_schema, specifies concepts to define material properties; the qualified_measure_schema, specifies concepts to qualify quantities by their uncertainty, e.g.: value_qualifier, uncertainty_qualifier, standard_uncertainty, qualitative_uncertainty, measure_qualification, measure_representation_item [28]. Additionally, basic concepts, dealing with geometric shape variation tolerances, are defined in the ISO 10303-47. The ISO 10303-50 defines basic concepts dealing with the definition of mathematical structures and data related to the properties of a product.

In addition to the basic definitions, application protocols are developed to specify information requirements to specific engineering application contexts. The ISO 10303-235 is the application protocol (AP) related to engineering properties for product design and verification [29]. It defines the processes for the testing, measurement, and approval of engineering properties, both of product samples and the manufactured product itself. The normative model, designed as Application Interpreted Model (AIM), comprises all the uncertainty related concepts defined in the STEP (SMRL). This AP is of interest when aiming to feedback data related to material properties derived from test specimens, and data derived from functional tests of scale models.

The ISO 10303-242 is for managed model-based 3D engineering, its scope is limited to product data related to design and manufacturing planning of mechanical parts and assemblies [30]. It substitutes the former applications protocols AP 203 and AP 214. In addition to schemas from the ISO 10303-41 and the ISO 10303-45 previously mentioned, it adopts schemas defined in the ISO 10303-59, related to the quality of product shape data. Among many other concepts, within the mechanical design, it comprises the definition of properties of parts and tools, data defining surface conditions, dimensional and geometrical tolerance data, quality criteria and inspection results of 3D product shape data. Successful implementation tests of the AP 242, to exchange 3D models with the specification of tolerances are reported in the literature [31]. The AP242 comprises schemas such as: product_data_quality_criteria, product_data_quality_definition, product_data_quality_inspection_result. Some of the main uncertainty related concepts are: QualitativeUncertainty, StandardUncertainty, ValuewithTolerances, ValueWithUnit, MeasuredQualification, MeasuredCharacteristic, PropertyValue. PropertyValue represents the value of a property and is an abstract supertype of: StringValue, ValueList, ValueSet and ValueWithUnit. It has an optional attribute named qualifications where the uncertainty or the precision of the value could be specified. A ValueWithUnit is an abstract supertype of: NumericalValue, ValueRange, and ValueWithTolerances. These concepts allow expressing the value of part properties for its propagation throughout different processes.

Another relevant AP is the AP 219 for dimensional inspection information exchange [32]. This AP could be considered as an antecedent to the aimed closed-loop data flow from manufacturing (physical domain) to design (digital/cyber domain). Where data and results from the part inspection could be feedback to design. It comprises all the basic concepts defined in the STEP integrated generic resources that were commented previously, however, its industrial implementation has not been reported. In that sense, it can be stated that to achieve the digital thread concept only with STEP related technology is not feasible yet. As an alternative, the Quality Information Framework (QIF) was defined to support manufacturing quality information.

4.2 Uncertainty in the Standard QIF

QIF is an ANSI standard that defines a set of information models to enable the closed-loop exchange of metrology data from product design to inspection planning to inspection execution to data analysis and results reporting. It defines several areas of quality information: measurement plans, measurement results, measurement rules, measurement resources and results analysis [33]. It is a feed forward and feedback data quality flow that supports the digital thread concept [34].

QIF is structured into six application area information models: Model-Based Design (MBD), Plans, Resources, Rules, Results, and Statistics; and it is specified in a set of XML schemas. It allows the definition of rules by means of the called QIF Rules model. Rules can be defined by each organization to define working practices, e.g. how a product should be measured based on measurement requirements [33]. It adopts the uncertainty definitions provided by the guide to the expression of uncertainty in measurement [20]. However, it considers it as an optional attribute. In instance files, QIF allows quantities to appear without explicit unit and uncertainty for each value. Units can be specified in a QIF instance file by using the FileUnits element defined in the Units schema. This schema defines also the concepts: SpecifiedDecimalType and MeasuredDecimalType. The former allows specifying a decimal type value with two optional attributes: decimalPlaces and significantFigures. The latter defines a SpecifiedDecimalType with two additional optional attributes: combinedUncertainty and meanError. While this approach may provide flexibility, it also means that different implementations of QIF are possible, which at the end may turn into interoperability problems among different organizations.

4.3 Uncertainty Representation and Engineering Data Fusion

Data fusion deals with integrating multiple data streams into one single consistent information stream. The Intl. Soc. of Information Fusion has created an ontology named URREF (Uncertainty Representation and Reasoning Evaluation Framework) to facilitate the communication and data processing in the context of complex, distributed, and operational information fusion systems [35]. It distinguishes between low-level information fusion, e.g. physical-based parameters, and high-level information fusion, e.g. the World Wide Web. The ontology comprises the classes: UncertaintyNature, UncertaintyType, UncertaintyDerivation, and UncertaintyModel. The uncertainty nature distinguishes between epistemic and aleatory. The uncertainty type refers to what makes information uncertain: ambiguity, incompleteness, vagueness, randomness, and inconsistency. The uncertainty derivation refers to how the uncertainty can be assessed, subjectively or objectively. The uncertainty model class refers to mathematical theories for representing and reasoning with the uncertainty types.

As previously mentioned, different devices are used to monitor and measure physical magnitudes. Both complex products and CPPS can be seen as large-scale measurement platforms were multiple data streams need to be integrated [23].

In this context, we may consider two possible data inducted conflict scenarios. One scenario, where different devices provide different values for the same magnitude. And a second one, where different data sources use different semantics for the same magnitude. The first scenario could be addressed by introducing two parameters. The reliability degree of the source, which expresses its level of trust for delivering true data over time. And the credibility degree of the data, which depends on its confirmation by other sources and its conflict with other data [36]. The second scenario refers to industrial interoperability and it can be addressed mainly in two ways. By harmonizing and implementing standards to transfer data, both from the cyber domain to the physical domain and from the physical domain to the cyber domain. And by automating the mapping and integration of different semantic definitions, e.g. ontologies.

An example of the need to harmonize standards can be illustrated by the previously commented ways to represent uncertainty. When considering the data transfer from the physical domain to the cyber domain, additional standards, to the ones commented in the two prior subsections, have to be considered. The standard IEC 62541, known as OPC UA, is a service-oriented framework that supports a client-server architecture to model and transport data that will be exchanged between industrial applications [37]. It provides the basis to define companion information models that specify the data content to exchange. MTConnect, although developed separately, has been harmonized with OPC UA and it can be considered as an example of such companion information models [38]. In itself, MTConnect is a standard to transfer data from the manufacturing equipment to the cyber domain and it is used as part of the digital thread implementation [34]. By means of an MTConnect Agent, a piece of manufacturing equipment can report data, e.g. an axis position. Data to be reported is modeled by means of the DataItem element, which has mandatory attributes: id, type and category; and optional attributes: name, subtype, statistic, units, nativeUnits, nativeScale, coordinateSystem, compositionId, sampleRate, representation, and significantDigits. The optional statistic attribute allows describing any type of statistical calculation that has been performed to provide the reported data value [39]. However, how to represent the uncertainty of the reported data is still unclear.

The automated integration of different semantic definitions obtained from different sources may lead to issues of incomplete and contradictory concepts. The resolution of these issues could be addressed by combining deductive and inductive reasoning mechanisms. In deductive reasoning, the truth of the conclusion is based on the truth of the data. In inductive reasoning, the truth of the conclusion cannot be asserted because of the uncertainty of the data. In this case, a certainty or truthfulness level of the conclusion could be provided. This approach is currently under research in the SFB 805 for Control of Uncertainty in Load-bearing Syst. of Mech. Eng. [40].

4.4 Uncertainty Representation in the Context of the SFB 805

Within the SFB 805, one of the main objectives is to manage uncertainty along the product lifecycle, from development to usage. The quantification of uncertainty is considered from three perspectives: data, model, and structural [41].

Data uncertainty relates to the values of parameters and measured magnitudes. Stochastic or aleatoric uncertainty is quantified by means of statistical methods using a probability distribution. When a probability distribution is unknown and only a nominal value is provided, then it is the case of unobserved uncertainty.

Model uncertainty derives from the input data and from the way it was created. Models can be deductive, empirical and hybrid. Empirical models can be derived from observed or measured data and or based on the experience of an expert. In any case, an evidence should be available to support the creation of the model. In empirical models it is necessary to identify the relationship between the different parameters, which may lead to three different situations: (1) the relationship is suspected, verified and validated, (2) the relationship is only suspected, (3) the relationship is unknown or ignored. It is a typical situation where epistemic and aleatoric uncertainties need to be quantified and integrated.

Structural uncertainty is originated during the product development phase. It derives from the typical multiplicity of functional decompositions and design solutions. The generation of design solutions from requirements involves executing tasks related to requirements formalization, functional decomposition, selection of physical principles, and structural decomposition. In general, the process is iterative and results in a combinatorial explosion of the solution space that cannot be fully identified. The part of the solution space that remains unknown constitutes ignorance. The structural uncertainty comprises epistemic, ambiguity and interaction uncertainties.

5 Conclusions

A bidirectional semantic harmonization of the uncertainty representation in the standards used to transfer data, both from the cyber domain to the physical domain and vice versa, could facilitate the attainment of the digital twin or the CPPS. It demands to consider also the data to accomplish the measurement traceability. This approach requires a deeper analysis of the modeling capabilities and the existing data transfer mechanisms provided by: STEP, QIF, MTConnect, and UPC-UA.

In addition to the already ongoing work on automated integration of different semantic definitions by combining deductive and inductive reasoning, the SFB 805 data fusion context provides a platform, where part of the standard harmonization approach could be tested.