1 Introduction

Motivation

Handling the ever-growing amount of heterogeneous data and models within the production domain (Brauner et al. 2022) requires a precise and standardized understanding about their foundations, structure, and forms of aggregation and especially their use. Managing quantities of data in structured form requires (1) pre-aggregation and cleansing of data for analysis, (2) which can be used within and across overall industrial ecosystems, (3) which are organized and contextualized according to metamodels to become self-contained and explainable, (4) where the metamodel is sufficiently precise and detailed, and thus finally (5) usable for derivation of algorithms and other forms of code, e.g., through model-based code generation. Our approach to handle these requirements is the concept of digital shadows (DSs). In our understanding, “a digital shadow is a set of contextual data traces and their aggregation and abstraction collected concerning a system for a specific purpose with respect to the original system” (Becker et al. 2021). These digital shadows can be used for sharing data or within software systems such as digital twins (DTs). For us, a digital twin is “a set of models of the system, a set of digital shadows and their aggregation and abstraction collected from a system, and a set of services that allow using the data and models purposefully with respect to the original system” (Dalibor et al. 2020). We create DTs as active software systems for observable objects and systems in the physical world that can be monitored, sensed, actuated, and controlled. However, due to the vast amounts of data that a virtual representative of a product, machine, or production line would require, a complete digital twin is not feasible (Brauner et al. 2022). Digital shadows provide us the needed information about a system’s state and history for a specific purpose which could be used within DTs. In contrast to DTs, however, they are a passive set of data (Brauner et al. 2022) and do not directly influence the physical system or objects (Kritzinger et al. 2018). To use digital shadows, we need a good understanding of relevant concepts, the methods to use them, and how they can be applied in different domains.

Current research lacks detailed descriptions about what constitutes a DS and how to create and maintain them. Existing research covers only parts of the DS concepts, e.g., metadata (Quix et al. 2016), data management and concepts from artificial intelligence (AI) (Liebenberg and Jarke 2020), data and data analytics (Ladj et al. 2021), or production processes and resources (Schuh et al. 2019). To cope with that, we suggested a first version of a conceptual model for digital shadows in Becker et al. (2021), which has to be further evolved to meet different use cases.

Research question

Within this chapter, we tackle the question of how to model data elements and static and dynamic relationships as well as their physical resources within the Cluster of Excellence “Internet of Production” (IoP) in a cross-disciplinary life cycle-spanning cooperation as a basis for knowledge management while meeting technical, scientific-ethical, and legal framework conditions.

Contribution

The core solution for this question is the use of an adequate set of modeling techniques, transformations, and their integration with DSs. This chapter provides insight into relevant concepts that constitute a DS and link it to their semantics defined by appropriate metamodels. This includes related assets and their properties described by engineering models, relevant data organized in data traces, data points and metadata as a source for calculation, and simulation for a specific purpose. We propose and discuss the digital shadow reference model (DSRM) in its second version that is based on Becker et al. (2021) and includes heterogeneous system configurations as well as engineering, calculation, and simulation models. To support interoperability, we discuss digital shadows in relation with base and domain ontologies. As the design of a digital shadow data structure is challenging in practice, we propose a stepwise method to derive a digital shadow from existing data. Moreover, we provide usage evidence in the form of examples from (1) production planning in injection molding, (2) process control, (3) laser-based manufacturing, and (4) automated factory planning discussing relevant digital shadow data models and semantics. Moreover, we discuss data and model life cycles in relationship to digital shadows and provide an outlook into open challenges for digital shadows and their use, especially within digital twins of the Cyber-Physical Production Systems (CPPS).

Structure

This chapter is structured as follows: Sect. 4.2 discusses related work for digital shadows. Section 4.3 presents the digital shadow reference model, and Sect. 4.4 discusses the role of ontologies for DSs. We present in Sect. 4.5 four use cases and their use of digital shadows and propose a method to derive a digital shadow from the domain expert perspective in Sect. 4.6. Section 4.7 illustrates the need for extended life cycle of production data and models. Section 4.8 gives an outlook to the use of digital shadows in digital twins before the last section concludes.

2 State of the Art

Clearly, digital shadows are important concepts for data use and sharing in smart manufacturing (Brauner et al. 2022). Thus, there exist several publications about digital shadows, and some parts of their most relevant concepts are already defined in other contexts.

Data and Metadata Management

Quix et al. (2016) describe their conceptual view on a metadata model that suits for the extraction of metadata and its management in data lakes. In (2020), Liebenberg and Jarke make use of generalizations of database view conceptualizations to model digital shadows regarding AI and data management aspects in the IoP. In contrast, our DSRM uses additional information and contextualizes data, for example, by specifying the source it originates from, or the asset by connecting it to its engineering models.

Loucopoulos et al. (2019) presents a conceptual metamodel for cyber-physical production systems, focusing in particular on aspects of information exchange and analysis at a wide level of requirements engineering. However, it does not address the standardization and detail level presented in this chapter.

Modeling the Production Domain

Bravo et al. (2008) present a metamodel that allows describing business objects. For that they enriched the metamodel presented in the PRDOML (www.prodml.org) Reference Architecture by the elements of resource, execution, planning, product, and client. In comparison to this approach, we look at the asset of the given physical system, its unique purpose, and the enrichment with metadata.

Ladj et al. (2021) describe a self-learning, continuous improving, and knowledge-based digital shadow incorporating a physical as well as a virtual system. The digital shadow manages data and knowledge. For that, they present a framework that applies data analytics to the database. The digital shadow uses the generated knowledge base to support the decision process. Their approach defines the purpose of the physical machine but is missing an extendable description of the elements contained in a digital shadow.

Bauernhansl et al. (2018) propose a concept for DSs of production. The core function of the DS is to provide the required information and is considered as a macro-service consisting of different micro-services, which guarantee to provide the right information at the right time and place. Necessary services are, e.g., control of information flow, a record of user needs, and identification or compression of information. The development of digital shadows is described by four complexity levels: linkage of information, information flow control, information quality control, and feedback and self-optimization of data and information basis. Bauernhansl et al. (2018) describe core functions but no conceptual model for digital shadows is given.

Schuh et al. (2019) develop a data structure model for digital shadows in the order fulfillment process from order acquisition to work preparation of single and small batch productions. The digital shadow is prerequisite for managing the organization’s knowledge that can be utilized to solve the use-case-specific tasks. The proposed data structure model describes relationships between relevant objects in the order fulfillment process, e.g., product specifications, manufacturing and assembly processes, and production resources. The data structure model digitally represents the real processes of single and small batch production, thus outlining the role of digital shadows for the use of knowledge management systems. However, the research is limited to the design of a concrete model for a specific use case, while no conceptual model is given. In addition, concepts such as purpose or data sources are not considered.

Parri et al. (2021) developed an architecture around digital twins using a model-driven approach to derive structural configuration times from SysML Block Definition Diagrams. A digital twin instance contains a concrete configuration that contains macrosopic events such as a failure event. They, as well, modeled a metamodel as a UML class diagram of the knowledge base concentrating on the digital system of a company and do not model elements of the digital shadow such as the contextualized data or models used.

Modeling Further Domains

Croatti et al. (2020) describe a metamodel for agent-based DTs in healthcare. The main elements of the metamodel are the digital twin and the physical asset connected by a cyber-physical connection. The model describes the physical asset for the DT, which interacts directly with information sources. Their metamodel is described at a high level and does not consider data traces.

Mertens et al. (2021) discuss to extend the concept of digital shadows to humans. This approach could be used for human-robot collaboration for manual work, decision support, and work organization, as well as human resource management. However, a concrete description of the relevant concepts is up to future work.

We took the insights gained from this related work and incorporated shortcomings into our digital shadow reference model. One point that particularly sets our approach apart is the consideration of connections to existing models, e.g., system, simulation, or AI models. Furthermore, we provide an additional semantic layer by pointing out the usage of ontologies along with DSs, give examples of digital shadows utilized in industrial use cases, and provide a methodology on how to build digital shadows from scratch.

3 The Digital Shadow Reference Model

In Becker et al. (2021), we suggested the first version of a conceptual model for digital shadows. This model comprises the ideas in our understanding of a digital shadow (see Sect. 4.1) for purposefully collecting, aggregating, and abstracting data from production enriched with meta-information to enable fast decision-making. As additional use cases employing DSs were realized for use in the IoP and further exchanges regarding its modeling best practices continued among the researches, the conceptual model for DSs was refined. The current version is defined in the digital shadow reference model, which is shown in Fig. 4.1 as a UML class diagram (Rumpe 2016). The DSRM is intentionally underspecified and only models key elements. A digital shadow designer is free to extend the DSRM to achieve a DS model tailored to their use case. The digital shadow collects aggregated and reduced data from an original system with respect to a specific purpose. Therefore, the digital shadow knows its referencing asset and the purpose it fulfills. It is composed of data traces that contain single data points. Those data traces originate from a source (e.g., the original asset) and are enriched with additional metadata. The digital shadow uses models to describe the system, data calculations, or system simulations.

Fig. 4.1
figure 1

The refined digital shadow reference model

Models play a key role in the reference model. According to Stachowiak (1973), models (1) consist of a mapping to an original object that the model represents, (2) are reduced to the relevant aspects and abstract from details of the original, and (3) have a pragmatism that lets them replace the original in certain scenarios. They add an abstract representation of knowledge of the underlying system and describe calculations for, e.g., data aggregation or simulation, and help to evaluate the asset’s data by providing more context. We initially distinguish between Engineering, Data Calculation, and Simulation models. Engineering models arise during design time of the physical asset to plan the system’s structure and behavior. Proper modeling of the target system during design time allows for a consistent, quality ensured development. In model-driven software engineering, models can also be used to generate code from an abstracted view on domain knowledge. These models then can be reused in the digital shadow to provide additional information to manufacturing data and to outline the system. UML class diagrams to describe the structural elements of an asset together with object diagrams to describe the asset’s layout and the Object Constraint Language (OCL) to restrict possible values and layouts are utilized to represent large parts of the asset. Ontologies and SysML BDD (Weilkiens 2011) models have similar expressiveness regarding the asset’s structure. Architecture modeling, such as MontiArc (Dalibor et al. 2020) or Focus, targets the system’s components in the large as well as their interconnections and communication. Behavior models, like state machines or MATLAB’s Simulink (Mathworks: Simulation and Model-Based Design https://www.mathworks.com/products/simulink.html), provide information about the asset’s expected behavior. The digital shadow uses Data Calculation models to formulate data aggregation and can have any form, from workflow models (Freund and Rücker 2012) over programmed Excel tables to complex optimization models as Python script. Simulation models serve in a similar function but with a strong focus on living aside with the physical asset and predict behavioral aspects. In that sense, Data Calculation models are meant to be executable by some engine, mainly a processing component, to compute new data traces. Results of both Data Calculation and Simulation models can be used by one another and finally provide an abstracted view on the manufacturing data.

A digital shadow is designed for fulfilling exactly one specific Purpose. The Purpose is the basis for data acquisition and information generation and varies from a human-formulated text string or selected filter criteria to semantically defined ontology terms. The detail level of the Purpose determines the range of the decision support, and subsequently the consideration of assets, models, and sources. Usually, the more general a purpose is formulated, the more results a requester gets. For example, finding an optimal shop floor configuration in general leads to multiple results since different objectives are competitive, e.g., costs and adherence to the jobs’ due date. Moreover, different models can fulfill the same purpose and hence lead to different results for the same purpose. An example in the production domain is finding the optimal lot size of a purchase order where different suitable models like Andler, Groff, or Silver-Meal exist (Vahrenkamp 2008).

To fulfill its purpose, DSs gather data from a Source that supplies at least one data point. Sources can be differentiated into Assets, manual inputs from Humans, automatic Measurements from sensors, data Processing (i.e., cleaning, aggregation, simulation, or calculation), and other digital shadows. Sources are further specified by SystemProperties, which define attributes of a source at a given point in time.

An Asset “is an item, thing or entity that has potential or actual value to an organization” (DIN ISO 55000:2017-05 2017). Thus, an asset can be either of physical or virtual manner. Typical physical assets on the shop floor are machines, equipment, material, or finished goods, while typical virtual assets comprise jobs, routings, bill of materials, machine settings, or drawings. Assets can be described by engineering models that provide their properties. The SystemProperties specify the assets’ technical feasibilities and conditions at a point in time, like status or performance. The composition of multiple assets can lead to a new asset, e.g., the combination of a machine and a handling robot to a work center, and subsequently to new properties, e.g., the overall equipment efficiency for this work center. The prerequisite for an automatic decision support through DSs is a digital representation of the assets’ properties in the digital world, e g., within a software system like an enterprise resource planning, which comprises the assets’ master data. Moreover, especially for transaction or process data (such as confirmation of jobs and current temperature), Humans via plant and Measurements via machine data acquisition realize the data flow. Because the gathered raw data for models is often not suitable for direct processing, Processing as an essential source is introduced for building and/or calculating the traces. Therefore, Processing can use previously built data traces or processes gathered data from other sources, i.e., by the filter, aggregation, simulation, or calculation. Thereby, digital shadows can also act as a source.

The digital shadow captures data derived by Sources as DataPoints gathered in contextualized DataTraces. Each DataTrace describes one procedure of this Source, which it is connected to and is a subset of the available data. Single DataPoints are used by the DS to provide information regarding the target purpose and are either directly accessible or may contain a reference to the original data. MetaData enrich DataTraces with additional information over its creation process, e.g., its creation time, or further structural knowledge. Combined with the SystemProperty’s validity in time, the DataTrace can be mapped to a specific system configuration of the referenced Asset or other Sources. This way, much more context is given to a DataTrace: its originating source can be the asset itself, processings on other data traces, or even other digital shadows; and MetaData together with the system configuration provide a clear context of its creation.

4 Ontologies in the Internet of Production

One of the challenges of interdisciplinary collaboration is making knowledge available and interpretable so that insights can be transferred and used in other domains. The digital shadow reference model presented in Sect. 4.3 allows to overcome these challenges by using a unified model to communicate different data structures. However, it is not sufficient to ensure smooth communication between different domains. Without the necessary semantics, these data often lack interpretable context and tend to be rigid and without any possibility to adjust the level of detail. Ontologies are a useful tool from the Semantic Web introduced in Berners-Lee et al. (2001) to enable accurate modeling of real-world objects at any granularity and to build explorable knowledge bases by semantically linking data. At the same time, data is offered in machine-readable form and can be interpreted and further processed with the help of the corresponding ontologies. In addition, ontologies enable the creation of universally valid metamodels that can be flexibly applied to different use cases, as presented in Sect. 4.3. Ontologies and Semantic Web technologies have gained great importance in the IoP. Due to their flexible application possibilities, ontologies are not only used as a modeling tool, but find practical application in many different research areas. In the context of the IoP, our previous work Lipp and Schilling (2020) identified and evaluated the application domains depicted in Fig. 4.2. Applied methods include but are not limited to ontologies for modeling, the SPARQL Protocol and RDF Query Language (Prud’hommeaux and Seaborne 2013) for querying, the Shapes Constraint Language (SHACL) (Knublauch and Kontokostas 2017) for validating, and tool support for visualization and search. In the following, we present these five areas with references to application examples. Please refer to Sect. 4.5 for a more detailed presentation of selected use cases that build on ontologies and DSs.

Fig. 4.2
figure 2

Main application areas of the Semantic Web in the IoP (Lipp and Schilling 2020)

(A) Data/service catalog is a widely used application and an excellent way to structure any given data source such as dataset, services, participants, or projects. They help users to keep track of a large number of different sources and enable them to find information based on different search terms. Open data portals for Germany (Geschäfts- und Koordinierungsstelle GovData: https://govdata.de) or Europe (Publications Office of the European Union: https://data.europa.eu) are a prominent examples of how data catalogs are used. A catalog usually is independent of the data itself and is applicable to any data management system. In the context of the Internet of Production, data catalogs can be used to make data available between domains. Catalog ontologies such as Data Catalog Vocabulary (Albertoni et al. 2019) are used as a basis for uniform communication and thus to improve interoperability. In addition, unique and persistent identifiers simplify automatic processing of sources.

(B) Integrating domains enables human understanding data and enhances interoperability on machine level. It is common practice to reuse existing ontologies to create a shared knowledge among different domains. However, it might still be necessary to extend ontologies or create new ones to optimally serve respective use case. Suggested tools include the widespread fully fledged ontology editor Protégé (Noy et al. 2001) or our quick prototyping tool Neologism (Lipp et al. 2021), which also allow combining multiple ontologies. One can, for instance, align concepts within one or multiple ontologies using constructs like sameAs/broader/narrower or apply more semantically sophisticated methods (Lipp et al. 2020a).

Ontology-Based Data Access (OBDA) enables (C) Database Access by using semantic tools. By mapping concepts of ontologies to terminologies and relations of data base schemas, domain experts are enabled to access data relatively easy and without further database knowledge.

(D) While Reasoning infers new knowledge from existing information, Consistency checks and validation in general enable safe system interfaces and predictable data processing. Lightweight Semantic Web Services for Units (LISSU) (Lipp et al. 2021), for instance, used in Sect. 4.5.3 provides validation for service-oriented architectures and Theissen-Lipp et al. (2022) extends this approach to an integrated SHACL-based solution.

(E) Data aggregation combines arbitrary, heterogeneous environments into a joined source of information and enables advanced semantic analysis based on views using different abstraction, focus, or interpretation. This approach is similar to the concept of DSs and was applied in Lipp. et al. (2020) to optimize data collection from manufacturing systems, in Lipp et al. (2020b) to even aggregate collected data and metadata into a data lake, and in Sect. 4.5.1 to support decision-making processes.

The abovementioned technologies maximize their benefits through close communication between all relevant stakeholders, which fosters common understanding and interoperability. The IoP, for example, maintains an Ontology Expert Group, where experts from different domains and use cases collaborate on ontologies, tooling, and best practices. This completes the layers of ontologies’ benefits from high-level conceptual human understanding to deep technical integration of low-level machine interfaces. The advantages include global unique identifiers, improve (re)use and maintainability of both information and domain knowledge, and finally dramatically improve analysis results through semantic integration of cross-domain solutions.

In summary, ontologies have a wide range of applications. The different main application areas provide new approaches to overcome existing problems in the industry. By using semantic interpretable models like the digital shadow from Sect. 4.3, not only can cross-domain communication be improved, but also a common knowledge base can be created by integrating different domains and thus supporting decision processes across domain boundaries. In the following section, we will show how ontologies and semantic tools can help to overcome existing problems in different use cases.

5 Data, Models, and Semantics in Selected Use Cases

To test the applicability of the DSRM, we analyzed data, model, and semantics of four use cases on different levels of details, namely, production planning, and process control in injection molding, adaptable laser-based manufacturing, and automated factory planning. We present relevant concepts, show how digital shadows can be used, and discuss its potentials and challenges.

5.1 Production Planning in Injection Molding

Digital shadows are able to support decisions-makers within production planning and control (PPC) in their daily business. We demonstrate different challenges within the domain of PPC in injection molding and how semantics can face those challenges.

PPC facilitates all organizational steps for manufacturing a product, starting from procuring raw materials and ending with the shipment of the finished goods to the customer. Production planning tasks comprise a long-term horizon, i.e., weeks until months. A typical task within production planning is the scheduling of the manufacturers’ resources under consideration of due dates, costs, energy consumption, and much more. In contrast, production control tasks are in a short-term manner. Thus, the regarded time horizon comprises seconds until days. One core task in production control is the reaction to disruptions or changes within the production (DIN EN 62264-1:2014-07 2014; Jacobs et al. 2018).

Injection molding is a widely used primary shaping production process with a large variety of possible finished parts to be manufactured. First, the raw plastics granulate is plasticized. Then, the injection molding machine injects the required melt into a mold that comprises at least one or more cavities representing the negatives of the manufactured part. After its solidification, the machine ejects the part, which is then, in most cases, ready for post-processing or dispatching (Rosato et al. 2000).

Figure 4.3 schematically illustrates the elements and their connection that construct a PPC decision support in the injection molding domain.

Fig. 4.3
figure 3

The digital shadow provides decision support for production planning and control purposes in the injection molding domain under consideration of semantics

An operator needs a decision for a complex planning or controlling task that is compliant with a specific purpose. In the first step, the operator selects, for example, the scope (e.g., the machines, time horizons, articles) and the optimization criteria. Based on these criteria, the digital shadow selects different but suitable models. The prerequisite for this consideration is a classification of the models, i.e., based on optimization objective (e.g., minimization of tardy jobs), in the form of a model catalog. Furthermore, the model catalog specifies the required raw data for each model, e.g., the jobs’ planned start date or quantity. Enabling the transformation from information to data and vice versa under consideration of the right models is coupled to diverse challenges. On the one hand, digital shadows must know which models fit for fulfilling the given purpose and the source of the required data. On the other hand, PPC tasks are often complex as multiple assets, objectives, and constraints interact in a way that optimization of one objective often causes a trade-off to another objective. A prominent example is the “dilemma of production control,” where increasing the (machine) utilization leads to higher work-in-progress inventory (Wiendahl and v Wedemeyer 1993). Hence, PPC is often subject to multiple concurrent objectives leading to a set of optimal solutions, known as Pareto front, instead of one optimal solution. In addition, finding an optimal solution in a short time for a given objective might not be possible due to the vast amount of permutations, e.g., for building a schedule. In this case, the operator desires the providence of a suitable solution (Hopp and Spearman 2008).

Realizing autonomous processing of models requires a digital data representation. This digital representation is enabled through Asset Administration Shells (AASs), as they comprise all relevant properties for integration of assets into the virtual world. The AAS can either store the data directly or provide the endpoint for properties located in external databases, e.g., enterprise or shop floor management systems (ERP and MES), manually in excel sheets, or other specialized systems, e.g., warehouse management. Thus, the AAS acts as a single source of truth. Consequently, the operator relies on tools that provide data-based decisions in an adequate time from different data sources.

An ontology establishes the relation between the single AASs. Ontologies and AASs encourage a semantic enlargement for properties with meta-information, e.g., by adding the unit, synonyms, or the description of the properties’ meaning, corresponding to IEC 61360. Besides, introducing internationalized resource identifiers (IRI) for each property ensures a unique identification. If all elements (databases, AASs, ontologies, DSs) use IRIs, a modular composition of digital shadows can be realized since the IRIs connect the required data from the model catalog with the AASs and the corresponding databases.

In summary, digital shadows, in combination with semantic tools like AASs and ontologies, are helpful to master the high complexity of PPC and provide data-based decision support to operators. Perspectively, identifying the data and models via IRIs enables a modular integration of DSs that are independent of underlying databases or software systems.

5.2 Process Control in Injection Molding

Besides digital shadows for PPC at shop floor level, DSs also offer additional value at control level for many applications in production. In a plastics processing company, digital shadows can be used for monitoring and control of injection molding machines.

Disturbing influences, such as fluctuations in environmental temperature and humidity or changes of material batch composition, influence the injection molding process, which leads to cyclically and long-term variations of part quality (Kazmer and Westerdale 2009). Therefore, it is necessary to continuously adjust the machine settings in order to ensure high reproducibility and avoid rejects.

Process data from the cavity, as location of molded part creation, has a high correlation to part quality. The cavity pressure is referred as the fingerprint of the injection molding cycle and has a great potential for high process stability as control variable (Yang et al. 2016). For instance, a digital shadow, based on model-based predictive cavity pressure control, can be used to compensate process disturbances. A predefined cavity pressure reference is realized by adjusting the screw velocity, whereas the reference is adapted when process disturbances are detected (Stemmler et al. 2019; Hornberg et al. 2021; Vukovic et al. 2022). The main DS concepts and relations of this process control method are shown in Fig. 4.4.

Fig. 4.4
figure 4

Model of a digital shadow for process control in injection molding

As purpose, the digital shadow should realize the given cavity pressure curve with high control accuracy. Needed data originated from the main asset, injection molding process, which is divided in the sub-assets mold, machine, material, and human. All process data needed for this DS purpose is collected in the process data trace and divided in process data points and metadata. The process data points are updated for each control timestamp in real time and provide actual process data. Metadata, which contains information about mold, material, and machine settings, consists of constant values, which describe the injection molding process and are needed to fulfill the purpose. The whole data trace is used for process adaption calculation. Cooling calculation and flow calculation are performed until the injection process ends. A digital twin operates as an external machine control and directly adapts the screw velocity to realize the DS’s purpose.

The implementation as digital shadow has several advantages for further usage of the DS in other processes. All data needed for the digital shadow is described in the sub-assets, so the requirements for execution of the DS are given. It follows that the digital shadow can be reused for other processes, if all input data is given. The reusability includes changes of cavity pressure reference curve, machine, material, and mold (produced part). Additionally, changes in the control algorithm can easily be implemented by changing the used model.

The data structure of the assets can be reused as well. Further process information can be added to the assets to increase the usability for a wide range of use cases, such as quality prediction based on actual process data. The data traces of each digital shadow only consist of data, which is needed for DS’s purpose. For implementation, a classifier can be used, which contains whether the data has to be considered for the DS data trace as process data point or metadata. Otherwise, data will be saved as system properties. This leads to an increased usability as the operator only has to provide data needed. Besides that, it is possible to trace which data and models were used to derive DS’s purpose, thus ensuring traceability.

In summary, the application of digital shadows at control level was illustrated by the example of the process control of an injection molding machine. The main advantages of digital shadows are (re)usability and traceability, as data structures and models can be applied to other use cases with reduced amount of effort and high transparency.

5.3 Adaptable Layerwise Laser-Based Manufacturing

One of the largest advantages of laser processes is that laser light is weightless and contact-free. These properties make laser light extremely attractive for production systems since these systems are typically not bound to any wear, can deposit the exact amount of energy needed at a precise time and place, and can be reprogrammed for new purposes on demand, making it a perfect digital process (Poprawe et al. 2018). This combination of properties renders laser-based manufacturing systems like Ultra Short Pulse (USP) ablation or Laser Powder Bed Fusion (LPBF), which is a very versatile and flexible manufacturing technology which allows the reconfiguration of production on demand.

These manufacturing technologies typically work in layers. In LPBF 3D printing, e.g., a 3D object is formed by selectively melting one layer of powder on top of another layer. This production process is in concept very similar to USP where the material is removed instead of added forming a 3D negative, e.g., for surface finishing. This layerwise production benefits tremendously from the introduction of DSs since these discontinuous processes have the inherent feature of having to stop in between layers. In LPBF, e.g., this stop is needed to apply new powder. During this time, a digital shadow can be used that, for example, evaluates the used process parameters during runtime by analyzing the produced surface roughness through camera pictures (Knaak et al. 2021). We can therefore build a digital shadow of a 3D printed product by creating a manufacturing cycle consisting of the following repeating steps:

  • Melt powder to produce layer and collect in process DataPoints like thermal emission

  • Take picture of produced layer forming another DataPoint

  • Analyze acquired DataPoint by the use of a model to generate new production parameters

  • Save DataPoints to existing DataTrace of the product

  • Apply new powder

Reiterating through this process will form a digital 3D representation of a 3D printed product forming the basis of a DigitalShadow. Similar sensor and data acquisition concepts which allow the evaluation of the process quality during USP ablation has been developed by Zuric et al. and allow for a similar production cycle during USP (Zuric et al. 2019). Figure 4.5 shows an example process for USP ablation. The plasma that is ignited during the process and is shown on the picture can be monitored spatially resolved in order to estimate the product quality. These digital shadows for process quality are typically designed for one specific manufacturing system from one individual vendor.

Fig. 4.5
figure 5

Process emission in USP ablation can be monitored in order to form a 3D digital shadow of the produced product

Especially in laser processing, these digital shadows could greatly benefit from a domain-wide usage not only limited in a vendor-specific ecosystem. However, in order to move one digital shadow from one machine to another one, it is vital to validate the data these DSs receive. We designed a microservice USP manufacturing system that allows the plug and play movement of DSs. In this system, every sensor and actor as well as analysis algorithm can be changed during execution and on demand, making it possible to reorder DSs running on the manufacturing system.

However, changing from one vendor-specific sensor to another can have large influences on a digital shadow that analyzes this specific data trace. A single changed temperature sensor that sends temperature as integer values but now reads Fahrenheit instead of Celsius could lead to manufacturing errors or damages on the machine depending on the usage of the DS outcome. In order to minimize this effect of changed hardware setup, we proposed LISSU (Lipp et al. 2021), which allows the description of sensor produced data in order to validate digital shadows consuming these data streams. This bottom-up approach validates communication between two parties, e.g., a digital shadow and a actuator, before a communication takes place and verifies if both parties interpret the incoming value as Celsius. In case of a mismatch, either it converts the data or it disables the communication. By not only checking for syntactical correctness but also semantic correctness by the use of high-level semantic, configuration file errors can be reduced.

5.4 Automated Factory Planning

The task of factory planning is to design production systems that utilize their technological and organizational capabilities to process goods to deliver products to the customer. In today’s dynamic market environment, changing requirements demand even for adaptable factories an increasing frequency of re-planning. In addition to reduced planning times, further cost pressure in the markets leads also to more complex and iterative planning tasks. To meet these challenges, the application of digital factory methods supports the planning process with design and simulation tools. However, heterogeneous sources of factory and planning information hinder a digital interconnectivity necessary to leverage the advantages of data-based and automated approaches (Schuh et al. 2011; Burggräf et al. 2021b).

To achieve interoperability of these data sources, future factory planning needs semantic information modeling as a foundation (Kádár et al. 2013; Büscher et al. 2016). An integrative information system forms the digital representation of the factory by combination of different factory asset data in its knowledge base, as shown in Fig. 4.6. The knowledge base contains general factory information in an engineering model, e.g., production quantities and machine dimensions, and metadata such as alternative configurations in planning scenarios. The semantic structure of this knowledge is set by a factory planning ontology as a conceptual model.

Fig. 4.6
figure 6

DSs in factory planning allow for semantic information processing by linking to the ontology-based factory information system

While the factory information system constitutes the basis for a digital factory twin, DSs offer an interface for machine-interpretable data exchange. For updated information, for example, by manual planning efforts, real-time updates from production feedback systems, or newly available asset data, a digital shadow imports the relevant Updadata as data traces into the information system. The semantically correct data integration is supported, because the DSs use the factory planning ontology as their Model. In another use case, implicit planning data is automatically checked with validation rules (Burggräf et al. 2021a) defined in the specific DS’ DataCalculation model. The relevant factory information is queried from the information system. The third use case describes a planning agent that enables automated factory planning in specific planning tasks as its Purpose. An example is the calculation of machine utilization based on production quantities and resource capacities in the dimensioning of a production system. Here, the newly reasoned information, i.e., the machine utilization, is imported from its Processing Source to the knowledge base (Schäfer 2022). Conclusively, DSs in factory planning contain information that is specifically selected for use-case-specific context and requirements.

These use case examples demonstrate how digital shadows are essential to connect data sources for automated factory planning. Augmented by complementing the DSs with calculation models, semantic information modeling of relevant factory information offers digital decision support to planning experts. The presented concept of the factory information system will be extended to further use cases in the future.

6 A Method to Design Digital Shadows

A digital shadow aims to support the user in a decision-making process; thus, it needs to provide all relevant information to support informed decisions. Up to now, research lacks a method how to realize digital shadows with real data in practice. Based on our experiences from the four use cases in Sect. 4.5, we have developed a method that enables domain experts to describe digital shadows to the extent that software engineers or domain experts can realize them in software systems. The result of this requirements engineering process for a digital shadow is descriptions from the domain expert perspective, which are yet independent from the actual implementation.

To make an informed decision, we can use digital shadows. Such a decision is related to a problem, which has to be solved, data related to this problem and its relationships in engineering models, one or more solutions with data calculation and/or simulation models leading to them, and the goal and purpose of the solutions. These parts constitute a digital shadow (see Sect. 4.3).

The following method can be applied by domain experts, e.g., product designers, factory planners, or production planners and controllers. Our method (see Fig. 4.7) includes the following steps: (1) Describe the problem, (2) analyze the assets and its models (3) use or build data calculation and simulation models, (4) identify needed data traces, data points, and meta-data. There are two ways to follow this method: from a domain expert and asset-centric perspective with steps 1–4 or in a data-driven way with step 4 before 2 and 3.

Fig. 4.7
figure 7

The method to design digital shadows as domain expert

(1) Describe the problem

In a first step, we identify a decision problem of the domain under consideration. The problem is described by the scope of consideration, the possible solution scope, as well as the goal of the decision, which is reflected in the purpose of the digital shadow.

The purpose specifies the goal of the digital shadow, and there exist different types of purposes, e.g., an improvement or optimization of objectives, or information about critical failures. The identified purpose serves as a basis for deriving the necessary information requirements. If the purpose is optimization of a specific process step, the user needs information about the objectives to be achieved and the necessary parameter adjustments. When it comes to identifying critical failures, the user needs information about the failures and its effects to prevent future failure occurrences. The user needs to define relevant assessment dimensions used to evaluate the solutions offered, e.g., logistical target values and costs.

(2) Analyze the asset and its models

Each decision is related to one or more assets and the models and data available about them. As this data might be distributed among different systems and databases, models provide domain experts an abstract view on this data and allow them an easy selection of relevant aspects of the asset. By selecting relevant aspects from assets within existing models, the latter realization in software provides already a connection between the information requirements for a decision and the data sources.

(3) Use or creation of data calculation and simulation models

To characterize a data calculation or simulation model, the domain expert must describe needed input data, the calculation or simulation specification, the output data, as well as further properties. The model’s input data is described by the required data sets (attributes), data structure, and data quality. The calculation or simulation specification describes how the data should be aggregated, in which formulas should be taken for computation, or how the simulation steps should look like. How much can be specified here depends on the domain expert knowledge. The information output is characterized and described in terms of accuracy. The domain expert can provide additional properties as meta-information about the calculation or simulation specification, e.g., if the calculation should work online or offline, how accurate and precise the results need to be, and the requirements on interpretability and explainability, e.g., of machine learning models, as well as adaptability and robustness needs from the domain expert perspective.

(4) Identify needed data traces, data points, and meta-data

In a next step, the domain users identify relevant data traces of the system including its data points and metadata by providing some examples. These examples can be used by domain experts to validate the input needed for data calculations and simulations. Specific data points at different aggregation levels might be required for each decision. Their aggregation has to be defined as data calculation in step (3).

When we apply this method to a specific use case, the result is a collection of requirements for the creation of the digital shadow. In the next step, we have to move from the requirements specification in the problem space to the solution space and the realization in a software system. Within that step, preparation might be needed, e.g., if data points were specified that do not exist in databases yet. Based on these specifications, e.g., the best fitting model can be selected, or a new model has to be created in cooperation with the domain expert. Implementation details have to be specified, e.g., the concrete locations of data, or if data type conversions are needed.

7 Data and Model Life Cycles in the IoP

Knowing better how to design digital shadows, the next step is to consider the impact of DS on the product life cycle. The data and models forming a digital shadow, e.g., from Sect. 4.5, can be used throughout the life cycle of a product, namely, Development, Production, and Usage.

Up to now, data and models tend to stay within these phases (see Fig. 4.8, left) and are often not even interconnected within each phase. Data is stored in data silos and not shared over the lifetime of a product (Brauner et al. 2022). To enable worldwide production labs, we have to extend the life cycle of data and model in various dimensions (see Fig. 4.8, right):

  1. 1.

    Sharing of data can be realized by using digital shadows, which encapsulate relevant data parts, link them to models, and give the stakeholders full control over their sharing when realizing privacy-ensuring mechanisms.

  2. 2.

    Models and data need to be connected for more powerful analyses and real-time monitoring of processes. This can be realized in own models or ontologies or incorporated within a digital shadow.

  3. 3.

    Models should be reusable within the same phase, e.g., a simulation model for a product can be reused for similar products with specific parameters, or within the whole life cycle, e.g., the simulation model of one machine can be used in development and in the production environment to check the parameters to be.

  4. 4.

    Models should be evolvable over time such as the assets they represent, e.g., allow for additions or changes.

Fig. 4.8
figure 8

Data and models within the product life cycle

However, the creation of DSs within software systems utilizing models and data does have a life cycle as well: data acquisition, data calculation or simulation model formulation, integration, and adaptation. All of these phases place different requirements on the underlying infrastructure, which needs to be able to fulfill all these requirements in order to allow a hassle-free adaption of digital shadows.

During the Data Acquisition phase, the initial data trace is aggregated in order to build the foundation for a data calculation or simulation model. Depending on the purpose of the DS, the frequency of this aggregation can vary from a few data points per hour up to multiple GHz. Also the amount of used data traces varies. In a manufacturing planning scenario, it would make more sense to use multiple data traces with a relatively low data rate, while a production digital shadow, which could be used in laser processing or injection molding, typically requires less sensors but at higher data rates. Handling the sheer amount of data can be a challenge in itself and put high stress on the underlying infrastructure especially regarding persistent storage and bandwidth (Thombansen et al. 2021). The required skills in this phase typically involve domain-specific knowledge of the use case, knowledge of networking, and domain-specific APIs as well.

Afterward, during data calculation or simulation formulation, the acquired data is analyzed and used in order to produce a working DS. Some models like simulations and machine learning algorithms require large amounts of processing power in order to fulfill this step. Also the iterative design of such models can require large amounts of domain-specific knowledge as well as software engineering knowledge.

Integrating a digital shadow in running operations can become one of the largest challenges. They not only need access to the – sometimes live – data trace(s) in an efficient manner but also need the processing resources in order to fulfill its task. Depending on the workload the digital shadow meets, it might make sense to scale the DS up and down in order to adapt to incoming request changes. That is why we propose to run digital shadows that need this kind of scalability in a cloud data center or an on-prem edge data center. Another requirement comes from the vast amount of different digital shadows in a cooperation environment. Having an underlying organization and orchestration infrastructure that allows not only the scaling but also the discovery of deployed DSs is vital.

Production requirements change over time. Product portfolios are updated or discarded entirely, which leads to the last phase: Adaptation. Here, the digital shadow is modified if a change in the purpose of the DS is detected. This can lead to the deletion of data traces in order to save cost, updating models or scaling computing resources up and down depending on the need of the DS. Version control becomes a vital part of this scenario not only of the deployed models but of the whole digital shadow that needs to track all elements of the reference model.

These technical and domain requirements and the connections over different product life cycle phases show that a multidisciplinary approach is necessary to create worldwide production labs.

8 Outlook: Using Digital Shadows in Digital Twins

Digital shadows need software systems to manage them (Brecher et al. 2021); their initial setup, population with data, and deletion; and their evolvement, versioning, and sharing. Such functionalities can be integrated into one system or distributed among different services. One solution that could integrate these functionalities are digital twins (Kritzinger et al. 2018; Bibow et al. 2020). However, digital shadows can also exist without surrounding systems, if considered from the data sharing perspective and reduced purely to the aggregated data, metadata, and connected models. The original systems in the context of the IoP are CPPS or their subsystems (Feichtinger et al. 2022); however, further approaches discuss digital twins of organizations or humans. We distinguish three types of digital twins, whereby a DT can evolve across these three types: (1) “as-designed” digital twins exist during design (including technical design and simulation), (2) “as-manufactured” digital twins exist during construction, and (3) “as-operated” digital twins during runtime of a CPPS. In contrast to a digital shadow, the digital twin is able to influence the CPPS (van der Aalst 2021), e.g., via self-adaptive functionalities (Bolender et al. 2021; Dalibor et al. 2020).

Digital twins can include different services using DSs, e.g., cockpits for visualization (Dalibor et al. 2020; Michael et al. 2022), process mining methods such as process discovery and prediction (Brockhoff et al. 2021; Bano et al. 2022), machine learning and AI methods (Liebenberg and Jarke 2020; Dröder et al. 2018), assistive services for human support (Michael 2022), supporting the assessment of sustainability targets (Fur et al. 2022), or services to compare DSs and their meta-information. Such services are implemented by oneself or integrated from a service catalog (see Sect. 4.4). We can support the setup of digital shadows within low-code platforms (Dalibor et al. 2022), and the digital twin could provide functionalities for versioning and evolution of digital shadows (see Sect. 4.7). Most of these concepts are not yet widely used in the industry, but our research within the IoP is trying to pave the way.

Moreover, we have identified a set of open challenges within two areas, which should be considered in the future: aspects to be realized for the applicability in worldwide production labs and challenges for improving the user experience when using and creating digital shadows and digital twins.

Challenges for the applicability in worldwide production labs

These aspects need to be taken into account to ensure the usability of DSs within worldwide production labs for which different companies with multiple factories exchange data based on defined conditions.

  • Privacy concerns of data: When handing over data, even it is only a restricted amount of it, the data provider wants to ensure that, e.g., the data is not used in another purpose than specified, stored longer than agreed on, or shared with third parties. Privacy policies allow data owners to control their privacy concerns and to monitor the compliance in supporting software systems. Thus, we have to incorporate relevant privacy concepts (Michael et al. 2019b) within the DSRM and define what components software systems such as the digital twin need to handle such digital shadow requests and related decisions (Michael et al. 2019a) while considering important privacy design patterns (Hoepman 2014) and the research of the International Data Spaces Initiative (Jarke 2020).

  • Selling digital shadows: Given the shadow’s purpose and the specification of the asset it works on, the digital shadow provides an interface for reusability. A DS, once designed and implemented, is itself a valuable property. It gathers new information in a smart and fast manner to fulfill its purpose. A company specialized in the remanufacturing and sale of this trade good could make use of this property. What then remains to be done is to precisely adapt the digital shadow to a new application. After a customer provided their asset specification, the communication interface needs to be implemented and models can be adapted to fulfill a slightly modified purpose. If the asset specification and purpose were enriched with semantic terms (see Sect. 4.4), this process could even be automated using previous implementations. However, further research on business models and software services supporting the adaption is needed.

Challenges to be met for a user-friendly handling of digital shadows

Designing DSs is a complex task and often requires close collaboration between domain experts and software engineers. We need to tackle these additional challenges to make digital shadow engineering as applicable as possible.

  • Reusable model repositories: One of the key elements of our digital shadow is the usage of models to describe the asset’s structure and behavior or to specify how the DS itself acts. Once specified, models describe a specific part of the DS and can be reused in other digital shadow designs as well. Having digital models in private or public repositories (see selling digital shadows) allows for an easy selection and creation of new composite models. To make this possible, all models need a semantic description of what it is supposed to stand for. In case of models meant for execution, such as calculation specifications or simulations, interfaces for input and output must be provided. These repositories of reusable models contribute to a user-friendly and domain expert understandable digital shadow engineering.

  • Automatic derivation of DSs from engineering models: During design time, the system’s structure and behavior are specified in engineering models. They describe in detail how the system is supposed to act and which parts of the system are of interest. We could use those engineering models to automatically generate digital shadows, e.g., we could generate the extraction of information of important system components from structure models or generate views on them (Gerasimov et al. 2021). When given 3D models, automatic behavior simulation could be possible. Nonetheless, all engineering models have to be set in context to the actual system and need to be enriched with their purpose information.

9 Conclusion

Within this chapter, we have presented the foundations of digital shadows: what concepts constitute them, their relations to ontologies, how to guide their creation from the domain-specific user perspective, and and how digital shadows can be integrated over different environments considering the product life cycle. We have further investigated four use cases and presented how digital shadows can support the challenges in these domains. Moreover, we give an outlook into what aspects have to be realized in software systems to create and manage digital shadows.

We envision worldwide production labs that foster cross-domain collaboration and are enhanced by sharing digital shadows that support decision-making, and we encourage DS reuse in other production scenarios. This requires for the different stakeholders to be able and willing to share data and models, and it requires from research to provide the needed concepts and technologies such as digital shadows and digital twins.