Implementing digital twins in existing infrastructures

Digital twins can offer various added values for companies. As part of a three-year research project, we are investigating the methodological approach, for building digital twins in existing infrastructures. In particular, the functional requirements of future users will be addressed, as this is less focused in existing approaches. Within the framework of this publication, we discuss the applied methodology as well as the created models and concepts. Initial insights were gained in the simultaneous development of digital twins in parallel projects with use cases for electric motors, production process monitoring and maintenance of gas turbine components. In detail, it becomes clear that software development methods (e.g. use cases, user stories, scenario development) are a good way to describe the expected added value functions. It is essential to involve the future users in the development as early as possible. Transferring the necessary functions identified in this way into a functional architecture shows that this architecture is mostly independent of the use case. Likewise, the IT systems used here hardly vary at all. Overall, it shows that a methodical approach can be followed in the development and the implementation can have a high degree of similarity, even in very different use cases, while the exact design, depending on these use cases, is very diverse.


Digital twins and fields of application
The usage of digital twins in the industry follows an increasing rate. According to a study by 2020, the focus is on the mapping of products in digital twins, and less on production [1]. Alltogether, digital twins serve as integrating access points to information; in addition, functions of quality assurance and (failure) analysis are named as dedicated functions of digital twins.
The definition of digital twins is as diverse as their application. For the purpose of classification, the central terms are briefly explained here. The concept of Damerau [2] is used for this purpose. A digital twin consists of two sets of information: the digital shadow and the digital master. The Digital Master describes the expected behavior, the expected function, geometry or similar in an instance-related way. The Digital Shadow represents the actual behavior, function, geometry or similar. By comparing these sets of information, added values can be created. This is done in the functions of the Digital Twin. These can be implemented as learning algorithms, simulations, visualizations or simple calculations. According to Lindow [3] these functions can be distinguished in different integration depths: Observation, Analysis, Visualization, Simulation, Interaction and Integration. In the context of using the functionalities, there is generally some kind of feedback effect on the product. This can be direct, for example controlling, but alternatively long-term effects such as changes in designs in next generations. Lindow emphasizes that a key challenge for Digital Twins is data integration across the lifecycle from product planning to end of life [3], which is valid for both sets of information described.
In the applications considered here, the focus is on both production and products. Those have in common that existing productions and products are to be supplemented by digital twins. Accordingly, approaches are being pursued that focus less on adapting the products or productions than on developing the twin functionalities. As part of the procedure described below, potentials for the use of digital twins are identified and functionalities derived from them.
In one major use case, an electric drive is considered, which is to be supplemented by a digital twin. The focused functions to generate added values within this use case are monitoring functions, feedback to design, and the conversion of the business model. The monitoring functions compare the actual behavior of the machine with the expected behavior. The machine behavior is represented by speed, torque, electrical power, temperatures in windings and housing, as well as vibrations in housing and bearings. The comparison of the target and actual behavior is carried out on the one hand using machine-specific characteristic values, and on the other hand using machine-specific simulations, which represent the machine behavior in the respective environment and operating scenarios. The functions for monitoring operation are also part of the feedback to design. In addition, for the unexpected behavior of the machine, it is possible to analyze recorded data visually or statistically at a later time. The focus by the change of the business model is selling the machine performance instead of the machine itself. The monitoring function (to ensure the operation) and the calculation of the actually delivered performance are used for this purpose. In addition to these core functions, a data provision function is provided, which makes documents and models available, for example for the production and maintenance of the machine.
A second use case considers gas turbine blades that are prepared for further operation in maintenance, repair, and overhaul (MRO) processes. The focus here is on the data integration of the various machining processes, analyses and documentation in order to enable associated services. The existing processes are currently accompanied by handwritten documentation. By providing the digital twin, the MRO processes are to be designed adaptively. On the one hand in basic decisions whether turbine blades can be repaired, on the other hand in adaptive processes, which can reduce the energy input into the turbine blades by knowing previous machining paths. Finally, the correlated data can be used to analyze damage types and predict damage in further operation.
A third use case addresses additive manufacturing (AM) of gas turbine components. The focus here is on data acquisition for failure analysis and process-integrated simulation. The aim is to be able to predict the manufacturability and achievable quality of components in higher detail through the improved data situation. Shadow data is collected during the manufacturing process and correlated with geometry and process characteristics. The idea is to identify statistical significances for quality parameters.
In the following, we will reflect on why a user-centered method was applied and whether the methods presented below contributed to the development of digital twins in these three application fields.

Development of digital twins
Numerous procedures and approaches to the development of digital twins have been formulated, which in turn are characterized by the respective ideas and purposes of use. Research into development approaches for Digital Twins revealed the approaches and focus areas presented in the following. Glatt et al. propose to follow an Model-based systems engineering (MBSE) approach in development and present the implementation on a material transport system [4]. It is emphasized that the Digital Twin in this case "is not seen as an add-on, but as an integral part of a CPS [(Cyber-Physical System)]" [4]. Furthermore, the high number of necessary components for digital twins is mentioned as a central challenge, which reasons the approach followed. It is not explained why dedicated functions are enabled for the twin, although this could have been done in earlier processes according to MBSE. Kang et al., on the other hand, focus on the issue of designing integrated simulations and monitoring-relevant parameters [5] without comprehensively addressing the necessary partial models or IT infrastructure, which can be reasoned by the high individ- Fig. 1 Digital Twin Development Activities uality of the considered products (bridges). Here, too, the identification of necessary functionalities is missing or precedes the illustration. Kaul et al. mainly consider data-based modeling and the use of ML algorithms [6]. A consideration of infrastructure and dedicated value-added functions at the user's end are not addressed. Klein et al. name, as do Glatt et al. the challenge of a high number of models [7]. From this they derive the necessity of a continuous information flow, while the heterogeneity of used simulations is met with the approach of co-simulations. Again, however, they do not present why these features in particular are developed.
Liu et al. propose to follow the natural phenomenon of imitation, highlighting three points [8]: Similarity to be as similar as possible to a dedicated object, variation as the variability necessary for this, and the complexity of adaptation through the coupling of partial models. Accordingly, part of the development is moved to automated adaptation. The preparation of models for this system is essentially based on approaches to enable adaptation of existing models to ongoing imitation. Again, the choice of functions remains unclear.
Other approaches, by Luo, Miller, Wang, Zheng focus almost entirely on the design of the necessary models without elaborating on a holistic development method that also addresses IT infrastructures or the added value for users [9][10][11][12].
Thus, there is currently a lack of approaches that precede the actual twin development, describe the necessary functions in a solution-neutral manner, and develop a reasoned architecture. The approach followed and presented below is applied to existing systems. It is assumed that these are not to be completely redeveloped, but that the digital twins are to be integrated. The insights gained from digital twins can nevertheless serve more efficient and reliable operation, contribute insights to next product generations or enable new business models with existing products. Furthermore, the focus is on the added value based on digital twins and the functions and IT solutions required for this.

Methodical procedure
The approach followed here is based on a serial procedure with partially parallelizable activities (see Fig. 1). The starting point is the development of software systems as presented by Cockburn [13]. In comparison to the previously described approaches, this approach starts with the functional requirements of the future users. In order to take into account the existing infrastructure and processes, Cockburn's approach is supplemented by data flow analysis [14]. Cockbrun does not consider the development of necessary models for information and knowledge representation. This Starting point for the development are user stories, which represent the needs of the future users. Analyzing these, scenarios are developed in which the necessary functions are hierarchically structured and, if necessary, modularized [13]. This in turn forms the basis for the subsequent structure of the use cases. These are supplemented by the parallel analysis of the infrastructure by means of a data flow analysis. The necessary architectures can be developed on this conceptual basis. The implementation of the previously defined user stories, scenarios and use cases is continuously checked.
The designed digital twins and the architecture can be further used to define the future data flow. This is updated by the increasingly detailed sub-concepts of the digital twin. With sufficient assurance, implementation of the necessary core functions and application functions then begin. Here, too, it is generally determined that previously defined concepts cannot be implemented or cannot be implemented efficiently, so that previous steps are gone through again. The procedure is concluded by checking the implementation of the user stories described.

Definition of user stories
The representation of future users' needs was done using the standardized format of user stories (I as <role> would like <function> to <value>) (Fig. 2a). While this standardized format clearly supported documentation and comparability, the initial challenge was to achieve a common understanding of Digital Twins. This could be achieved by a simple graphical communication on the one hand and a high empathy for the tasks of the respondents on the other hand. The recording was carried out in the first and third scenario (electric drives and additive manufacturing) in interviews with individual representatives of specified roles (sales, development, design, procurement, production planning, manufacturing, service, method and process managers). In the second scenario, the focus was on recording in workshops. Here, various roles involved in the process were simultaneously asked to describe possible functions. A comparison of the approaches shows that the interview variant led to more specific and detailed requirements, while the workshop variant led to greater heterogeneity and, especially, to a more comprehensive understanding of potential. In this first survey, the function descriptions are still very generalistic and intentionally not described in their technical implementation. The result of this first step are functional descriptions such as "As a developer, I want to obtain information about the behavior of the machine in the field in order to develop durable machines".

Definition of scenarios
The user stories described were then converted into scenarios (Fig. 2b). These describe the functions required to achieve the expected added value in individual steps. Furthermore, the description remains neutral with respect to the actual implementation. The scenarios were described according to the suggestion of Cockburn [13] in standardized dimensions (goal, scope, level, pre-conditions, success end condition, failed end condition, primary actor, trigger event, main success scenario and variation scenario; in the figure only the titles, goal and main success scenario). Initial difficulties of aggregation and diverse understanding of these dimensions became apparent among the engineers involved. However, on the one hand, this comprehensive description enabled the coordination between the involved. On the other hand, the step-by-step breakdown made it possible to easily identify the similar described functions. Furthermore, it was additionally considered which main artifacts, i.e. data and information models, are used within the scenarios. Although this was done methodically too early and therefore very categorically, and partly implicit solutions were required, efficient communication quickly emerged on this basis. No differences in the approach and its feasibility could be identified between the applications considered (electric drive, MRO and AM). As a result of this second process step, scenarios emerged which described the functions in sub-steps in more detail.

Modularization of the scenarios
The goal of this process step is to identify similar and identical subfunctions of the previously described scenarios and to make dependencies clear (Fig. 2c). Cockburn describes the hierarchical differentiability of use cases at various levels [13]. In his example of the Material Store, he establishes a hierarchy in which use cases of product weighing, material inspection, or new creation form the basis for more complex use cases such as inventory checking or, even more generally, warehousing. The downward hierarchy details how a use case is fulfilled and the upward hierarchy details why a use case is necessary [13]. This approach was easy to apply due to the standardized description mentioned above. Quickly, a hierarchy of scenarios emerged from the described scenarios, which together fulfill higher functions. The necessary sub-functions (components logging, send and prepare sensor data, life cycle prognosis, vulnerability analysis) can be derived from the scenario for the optimization of field data. These functions are also used by the scenario for lifetime prediction and MRO planning. For this case, dedicated functions are added. Due to the integrated consideration of the artifacts, a necessary link for data consistency could already be estimated. No differences in the approach and its feasibility could be identified between the considered applications (electric drive, MRO and AM). The result of this step was a hierarchy of necessary functions to fulfill the user requirements.

Analyze data flow
Accompanying the previous steps, the existing value creation system and product system in which the digital twin is to be integrated were analyzed. The data flow analysis was used for this [14]. Within the framework of this, the existing processes, organizational structures and roles, IT systems, as well as data and information models and the activities taking place were described in an integrating model, Fig. 4 Functional description for temperature monitoring as presented by Seegrün [14]. The resulting model allows to plan the data sources and sinks of the digital twin. To consider existing software systems and to plan or reuse necessary interfaces. In addition to these infrastructural issues, the resulting artifacts, i.e., information and data models, as well as subproducts and products are described. In this step, a complete picture of the existing IT infrastructure, the processes that take place in it, and the (partial) results that emerge is created.

Definition of use cases
Like the previous steps, the description of use cases has been described extensively by Cockburn [13]. The resulting breakdown of sub-functions, their interaction, and the assignment of roles provides a detailed view of the necessary implementations. In particular, and in contrast to Cockburn's proposal, added value could be achieved by integrating several systems in one diagram. Thus function distributions, as well as resulting interfaces and bottlenecks became visible and could be discussed. In the comparison of the applications (electric drive, MRO and AM) it could be determined that the necessary functions, in particular on the lower hierarchy levels, are identical or very similar (data acquisition, procurement, storage, visualization, provision, storage, processing, connection of edge systems, connection of services). Even at the level of the necessary data communication, no comprehensive differences could be identified (communication of small and fast data packages and large information-rich data packages). Even in the core components of the digital twins, the same elements were necessary. As a result of this process step, a diagram was created that locates the functions in the various IT systems and describes the roles and services involved.

Architecture design
Based on the integrated use cases, the described functions and interactions between the systems, a functional architecture (see Fig. 3) of the digital twin could be developed. As already described, the functions and their distribution showed such a high degree of similarity that the same function architecture can be used for all three applications, even with different field of applications. The variance results on the one hand from the models used (e.g. for data preparation, simulation and computation) and on the other hand from the peripheral data-holding and data-taking systems and services.
In the next step, IT architectures were derived from the functional architecture. In the process, the diversity of the applications became clear. The respective existing systems, as described in the data flow, partly took over the necessary functions of the digital twin (for example as external services), and partly represented necessary data sources and sinks.
In the case of the electric drives, simulations were implemented for temperature distribution and for describing power losses that occur. Edge data is processed within the framework of Node-Red and through MQTT connection. The PDM (Product Data Management) system and a database for describing development parameters (e.g. calculated voltages and currents, geometrical parameters, masses, inertias, tolerated temperatures) were integrated as existing data stores. An application for recording quality parameters in production was provided as an external service. The communication to this was done via MQTT. The provision of data for the service is done by means of REST queries. The visualization of the operational behavior is implemented using an IoT platform.
In the case of the MRO process, simulations of the frequency behavior of a blade are integrated. The processing of process data is done in the context of Node-Red and by MQTT connection. ERP and PDM, as well as material and operational database are integrated as existing data stores. 3D scanning, damage prediction and damage assessment are integrated as external services. The communication to these is done via MQTT and REST. The provision of data for individual process stages (worker assistance, coating and decoating) is carried out using REST interfaces. Visualization of process parameters is implemented using an IoT platform.
In the case of the AM process, simulations of the temperature behavior of the melted-out powder are integrated. The processing of process data is done within Node-Red and by MQTT connection. ERP and PDM are integrated as existing data stores. External services or data provision are not planned. The visualization of process parameters is implemented by means of an IoT platform. Overall as a result of this step, functional and IT architectures for the digital twin were created to implement the described functions.

Subsystem design and implementation
Based on the IT architecture, the necessary subsystems could be described. The starting points were the scenarios and use cases as well as the IT systems, data sources and data sinks identified in the data flow analysis. In a comparative diagram (Fig. 4), a complete picture of the data processing was generated for each scenario. On one side was the origin of the digital shadow, for example as a sensor, on the other the origin of the digital master, for example as a data source for development parameters. In the step-by-step description of necessary data manipulation (e.g. selection of data from table, calculation with parameters, formation of average values, configuration of simulation, execution of simulation etc.) the necessary functionality was mapped. This was then transferred from this pseudo-code into function blocks in Node-Red or other IT systems involved. In addition to the function description, the necessary ontology for data creation could also be designed from this.
Within the scope of this process step, the algorithms, interfaces, data and information systems were completely described. Particularly at this and the subsequent implementation level, there were repeated findings of the necessary changes, so that it was necessary to go back to previous steps. The changes did not necessarily result from the de-sign space of the digital twin, but in part also from the linked services or peripheral systems. In this step, the tasks in the applications were performed completely differently according to the variant infrastructures. The result of this process step was a further detailed breakdown of the necessary partial functions, data and information models, and simulations based on the selected architecture and given infrastructure. In addition, these were implemented in the respective IT systems.

Definition of the data flow
The new functionalities of the digital twins change the value creation system in which they are used. Accordingly, new data flows have been defined based on the example of Seegrün [14]. These describe both changed processes and the new IT systems, information and data models. In the context of the coordination between the developers but also with the responsible persons for the subsystems, the data flow model could be used to assist communications. The overview of functions and data processing provides a good basis for the discussion of various solution approaches. The adapted modeling of the initial model could also be used to communicate the necessary changes and resulting added values. More important, however, was the assurance that the designed IT systems of the digital twin would lead to a way of working that could be accepted by the personnel concerned. The result was a model for describing adapted value creation using digital twins.

Integration and test
The procedure was concluded with the integration of the partial solutions and their testing in the isolated and integrated state. As expected, this also resulted in necessary repetitions in the methodical procedure. The initially described user stories, scenarios and use cases were used as a reference point for function fulfillment in order to evaluate the quality of the implemented functions and added values. This final process step resulted in validated functions according to the formulated user requirements.

Discussion and next steps
It was investigated whether the procedure presented here, based on Cockburn's procedure for software development and extended by data flow analysis, can be used to develop digital twins for existing infrastructures. As shown in the process, the procedure supports the development of the necessary functions and models. This is oriented less to the product system and more to the necessary function for the user, which is to be achieved by the digital twin.
The extension to include data flow analysis allows the existing value creation environment and product systems to be considered. However, the extent to which this approach is useful for integrated twin development in the context of product development remains unclear.
The sequence of user stories, scenarios and use cases is consistent, also in terms of information and methodologically easy to understand. In the course of the necessary iterations, it became apparent that models that are actually consistent with data would significantly accelerate the conceptual work. The modeling of the stories, scenarios and use cases, which was only carried out in tabular and graphical form, did not support the adjustments in the way that would have been possible with real (data-consistent) models. Accordingly, an IT technical environment for twin development would be beneficial.
Similarly, while the creation of the twin models and functions themselves was well supported by the representation method, was understandable, and communicated well, it also lacked a modeled relationship to scenarios and use cases, as well as IT technical support in customizations.
The increasing variance of activities across the methodological flow in the comparison of the three use cases is not surprising, according to the varying fields of application. What is rather remarkable in the implementation is the high degree of equality of the necessary core functions. Accordingly, it is possible that the categories found are generally useable. Also, the similarity of the functions validates the approach of frameworks in which diverse twins can be operated and developed, which should be considered in more detail in subsequent investigations. The adaptation of digital twins or their maintenance in the further product life also remains open at this point. It could be investigated how other approaches to software development could be transferred.
Foundations of the described work has been developed in the scope of a project of the Werner von Siemens Centre (https://wvsc.berlin/) WvSC.EA "Electric motors 2.0" supported by the European Regional Development Fund (EFRE).
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4. 0/.