A Holistic Digital Twin Based on Semantic Web Technologies to Accelerate Digitalization

. This proposed research is concerned with the development of a Semantic Web representation of supply chains deploying and manufacturing semiconductors. The so-called digital twin is a prototype for virtualizing the electronic component supplier (ECS) value chain, aiming for a holistic digi-talization across the entire product life cycle. With its unique advantages, Semantic Web will contribute to overcoming current Big Data issues and to getting one step closer to smart factories. This includes enhanced smart collaboration between both business and project partners. Furthermore, it is shown in ﬁ rst business use cases how the developed ontology is supporting selected applications to manage data more ef ﬁ ciently and to build a comprehensive knowledge representation.


Context and Motivation
The emerging fourth industrial revolution towards both Industry 4.0 and smart networks includes the progressive approach of Internet of Things (IoT) as one main paradigm. One basic principle behind IoT is the connection and remote interaction between (smart) objects, called things. This mainly includes numerous steps of data processing and hence for the initial data generation it requires integrable sensor devices (Andriopoulou et al. 2017). Improvements during the last years let these objects become smaller and enable enhanced collaboration of such embedded devices. In this regard, things include both physical and virtual entities that need distinct identification for efficient interaction and to ensure real time communication (Siozios et al. 2017). It is important to maintain and improve well-defined environments for uninterrupted communication and consequently high performance. Furthermore, a growing number of domains trying to be part of an interconnected world leads to the need for flexible, scalable and versatile approaches to overcome emerging complexity issues (Mujica et al. 2017). This also holds true for automation approaches that make use of embedded sensors and other smart devices (Rumpl et al. 2002). Furthermore, involving data plays a central role in the future. This is already the case in consumer sectors and is highly emerging in the industrial environment as well. Consequently, it leads to a data boom that entails critical impact on industry due to Big Data issues, namely volume, velocity, veracity and variety, among others (Hofmann 2017).
The digital transformation of business and society offers enormous growth potential for Europe. Digitalization is both an enabler and driver of fundamental disruptive business innovations. European industries can build on their strengths in advanced digital technologies and strong presence in traditional sectors to seize the range of opportunities offered by technologies such as the Internet of Things, Big Data approaches, advanced manufacturing, blockchain technologies and artificial intelligence. Integrated Development 4.0 (iDev40) as a European Union funded project is focusing on digitization of integrated product development processes as well as value chains. Nevertheless, despite large initiatives and associations like Industrial Data Space (Fraunhofer IDS) and Platform Industrie 4.0, digitalization is slowly adopted by European companies. One reason is, among others, that humans understand the digitalization papers generated in the European research community but computers do not. Furthermore, one is observing slow broad-scale adoption in the industrial space whereas digitalization is gaining more attention in other environments. This is among others due to persistent uncertainty about specific business and application cases. Additionally, Big Data issues named above are not yet overcome comprehensively. The holistic digital twin based on Semantic Web technologies is an approach to bridge the gap between the physical and digital world in order to accelerate digitalization. This basically leads to IT systems being enabled to process information from web sites and other data resources in order to recognize relationships and dependencies between pieces of data. Hence, one is able to make implicit knowledge explicit and link data from different data resources effectively. Although this was very successful for search engines (Google, e.g.) and in social networks (Facebook, e.g.), it has not been applied in the industrial spacealso known as B2B environmentin a large scale yet. Summarizing, one requires a semantic annotation of the web and existing relationships for digitalization progress in industry environments (Baumgärtel et al. 2018).

The Semantic Web
In order to overcome current hurdles described above, we propose the application of Semantic Web technologies. Semantic Web provides a powerful toolset to define and maintain a controlled vocabulary of processes, roles, objects and interactions. The Semantic Web expands on the current World Wide Web (WWW) framework. Linked open data allows for data to be read and interpreted by both humans and machines, consequently better enabling cooperation between computers and humans. While the traditional WWW links information via human-readable documents encoded in HTML, Semantic Web links information on the data level using the Resource Description Framework (RDF). Therefore, it is machine-understandable and -interpretable and hence improves data analysis possibilities as well as knowledge extraction (Dustdar and Falchuk 2006;Mane et al. 2019). First introduced in 2001 by Tim Berners-Lee, the dual intelligibility between humans and machines provides a tremendous opportunity as an enabler for emerging technologies such as blockchain, Big Data analytics and automation. Semantic Web is also a big step towards advancing several of the design principles of digitalization and Industry 4.0.
The various building blocks of Semantic Web are depicted in Fig. 1, showing the Semantic Web Stack that is created by Berners-Lee and builds on standards of the traditional WWW. Semantic Web uses RDF to represent semantic knowledge, thus allowing to model resources and the properties and linkages that are defining them. Data is expressed in the form of triples, containing a subject, predicate and an object. A subject, also known as a resource is the object or thing of interest. The predicate, also called property, relates the subject and the object with an attribute. This is done with either an object property or data property. An object property is used to relate a resource with another resource. Similarly, a data property is used to relate a resource with a piece of literal data. There is a unique identifier (URI) for each resource and property, and the URI refers to the address where the defining ontology is stored. In an ontology, a resource corresponds to a class or sub-class, which holds a set of elements. Furthermore, instances are individual objects belonging to a class. The Resource Description Framework Schema (RDFS) goes beyond RDF, by allowing for property and class vocabularies and the generation of hierarchies. A further extension of RDF and RDFS is the Web Ontology Language (OWL). OWL expresses the RDF construct and allows for even more descriptive properties, relationships and class descriptions. OWL is developed and supported as the standard language by the World Wide Web Consortium (W3C) that additionally defines a broad set of standards in the Semantic Web community (Berners-Lee et al. 2001).
Semantic Web uses ontologies and languages such as OWL for class, property and individual definition as well as determining their relationships. When data is represented in an ontology, this facilitates information sharing and collaboration between machines and humans. A common understanding through ontologies helps to reduce potential misunderstanding. By encoding information using an ontology language, the implicit specification of knowledge can be interpreted, extracted and made explicit. Thus, it enables advanced functions being performed on the data. Protégé is one of the primary ontology tools in use. A compatible software like Protégé allows information to be embedded without the need of specific expert knowledge initially (Lacy 2006).
While developers of an ontology can define certain relationships and properties related to classes, reasoners provide the ability to make additional inferences, thereby providing explicit deductions from implicit information. These deductions can be thoroughly explained and reviewed by tracing each step and accompanying inference rules. Another feature of reasoners is the ability to review an ontology fore.g. logical inconsistencies. Furthermore, data stored within RDF ontologies can be extracted and manipulated using SPARQL. This is accomplished by information selection based on graph assemblies, thus providing filter strategies based on logical comparisons. Reasoners and SPARQL queries are means that leverage the possibilities of a machinereadable form. Conversely, the WebVOWL tool caters to the human operator by providing an interactive visualization of ontologies. An ontology with all its entities can be graphically represented by implementing the Visual Notation for OWL Ontologies (VOWL). The application is interactive, allowing users to customize the spatial arrangement of classes (i.e. nodes) through the Pick & Pin mode. The customized ontology visualization can be saved and shared in the JavaScript Object Notation (JSON) file format, thus improving human readability of the respective representation (DuCharme 2011; Antoniou 2012).

Related Work
Semiconductor industry is in many respects strongly linked to the tremendous advance of digitalization and digitization approaches. First, its products and final goods of other downstream tiers serve as enablers of technology in general. Semiconductors are part of every device that drives digitalization and the technological world of the future. This is of course not limited to the domain of sensors but includes all microtechnology based devices, tools and equipment. This leads to the second connection of semiconductor industry, which holds true for many manufacturing industries and especially for dataand knowledge-intensive members. During almost every step of the product life cycle data is generated or used and hence serves as a potential source for the respective nature of a product in a later stage (Hesse and Schnell 2009). To ensure that all relevant process data is generated, maintained and accessible, it is important to provide a sufficient framework. For fluid and uninterrupted data exchange, an automated and fast responding connection between involved parties is required. With regards to automation, one aims at connecting either all systems that process the requested product or the product itself (Leite et al. 2019). In industrial space, this includes systems involved along the entire product lifecycle, which is to a large extent covered by the supply chain, respectively the supply network in a broader sense. Emerging new technologies like cloud-based automation will play a central role in connecting smart entities throughout the product engineering process in an intelligent way (Mahmoud 2019).
Semantic Web technologies are already part of promising approaches in certain areas like e-commerce or health care and life sciences to name but a few. Nevertheless, a broad-scale application in the industrial space lags behind (Petersen et al. 2017). Although current research for industry applications covers many of the issues that appear crucial in relation to Semantic Web technologies, state of science and technology is considered to be limited to proprietary implementations with regard to industrial environments (Leite et al. 2019). In the area of supply chains in general and Supply Chain Management and Planning more specifically, some ontology-based models have already been introduced in the research community (Ye et al. 2008, Zhai et al. 2008, Scheuermann and Leukel 2014, Pal 2017). More recently one focus is, among others, on managing Retail Supply Chains with the help of service-oriented computing (SOA) based on Semantic Web (Pal 2018). Regarding manufacturing environments, research is emerging and facing potential solutions for overcoming hurdles like lack of awareness, successful use cases as well as technological issues. For instance, M2M communication is addressed, however lacking proper automation features (Gyrard et al. 2014).
Furthermore, recent research on Semantic Web technologies in the industrial space with special focus on automation, namely for semantic interoperability, is facing challenges and is to some extent limited to partial automation (Svetashova 2018). Recent approaches for semantic integration of sensors for enhanced automation are either restricted to a certain application and complexity layer (Petersen et al. 2017) or they are present in domains where data structures are less complex than the manufacturing environment and therefore considered more suitable for comprehensive digitization (Gray et al. 2011). Present research publications mostly cover specific use cases, yet a generic approach that is applicable to multiple scenarios is absent. Moreover, both holistic representation and integration of important standards that facilitate a broad scale implementation of Semantic Web technologies appear to be missing in the B2B environment.

Approach
By closely interlinking development processes, supply chain and production with semantic technologies, the iDev40 digital twin targets to achieve a disruptive step towards a digitalized product life cycle. The intermediate Semantic Web representation called digital reference serves as a basic concept to build the digital twin on. The digital reference is a result of the Horizon2020 ECSEL joint undertaking Productive4.0, being a complementary program to iDev40. Integrated Development 4.0 leads the digital transformation of singular processes towards an integrated virtual value chain based on this model. Development, planning and manufacturing will benefit from the digital twin concept in terms of digitized virtual processes along the whole product lifecycle, for instance via semantic-based supply chain analytics. Hence, it acts as an enabler to validate AI approaches in a variety of areas, including the prediction and simulation for development lots. It will also support structuring the build-up of learning skills (iDev40 2019).
In our understanding, the developed digital twin is representing all relevant data that is created throughout the entire product life cycle. This includes planning and development steps, production and delivery phases as well as data during the actual use of a product and its recycling, i.e. post-production. Possible applications include virtual testing and simulation, predictive maintenance and production failure analytics; among many others. In order to realize an uninterrupted linkage, one requires a unique virtual description of all involved entities (i.e. products, machines, etc.) and their relations. The need for a machine-and human-interpretable language leads to the promising approach of Semantic Web. Beyond the ontology of the digital twin, the challenge is linking networks with solely implicit connections and further optimizing the result. Moreover, it is of high importance to ensure highest quality with regard to validity and accuracy of the respective ontology. In order to fulfill these requirements, the intention is to use semantic web reasoners. Furthermore, the reuse of existing ontologies already agreed on (by the W3C, e.g.) is intended wherever possible. In this case, mapping and merging of classes and properties is necessary.
By providing the ontology as a formal shell of relations and entities, the representation yet lacks a proper integration of relevant data. This can either be solved by certain domain experts being very knowledgeable about specific topics that are no common knowledge. In this case, Protégé serves as a facilitated user interface for data inclusion. Another approach is accessing existing databases that are present in other formats by linking them to the ontology. This is for instance done by mapping database objects to classes and properties of the ontology or by having an additional external connection (via anchor layers, for instance). However, to ensure that all included data is both relevant and correct, it needs to be proven and accepted.

Preliminary Results
We are providing a basic principle of a semiconductor supply chain with its entities, processes and relations, represented by an RDF based ontology. The current version of this digital twin is depicted in Fig. 2 as visualized by WebVOWL. For better human readability, the various domains are visualized by separating them into different areas, called lobes. Furthermore, the single nodes are arranged based on a traditional understanding of organizational structures; using the Pick & Pin mode. Some lobes are referring to already existing standard ontologies that are recommended by W3C (Sensor ontology, for instance). Thus, it ensures a broad consensus of the content quality. Other parts of the digital twin, however, are specifically representing semiconductor manufacturing environmentor more specifically structures, products and processes that are unique for Infineon.
Many different tiers, projects and project partners lead to a variety of small, independent, problem-specific concepts. By representing them as ontologies and matching their entities with the existing digital twin, we are achieving enhanced collaboration due to fast data retrieval within a large amount of expert knowledge. This improves both collaboration between suppliers and customers along the supply chain as well as collaboration between project partners. The technology will help to enable complexity management within electronic component supplier (ECS) value chains that are most likely supply chains employing semiconductors. Additionally, Semantic Web technologies will support supply chains manufacturing semiconductors. Both benefits are specifically important to find solutions for the complex interrelationship between the technology developed and commercial users. Furthermore, Semantic Web technology is contributing to enhanced workplaces and smart collaboration in ECS value chains. A detailed view of the semiconductor development lobe in the current Pick & Pin digital reference version is depicted in Fig. 3. The extract shows the entities with their relationships. Additionally, by clicking on nodes and edges, one is able to gain insight in stored metadata, constraints and comments if existing. The visualization might serve as an explicit user interface for interested parties, having direct access to an expert knowledge base. Furthermore, by relating existing knowledge and reasoning options, it is possible to gain additional insight into related bits of knowledge and roots of the requested information. Further ontologies have been prepared to support alternative business processes like revenue management (RM), possibly being an enabler for pricing mechanisms based on order lead time (OLT). Results include reducing the bullwhip effect and reaching both high capacity utilization and more deliveries as to customer wishes.
Another concept that is realized with the help of Semantic Web is the open online sales and marketing platform (OOSMP), on which integrated cyber-physical systems (ICPS), product vendors and component suppliers can meet virtually. It is being developed in Productive4.0 and might be extended to the whole product life cycle throughout the iDev40 project. The intention is that product vendors will be enabled to describe their product ideas and concepts with requirements to components, and component suppliers would be able to offer their components with features respectively. The visionary platform system is then intended to find matches between requirements and offers in an ecological, economical and societal balance along the entire life cycle. This is particularly important for a huge, unclear and quickly changing product portfolio of a semiconductor vendor. The Semantic Web and reasoners especially could enable further processing of search requests and linking it to prior experiences or other implicit correlations. Hence, recommendations for similar or new products as well as suggestions for substitutes of a requested product (if not on stock, for instance) would be possible. This may be a large step towards efficiently attracting new customers that are not familiar with the current product structures. Furthermore, development and production plans could be controlled and influenced considering knowledge about potential customers' behavior.

Reflections
In this paper we described our approach of a digital twin in the semiconductor environment. The described ontology is part of first use case scenarios, where we make use of major advantages like flexible data handling for large data sets, knowledge extraction and deduction of implicit knowledge. Within our current research projects, we are providing the current status to all interested partners for further improvement and large-scale adoption. Moreover, the current digital twin version is maintained on a regular basis to ensure consistency among its users.
The approach described in this paper leaves room for further improvement. In order to approach a holistic representation of the semiconductor manufacturing domain, some areas are missing to be translated into an ontology. This includes for instance automation standards, security and planning. Moreover, more relevant data is to be included in addition to classes and properties. In a further step, there have to be experts that agree on the quality and logical consistency of the imported data sets. After achieving a high-quality digital twin that consists of a sufficient set of relevant data, it is important to show its benefits in various use cases. This may include both business scenarios and collaboration experiments on a project level. Paying special attention to interconnecting different work packages on a use case level (as in iDev40, for instance), Semantic Web may contribute to an improved project exploitation. With getting an increasing number of experts involved, the digital twin will serve as a comprehensive knowledge representation that allows for fast knowledge extraction and deduction. Consequently, the aim for a large-scale adoption of the presented concept is getting closer. This is intended for applications in both the research community as well as the industrial space.
Further research may focus on adding relevant ontologies to the existing digital twin as well as developing a strategy to validate the added and existing domains. A major issue in this regard is that different parties involved have a diverging understanding of the same entity or the same lobe and their relations. This includes both scientific and semantic dimensions. Yet, by including the advice of domain experts, the comprehensiveness of the concept will improve over time in case there is a broadly admitted standard validation method. Other research topics are concerned with the possibilities of representing processes as graphs and of connecting existing databases efficiently. In the long run, the digital twin shall facilitate human-machine interaction along the entire value chain and product life cycle. In particular, it improves the use of relevant data with regards to speed and efficiency.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.