1 Digitization of Stationary Retail

Participants of stationary retail, such as customers, store personnel, or store managers, demand for added service quality in their respective field of interest. While customers ask for shopping assistance or recommendations similar to online shopping experiences, store personnel wants to know the optimal time to restock shelves with products and store managers want to determine the optimal product placement strategy for a store, for example.

As stationary retail is in the early stages of digitization, novel services are developed, aiming at satisfying the respective demands—mentioned above. Therein, models representing complex (retail) processes are introduced as a foundation for further analysis and reasoning. Based on a retail in-store logistics model (Kotzab and Teller 2005) and the Digital Twin model (Augustine 2020; Erkoyuncu et al. 2018; Grieves 2011), this chapter introduces a semantic Digital Twin (semDT) for retail as the semantic connection of a scene graph with a symbolic knowledge base, allowing for abstract reasoning about its contained facts in various retail applications. We refer to a scene graph (Costanzo et al. 2020; Mania and Beetz 2019) as a semantically annotated environment model that can be created automatically by robotic agents (Matsuo et al. 1999; Rusu et al. 2008; Sommer et al. 2019). The symbolic knowledge base consists of interlinked ontologies based on automated ontological models of everyday activities (Haidu and Beetz 2019) that contain further product or store logistic information using a uniform interface that supports integration of the scene graph.

The core entity in the semantic Digital Twin is the digital representation of a real store including its layout and offered products, augmented with various spatial, semantic as well as relational information. On the one hand, the scene graph holds information about a product location in relation to a shelf, a shelf layer, or other products on the same shelf layer and allows for optimization thereof. On the other hand, the symbolic knowledge base contains product information like 3D models of all objects, a product taxonomy, an ingredient classification as well as additional product information like product brand or awarded labels. Furthermore, the semDT knowledge base can be linked to store specific information like delivery or sales data. This knowledge can be reasoned about, visualized and modified in various environments from web interface to virtual environment, or applied on different platforms for Augmented Reality as well as robotic store assistance, for example. In Augmented Reality (AR), digital content is projected onto the real world (e.g., when using smart glasses like HoloLens) or a video stream of it (e.g., when using a Smartphone).

A robotic store assistant given the product replenishment task can access the semDT to combine location information, warehouse stock, as well as destination information. To further optimize the task, they can access semantic information and rules to determine the adequate time and sequence for restocking. Figure 1 illustrates the collaboration of robots using a semantic Digital Twin knowledge base. The robot to the left is creating the scene graph of a retail store by interlinking perceived object facts, thereby creating a semantic environment graph containing information about the product in relation to a shelf and shelf layer, for example: <Product A>is_in <Shelf 3>, is_on <shelf_layer 2>. The robot also captures empty facings and stores the information semantically (<Product A>has_stock <0>). We refer to a facing as the area between product separators on a shelf layer that can hold products subsequently behind each other. As all entities like shelf, shelf layer, and product also have positions relative to a reference frame, the environment knowledge can be shared between robotic agents as depicted in Fig. 1 to the right. From the inventory data in the scene graph, these two robots can infer which products need to be replenished (<Product A>), and from the symbolic knowledge base, they can derive how to identify them: <Product A>has_GTIN <40023>. The Global Trade Item Number (GTIN) is a unique identifier that is encoded in the product barcode. Since shelf layers are equipped with price tags containing this GTIN, they can be recognized by perception systems. From enterprise management systems, the semDT can include the planned stock for a given product (<Product A>has_planned_stock <3>, thereby having valuable information available to successfully collaborate for store assistance.

Fig. 1
figure 1

Simulation of collaborating robot store assistants using a semantic Digital Twin

The connection of the scene graph to the symbolic knowledge base therefore makes it possible to reason about object properties in a shared environment, a benefit of implementing a semantic Digital Twin as knowledge base.

In this chapter, we define a semantic Digital Twin, discuss its composition and creation as well as applications of semDTs in retail stores to support users in their decision processes, and answer questions like “Which products need to be replenished in which quantity?,” “Which products are lactose-free?,” or “How does product placement influence sales data?.” Furthermore, we show the considerable economic potential of a semDT knowledge base in three example use cases:

Replenishment Process

Based on inventory as well as delivery data and predefined rules, store personnel or robotic assistants can determine the adequate time, order, and destination for restocking of products.

Augmented Reality Shopping Assistant

A customer can set product preferences on a HoloLens in order to highlight product information like an ingredient classification such as “vegan” or “lactose-free” to single out products that hold the given preference. Additionally, they can use their Smartphone to visualize interesting product information like awarded labels or contained allergens.

Digital Store Visualization and Robot Simulation

A store manager can load a digital version of the store to visualize interesting information like deviations from planned to actual store layout. Aside from that, software developers demand for a simulation environment to test and verify their application before deployment to the real scenario.

The main contributions of the presented work are: (i) We introduce a semantic Digital Twin for retail as a connection of a scene graph and a symbolic knowledge base based on ontologies. (ii) We describe the semDT components and reasoning capabilities. (iii) We demonstrate the applicability of the semDT in three use cases for different users of the semDT.

2 Optimization of Retail Logistics

The advances in digitization have led to a tremendous growth of digital retailers and online shopping. Stationary retail needs to look for digital solutions in order to being able to compete with the information systems of their digital counterparts, providing an enhancement of the customer shopping experience and improving purchasing processes or optimizing the offered assortment depending on individual preferences and shopping behavior analysis. Contrary to digital retail, stationary retail lacks this transparency and abundance of information and thus has to focus on enhancing the customer shopping experience with more employees involved in customer interaction and care.

The introduction of digital technologies like Augmented Reality, Virtual Reality (VR), or mobile phone applications in retail stores can complement and enrich the shopping experience. This combination of digital technology in real environments can make shopping more appealing and entertaining for customers and accelerate localization or inspection for availability of searched products that contain certain ingredients or allergens, for example. This would allow for improved support for customers with allergies or other health conditions, goals, and needs. Digitization and the use of state-of-the-art technology can also support store personnel in their everyday work. Store employees can use AR devices to find product destinations for restocking or in order to complete Click&Collect orders. In Click&Collect, a customer can order products online to eventually collect it at their retail store in a preferred time slot. AR technology can also help train new employees faster by helping them in finding their way around the new store and searching for requested products.

These examples corroborate one problem that needs to be tackled for successful digitization of retail stores: Environment information plays a significant role in answering many questions in stationary retail, yet it is not included or linked to companies’ knowledge bases. Therefore, in this chapter, we propose the semantic connection of environment information to product information in a semantic Digital Twin using an automatically created scene graph that is linked to a symbolic knowledge base consisting of interlinked ontologies.

Figure 2 visualizes example scenarios of this chapter. It illustrates the automatic scene graph creation by a robot, a Virtual Reality scene graph of a retail store, and an example application of the semantic Digital Twin: product placement investigation for assortment optimization.

Fig. 2
figure 2

Example scenarios of the semantic Digital Twin

The semantic Digital Twin enables retail users, e.g., store managers, to visualize all products from a certain brand. They can ask for the sales quantity of these products and compare different product placements in regard to product sales data. Since stores are linked to retail chain and company-specific product information, one can even compare different store layouts to find similarities and differences.

Semantic Digital Twins can contribute to solve various problems, particularly in logistic processes:

Store managers are interested in questions such as “Which products need to be replenished in which quantity?” or “Which products are sold out?” at any time.

Retail store personnel or robotic store assistants can pose questions such as “Where should this product be sorted?” or “At what time should which product be restocked?.”

Customers may want to know “Where can I find the products from my shopping list?” or “Which products are produced economically friendly?.”

Regional managers might be interested in questions such as “What experiences do stores have with the placement of certain products in different areas?” or “Which store sells which products?.”

Logistics managers can get answers to questions like “How much space is available in the store for a certain product?” or “In which order should products be palletized to make unloading and refilling of the shelves as efficient as possible?.”

Marketing specialists can use semDTs to compare and evaluate store layouts and assess how product positioning influences sales figures.

Supply chain managers get optimized stock information to abbreviate order and delivery times and minimize warehouse stock.

Software developers in the field of robotics can use semDTs to simulate robot behavior before applying it to the real robot in the store.

These are only a few of the applications the semantic Digital Twins can contribute to. As development and experience with this system move on, further applications and use cases will arise.

3 Semantic Digital Twin: A Digital Representation of a Retail Store

The concept of a

Digital Twin, the digital equivalent to a physical product,

was introduced in 2003 (Grieves 2011) and later defined variably in related work with different foci in a manufacturing context like applications in NASA and US Air force vehicles (Glaessgen and Stargel 2012) or as a living model of a physical asset (Liu et al. 2018). If we focus on the retail sector instead of manufacturing, the object of interest changes from physical item or assembly to an ecosystem centered around purchasable products such as described in in-store logistic models (Kotzab and Teller 2005; Zheng et al. 2019). We propose embedding of a Virtual Reality scene graph representation of the physical elements, allowing for integration of physics simulation in addition to visualization. We follow the approach of Erkoyuncu et al. (2018) to

connect Digital Twin and physical asset through their interrelations,

thereby highlighting the importance of relations and semantics. Further inclusion of non-physical entities and their properties enables complex reasoning capabilities. Hence, we define a semantic Digital Twin as:

Definition 1 (Semantic Digital Twin: semDT)

A semantic Digital Twin is a symbolic representation of robots, human beings, and their environment as physical elements connected to complementary non-physical entities as well as their properties and interrelations, represented by data structures of Virtual Reality scene graphs. Thereby abstract information associated with the entity of interest can be inferred, reasoned about, and visualized through a variety of media to predict current or future conditions. Particularly, actions can be simulated, and hypothetical scenes can be rendered to support and enhance decision-making.

We can use semantic Digital Twins to store, interpret, and query information of heterogeneous resources from enterprise data over web information to simulated processes or effects of actions through a uniform interface as depicted in Fig. 3. Its symbolic knowledge base is based on an information system for storage, management, and usage for embodied intelligent agents (Beßler et al. 2020) and, therefore, describes contained facts about products and logistics processes in a machine-understandable way, enabling many new applications in the first place. The components of the semDT will be further explained in Sect. 4.

Fig. 3
figure 3

Semantic Digital Twin model and its main components

The semDT realistically represents the physical store, including 3D images for visualization and semantic information. Each element is uniquely identified so that it can be linked to additional knowledge bases.

Semantic Digital Twins can be automatically created by perception-based systems (Arpenti et al. 2020). Inclusion of merchandise management systems (ERP) can allocate sales information and automatically connect it to object location information, enabling system users to answer queries for product sales per shelf, brand, or store. The semDT combines disjointed business data into a coherent picture. The entire company thus has access to the most up-to-date and complete data stock. In this way, logistics can be individually coordinated for each store and the flow of goods can be sorted on site according to an optimized placement of products. Simulation, on the other hand, facilitates extensive marketing studies and supports the integration and control of various systems such as service robots through the included semantic information.

Subsequently, semDTs can function as information carriers with a unique level of detail, visual realism, and cognitive capabilities for the development of innovative information services.

4 Building Blocks of the Semantic Digital Twin

The semantic Digital Twin is based on environment information perceived by a robot. This environment information is transformed into a scene graph, a semantically annotated environment model. Thereby a product is assigned a position relative to other products on a shelf and shelf layer, allowing reasoning about its location. For creation of the semDT, the scene graph is connected to a symbolic knowledge base containing enterprise product information like product brand or price but also product information like an ingredients classification or label information from various sources as depicted in Fig. 3, enabling the semDT to quickly adapt to changing product information or consumer needs.

The semantic Digital Twin can be described as a Knowledge Representation and Reasoning (KRR) framework, which is a knowledge base that organizes information of different knowledge sources from the literature, perception system, historical data as well as forecasts and other open research data services. In the following, we continue to describe the knowledge representation of the semantic Digital Twin and the creation of its components, the scene graph, and the symbolic knowledge base. The section closes with an overview of the reasoning capabilities of the semantic Digital Twin, which enable answering of the questions posed in Sect. 2.

4.1 Knowledge Representation

The semantic Digital Twin represents its knowledge in the form of ontologies; therefore, all information is stored using triples of entities and their relations. Ontologies such as information sources provide meaning to the contained facts and entities and allow reasoning about them (Staab and Studer 2010). Each ontology comprises information from a different source or with a different focus in order to make them modular and interchangeable. As depicted in Fig. 4, the product is the central object in each ontology and therefore used for interlinking between the ontologies. It can easily be expanded by any ontology that uses a GTIN as product intendifier. In this example, five ontologies are shown, containing location and inventory information (<Product A>is_in <Shelf 3>, has_stock <2>,…) as the scene graph that is connected to the symbolic knowledge base with enterprise information (e.g., <Product A>brand <Somat>) as well as additional product information derived from different sources from robot to web information (<Product A>is_a <detergent>, has_label <vegan>, has_ingredient <perfume>). This information can be used as a basis for many consumer applications.

Fig. 4
figure 4

Example of linked knowledge sources in the semantic Digital Twin

In order to achieve a rich foundation for various reasoning tasks, the semDT KRR framework maintains a virtual replica of the retail store, which is enriched by ontologies comprised of expert knowledge, current state information captured from mapping and sensing technologies as well as interesting product information. Given such foundation, the framework allows investigating and mining hypotheses to support queries of retail business actors and Big-Data-enhanced abstract reasoning tools; for instance, retail-relevant concepts, such as “misplaced objects,” can be inferred if differing products are detected in a facing. Furthermore, responses and current state information are intuitively rendered to aid and assist in decision-makings for everyday tasks in a retail store.

KnowRob: Knowledge Processing Framework

The ontology-based KnowRob system, first introduced in 2009 (Tenorth and Beetz 2009) and extended in 2018 (Beetz et al. 2018), is at the forefront of cognitive robot control in the household domain in terms of the extent of information its knowledge base represents (Thosar et al. 2018). It combines encyclopedic knowledge with implicit knowledge but also techniques for acquiring knowledge and for grounding it in a physical system. KnowRob can serve as a common semantic framework for integrating information from different sources. Thereby, it combines robot sensor input with static encyclopedic knowledge, common-sense semantic knowledge, task descriptions, virtual environment models, and object information. Furthermore, it supports different deterministic and probabilistic reasoning mechanisms, clustering, classification and segmentation methods and includes query interfaces as well as visualization tools. As a result, we propose KnowRob as a suitable knowledge processing framework for the semantic Digital Twin.

4.2 Scene Graph

The KnowRob ontologies used for scene graph creation are built on top of existing ontologies. They include concepts about objects from the Everyday Activity Science and Engineering (EASE) framework (Bateman et al. 2017). These concepts are connected to a general description of actions, tasks, and agents from the Socio-physical Model of Activities (SOMA) (Beßler et al. 2020). Lastly, concepts about participants in events or localization of objects are integrated from the DOLCE+ DnS Ultralite ontology (Gangemi et al. 2002). Thus, the KnowRob ontologies link objects to tasks of actions with participants as a basis for interpretation and reasoning. The KnowRob ontologies are expanded for the semDT by retail-relevant ontologies containing store or layout information with descriptions about the existing shelf systems, their features (height, width, the number of shelf layers) as well as object positions and product information (barcode, stock). All objects in the store are represented as concepts with properties, defining how they are related to each other. For example, shelf layers support a certain weight, leading to a maximum number of products that can be placed on it.

Retail stores are structured environments containing systematically organized shelf systems. The structural design of these organized shelf systems can be formalized using rules in predicate logic (Rashmi and Rangarajan 2018), enabling reasoning over the contained entities but also automatic creation of semantic environment maps by robots. Such a semantic environment map contains a rich description of objects in the stationary environment and can be used to answer queries like “Where are deodorants located?” or “How does shelf X differ from its abstract representation in the planned store layout?.” Due to the fact that the semantic environment map links to concepts about actions, objects, and even actions effects, we refer to it as a scene graph.

Internally, KnowRob generates a hierarchical structure of the asserted objects, while a robot is scanning the store for scene graph creation. Together with the semantic knowledge about these objects and their 6D pose, KnowRob creates a virtual snapshot of the world’s state as depicted in Fig. 2.

Automatic Scene Graph Creation by Robots

We apply robots to automatically create the scene graph for the semantic Digital Twin, but the scene graph can be created by any perception-based system. The process of scene graph creation consists of two major steps: layout detection and store monitoring. During layout detection, rarely changing features of the store (like room size and shelf positions) are captured. This task needs to be performed once for each new store layout. The store monitoring process is the repeating process of stocktaking.

Layout Detection :

The layout detection starts with the creation of a 2D map of the store using grid mapping as simultaneous localization and mapping technique (Grisetti et al. 2007). Afterward, the robot drives through the store to create a basic scene graph of the environment without product information. Each shelf is detected using a Quick Response (QR) code. The position data is added to the scene graph in such a way that shelf positions in relation to other shelves or points of interest can be calculated and reasoned about.

Store Monitoring :

During store monitoring, frequently changing product positions are detected automatically by the robot. For each shelf, the robot scans the shelf vertically to detect shelf layers and horizontally for each shelf layer to detect price labels and product separators. The stock as the number of products between two product separators is estimated based on detected product features of an RGB-D camera similar to other approaches (Donahue et al. 2014). Each price label contains a barcode that encodes the GTIN of a product, thereby adding product information to the scene graph. This product information is continuously updated based on sales data. If the store monitoring process is performed regularly, the scene graph of two different time points can be compared to detect irregularities between calculated inventory and actual inventory that can be reasoned about.

4.3 Symbolic Knowledge Base

If a robotic agent is given the task to pick objects in a retail store in order to fulfill a customer shopping order and an unexpected situation is experienced, such as a misplaced product (e.g., a “coke light” bottle in the facing of “coke” bottle), the robot needs to be able to reason about the situation using externally acquired knowledge (e.g., current store layout, product classification) in order to successfully finish the task.

As previously mentioned in Sect. 4.2, the scene graph is created using the ontology-based KnowRob system. The ontologies in the symbolic knowledge base can optionally be integrated in the KnowRob system. KnowRob can simplify the process of integrating information from different sources and translating them to ontologies. This is specifically useful for the semantic Digital Twin since most retail companies store their data in merchandise management systems (ERP). With KnowRob, product classification information from an ERP system can be translated into a product taxonomy in Web Ontology Language (OWL) format, for example. Furthermore, KnowRob is able to integrate Semantic Web information.

The Semantic Web (Berners-Lee et al. 2001) is an extension of the World Wide Web (Sirin et al. 2003), aiming at structuring content of web pages by using standardized, machine-readable formats to represent entities, their properties, and relations so that it can be interpreted by software agents or robots. This integration enables the semantic Digital Twin to compete with online stores and their recommender systems. The label and ingredient information depicted in Fig. 4 are based on Semantic Web information, for example. The ingredients ontology can further be connected to an allergen ontology, enabling the semDT to reason about contained allergens in the products available at the store as shown in an example query asking for all products that contain an ingredient that is classified as being hazardous to the environment in Fig. 5.

Fig. 5
figure 5

Example Prolog query for all products and their classification that have an ingredient that is classified as hazardous to the environment

The depicted Prolog query is using the Semantic Web library package; Prolog is a logic programming language (Nilsson and Małuszyński 1990) used for reasoning in the symbolic knowledge base. In this figure, a SPARQL query at a specific SPARQL endpoint is called (lines 1 and 13). SPARQL is a graph-based query language and recommended query language for Resource Description Framework (RDF) graphs (Angles and Gutierrez 2008; Pérez et al. 2006). Lines 2 to 6 declare prefixes to be used in the SPARQL query given in lines 7 to 12. The query asks for all products that contain an ingredient that is classified as being hazardous to the environment, a label that has the standard depiction shown in Table 1 to the left, which is referenced in Semantic Web sources like Wikidata, for example. Line 7 names the variables that are to be returned by the query: ?allergen, the ingredient contained by a product and classified as hazardous to the environment, ?product, the GTIN identifier of the product, and ?class, the class of the product. Line 8 queries the allergen ontology for all ingredients that have the relation <?allergen>has_depiction <hazardous_to_the_environment>. Line 9 links the allergen to the ingredients ontology using the owl:sameAs statement. Then all products that contain any of the returned ingredients are searched for in line 10 (<product>has_ingredient <ingredient>). In line 11, the resulting product list is linked to the product taxonomy through the use of owl:sameAs statements. Lastly, the product class is retrieved in line 12 (<?prod>type <?class>). Line 13 filters the result set for redundant classes.

Table 1 Results excerpt (out of 119 resultsa) for the query in Fig. 5

An excerpt of the results for the query in Fig. 5 is shown in Table 1. It depicts the international pictogram for environmental hazard in column 1. The classification of ingredients as being hazardous to the environment was extracted from the European chemicals agency ECHA websiteFootnote 1 and integrated into the semDT allergen classification. All ingredients that are listed as environmental hazard allergen in the table, column 2, therefore are classified according to the ECHA website. Column 3 contains the GTINs of the products that contain the hazardous ingredient, and column 4 contains a product classification. The prefixes in columns 2 to 4 show that all information is extracted from different sources, emphasizing the benefits of reasoning over connected ontologies in the semDT. Using Prolog and ontologies, information sources can easily be included or exchanged and reasoned about.

Consumer applications based on this symbolic knowledge base are further elaborated in the use case described in Sect. 5. If customers particularly prefer environmentally friendly produced products, they can use the symbolic knowledge base to filter out products with non-compliant or hazardous ingredients as shown in the example above, for instance.

Entity Linking in the Semantic Digital Twin

We state that ontologies can easily be included or exchanged in the semDT. This is true for all ontologies that hold product information and include a GTIN as unique identifier for the contained products, which is the case for most ontologies in the semDT. Entity linking in knowledge bases is a widely studied research problem (Shen et al. 2014), even more so for web-scale data (Lin et al. 2012). For all semDT ontologies, we use string matching techniques to identify entities that are to be linked. ‘‘Benzyl_benzoate’’ from the allergen ontology thereby is linked to ‘‘benzyl_benzoate’’ and ‘‘benzylbenzoat’’ from the ingredients ontology, for example.

4.4 Reasoning

Reasoning about semantic information in conjunction with environment information is the main capability of the semantic Digital Twin.

As previously mentioned, the logic programming language Prolog is used to reason about information contained in the symbolic knowledge base (see Sect. 4.3). With Prolog, we are able to answer questions as posed in Sect. 2 and develop applications that are based on it like in the use cases presented in Sect. 5. Since the semDT also stores 3D models of the perceived objects and connects scene graph with symbolic knowledge base, we can generate a web user interface that highlights information on demand as depicted in Fig. 6.

Fig. 6
figure 6

Example reasoning capabilities of the semantic Digital Twin

Figure 6 displays the web interface openEASE,Footnote 2 an open access web-based knowledge service based on KnowRob, where users can upload episodes of data and query it for information. OpenEASE facilitates access to the knowledge base using Prolog. The Prolog query asking for empty facings is depicted in Fig. 6 to the left, and the Prolog query asking for a specific product destination is visualized in Fig. 6 to the right and generalized with comments in the following Fig. 7.

Fig. 7
figure 7

Example Prolog query for destination of a given product GTIN

The query in Fig. 7 requires an input parameter specified in line 1, which is the GTIN of a product. The GTIN is used to first retrieve the corresponding Barcode object from the shop ontology in lines 2 to 5. The Barcode variable is defined as an <object>type <shelf_label> in line 3 with the GTIN uniquely identifying the object as of line 4 (article_number_of_product <GTIN>). The Barcode is used as input to retrieve the destination facing ob the object in lines 6 to 9. The variable Facing is defined as a product_facing in line 8, and the facing corresponding to the input product GTIN saved in the variable Barcode is scanned for in line 7: <Facing>label_of_facing <Barcode>. Lastly, the facing is being highlighted in line 10 using the show command.

OpenEASE offers a variety of visualizations of the knowledge base content, including 3D visualizations of the store including its entities (e.g., shelf, products, facings, or robots as in Fig. 6), the trajectories they perform as well as interval logic layers for showing temporal relationships between different events. In order to exchange information between the semDT and users, openEASE serves as an intuitive interactive interface providing information about the current state of the store and facilitating Q&A queries. Responses can be rendered in the form of physical or abstract entities to assist humans as well as robots in everyday retail tasks as will be demonstrated in the use case section.

Reasoning About Action Information

A main advantage of the ontologies used in KnowRob is their integration of activity knowledge from the SOMA ontology (Beßler et al. 2020). In combination with the scene graph based on object information from the EASE framework (Bateman et al. 2017) and task descriptions from the DOLCE+DnS Ultralite ontology (Gangemi et al. 2002), KnowRob can reason about performed actions by different agents. This reasoning capability can also be accessed via the openEASE web interface.

Action information in KnowRob is stored as episodic memories (Bartels et al. 2019) based on general plan descriptions (Koralewski et al. 2019). These general plan descriptions follow an action and task hierarchy where a stocktaking action performed by a robot can be comprised of different tasks like driving, positioning, and scanning. Figure 8 shows a general plan description for the action of looking at a product, a subaction of the stocktaking action. Figure 9 describes the plan for a navigation action to a destination.

Fig. 8
figure 8

Plan description for LookingAt action

Fig. 9
figure 9

Plan description for Navigation action

Additionally to the action hierarchy, episodic memories can store agent information as participant or performer of an action, assign roles to objects for a certain action (e.g., shelf 2 in the general plan in Fig. 9 would be assigned the role destination), and log times and duration of the performed actions.

Figure 10 shows example action information about a recorded stocktaking episode by a robot. It shows that the stocktaking action is divided into various stocktaking tasks for the different shelves and shelf layers, respectively. All scanned facings are highlighted in green, and empty facings in yellow. Additionally, robot movement is logged as “AssumingArmPose” and “LimbMotion.” One can inspect each task and action for their duration as well as participants of the action or action effects. The event can even be replayed in an interactive simulation.

Fig. 10
figure 10

Example action information in the semantic Digital Twin

Action information cannot only be recorded for robots, the semDT can also store anonymous episodic memory information about shopping events using AR devices or sensors in the retail stores. Such information is a valuable resource that can be applied to various use cases such as to optimize robotic store assistants as well as customer experience.

5 Semantic Digital Twin Use Cases in Retail Logistics

We demonstrate the applicability of a semantic Digital Twin in three use cases based on different users of the system. The semDT provides vast retail-relevant information through machine-understandable data formats and interfaces. This enables services including AR applications and robots to access information, such as poses of products and shelves or properties of objects like product dimensions, in order to perform tasks such as shelf replenishment.

As discussed in Sect. 2, store personnel and robotic store assistants aim at optimizing logistics such as for product replenishment that is demonstrated in use case 1. Use case 2 shows customer AR applications accessing the semDT for product-specific information like contained ingredients or awarded labels. Lastly, use case 3 demonstrates how a digital store can be visualized in a virtual environment, where store managers can simulate action effects.

5.1 Use Case 1: Replenishment Process

This use case demonstrates the benefits of a semantic Digital Twin from the perspective of store personnel and robotics using the example task of product replenishment.

5.1.1 AR Supported Replenishment

Product replenishment poses several challenges: It is required to be performed accurately and regularly in each retail store. New employees need to learn product destinations before they are able to efficiently perform replenishment, but all store personnel need time to adapt to changing store layouts, seasonal product stands, and new offerings.

The semDT can be accessed by AR applications for store personnel to ease the task of replenishment and order fulfillment, for example. Figure 11 depicts such an AR app where store personnel can ask for the location of a product group as in “Where is dishwashing detergent?.” The app locates the AR device in its environment using persistent digital environment anchors in the store. Then the camera view of the environment is augmented with digital waypoints that are aligned relative to these anchors, thereby leading the user to the product destination. The correct path is recalculated regularly to account for direction changes of the device. In order for app users to distinguish waypoints from destination points, they are assigned different colors. While green waypoints mark the path, an orange waypoint marks the searched product destination, see in Fig. 11 the shelf holding dishwashing detergent products.

Fig. 11
figure 11

AR routing for store personnel

5.1.2 Robotic Replenishment

Robotic store assistants can also be used to contribute to product replenishment. They require reliable environment information for this task. The main benefit of the semantic Digital Twin is its connection between environment information in the form of a scene graph and semantic information in its symbolic knowledge base. As a proof of concept, this use case shows the replenishment process performed by a robot based on scene graph information as depicted in Figs. 2 and 12. The robot responsible for replenishment can access the scene graph information and query the semDT for all empty facings as shown in Fig. 6 to the left. By accessing the symbolic knowledge base of the semDT, the robot can further reason about the products (e.g., their weight and size and how these properties affect the replenishment task) and the quantity of products that need to be fetched from the warehouse. Afterward, they can query the semDT for the respective product destinations as described in Fig. 7. Subsequently, the robot can be ordered to move to an item, localize the item using its camera, and collect the item with its gripper for replenishment purposes. Figure 12 shows a robot querying the semDT for empty facings in the simulation interface and accordingly performing the replenishment in a real store scenario.

Fig. 12
figure 12

A robot identifying empty facings in simulation (left) and replenishing products in the store accordingly (right)

5.2 Use Case 2: Augmented Reality Shopping Assistant

This use case demonstrates the benefits of a semantic Digital Twin from a customer perspective in two AR applications on different devices: a HoloLens and a mobile phone.

5.2.1 HoloLens Application

Customers demand information and recommendations as presented in online shopping experiences. The semDT supports an AR app designed for the AR glasses HoloLens, highlighting individual preferences in a retail store. An example query that could be used in this AR application for products that contain ingredients that have a classification of being hazardous to the environment has been discussed in Sect. 4.3. In the same manner, AR applications can access the semDT and query for all products that contain preservatives, as shown in Fig. 13 to the left. All products of the store are aligned relative to shelves using scene graph information and environment anchors as in use case 1. If a customer checks a preference in the menu, all products that contain the ingredient are overlayed by a red “X,” indicating that the product behind the overlay contains the ingredient and is not intended to be bought. This allows for customers to inspect all remaining products.

Fig. 13
figure 13

AR apps to highlight consumer preferences like ingredients, label, or hazard information

5.2.2 Mobile Phone Application

The semDT can just as well be accessed by other devices. We present a second AR app running on a mobile phone, using object recognition to identify products in the store. If customers are interested in a product, they can point the camera toward the interesting product. Once the product is identified, interesting product information like awarded label and hazard information is being displayed, as depicted in Fig. 13 to the middle and right. If consumers are interested in further information, they can swipe the product name that appears at the bottom of the screen and a second screen with additional product information is being displayed.

5.3 Use Case 3: Digital Store Visualization and Robot Simulation

This use case demonstrates the benefits of a semantic Digital Twin from the perspective of a store manager to visualize store information and software developers to simulate robot behavior.

5.3.1 Semantic Digital Twin Visualization

SemDT users like store managers, who want to investigate available product information in order to, e.g., optimize product placement, face the challenge to virtually imagine the appearance of the store. They require a mechanism to spatially represent and visualize different states and configurations of the store layout in order to efficiently optimize logistic store processes such as product placement. Therefore, in addition to the web-based interface openEASE, the semDT provides a photo-realistic 3D visualization of the semDT as an extension of the Virtual Reality scene graph, facilitating users to interact with the semDT in an intuitive way. A store manager can visually inspect, compare, and analyze different store layout configurations or move articles and change layouts in different states of the semDT. Using Prolog queries, travel paths and trajectories can be visualized. In Fig. 2, we replicated the example store located at the University of Bremen. Figure 14 demonstrates a visualization of empty facings in the virtual semDT.

Fig. 14
figure 14

Visualizing empty facings in the virtual semDT

Using the Prolog query interface, semDT users can also highlight interesting information as depicted in Fig. 15, where all products containing hazardous ingredients that are reachable by children are highlighted. This query uses the query for hazardous ingredients as shown in Fig. 5 and filters out all products that are below a given height and therefore reachable by children.

Fig. 15
figure 15

Highlighting “hazardous products reachable by children” in the virtual semDT

5.3.2 Robot Simulation

Simulation is crucial for the development of novel retail applications. For instance, software developers of robot applications might have no or only limited access to retail stores. Furthermore, robotic applications tend to be complex systems, consisting of perception, localization, and control frameworks, among others. By offering interfaces equivalent to the real robot interfaces, it is possible to execute and test the application in the virtual semDT before deploying it in the retail store. Examples of different robot simulations are shown in Figs. 1 and 16. Figure 16 simulates a robot perceiving objects on a shelf layer, whereas Fig. 17 shows the real robot perceiving and picking the same objects in a retail store.

Fig. 16
figure 16

Robot simulation with semDT

Fig. 17
figure 17

Real robot perceiving and picking objects based on simulation results

6 Conclusion

In this chapter, we introduced the semantic Digital Twin for Retail Logistics as a semantically enhanced digital representation of a retail store and the necessary connection between environment information in a scene graph and semantic product information in a symbolic knowledge base based on ontologies that allow for visualization, simulation, and complex reasoning tasks. This chapter outlines the key components and potential benefits of a semDT. With the semDT, store personnel can derive product destinations and implement rules for inferring the optimal order of product replenishment. Customers can use the semDT AR applications on various devices like HoloLens or mobile phone for an enhanced shopping experience, perceiving vast product information as well as visual and tactile sensation. Store managers can access the semDT for layout inspection or optimization, and software developers in the field of robotics can simulate robot behavior in the semDT.

Another contribution of this chapter is a demonstration of the capabilities of the semDT. The semDT ontologies derived from various sources allow for complex reasoning, which was highlighted in example queries from different platforms and applications. We demonstrated the applicability of the semantic Digital Twin in three use cases focusing on different users of the system (i.e., store personnel, manager, customer as well as software developer) accessing the semDT knowledge via various devices.

7 Future Directions

The proposed semantic Digital Twin aims at providing mechanisms offering further transparency for logistic retail processes of a store; further development and extensions of semDT capabilities are sought. Besides implementation of semDTs for different retail chains, like drugstores, grocery stores, and bookstores, which are able to afford high investments in digital infrastructure, we aim at developing semDTs in diverse and small retail business like rural corner shops as well. Since the semDT ontologies are created in a modular fashion, they can be used in different applications and can be integrated into online shops, which will be implemented in a future version of the semDT. Another future direction is set on the interconnectivity of semDTs enabling new possibilities of such knowledge transfer among different retail domains to optimize logistic processes, for example.