1 The Future of Assembly

Assembly accounts for up to 44 % of production costs and 70 % of production time and is therefore an essential step in the production process chain and impacts the efficiency of production (Lotter and Wiendahl, 2013). It has been postulated for several decades that conventional, fixed-coupled assembly systems are reaching their limits of adapting to dynamic changes, such as fluctuating market requirements and the production of customer-specific products down to batch size one (Hu, 2013). Fluctuations in supply chains and production capacity resulting from global crises increase the pressure for sustainable and crisis-resistant production (Hiscott et al., 2020). The introduction of sustainable product portfolios, like the transition from combustion-powered vehicles to electric vehicles, leads to expensive assembly reconfigurations (Hubik, 2021). The resulting paradigm shift to dynamically coupled assembly systems promotes and requires the design of flexible and adaptive systems capable of addressing individual assembly sequences and responding resiliently to changing conditions (Hüttemann, 2021). Assembling parts to create individualized products results in a high degree of complexity, primarily caused by the total number of product variants (Asadi et al., 2016). Adaptability can be seen as fundamental to the success of a company and is enabled through Industry 4.0 (I4.0) technologies (Lanza et al., 2018). Shifting from the Industrial Internet of Things (IIoT) and I4.0 to the Internet of Production (IoP) will enable a holistic, cross-domain network and linkage of currently stand-alone industrial technologies, revealing and harnessing the interdependencies of previously separate production steps and technologies.

The paradigm of LMAS provides a possible realization of adaptable and flexible assembly and is based on three principles:

  1. 1.

    clean floor approach: the assembly operation is executed on a fixture-less and free space, which allows for free placement of assembly resources,

  2. 2.

    mobilization of all physical resources (robots, parts, tools), allowing free and autonomous formation of assembly stations, and

  3. 3.

    dynamic planning and control creating suitable assembly stations and optimizing schedules, job-routes, and task allocation in dependence on demand and objective function.

LMAS are characterized by a dynamic sequence of operations, which is not fully predetermined for most products, requiring sophisticated planning of scheduling, task allocation, as well as formation and trajectory planning of the mobile robots, to enable efficient operation (Buckhorst et al., 2019).

In the following, we first conceptualize modular levels and layers to operate LMAS highlighting included concepts of the IoP (Sect. 2) by summarizing our former research results. This is followed by detailed research focal points (cf. Fig. 2) and future research directions of the modules.

2 Modular Levels and Layers for LMAS Operation

Creating an operating future assembly system following the paradigm of LMAS requires a connection of currently stand-alone industrial production steps and technologies. The necessary cross-domain network follows the main principles of the IoP, namely, the creation of a World Wide Lab and an interrelated network of digital shadows (Brauner et al., 2022). Accordingly realizing the operation of LMAS requires answering the following research question: How can efficient decision-making, modular field, and process-level control in human and machine factory operation be realized and combined with locally and globally available information (services) and distributed computing capacities, so that a real-time-capable response behavior of a LMAS results? Answering this research question contributes a modular collaboration of cyber-physical and virtual devices to the IoP, by combining domain-specific benefits and expertise of production engineering, with data analytic and human factors.

This research group has identified necessary functional blocks (“Holons Footnote 1”) as well as communication, authentication, and safety layers to operate LMAS and structured these into a framework. The resulting framework maps the holons into a Holarchy (Buckhorst et al., 2021). In particular, this holarchy defines a semantic framework in which the processes, resources, technologies, and planning steps can be integrated as dedicated holons and related to each other via interfaces and layers, as can be seen in Fig. 1.

Fig. 1
figure 1

Holarchy of the Future Assembly including highlighted research focal points (based on Buckhorst et al. 2021)

Fig. 2
figure 2

Linking research focal points for operating LMAS with research objectives from the IoP (in blue) exemplified on a scene from a truck assembly

In the contribution at hand, we detail research results and future research directions of particular holons, based on the application scenario of a truck assembly, as visualized in Fig. 2. Beginning with the macroscopic level of formation planning, Sect. 3 gives an overview over autonomous decision-making involving capability-based digital shadows to realize planning of the spatial arrangement (“formation”) of heterogeneous hardware resources in an assembly station (cf. Fig. 2, upper right). This is followed by the mesoscopic level of mobile robot control. Section 4 discusses different possibilities for system modeling of commonly utilized robotic manipulators in production lines and introduces appropriate motion planning algorithms for such systems (cf. Fig. 2, bottom left). Thereafter, the microscopic level of services, such as in-network computing, interpreting autonomous decisions and input devices is detailed. Section 5 addresses the identified need for fast control algorithms by proposing the deployment of In-Network Computing (INC) in industrial environments, hereby outlining major challenges that need to be overcome (cf. Fig. 2, bottom right). Subsequently, Sect. 6 focuses on extracting relevant information from images using interpretable features learned by generative Deep Learning (DL) methods (cf. Fig. 2, bottom center) before Sect. 7 addresses the question of how human-machine interfaces (HMI) should be designed to meet the requirements of usability, production process flexibility, and accounting for robot autonomy in the context of human-robot collaboration (HRC) (cf. Fig. 2, top left). Lastly, Sect. 8 presents an approach to developing an assistance system for knowledge-based control process configuration (cf. Fig. 2, far right). The research focal points of this chapter are embedded into the following fields of research in the framework of the IoP:

  • Assembly planning and control (Planning Layer, Resource Layer)

  • Description models and digital twins (Technology Layer)

  • Intelligent computation methods (Technology Layer, Resource Layer)

3 Toward Modular Station-Level Control Through Formation Planning of Mobile Robots

Reacting to ever-changing demands regarding production volumes and product mix and production disruptions, the assembly Planning Holon (cf. Fig. 1) regularly recalculates assembly station compositions and placements to execute the allocated assembly tasks, resulting in constantly changing assembly stations (Buckhorst et al., 2022; Kluge-Wilkes and Schmitt, 2021a). Ideally, the Station Holon – controlling the assembly stations’ operation as a module of the assembly – reconfigures the station to allow for optimal assembly operation in dependence on the allocated assembly tasks. Station Holons control a defined set of multipurpose resources (like mobile robots or sensors) on a defined area in the assembly (the assembly station) by allocating tasks to resources and planning the station formation (formation: temporal-spatial layouts of mobile assembly resources in relation to each other and their surroundings).

Present research in the field of mobile robotics mostly focuses on optimizing control algorithms (e.g., in Sect. 4) for single robots or the multi-SLAM (simultaneous localization and mapping) problem. In both fields, the goal poses of the robots are assumed to be known. But how are these goal poses determined in the first place? There is a high number of possible goal poses for robots allowing the execution of allocated tasks and accordingly a high number of station formations, necessitating a means of evaluating formations with regard to executability of tasks to select optimized formations. Such an evaluation as a base for formation planning closes the gap between high-level factory planning and low-level robot control.

As a first step, a standardized form of describing resources (here: robots), capabilities, and tasks must be found to realize an allocation of tasks to resource. We developed the CAPability-based resource AllocatioN Ontology (CAPILANO) to describe and match the required capabilities to execute the assembly tasks (like screwing, transporting, or welding) with the capabilities the robots offer (Kluge-Wilkes, 2022). The allocation in CAPILANO is based on a theoretical model of capabilities to perform a task, the spatial executability of allocated tasks needs to be evaluated subsequently. For an overview on evaluation criteria of executability of a task, see Kluge-Wilkes and Schmitt (2021b). By evaluating the workspace according to the quantity of reachable orientations by the robot flange at discrete points, a so-called Reachability Map is generated (Dong and Trinkle, 2015). Since this method currently only includes the reachability with regard to the robot flange and excludes tools and equipment (Makhal and Goins, 2018), the inclusion of tool dependence on reachability is investigated in the following.

3.1 Tool-Dependent Reachability Measure

To incorporate the effects of tools and equipment on the feasibility of robots in performing a given task, we derive the reachability measure of possible robot flange poses. Therefore, firstly, the multitude of theoretically possible robot flange poses to perform the assembly task (visualized as yellow circle on the left of Fig. 3) has to be determined. Secondly, the evaluation of the practical reachability of those robot flange poses as a function of the current base position of the robot (visualized as color scale on the right of Fig. 3) is performed and expressed in a quantitative measure. The calculated measure depends on the Reachability Index of each of the identified theoretical robot flange poses. The reachability measure is defined as the arithmetic mean of the Reachability Index for all theoretical flange positions of the respective task pose. In detail, the six process steps are as follows:

Fig. 3
figure 3

Calculating the overlap of robot flange poses with a Reachability Map

  1. 1.

    Determination of the dimensions and degrees of freedom (DoF) of the tool or equipment (“tool parameter”),

  2. 2.

    calculation of the theoretical robot flange poses based on DoFs of the tool and tool center point goal pose,

  3. 3.

    generation of the robot’s Reachability Map,

  4. 4.

    overlap of the flange poses with the Reachability Map,

  5. 5.

    calculation of the Reachability Index for each flange pose, and

  6. 6.

    calculation of the average value of the Reachability Index.

To enhance the practicability of the process, a GUI was developed in which the respective parameters can be specified. The calculation runs in the background and outputs a visual representation of the evaluated workspaces as well as the quantitative reachability measure.

The process to determine the tool-dependent reachability measure is validated on a UR10, which is equipped with a screwdriver as a tool. Following Dong and Trinkle (2015), the Reachability Map of the UR10 is generated. The set of possible robot flange poses is calculated for a screwdriver with one degree of freedom, resulting in a circular arrangement of robot flange poses, in which the screwdriver’s tool center point would reach the goal pose (visualized as yellow circle on the left of Fig. 3). Each of the flange poses is assigned to the Euclidean closest value of the Reachability Map.

3.2 Outlook

Formation planning in assembly stations aims at the capability-based allocation of tasks to resources and the initiation of a spatial formation of all resources. As summarized above, we developed a description model, implemented a capability-based task allocation, and included an evaluation of task executability in the environment of the robot. Pending is the derivation of base placements of the robots in dependence on the executability of the allocated tasks to derive a formation. The goal is to transfer the finalized formation (consisting of allocated assembly poses, robots, and base placements) to the next module of motion planning, as presented in Sect. 4.

4 Consensus and Coordination in Sensor-Robot Network

Due to their large workspace and high manipulation capabilities, open-chain robotic manipulators with high DoF (e.g., six DoF for static manipulation tasks or nine DoF for mobile manipulation) are commonly used in assembly. Therefore, the Robot Holon and its Sensor Holons are central parts of the Resource Layer of our holarchy introduced in Fig. 1. The interaction of these holons with the Process Holon of the Process Layer controls the automated execution of operation in LMAS.

Taking the previous task allocation and base placement of robots in Sect. 3 as an input follows the motion planning of the robot to execute the allocated tasks. Although deciding on the appropriate motion is natural for human workers involved in the assembly process, it can be challenging for automating robotic systems. Robotic systems should be able to adapt their motions to ongoing changes in the assembly, developing a natural behavior, i.e., creating a safe and predictable environment for their human “counterparts.” For reliable motion planning and motion control algorithms, we must consider the changes of the environment. Robotic manipulators utilized in collaborative environments should be able to employ online motion planning , i.e., the manipulators should be able to quickly react to changes of the environment. This leads to the development of the real-time-capable response behavior of the systems of a LMAS.

In this context, powerful algorithms have been developed for various aspects of continuous and discrete planning, for instance, by Biagiotti and Melchiorri (2008), LaValle (2006), and Lindemann and LaValle (2005). Most of these motion planners are developed for robotic systems with low-dimensional configuration spaces, e.g., planar systems, or mobile robotic platforms. Open-chain robotic manipulators, however, are normally of high (generally at least six) DoF, to handle the tasks defined in the six DoFs of the real environments. Thus, the applicability of these algorithms in dynamic environments, such as LMAS, is rather limited.

4.1 System Modeling

The main hurdle in developing algorithms that enable the integration of open-chain robotic manipulators in LMAS, i.e., the motion planning algorithms that enable quick responses to changes in the robot’s environment, is system modeling. The conventional modeling procedure of the 6D Task space (T-space) of robot manipulators exhibits representation singularities, i.e., the representation of the orientation in some combinations of the Euler angles leads to ambiguities, so that a unique derivation of the initial combination of the angles is not possible, which hinders the planning of a unique path. To enable the motion planning for robot manipulators with online properties, i.e., to react to changes in the environment during planning, it is advisable to use singularity-free modeling approaches for the systems, e.g., the modeling approaches based on Lie theoretic conventions as introduced by Müller (2018) and Lynch and Park (2017) or developed over the ring of dual quaternions by Shahidi et al. (2020). This type of system modeling not only enables a compact and singularity-free modeling of the systems in the T-space of the robotic manipulators but also results in a lower memory footprint for the calculation of the motion.

4.2 Motion Planning Algorithms

An optimal system modeling will only partially address the problem of online motion planning for open-chain robotic manipulators. The majority of the motion planning algorithms are either developed in the Configuration space (C-space) of the systems or for systems that have similar C-space and T-space, such as non-holonomic mobile robotic systems, like the one presented in LaValle (2006) and Koenig and Likhachev (2005). However, the C-space and the T-space of the robotic manipulators are basically of different cardinalities. Moreover, the forward kinematics function for open-chain robotic manipulators is a non-injective surjective function. These facts are not taken into account when the sampling process for the sampling-based planning algorithms is performed in the C-space of the system as is common in state of the art. Hence, the direct adaptation of the successfully developed algorithms for dynamic environments, e.g., by Koenig and Likhachev (2002), to open-chain robotic manipulators is only possible to a very limited extent. In recent research by Shahidi et al. (2022), we have developed a novel algorithm that combines the information from the C-space and T-space of the open-chain robotic manipulators and prepared an optimal structure of a graph, dubbed kinematic graph, to be utilized in the sampling-based planning algorithms. In the proposed algorithm, it is possible to employ the cost and heuristic functions from both spaces and facilitate an optimal motion planning within different aspects. Figure 4 illustrates qualitatively different configuration motions generated by the developed motion planner for a simple two DoF mechanism. It can be observed that the motion of the mechanism seems more natural, when the manipulability of the mechanism is considered in the cost and heuristic functions in the planning process. Note that the computation of the heuristics based on the C-space information demands the knowledge of the configuration of the mechanism at the goal posture; hence, the inverse kinematics function should be performed. This can be problematic due to the non-injective surjective behavior of the forward kinematics function, i.e., multiple answer possibilities for the inverse kinematics function. The case where both the cost and heuristic function rely on the C-space information only is presented for demonstration purpose.

Fig. 4
figure 4

Motion of a two DoF mechanism based on different cost and heuristic functions that are considered in the motion planning problem. The trace of the end-effector of the mechanism and the configuration of the mechanism evolve from light blue to gray. (a) cost function: the Euclidean distance in the T-space; heuristic function: the Euclidean distance in the T-space. (b) cost function: the Euclidean distance in the C-space; heuristic function: the Euclidean distance in the C-space. (c) cost function: the combination of the Euclidean distance in the T-space and the manipulability of the mechanism; heuristic function: the combination of the Euclidean distance in the T-space and the manipulability of the mechanism. (d) cost function: the combination of the Euclidean distance in the C-space and the manipulability of the mechanism; heuristic function: the combination of the Euclidean distance in the C-space and the manipulability of the mechanism

Finally, a fast and time-efficient control scheme can be used to close the loop of system modeling, motion planning, and control and complete the online motion planning for the robotic systems (Shahidi et al., 2020). With the approaches of online motion planning and control in combination with a real-time vision system, the planning of the motion for the robotic systems in assembly can be carried out taking into account high safety and efficiency requirements. However, latencies arising during the compute process can still negatively impact the performance of the control algorithms, e.g., if control signals arrive too late at the robotic manipulators. Reducing the response times is thus a crucial aspect which can be achieved by choosing suitable compute locations close to the system, thus decreasing latency. In the following, we focus on how compute locations in the network might help.

5 Leveraging Distributed Computing Resources in the Network

Modern shop floors can leverage a multitude of distributed computing resources, ranging from on-premise (edge) deployments to remote cloud services. Choosing the best option from this spectrum critically depends on the concrete process requirements. For the aforementioned robot motion control (cf. Sect. 4), e.g., low response times are of highest importance as control signals arriving too late might present a danger to the safe working environment of the human workers. Consequently, motion control algorithms are best executed as close to the controlled robots as possible or even directly on them. However, computational capabilities in edge deployments are typically limited. Additionally, large volumes of information could be leveraged to influence the control decisions, ranging from data of sensors directly mounted on the robots to stationary sensors monitoring the work environments, such as optical sensors (cf. Sect. 6). In most cases, processing all available information on the robots themselves is not feasible due to their limited compute capacities. Similarly, sending all sensor information to central computing resources can be prohibitive, either in terms of too high communication latencies or in terms of data volumes that could overload the network. As a middle ground, a growing branch of research explores deploying sensible control functionality onto networking devices which can process high data volumes of several Tbps at sub-millisecond latencies. This in-network control can potentially provide the desired real-time-capable response behavior for robot control and LMAS in general.

5.1 Laying the Groundwork for In-Network Control

In-Network Computing (INC) has been enabled by latest innovations in networking technologies (Sapio et al., 2017). In particular, networking hardware can now be programmed using domain-specific languages, such as P4 (Bosshart et al., 2014), allowing for highly customizable data processing and filtering directly on the networking hardware. In the context of these advances, the possibility of deploying control functionality into the network has already been studied. For example, we have shown that offloading simple linear-quadratic regulator (LQR) controllers to networking devices can have benefits in settings with higher latencies (Rüth et al., 2018). Similarly, Cesen et al. (2020) show that deploying latency-critical tasks on networking hardware can improve reaction times of robot control scenarios compared to pure remote control. In addition to these direct applications to control tasks, we have focused on providing crucial building blocks for INC applications and control algorithms in general. In particular, we have demonstrated that simple image processing methods (Glebke et al., 2019), data transformation techniques (Kunze et al., 2021a), as well as signal phase detection and dynamic data pre-processing (Kunze et al., 2021b) are possible using INC. These results showcase the potential of INC, further diversifying the distributed computing resource landscape existing today. Thus, INC constitutes one part of our Technology Layer (cf. Fig. 1).

However, the direct applicability of these approaches to existing architectures, e.g., realistic robot control scenarios using the Robot Operating System (ROS), is questionable: In our work, we mostly rely on the User Datagram Protocol (UDP) and provide custom-tailored solutions, while Cesen et al. implicitly intercept ongoing Transmission Control Protocol (TCP) connections and perform opaque operations in the network that the central controller is not aware of. Whether this behavior is an acceptable practice and how INC should interact with transport protocols in general is still part of ongoing discussions (Kunze et al., 2021c). Similar questions also arise for many of the other related approaches that initially focused on identifying sensible application areas for INC (cf. Hauser et al. 2021). With growing maturity, research on in-network control and INC in general is shifting toward the development of frameworks that allow for the seamless integration into existing architectures, addressing some of the concerns raised above as well as additional criteria that we have collected (Kunze et al., 2022).

5.2 Toward Deployable In-Network Control

The fundamental challenge of using INC for control tasks in existing frameworks is the integration into today’s transport protocols. These protocols establish end-to-end connectivity between the devices, but typically expect the network to deliver packets without modifications (Kunze et al., 2021c). INC violates this assumption and is thus not directly compatible with many of the connection-oriented transport protocols, such as TCP (Stephens et al., 2021). While connectionless protocols, such as UDP, often allow for the desired changes to the packets, these approaches currently require manually defining the semantics of the INC operations and a corresponding manual adaptation of the application logic. Hence, deployment on larger scales is far from trivial. Consequently, there is a need for general frameworks that define standard interactions with INC functionality and, especially, how this functionality can be included in the transport protocol semantics.

Moving toward this goal, we envision to implement ROS-based control functionality using INC while respecting the semantics of existing transport protocols. In this context, it is important to note that ROS communication by default uses TCP, while a module providing UDP connectivity is not well maintained. Thus, currently, INC-tolerant transport protocols are not yet available. Possible solutions are either using a novel, message-oriented protocol that is specifically designed for use with INC (Stephens et al., 2021) or adapting existing message-oriented protocols, such as UDP or the Stream Control Transmission Protocol (SCTP), for use in ROS with INC. Enabling this critical component for ROS-based communication will be key for deployment-ready robot control scenarios that leverage INC, e.g., for faster image processing (Glebke et al., 2019) to localize the robot in the shop floor.

While this concrete example will likely benefit from the significantly reduced processing times, other components of our overall system are far less latency-sensitive. For example, the aforementioned optical sensors cannot only be used for robot control, but also for monitoring and assuring the quality of assembled or produced components. In these settings, deployments that capitalize on INC are thus not required, and the respective approaches can also be deployed on other compute resources, consequently leveraging higher program complexity, as we will discuss next.

6 Trustworthy Vision Solutions Through Interpretable AI

As outlined in Sect. 4, resources such as mobile robots need to react appropriately to expected and unexpected events and must therefore understand and interpret their environment in context-aware real time. While Sect. 5 concentrates on the infrastructural requirements to take data-driven decisions in time, this section addresses the Technology Layer of the underlying holarchy of LMAS (cf. Fig. 1) by investigating how to intelligently extract the relevant pieces of information from the image or video data acquired from optical sensor systems.

Vision sensors, such as cameras or triangulation sensors, are often used to acquire a precise digital representation of a resource’s proximity or performing a quality control of assembled parts. Analyzing large amounts of images or point clouds and extracting the relevant pieces of information to make decisions is challenging due to the complexity of LMAS scenarios. Deep Learning (DL) promises to solve these obstacles by a rich set of data-driven tools and techniques, many of which were successfully employed in the field of autonomous driving and operation of resources (Grigorescu et al., 2020). These methods, however, usually operate as black box models. For this reason, the underlying criteria of decisions made by these models remain unknown; thus, the inverse direction of assessing the actual properties of the input that caused a certain decision is not comprehensible. This leads to a general level of mistrust in decisions made by DL models and makes them inapplicable for an autonomous operation as required in LMAS.

6.1 Interpretable Machine-Learned Features Using Generative Deep Learning

The interpretation and explanation of DL models is an active field of fundamental research in machine learning (Fan et al., 2021; Selvaraju et al., 2017). Current industrial settings mainly use discriminative DL models that assign a decision boundary to a given dataset \(\mathcal {X}\) to divide new samples into a set of classes \(\mathcal {Y}\). Generative DL models approximate the distribution of the data \(p_{\mathcal {X}} \left ( x \right ) \) by means of a function \(\mathcal {G}_\theta : \mathcal {Z} \rightarrow \mathcal {X}\) that maps from a latent space \(\mathcal {Z}\) to observation space \(\mathcal {X}\). The function is thereby modeled by a neural network with parameters θ which are inferred during training the model. After training, the generative model can be used to synthesize samples that possess the characteristics of real data samples and present an attempt to improve the interpretation of DL models. The properties of a latent vector \(z \in \mathcal {Z}\) and the effect of translations z  = αz + β, with \(z,\beta \in \mathcal {Z}\) and \(\alpha \in \mathbb {R}\), can be visualized by generating the corresponding image with the generative model. By this, latent space can be interpreted by means of characteristics of the data.

Style-based Generative Adversarial Networks (GANs) (Goodfellow et al., 2014; Karras et al., 2021) learn disentangled factors of variation \(z \in \mathcal {Z}\) allowing for the control of distinct characteristics of synthesized data samples, which leads to a higher level of interpretability of the machine-learned features. By adding a GAN inversion mechanism to the generative model, such as a dedicated encoder network mirroring the generation process as in the Adversarial Latent Autoencoder (ALAE) Framework (Pidhorskyi et al., 2020), real samples can be projected into the interpreted feature space, which allows for the assessment of the characteristics of these embedded samples. Through this procedure, \(\mathcal {Z}\) can be used, e.g., to interpret the properties that cause a certain decision by visualizing a corresponding counterfactual example (Lang et al., 2021).

6.2 Initial Implementation on a Synthetic Dataset

To investigate whether machine-learned features from generative models can be identified and associated with human-understandable image properties, we created an artificial image dataset containing 10,000 white, centered ellipses on a black background for this study. The ellipses are fully characterized by three quantities: major axis length MA, minor axis length ma, and rotational angle ϕ. An ALAE with a style-based GAN was implemented using Python 3.8.5 and the PyTorch v1.8 framework following the code provided by Pidhorskyi et al. (2020). We have implemented and applied this framework including a more detailed justification for metrology applications in Schmitt et al. (2022). In this study, the model was trained up to a resolution of 32 × 32 pixels. By sampling random vectors in the disentangled latent space \(\mathcal {W}\), images of ellipses can be generated and confirm that the model is able to learn the data manifold (cf. Fig. 5 top left). To investigate whether the encoder network maintains the properties of the ellipses, we passed images through the encoder and reconstructed them utilizing the generator (cf. Fig. 5 top right). The reconstructed ellipses resemble the properties of the input ellipse, however leading to slightly blurred edges. One possible reason for this behavior might be that style-based GANs (that is used as generator network for ALAE) apply moving average filtering during training, which attenuate high-frequency components in the image. To identify latent variables ξ i corresponding to properties of the ellipses, we randomly sampled 1000 vectors \(z\in \mathcal {Z}\) and applied a principal component analysis (PCA) according to the procedure proposed in Härkönen et al. (2020). Figure 5 (bottom) depicts the effect of embedding an ellipse into \(\mathcal {Z}\) and observing the effect of the two components ξ 1 and ξ 3. The first principal component corresponds to a change of the area of the ellipse, while the third principal component represents a contraction and rotation of the ellipse.

Fig. 5
figure 5

Results of a preliminary toy implementation on a dataset of synthetic ellipses

The initial implementation evaluated on the synthetic ellipse dataset indicates that unsupervised methods such as PCA can be used to identify relevant characteristics of the data in the disentangled latent spaces of style-based GANs. As presented in Schmitt et al. (2022), the method is also capable of extracting interesting characteristics from industrial image datasets. These identified characteristics in combination with the generative capabilities can be used in the Internet of Production to support humans, e.g., by providing them with explanatory images for the decision of an autonomous agents, such as those employed in LMAS.

7 Multipurpose Input Device for Human-Robot Collaboration

Due to the combination of the growing number of robots in the industry worldwide (International Federation of Robotics (IFR), 2018) and the increasing collaboration between humans and machines (Matheson et al., 2019), the work environment changes: The assembly of the future will be shaped by

  1. 1.

    robots with high levels of autonomy – enabled by technologies such as presented in Sects. 3, 4, 5, 6 – and

  2. 2.

    human-robot collaboration (HRC) in areas where human work is irreplaceable.

Understanding the behavior of machines – e.g., through explainable AI (cf. Sect. 6) – is an important factor for the acceptance of machines in HRC by humans.

Krupitzer et al. (2020) provide an overview of the state of the art of human-machine interfaces (HMI) in the Industry 4.0 domain. Where the workspaces of human and machine are merging, new, more flexible, and ergonomic operating concepts for machines are necessary. With the increasing number of (different) machines in combination with their growing range of functions utilized in production, a new generation of input devices is needed that enables operators to control different machines – if necessary also simultaneously. The resulting HMI allows the Human Holon (cf. Fig. 1) to communicate with other holons of the assembly station, namely, Robot Holon, Sensor Holon, and Product Holon. Applied to the LMAS, such an HMI enables the integration of a human worker: When a station is formed at where a worker is to perform tasks, all robots and tools can be accessed ergonomically and seamlessly without media discontinuity. The overarching research question to be answered is how an HMI can be designed to allow a human to operate several different types of machines focusing on

  1. 1.


  2. 2.

    workload, and

  3. 3.


7.1 Application, Implementation, and Result

The assembly of a truck is chosen as an application scenario: An overhead crane carries the drive train and an Autonomous Guided Vehicle (AGV) provides the vehicle frame onto which the drive train is to be mounted. A robot arm assists with the positioning. Currently, each machine is operated by one worker and an additional worker supervises, secures, and instructs. For Future Assembly a multipurpose input device shall be developed, enabling a single worker to handle all listed tasks.

There are two perspectives to this problem:

  1. 1.

    From the technical point of view, the components to be assembled are large and heavy and move dynamically. Therefore, the handling is nontrivial.

  2. 2.

    From an ergonomic point of view, time pressure as well as the rapid and frequent changes between different parallel tasks (multitasking) is exhausting for humans. This leads to fatigue, which can cause errors. The challenge is therefore to maintain situational awareness and to keep – especially the cognitive – workload in an optimal range.

The goal is to develop an input device that enables a single operator to control these different machines (overhead crane, AGV, and robotic arm) without or with little training. Another aspect is to use available data to assist the operator in performing the task and to provide information to the operator. This includes consideration of not only the technical but also the human factor.

VDI 2221 was chosen as the design methodology. The detailed design procedure is described in Baier et al. (2022).

Figure 6 shows schematically how the input device for simultaneously controlling different machines is designed. A prototype of the input device is in the making: The design features a lightweight and wireless wearable. This allows the operator to move freely during the process and relative to the machines, as a distinct advantage during HRC. The input device functions as HMI as a dedicated intermediate layer between the human and machine domains. Mechanical inputs – in the form of movements or button presses – are received from the operator, are processed on the device, and are sent to the control unit via Wi-Fi. This is implemented in the form of a 3-axis acceleration sensor and a 3-axis gyroscope, which enables the device to control 6 DoF. In addition, there is a touch-sensitive element on each finger.

Fig. 6
figure 6

Schematic design of the multipurpose input device

Interaction concept

Few commands are sufficient for operation, which is why the interaction concept provides for comparatively simple control with hand movements and gestures. A decisive factor for the safety of the control system is error robustness, so that unintentional inputs are not implemented or at least do not cause any damage. Thus, inputs can only be made when a certain touch element is touched, which also doubles as a dead man’s switch. As mentioned before, inputs are evaluated depending on the context and interpreted as commands or discarded. For this purpose, the input device first detects the movements and then processes these signals into inputs. Depending on the machine currently being controlled, valid commands are recognized from these inputs by matching the recognized pattern with the command set stored for the machine in advance. In addition, assistance systems based on the digital shadow are designed to increase safety.

7.2 Outlook

Simulation studies

First, pre-studies will be conducted in the simulator to determine optimal settings for the device. This will be followed by further simulation studies of usability and performance for abstract tasks, and the simulated use case will be investigated. The outcomes will be validated by a physical laboratory study on an AGV with a manipulator.

Use of the digital shadow

To keep the (cognitive) workload for the operator low, an assistance system will be created to support the operator by automating tasks that do not require human intervention. Digital shadows of previous assembly operations will be used as a source of information. For example, an automated system could automatically follow previous trajectories in noncritical areas to free the operator from this task or, as a safety measure, compare the current trajectory with previous trajectories and warn the operator if the deviation is too large. Behavior trees are to be used as data structure for this modeling, since they can represent process steps both discretely and simultaneously.

Section 8 is focused on the support of the user by the system as well. All relevant information is presented to the user through an interface. This information is collected and processed from various sources using an ontology.

8 Ontology-Based Knowledge Management in Process Configuration

Another challenge of human machine collaboration is creating a unified understanding of the existing relationships of process parameters. In complex assembly systems, like LMAS, where the number of process parameters is high, there is no trivial solution for understanding process relationships and dependencies. Accounting for this, the Model Composition domain in the Technology Layer of the Future Assembly Holarchy (cf. Fig. 1) can be achieved by ontology-based knowledge management. To demonstrate the benefit of knowledge-based assistance systems in LMAS, we modeled the influence parameters of the side window assembly in automotive body assembly, where industrial robots are being used for adhesive application. The quality of the application is crucial for the stability of the subsequent bonding and depends on a number of adhesive and process parameters, such as flowability, bead cross-section, travel, and adhesive exit speed. Correct adjustment of the process parameters requires precise knowledge of the complex relationships between the process and adhesive parameters and their effect on upstream and downstream process steps. Thus, it is difficult for the operator to find solutions for suitable control parameters in the event of a process change.

Semantic technologies offer great potential for solving two main challenges of process parameter configuration (Lipp and Schilling, 2020; Sahlab et al., 2021): on the one hand, to create a complete system knowledge base including expert knowledge and, on the other hand, to assist in the search for configuration solutions. Due to a graph structure consisting of a large number of connected nodes or data points, ontologies enable a more flexible modeling of existing data relationships than tabular databases or hierarchical class diagrams. The set of nodes can be expanded as desired, which allows the networking of individual data domains to be modeled and the integration of existing expert knowledge. Furthermore, concepts of graph theory are particularly well-suited for solving optimization problems in which pairs of objects are related (Dengel, 2012). Moreover, ontologies can be based on already existing data management systems, so that no complete remodeling of existing data is necessary (ontology-based data access, OBDA).

8.1 Concept and Implementation

An ontology-based configuration tool can consolidate already existing product and process information, expand it with expert knowledge, and uncover new knowledge connections by creating new relationships between the data points of distributed process data resources such as data models and control units. The generated knowledge base can be used as the basis for assistance solutions to optimize process configuration and, thus, shorten the planning and configuration time. This contribution presents an approach to develop an assistance system for knowledge-based control process configuration.

Figure 7 shows the intended approach for designing an assistance system for knowledge-based control process configuration. An essential aspect of the approach is the conception of the knowledge management system.

Fig. 7
figure 7

Concept of an assistance system for knowledge-based control process configuration

Based on the specific process requirement description, we carry out the model specification of the knowledge management system (A). For this purpose, we analyze existing approaches for semantic modeling of robot-based processes, such as CORA (Core Ontology for Robotics and Automation), MARCO (Manufacturing Resource Capability Ontology), and SUMO (Suggested Upper Merged Ontology) and use them as a system basis (Brecher et al., 2021; Prestes, 2013). Subsequently, we examine the connection possibilities of the system to process-internal data sources (B and C). Of great importance is the evaluation of the integration possibilities of the developed process interface with the OPC UA information models of the assembly process (B). OPC UA is considered, because it is one of the most widespread communication protocols in production technology. Ontologies based on the Web Ontology Language (OWL) standard can provide formal semantics and better search functionality compared to OPC UA models (Schiekofer and Weyrich, 2019); thus, we consider a transformation of OPC UA information models to the ontology-based knowledge management system. Subsequently, we examine the connection of the knowledge management system to already existing control logic or process chains (C). Furthermore, the integration of existing expert knowledge in the form of machine-readable, interpretable metadata into knowledge management system is in the focus of consideration (D). Finally, we conceive a possibility for integrating the knowledge management system into a future overarching semantic network of the IoP (F). This is achieved by designing of a communication possibility between ontologies of different abstraction levels through a bridge concept between domain and application ontologies.

The aggregation of data from process control, information models, and experience-based knowledge across assembly systems opens up new potential for optimizing assembly process control, which could not be exploited until now due to the lack of networking of relevant data sources and semantic expressiveness of existing information. In particular, the use of methods of graph theory and operations research (OR) opens up new possibilities. Therefore, different methods can be applied to the use case of robot-based adhesive process described above, such as, e.g., mixed-integer linear optimisation (MILP) and machine learning methods (E).

Based on the described architecture, we created a system that simplifies the solution search for suitable industrial parameters for a programmer. The system initially includes the following functionalities: suggestions for time optimization, help in changing the adhesive to be used, and troubleshooting. The functionality of the assistance system will be briefly presented using the example of time optimization. The ontology based on CORA concepts is used as a basis for finding the data, which reveals their interrelationships, e.g., those between the individual process variables such as pump output and adhesive density. Real process data such as trajectory values are saved, filtered, and stored in an SQL database. The ontology can access the database of actual values using the OBDA approach (R2RML), turning it into a knowledge graph. The resulting knowledge graph is then used as a guide for the optimization algorithm (parallel machine scheduling), which detects optimization potential in terms of process time. If optimization potential exists, the assistance system suggests the process variables that need to be adjusted to achieve this potential, e.g., the speed of the robot.

8.2 Summary and Outlook

In summary, semantic technologies, such as ontologies, represent a promising approach to knowledge-based assistance for process configuration of robot-based assembly processes, which can integrate existing expert knowledge and transfer already existing knowledge sources such as information models and flow logic structures. In the further course of the research, we will identify and evaluate further optimization algorithms describing the assistance performance of the system. Subsequently, we will use these algorithms to expand the assistance performance of the developed system with additional functionalities. Moreover, we will create methodologies for an efficient transformation of common data exchange formats to a knowledge graph format.

9 Conclusion

We present services and concepts for modular control enhancing the future operation of lineless, mobile assembly, based on a previously developed system architecture. We contribute the following research focal points through applying and extending principles of the Internet of Production (IoP) in lineless mobile assembly systems (LMAS):

Closing the gap between high-level scheduling and low-level robotics control, we introduce measures for the modular formation planning for mobile robots in assembly stations. Based on the resulting formation, consisting of robot base placement and task allocation, the robot motion planning to execute those tasks is carried out. Firstly, we have developed a compact and representational singularity-free modeling for the robotic manipulators that enables the use of fast motion control strategies. Secondly, we have developed the structure of a novel graph specifically designed for open-chain robotic manipulators to enable the effective implementation of the sampling-based scheduling algorithms using heuristic functions. There is a lot of potential in previously unused networking resources that can now be leveraged using INC, to, e.g., compute the motion planning. Their application to existing communication scenarios, however, requires new transport protocol solutions that do not break when subject to INC. Similarly, novel methods based on generative deep learning might help to visualize and explain the decisions of neural networks and, thus, increase the level of autonomy of resources in LMAS. In work systems of future industrial assembly, human work will take place in the context of human-robot collaboration. We have presented an approach to enhance safety, ergonomics, and workload reduction in HRC based on a HMI that allows different machines to be controlled flexibly and as needed. We introduced semantic technologies (ontologies) as a promising approach for knowledge-based assistance solutions to automatically configure robot-based assembly processes.

In future work, we plan to deepen the knowledge of the research focal points as well as further integrate research labs, engineering, and production sites into a combined demonstrator, following the principle of the World Wide Lab.