One of the tenets of Industrie 4.0 is the digitization of all processes, assets and artefacts in order to be able to virtualize and simulate production and to achieve maximum flexibility and productivity. While much of the innovation effort is aimed at outright automation, it is also obvious that for the foreseeable future, human-to-machine interaction will remain an important element of any production. With an aging workforce and products that are becoming less standardized (lot-size 1) the notion of assistance has been getting significant attention. In an Austrian “lighthouse-project” the idea of “Assistance-Units” is put forward. This paper presents a taxonomy and internal structure of these units and a proposed formal language to specify their desired interaction with humans when an assistance need arises. It should be noted that the proposed language acts as a mediation layer between higher level business process descriptions (e.g., an order to manufacture a product) and machine-specific programming constructs that are still needed to operate e.g. a welding robot. The purpose of our task description language is to orchestrate human workers, robotic assistance as well as informational assistance, in order to keep the factory IT in sync with progress on the shop-floor.
1.1 Motivation and State of the Art
While there is plenty of research work on collaborative robotics, exoskeletons, and informational assistance through e.g. virtual reality headsets, less work has been done on bringing these diverse forms of assistance under a common roof. A recent study  also took a broad view that included assistance systems for both services and industry. There, the authors distinguish three kinds of (cognitive) assistance: (1) Helper systems providing digital representations of e.g. manuals, teaching videos, repair guides, or other elements of knowledge management. (2) Adaptive Assistance systems that are able to take into account some situational context, e.g. through sensors, and that can then provide relevant information that is adapted to the specific situation. (3) Tutoring Assistance systems are also adaptive, but address explicitly, the need for learning in the work context.
The study forecasts a very high market potential for the use of AI in assistive technology and this is of relevance to the category of “machines and networked systems” which gives the closest fit with manufacturing. The study also lists a number of German research projects broadly addressing digital assistance, in the years 2013–2017.
Looking at the current state of the art in industrial use of assistive technology, there is a prevalence of personal digital assistants based on speech recognition engines such as Apple Siri, Microsoft Cortana, Amazon Alexa, or Google’s Assistant with its cloud speech tool. These are often combined with VR or AR headsets, or with tablets or smart phones depending on the use case. It should be noted that any user-activated assistance (e.g. by speech recognition) is reactive, i.e. it has no option of pro-actively helping the user. A pro-active approach requires the system to somehow know where worker and assistive machinery are, in the overall task. The following example comes from one of our industrial use cases for which a demonstrator is being built.
Motivating Example – Assembly.
The use case comes from a manufacturer of electrical equipment for heavy duty power supply units as used for welding at construction sites. There are detailed documents available describing the assembly process and the parts involved. The actual assembly process is done by referring to a paper copy of the assembly manual. There are also CAD drawings available that can be used to derive the physical dimensions of the main parts. There are a number of important steps in the assembly process where the worker has to verify e.g. the correct connection of cables, or where he or she has to hold and insert, a part in a certain way. Furthermore, the assembly steps are associated with typical timings in order to ascertain a certain throughput per hour. The purpose of the associated Assistance Unit (we will use the abbreviation A/U) is to help inexperienced workers learn the process quickly and to remind experienced workers to double-check the critical steps. It should also be possible for the worker to ask for assistance at any point, through speech interaction and the A/U should be able to detect delays in a process step, as well as errors in the order of steps when the order of steps is important for the assembly process.
Further use cases not discussed in this paper address maintenance, repair, re-tooling, working on several machines simultaneously, and physical (robotic) assistance with lifting of heavy parts.
1.2 Structure, Usage and Purpose of Assistance Units
The research into assistance units started with a hypothetical structure that was deemed useful to define a methodology for creating and customizing different kinds of assistance units (Table 1):
Looking at the concept from a technical perspective, however, it is difficult to refine this structure sufficiently, in order to use it as a taxonomic discriminator: the Description points at a potentially significant set of user requirements that are not covered in the specification. Likewise, distinguishing between cognitive and physical assistance is a binary decision that can be rewritten as “content management” (for cognitive needs of the user/worker) or “robotics” (addressing physical needs of the end user/worker). Input format and device, and analogously, output format and device are somewhat dependent entities because speech interfaces require microphones and video output requires a screen of some sort. The “knowledge resources” are worth a more detailed investigation with two possible technical approaches: firstly, one could think of the resources as media content that is searchable and through appropriate tagging, can be triggered in certain circumstances. The second approach would be to endow the assistance unit with a detailed model of the task and thus, make the unit capable of reasoning over the state of the task in order to assess progress or any deviations. This latter approach involves Artificial Intelligence techniques for knowledge representation and reasoning.
Usage as a Distinguishing Feature?
It is clear that in principle, there may be as many kinds of assistance units as there are kinds of manufacturing tasks. So one approach to classification could be high level distinctions of usage. Initially, we classified our use cases as being either assembly or maintenance or re-tooling, but in each of the cases, the distinguishing features only show themselves in the content of the knowledge resources or in the knowledge based model of the actual activity, without having a structural manifestation in the assistance unit itself.
Purpose (of Interaction) to the Rescue?
The main reason for wanting a distinction between different kinds of assistance units is to have clearly separable methodologies for their development and deployment. Ideally, one should be able to offer pre-structured assistance “skeletons” that can be parameterized and customized by experts in the application domain, rather than requiring ICT specialists for bespoke programming. Having been dissatisfied with the previous attempts at structuring, we then looked at the way in which human and machine (aka assistance unit) were supposed to interact and this led to a proposed model that distinguishes eight forms of interaction or purpose, for using an assistance unit, with distinctive capabilities specified for each kind of unit (Table 2).
The above taxonomy has the advantage that it requires an increasing set of capabilities going from mediator to avatar. In other words, the distinguishing features allow one to define capabilities on an ordinal scale (Mediator < Tutor < Trouble-shooter < etc.). It is of course questionable whether the mediator role qualifies at all, as an assistance unit in its own right, but we may accept that a combination of on-line manuals, plus the ability to call an expert, all packaged in some smart-phone app, is sufficient functionality for a low-hanging fruit w.r.t. assistance units.
1.3 Assistance Units as Actors on the Shop-Floor
One of the defining features of any assistance unit is the ability to recognize when it is needed. This presupposes some form of situational awareness which can be obtained either by explicitly being called by the worker or by some sensory input triggering the A/U. Situational awareness requires explicit knowledge about the environment, knowledge about desirable sequences of actions (e.g. the steps of an assembly task) and it also requires some planning capability unless we can afford the system to learn by trial and error – an unlikely option in a competitive production environment.
Formal and semi-formal production planning methods are already being used in many manufacturing firms. Hagemann  gives the example of Event-Process-Chains for planning assembly lines in the automotive sector, but for assistance, one would need a more granular level of description. At the level of robotic task execution, there are examples such as the Canonical Robot Command Language  that provides vendor-independent language primitives to describe robotic tasks.
In the field of production optimization, many firms rely on some variety of measurement-time-method (MTM). There are different variants of MTM offering different granularity [3, 5, 7, 9] but the main problem is that the method is not well suited to formally describing purposeful, planned processes with inputs and outputs, but instead, focusses on the description of isolated actions and their time- and effort-related parameters. The method is also propagated rather exclusively, by a community of consultants belonging to the MTM association which keeps control over the intellectual property of practicing the art of MTM analysis.
So, while there are clearly potential connecting points with process planning at large, and with action-based productivity measurement at the more detailed level, one still requires an orchestration formalism that can distinguish between actors, can name inputs and outputs of processes, and can describe the manufacturing process as well as any assistive measure that one may want to add to the manufacturing process. To achieve this goal, we used – as a starting point – the agent model of Russell and Norvig  as shown below (Fig. 1).
The important issue to note is that we distinguish Environment and Agent, the latter consisting of sensors to perceive observations from the environment, effectors to manipulate entities in the environment and some form of rationality expressed in a world model and in a rule system that predetermines what actions the agent will take, under certain conditions. For a simple reflex agent as shown above, there is a direct relationship between singular observations and according actions. This is definitely not a sufficient model for what we would call “intelligent” behavior, and so we should extend the basic model, as shown in the next figure.
As can be seen, the second model introduces a memory for the agent to store a state of the perceived world, and a planning mechanism to reason about the effects of possible actions. This means that the agent can now make decisions concerning its actions, on the basis of some utility function included in the planning mechanism. This agent-model gives us now a better handle on the original structural interpretation of the assistance unit: the input devices are a subset of all sensory input to the A/U and the associated formats are the data structures of those observations (also known as “percepts”). The output devices are a subset of all effectors of the A/U and quite clearly, for these to work in an intelligent way, a reasoning engine of some sort is required between the input and the output mechanisms of the A/U – as illustrated in Fig. 2.