Experimental evaluation of tasking and teaming design patterns for human delegation of unmanned vehicles

This work discusses different approaches for the cooperation between humans as a supervisor and multiple unmanned vehicles (UVs). We evaluated the most promising approach experimentally with expert pilots of the German Air Force. The co-agency of humans and highly automated unmanned systems (i.e., human autonomy teaming, HAT) is described by the use of a design and description language for HAT design patterns. This design language is used to differentiate control modes for tasking, teaming, and swarming of UVs. The different control modes are then combined in a planner agent (PA) design pattern that further enables the UV guidance on scalable delegation levels from a single individual up to a team. The desired system behavior and interaction concept of the PA for these scalable delegation levels is then transferred to the domain of manned-unmanned teaming in fighter aircraft missions. To demonstrate the applicability of the system, we implemented the concept into our fast-jet simulator of the Institute of Flight Systems (IFS) and conducted an experimental campaign with expert pilots. The results of the experiment showed (1) task delegation with the PA design pattern is faster and reduces the error potential; (2) scalable delegation levels enable a pilot and situation-specific task delegation; (3) the delegation of teams is faster and reduces the error potential; however, in some situations, deeper access through the scalable delegation levels is needed; (4) the concept is intuitive and the transparency and trust in UVs and swarms were very high; and (5) the pilots could imagine operating such systems in the future. Overall speaking the presented PA design pattern is suited for the guidance of UVs and the scalable delegation levels are beneficial.


Introduction and background
A major challenge in guiding multiple unmanned systems (UV) by humans is to keep the arising task load at a manageable level. This demands some sort of higher level of autonomy than manual control for the UVs. Despite the increase in the autonomy of the UVs, the number which can be meaningful led is limited. To further increase the number of UVs, a different organizational structure needs to be used for guidance. Kolling et al. (2016) show such a structure to reduce the complexity of multi-UV guidance with a swarm concept. This highly automated human-swarm interface makes the guidance of the UVs independent of their size. Another challenge in the UV guidance is the demand for a fast data link and low interferences. In military environments, communication jamming is an integral part of warfare, and therefore permanent datalinks to a remote command station cannot be ensured. Ethical guidelines in these environments prevent a fully autonomous UV performance and demand to have a human as decision-making authority on site. The research field dealing with the integration of humans into such systems is called manned-unmanned teaming (MUM-T).

Manned-unmanned teaming
For military purposes, we created a definition for MUM-T. According to this definition, MUM-T describes the interoperability of manned and unmanned vehicles to pursue a common mission objective. Both manned and unmanned vehicles need to be employed in the same confined spatial, temporal, and mission-related context. In MUM-T, the unmanned platform(s), as well as its/their mission payloads, will be commanded by the manned asset (s). From this, the major challenge for MUM-T technologies arises, i.e., to master the high work demands posed on the human user(s) arising from the multi-platform mission management and task execution. Therefore, technical solutions for MUM-T have to encompass the following: • Dedicated human-machine interaction and interface concepts • Collaborative mission planning and vehicle control algorithms • Intelligent assistance and support functions for the human user • Dedicated links for real-time data distribution between the platforms • High levels of automation of mission management and execution functions The benefits of MUM-T are the high mission effectiveness and efficiency achieved with a minimum of personnel, however, keeping the human user in the local decision loops.
It can be expected that in MUM-T, large portions of the mission and payload management, navigation, guidance, and control will be highly automated. On the other hand, though, it is commonly agreed amongst human factors researchers that high degrees of automation may also bear the risks of loss of situational awareness, as well as complacency, and skill degradation, amongst other adverse effects for the human user (e.g., Parasuraman et al. (2000)). Therefore, in this contribution, we focus on the following aspects: Human-machine interaction concepts-these need to be considered to exert meaningful human control (MHC) over manned and particularly also a number of unmanned systems, i.e., the platforms, and their payloads (Santoni de Sio and van den Hoven 2018). In addition to highlevel control automation, this also requires an adequate concept of workshare and function allocation, as well as appropriate interaction design, to master the complexity effects that come along with it. Controllability and transparency requirements shall be taken into consideration as well (Chen et al. 2014). A controllability approach for multiple UV from manual control to very high automation in search and rescue missions with multiple UAVs and one user on site is described in Bevacqua et al. (2015). A similar approach, called FLEX-IT (Flexible Levels of Execution, Interface Technologies), for the multi-UV guidance, is used in Calhoun et al. (2018). FLEX-IT provides the user interaction modalities ranging from manual control of an individual UV to high-level guidance building upon the Playbook approach.
Mission planning algorithms-these represent the automation functions supporting task allocation amongst the participating vehicles, manned and unmanned, as well as the highly automated task performance. Those functions can either be centralized, usually in the manned command vehicle, or distributed over the participating platforms (Ponda et al. 2015). The planning of heterogeneous teams with different roles and responsibilities has been examined in detail in several studies. A common approach incorporates finding optimal or near-optimal solutions for the task assignment and scheduling for the UVs, e.g., by minimizing mission time or distance (Behymer et al. 2017). To solve these problems, integer programming, Markov decision process, and game theory approaches are widely used (Behymer et al. 2017;Ponda et al. 2015).
For the integration of these aspects, an adequate concept of workshare and function allocation, as well as appropriate interaction design, to master the complexity effects that come along with it, must be developed. Therefore, we present related works in the field of MUM-T and describe approaches to cope with the MUM-T design challenge using a symbolic design pattern approach. Here, we also provide insight into operational concepts to differentiate between teaming and swarming. Based on these design patterns, we describe the task assignment and delegation process. Afterward, we describe how this process is applied to the air combat domain and provide insight into our implementation of the design patterns in our laboratory prototype. This prototype is then evaluated in a human-in-the-loop experiment with German Air Force pilots. The last chapter concludes the experimental results of the experiment and discusses future research topics.

Related work
The challenges of MUM-T are various and connected to several fields of research. In Goodrich and Cummings (2007);Parasuraman et al. (2000);and Sheridan and Verplank (1978), different levels of automation are introduced, which can be used to describe how partial functions of the UVs can be automated. Depending on the chosen degree of automation, various human factor issues, e.g., complacency, loss of SA, can occur (Wiener and Curry 1980) and are in detail addressed in Goodrich and Cummings (2007) for the unmanned aerial vehicle guidance. Different swarm approaches and their integration into a human-machine system can be found in Lewis (2013). In contrast to these rather fundamental works, the following studies deal with the design and evaluation of concrete MUM-T systems. Bevacqua et al. (2015) present a planning and execution system for search and rescue missions with multiple UAVs and a co-located user in an alpine scenario, where the systems can be guided on adjustable levels from explicit teleoperation to complete autonomy. In Schmaus et al. (2018) the required autonomy, communication, and human-robot interfaces for setting up habitats in a hazardous planetary environment with robotic coworkers are addressed. The setup is evaluated with an astronaut onboard the International Space Station, who commanded real-world robotic coworkers in a simulated Martian solar farm on earth at the DLR. IJtsma (2019) describe a system to organize and guide a team consisting of two astronauts, a humanoid robot, a remote manipulator system, and a free-flying robot to inspect and repair multiple exterior components of a spacecraft. Airbus (2020) demonstrated in a real-world campaign, how five Airbus-built Do-DT25 target drones could be guided from a command and control (C2) aircraft. Behymer et al. (2017) give detailed insight on interfaces and autonomy for a Multi-UxV Planner in base defense missions. They also provide an initial evaluation of the system with seven human participants.

Tasking, teaming, and swarming design patterns
In Schulte and Donath (2018), a description method and a common language to structure and depict configurations for highly automated work systems involving humans ( ), cognitive agents ( ), and conventional automation ( ) were introduced. Those actors can be attributed to different roles in the work system (i.e., Worker or Tool). The Worker takes initiative for the pursuit of the work objective, whereas the Tool executes given commands. The relationships between the actors can be either hierarchical (i.e., delegation, tasking: ) or heterarchical (i.e., assistance, teaming: ). The hierarchical mode can be applied whenever a superior actor (human or agent) delegates another actor. Typically, actors representing the Worker will use this mode to delegate actors being a Tool. The heterarchical mode is only applicable amongst the Tools (two or more agents), or within the Worker (humans and agents). In the latter case, we speak of assistant systems. Hence, establishing a heterarchical relationship between a Worker and a Tool is not intended.
These elements describe human autonomy teaming (HAT) design patterns and enable the construction of rather complex automated work systems. From thereon, systems engineering requirements for developing cognitive agents and related human-agent interaction modalities are derived. Figure 1 shows two elementary delegation design patterns for tasking, i.e., the delegation of tasks to a subordinate cognitive agent or a conventional tool. Sheridan's term of Human Supervisory Control and succeeding works on human-automation function allocation describes design options for the delegation of well-defined, simple tasks to conventional tools, as depicted in Fig. 1a. With the advent of intelligent automation (Miller and Parasuraman 2007), also the delegation of higher level tasks to a cognitive agent became feasible (see Fig. 1b).

Tasking teams
As opposed to pure tasking (or delegation) to a single subordinate agent or tool, teaming entails the introduction of some heterarchical structures ( ) amongst a number of actors in the work system, i.e., the team members (see Fig. 2).
Teaming is an organizational structure for human collaboration. We transfer it to socio-technical systems. This might involve purely unmanned teams, but also the human user becoming teammate with artificial agents. In teaming, each member has a role to play, knows which roles are assigned to the other team members, and how they contribute to the overall work objective. Teaming can be considered as coordination at the task level, where each team member contributes their capabilities to complete the task. Figure 2 shows four different design patterns for teaming. The patterns exemplify only two unmanned assets, for the sake of clarity and space. In the concrete applications, manned/unmanned teams of 3 to 5 UVs plus one command vehicle were used.
(a) In this design option, the user delegates vehicle tasks (VT) to a small number of agents, each of which controls its given conventional tool(s) (e.g., UVs) (Uhr- . Each VT has to be performed by exactly one UV and must contain all parameters required for the highly automated execution by the onboard/co-located, indicated by the box, cognitive agent of the UV. The responsibilities of the onboard agent comprise the route planning and transit to the task as well as its intelligent processing considering the intention, timing, tactics, and constraints. Most of the coordination work in this pattern is done by the human user. However, there will also be local coordination of tasks amongst the dislocated UV agents. This pattern can be regarded as a weak form of teaming; tasking is the dominating mode here. The application of this concept has been demonstrated for multiple domains, such as ground-based control stations and helicopter missions (Clauß and Schulte 2014; Uhrmann and Schulte 2012). (b) As opposed to pattern a), here, a more complex team task (TT) will be issued to all agents (indicated by a bracket). This TT is an abstract description of a highly complex task that requires close cooperation between several (heterogeneous) UVs. Conceptual approaches for such a TT can be found in high-level plays of the playbook approach (Calhoun et al. 2018), non-primitive tasks of HTN networks (Erol et al. 1994), and goal nodes in the task specification language (Doherty et al. 2010). Teaming amongst the dislocated agents in this pattern will be facilitated by pursuing cooperative goals and coordination amongst each other. Schulte and Meitinger (2010) investigated this design option by implementing cognitive agents onboard of up to five UVs, dynamically negotiating the distribution of sub-tasks for a highly complex team task describing the mission goal. Gangl et al. (2013) integrated this solution in a fast-jet cockpit simulator and conducted human-in-the-loop experiments. Generally, the chosen high level of autonomy and the complexity of teaming behaviors were found to be compromising situation awareness, controllability, and also causing complacency effects. (c) In this option, the cognitive agents, still co-located with their given tools, adopt the role of a Worker (i.e., pursue the given work objective by own initiative) (Onken and Schulte 2010). Together with the human, they will form a cooperating human-agent team. Meitinger and Schulte (2009)  the so-called planner agent (PA), enables the delegation of both vehicle and team tasks. The PA is responsible for integrating environmental information, delegation specification as well as the capabilities and restrictions of the UVs into the planning problem and correspondingly creates an appropriate mission plan. In the case of a TT delegation, the PA is further responsible for the determination of the required vehicle tasks for the successful task execution, e.g., by using logical planning algorithms, and the coordination of the team members. The coordination work, in this case, can be shared and traded between the human user and the central coordination agent in a wide range. The PA would usually be co-located with the user (e.g., the pilot aboard a command vehicle in MUM-T). This approach using scalable delegation levels is investigated in Heilemann et al. (2019) and Heilemann and Schulte (2020a, b) on a conceptual and experimental level.
In order to combine the advantages of the central planning agent (i.e., controllability, transparency) and the human-agent team (i.e., flexibility, adaptiveness), we designed a pattern, as depicted in Fig. 3. Here, the pattern introduced in Fig. 2d will be augmented by a heterarchical relationship between the human user and the central planning agent. In this pattern, the predominant mode will still be the scalable delegation of tasks to the central planning agent. However, the agent in this setup will also assist the planning process by own initiative.

Tasking swarms
The notion of swarming stems from biology, describing moving in or forming a large or dense group of small, rather simple animals. The observed complexity of swarming behavior emerges from frequent and parallel, but usually simple and local interactions of the swarm members, based upon the exchange and adaptation of behavioral parameters (e.g., direction and speed of motion). Usually, all swarm members follow the same purpose. In our context, we borrow the term swarming to describe the coaction of a larger number (i.e., maybe greater ten) of technical vehicles (UVs) that all serve a human user-provided purpose.
Applying the swarming metaphor to a technical application, the swarm needs to be tasked by a human user. Starting from the elementary patterns of tasking (cf. Fig. 1), we derive a pattern involving a swarm avatar, as shown in Fig. 4a. However, the direct tasking or interaction of the human with the many swarm members, according to the pattern depicted in Fig. 1a, is not an option. It could be proven by Coppin (Coppin and Legras 2011) that swarm performance breaks down or significantly decreases when direct human interventions in swarming algorithms are allowed. The purpose of the swarm avatar is to provide a tasking interface to the human user, to exert meaningful control over the swarm. Therefore, the avatar has to translate the purpose of the swarm task (ST) delegated by the user into parameters of the swarming algorithms. The avatar will usually be a tool agent, however, most likely co-located with the human command station. This concept also allows integrating a swarm into any teaming context, as illustrated in Fig. 4b. This teaming mode corresponds with the one in Fig. 2a. However, it is imaginable to substitute a swarm within any other teaming structure shown in Fig. 2 by using the avatar principle.
In industry, the terms of teaming and swarming are frequently mixed up. Teaming-capable beings like humans create plans which they transform into temporal conditions via experiences and models, so exactly contrary to the swarming characteristics as a composite of reactive individuals. Clough (2002) provides detailed information on how to discern teaming and swarming entities. Due to this difference, no human being can be efficiently integrated into a swarm structure. If we want to use the characteristics of a swarm in a human-machine system, an automation interface is required that abstracts the reactive and emergent characteristics to a teaming-suitable (plannable, transparent) instance. For us, manned-unmanned cooperation always takes place at the teaming level using the swarm avatar to integrate numerous but purely unmanned UVs (see Fig. 4a). In more complex setups, one or a few swarms, as well as Team-UVs, might be members of a team (see Fig. 4b).

Basic requirements for a task
For the task-based guidance of the presented design patterns, we need to clarify what we understand by a task. In the literature, many different interpretations for the concept of a task required for the UV guidance are provided (Doherty et al. 2014). One common interpretation of tasks describes them as an action, or a combination of actions required to accomplish a job, a problem, or an assignment (Moon et al. 2015). Depending on the complexity of the task (e.g., team task), additional logical planning might be needed to obtain this set of tasks (i.e., vehicle tasks) and their actions. However, regardless of task complexity and required planning capabilities, the following attributes are commonly required -Spatial information-where should the task be performed and what are the location characteristics. How accurate is the location information, is the location stationary or moving? What is the size, shape of the target location, e.g., a single coordinate point, multiple coordinates defining an area or route, etc.? -Action-what is/are the desired action(s) the system(s) should perform to fulfill this task? -Time-when should the task start, how long is its duration? -Constraints-which skills and resources are required for the task execution, which dependencies must be met? -Tactics-what is the priority of the task, and how should it be performed, e.g., fastest, minimum resource usage, etc.?

Scalable delegation concept
The here presented tasking concept for the guidance of several unmanned vehicles is based on the planner agent design pattern depicted in Fig. 3. The concept aims at reducing the human workload by using the co-located PA and cognitive agents in a supervisory control relationship. In this chapter, we present the concept of scalable delegation for the guidance of single vehicles, teams, and swarms. Therefore, we first define a mission plan and then present the integration of tasks in this plan on scalable delegation levels. In this context, we also elaborate the desired behavior of the PA for the insertion of a new task and how errors of the PA can be addressed.

Mission plan
When several UVs with multiple tasks need to be guided, we recommend a chronological task arrangement on a mission plan (cf. Fig. 5). In contrast to a map view of the tasks (cf. Fig. 5a), which makes the geographical information accessible, the mission plan simplifies the visualization of the temporal and logical dependencies of the tasks. Temporal constraints result mainly from the individual transit times between the tasks (cf. Fig. 5b) but can also be defined by a time parameter of the task. Logical constraints in the mission plan allow the PA to model that a certain task must be completed before the start of another task (cf. Fig. 5c) or that a set of tasks need to be performed at the same time.

Scalable delegation
The concept of scalable delegation provides a wide variety of delegation possibilities for the different types of tasks, vehicle, team, and swarm. For this purpose, a combined plan and delegation interface, shown in Fig. 6, has been conceptualized, which provides the following scalable delegation levels: Team ( Fig. 6a-in this delegation level, the PA determines the best-suited team member(s) for the task(s) and inserts the task(s) at the best position in the plan. This delegation level is available for all types of tasks (vehicle, team, and swarm). In the case of a team task, the PA supplements the corresponding sub-tasks (e.g., by logical planning or task decomposition) and allocates them to the team members. Individual-here, a task for a single vehicle or a swarm is assigned to a specific team member. Individual delegation can take place at the following levels: 1. Due vehicle (Fig. 6b)-the user specifies the UV to perform the task. The PA determines the best position of the task in the task list of the selected UV. 2. Due position (Fig. 6c)-the user specifies the relative position of the delegated task in a vehicle's current task list. The PA then adjusts the timing of the dependent tasks and generates a new plan, according to the specified order of tasks. 3. Due time (Fig. 6d)-the user specifies the exact start time of the delegated task in the vehicle's current task list. This time for the task is then obligatory for the PA.
The delegation on the plan provides direct feedback on the impact of each planning step on the resulting mission plan. Lacking capabilities could be indicated by graying out the vehicle or position. A technical description of this planning agent with the scalable delegation levels is described in (Heilemann et al. 2019). Related work on guiding multiple UVs on adjustable autonomy levels is presented in Bevacqua et al. (2015); Calhoun et al. (2018); and Doherty et al. (2010). However, these works focus more on the specification of the different levels and less on the delegation process of these tasks to the UVs.

Planner agent behavior
When a new task is integrated into the mission plan, the PA must consider the different task dependencies and vehicle capabilities to maintain a valid mission plan. This desired system behavior is illustrated in Fig. 7. Here exemplarily a new vehicle task (TN) is inserted into the mission plan of system 2 (S2) (cf. Fig. 7a, b). First, the time constraints resulting from the transition times of the UV between the tasks need to be adjusted, which leads to a delay of task 2 (T2). If the start of task 3, as described above, depends on the successful completion of T2, the start time for this task must also be delayed by the PA (cf. Fig. 7c), to maintain a valid mission plan.

Errors of the planner agent
From a technical point of view, it can be assured that the planner agent finds an optimal or near-optimal solution for the task assignment and scheduling of the UVs by minimizing mission time and resource usage. While this technical solution is correct in the majority of cases, there may be situations in real environments where the solution does not represent the user's intention, or the user prefers a different, better solution based on his experience. In such a case, the errors of the PA can be actively corrected by reselecting and re-delegating the corresponding task(s) by the user via the scalable delegation concept. We especially expect such errors to occur in the delegation of team tasks since in this case, the system distributes multiple tasks to the team members at a very high level of autonomy. Other types of errors that can occur when using the PA are falsely or not modelized restrictions between the tasks in the system. In such a case, the user must manually modify or add these restrictions. Besides the technical errors of the system, there may also be errors made by the user in the delegation process. A typical error may be a violation of a modeled restriction. In such cases, the PA can point out the error to the user and suggest possible solutions via the heterarchical relation.

Manned-unmanned teaming in fighter UCAV missions
The Future Combat Air System (FCAS) will encounter the challenges of future operating environments (FOE) for European Air Forces. One part of this system network is the Next Generation Weapon System (NGWS) which needs the ability to penetrate denied airspace. Due to the high risk associated with this task, it is envisioned to reduce the number of manned platforms using unmanned aerial vehicles. To investigate how this joint operation of manned and unmanned forces can be realized, we developed a laboratory prototype of cockpit and mission dynamics. Our approach is that the manned assets command the unmanned aerial vehicles, as well as their mission payloads. This approach

Laboratory prototype
The generic laboratory prototype enables fighter missions together with unmanned combat aerial vehicles (UCAVs) in a team and swarm structure. The simulator is equipped with an outside view, (cf. Fig. 8a) for flight simulation and the manual control of the manned fighter aircraft is performed via HOTAS (Hands On Throttle And Stick). Even though the simulator is built to be realistic, we reduced the complexity of some tasks as we expect these to be less complex in the future. For example, radar control and evaluation are highly automated, and we automated system management functionalities. Additionally, we added highly automated flight guidance and navigation capabilities to the autopilot. We can also expect and currently observe in the case of the modern F35 aircraft that the main task of a fighter aircraft pilot in the future will no longer be to (manually) pilot the aircraft. With this reduction of the complexity and increase of the onboard automation in our simulator, we assume that piloting the aircraft is minimal taxing. Therefore, the complexity of the UV guidance will be similar to a ground control station in most situations. The main task of the pilot in the simulator is the mission management and the guidance and supervision of the team members. These tasks are performed on a generic central multifunctional head-down display (MHDD) and two side displays (cf. Fig. 8b).

Unmanned vehicle tasking
The tasking of the UCAVs and swarms is primarily carried out via MHDD, which is shown in detail in Fig. 9. In this interface, the pilot can select specific pages on both sides of the interface (a). In the center of the figure, the team Fig. 8 Simulator setup, a view with the projection system, b view of the pilot Fig. 9 Multifunctional head-down display, for the cockpit-based multi-UCAV guidance consisting of the manned fighter (b) and the unmanned vehicles (c) is shown. The task delegation process in the simulator works as follows: first, the pilot selects a specific task for the different types of targets (d, e, f) through the radial context menu (g). The parameters of this task can then be adjusted with the parameterization page (h). Finally, the task is integrated into the mission plan through the delegation page (i). These steps are described in detail below.

Task
Based on the theoretical description of a task, they should contain spatial information, action, time constraints, and tactics. In our fighter aircraft domain, we use the feature, action, and context approach presented in Lindner et al. (2019) to specify these attributes in the task creation, shown in Fig. 10.
In this approach, we first define the spatial information of the task and then specify the action and the parameters for the execution. The spatial information of a task is called a feature (cf. Fig. 10a), and can be divided into the following categories in our approach: -Points-like a building, missile launcher, parking aircraft, or even a navigation point -Lines-like streets, rivers, routes, the forward line of own troops -Areas-(polygon) like marshaling area, CAPs, airports, radar reconnaissance range -Moving points-like fighter jets, cars, tankers, AWACS To make the vast number of tasks in fighter aircraft missions manageable, a preselection of the actions of the tasks according to their context, cf. Fig. 10b, is performed.
As described in Lindner et al. (2019), we distinguish the contexts navigation, offensive counter air (OCA), air interdiction (AI), suppression of enemy air defense (SEAD), and electronic warfare (EW) in our application. For each of the different actions (cf. Fig. 10c), corresponding parameters (cf. Fig. 10d) can be specified. Some of these parameters are pre-parameterized with expert knowledge from interviews with German fighter pilots. Nevertheless, if the pilot has a certain tactic in mind for the task performance, these parameters can be modified by hand.
The implementation of the feature action context concept in our simulator is shown in Fig. 11. The yellow symbol in Fig. 11a and b represents point features and the yellow area in Fig. 11c an area feature. The yellow color indicates that the features have the context of air interdiction. In the user interface, a further separation of the tasks takes place, depending on whether a vehicle (Fig. 11a), team (Fig. 11b), or swarm (Fig. 11c) task should be generated. The time constraints and tactics of the tasks can be adapted after the creation as described in Heilemann and Schulte (2020b) via the task configuration interface, shown in Fig. 9h.

Scalable delegation
Based on the scalable delegation concept presented in chapter 3, an interface for the guidance of several UCAVs and swarms was developed (cf. Fig. 12). After the creation and parameterization of the tasks, this interface allows the delegation of the tasks on the scalable delegation levels, by selecting the corresponding buttons for team (Fig. 12a) or due vehicle (Fig. 12b) delegation. Alternatively, the pilot can select a specific time slot for the due position (Fig. 12c) delegation. The due time and position delegation of tasks can be achieved by parameterizing the task with a fixed time or by drag and drop of the task on the timeline. Team members that do not display a delegation slot (such as UV "Golf" in Fig. 12d) lack the capability required for the task (e.g., missing sensor). In the swarm section (Fig. 12e), all currently present swarms are shown. A new line appears if a new swarm has been launched. If a task has been assigned to the swarm (Fig. 12f), in addition to the task duration (marked by the boxes), the travel time from the launch point is shown (small line before the start of the task).

Human-in-the-loop evaluation
The different design patterns for the multi-UCAV guidance from the manned cockpit are evaluated in human-in-theloop experiments. In the first experiment, we compare two different design patterns for the UV guidance and evaluate the use of the different task types and delegation levels. In the second experiment, we evaluate the planner agent design pattern in realistic missions of varying complexity and different force structures on the tool side.

Test subjects
For a high external validity, a total of 8 experienced German Air Force pilots (4 Eurofighter, 4 Tornado) were invited for the experiments (cf. Table 1). To prevent incorrect results arising from a lack of immersion or decreased situational awareness, all missions were performed from takeoff to landing.   1 3

Scalable delegation experiment
The goal of this experiment is to measure the influence of the planner agent on the task delegation process. To quantify this effect, the design pattern of the manual task delegation (MP), shown in Fig. 13a, is compared with the design pattern of the planner agent (PA), shown in Fig. 13b, in comparable situations. We want to show that the delegation time and errors can be reduced with the PA and that user acceptance for such a system is very high. Another research focus in this experiment is set on how the different task types (individual, team, and swarm) are used and which of the scalable delegation levels is preferred by the pilots.

Hypothesis
The differences between these two approaches are examined with the following four hypotheses: H1-Task delegation with the PA design pattern is faster than the MP pattern.
H2-Task delegation with the PA design pattern is faster than the MP pattern when other tasks depend on the inserted task. H3-Constraints violations are detected and corrected faster with the MP pattern. H4-The delegation of team tasks is faster than the delegation of individual tasks for a teaming situation.

Missions
The hypotheses are examined in four missions containing similar tasking situations. To measure the influence of the PA, two missions were carried out with the PA and two with MP design pattern. Each mission contained multiple mission sections which contained the reconnaissance, engagement, and battle hit assessment of a high-value target and the reconnaissance of secondary targets. Throughout the mission area pop-up threats, i.e., enemy surface to air missile sites (SAM), had to be expected. The rules of engagement in these missions stated that those threats should be engaged if an aircraft or high-value target is endangered. Fig. 13 Comparison of the different design patterns: a manual task delegation, b planner agent pattern for the task delegation An exemplary mission with three sections is shown in Fig. 14. In Fig. 14.0, the initial force structure with a manned fighter (gray) and three UCAVs (orange, blue, green) is depicted. After the takeoff, the first situation, shown in Fig. 14.1, is displayed to the pilot. After the successful completion of the section, the second mission section, compare with Fig. 14.2, was entered via a communication interface and the pilots had to replan in the air. In this situation, a pop-up threat, Fig. 14.2 s, occurred when the target area was approached. The successful completion of the mission required additional planning of the pop-up threat. The last situation is shown in Fig. 14.3, in which an area must be searched and enemy vehicles found in this area must be planned for engagement. In this situation, a further pop-up threat occurred when the manned fighter and the team members were in range.

Results
First, the two design patterns are compared in terms of delegation time and error-proneness. Then, the usage of the scalable delegation level as well as the usage and acceptance of team tasks will be discussed. Finally, a short evaluation of the delegation interfaces will be conducted.
Task delegation time The comparison of the task delegation time of the MP and the PA design pattern, cf. Fig. 15, revealed that the delegation time with both systems is generally quite similar. However, way more outliers with a delegation duration greater than 20 s exist for the MP design pattern (red) compared to the PA pattern (blue).
Before data analysis, we first deleted these outliers, with delegation times larger than 20 s, from the experimental data (cf. Fig. 15 gray area). The analysis showed that there was a statistically significant difference between delegation times with the PA and the control group of the MP pattern, with mean delegation times 0.9 s (95% CI [0.28,1.62]) lower for the PA t(415) = 2.78, p = 0.006, d = 0.27, and therefore H1 can be accepted. In this experiment, we compared all delegated tasks and did not distinguish whether another task in the plan depends on the newly integrated task and must therefore be adapted. Nevertheless, such task dependencies, i.e., temporal or logical constraints between the tasks, have a major influence on the planning process and influence the planning time. For this reason, the delegation times for such dependent tasks will also be compared with the two systems.
Delegation for dependent tasks The comparison of the delegation time for such dependent tasks in the plan generation reveals that these data points mainly represented the outliers in the MP design pattern (cf. Fig. 16). In the PA design pattern, on the other hand, this effect is only very weak, which indicates a much more efficient plan creation for dependent tasks in this pattern. Before we perform the data analysis, we first deleted all outliers, with delegation times larger than 30 s, from the experimental data, cf. Fig. 16 grayed out area.
Due to the small sample size and the fact that the delegation time for the PA was not normally distributed (Shapiro-Wilk test, p < 0.05), we used the Mann-Whitney U test for the significance analysis. This test showed that the distributions differed between both groups, Kolmogorov-Smirnov p < 0.05. We could show a statistically significant difference in delegation time between both groups, U = 11.00, Z = − 4.42, p < 0.001, and therefore H2 can be accepted. These measurement results were additionally verified with answers from a questionnaire depicted in Fig. 17. Planning with the PA was considered faster and was preferred. Additionally, the integration of planning constraints, from the dependent tasks, was seen as helpful.

Fig. 15
Task delegation times for the manual planning (red) and planning agent (blue), gray area marks deleted outlier Planning errors of the pilots Another important factor besides the planning time of the two patterns is the susceptibility to errors and the possibility to find and fix them. Therefore, we assessed how many errors with the respective design pattern occurred during the planning and how long it took to correct them. Across all missions, we observed three planning errors with the MP pattern and two errors in planning with the PA pattern. In contrary to the MP, the pilot was given a direct indication of the conflict when planning with the PA. The average time until the errors were identified and corrected was ~ 180 s for the MP pattern and ~ 20 s for the PA pattern. A significance analysis is not performed due to the small sample size (5), but the evaluation of the questionnaires, cf. Fig. 18, shows that with the help of the PA, errors could be found and corrected quickly.

Usage of the scalable delegation level
The PA design pattern enabled the pilots to delegate the tasks on scalable delegation levels, as described in chapter 4. The usage of these scalable delegation levels for the insertion of a vehicle task into the mission plan is shown as blue boxes in Fig. 19a.
We observe that the due position delegation was heavily preferred for the vehicle task delegation by the pilots. A pilot-specific view, cf. Fig. 19b, of these delegation levels for each of the eight pilots reveals how often the respective delegation level was used by them. While some pilots almost exclusively used the due position delegation (Fig. 19b1), others preferred the due vehicle delegation (Fig. 19b2). A third pilot only used the due position and team delegation (Fig. 19b3) but no due vehicle delegation. This shows that the different delegation levels allow a user-specific operation of the system. The results also indicate that the pilots were not overtaxed; otherwise, they would probably have chosen the team delegation in which they would not have to decide on the best team member and the position. In a questionnaire, the pilots further stated that they found the different delegation levels useful and that they enabled a situationspecific task delegation, cf. Fig. 20.   Usage of team tasks The planning of complex teaming situations in the missions could be done either by delegating and parameterizing the corresponding vehicle tasks to the different UVs or by delegating a team task. In 81.25% of the cases, these complex teaming situations were planned by using team tasks and correspondingly in 18.75% of the cases by the delegation of the corresponding vehicle tasks. The delegated team tasks resulted in a total of 105 vehicle tasks distributed by team delegation to the UVs in the missions. Of these 105 team delegated vehicle tasks, only 9 (8.57%) tasks were subsequently moved by three of the pilots to another team member. All of the moved tasks were assigned to the fighter aircraft by the PA and these three pilots did not want to perform them on their own. This result also shows that the pilots can identify incorrect decisions made by the PA and correct them using the scalable delegation concept. A temporal comparison of the delegation time of a teaming situation with team tasks compared to the planning of the situation by delegating vehicle tasks showed that the delegation of team tasks was on average 9.5 s faster, cf. Fig. 21.
A Mann-Whitney U test was calculated to determine if there was a difference in delegation time for teaming situations between team tasks and the delegation of vehicle tasks. The distributions differed between both groups, Kolmogorov-Smirnov p < 0.05. There was a statistically significant difference in delegation time, U = 2.50, Z = − 3.875, p < 0.001, using the exact sampling distribution of U, and therefore H4 can be accepted. However, this time saving of the team tasks is reduced if the pilot subsequently has to move delegated tasks to another team member. The measured values are also reflected in the evaluation of the questionnaires. The pilots stated that the team tasks enabled a faster and less error-prone delegation of tasks and extended the individual UCAV management meaningful (cf. Fig. 22).  Delegation interface evaluation After the experiment, an evaluation of the delegations interface was performed (cf. Fig. 23). The pilots stated that the individual delegation levels were well structured, easy to understand, and intuitive to use. This enabled the easy integration of new tasks into the plan.

Full system evaluation
In the last experimental section, we want to showcase the planner agent design pattern in a full mission simulation. Therefore, we developed realistic air combat scenarios to challenge the expert pilots. In total, six full missions in three levels of difficulty (A permissive, B semi-contested, C contested) were conducted with each pilot. Within these, they had to cope with different kinds of tasks and differently composed MUM-T systems (configurations I-VI) (see Fig. 24). The generic overall mission sequence is the penetration of enemy territory (Ingress), reaching the target area to achieve the desired effect and leaving enemy territory (Egress). In all scenarios, enemy air defense (ground-and air-based) had to be considered and dealt with.

Hypothesis
For this experiment, we want to test very generally described hypotheses to find out whether the chosen design pattern approach results in a suitable system configuration for air combat missions.
H1-a pilot can guide a MUM-T system efficiently in a military air mission. H2-the force composition of the MUM-T system impacts the workload imposed on the pilot. H3-keep the human in the decision-making process. H4-the human places trust in the automation (cognitive agents).

Missions
In the missions of the same difficulty level (A, B, C), there was a comparable number of tasks, targets, threats, and tactics to be applied (4 T's). Before each mission, the pilots were briefed on the conflict situation in a mission briefing. Afterward, they went into the cockpit simulator. Before takeoff, they had to work through checklists and obtain the takeoff clearance from the air traffic control. From now on, they had to try to achieve the required mission target with the MUM-T system on their own. To ensure the comparability of the mission process amongst the test persons, an initial plan was proposed to the pilots. In the further course of the mission, the pilots had to dynamically adapt this plan to the change in the situation that occurred during the flight. For a more detailed description and evaluation of the experiment, refer to Lindner and Schulte.

Results
In our experiments, all pilots could efficiently operate the MUM-T systems within various environmental settings and deal with the corresponding primary targets (H1 is supported).
Measuring the required effort of delegation and monitoring mission execution with and without a swarm network, we got comparable results. This points in favor of our approach to integrate multiple platforms as a swarm avatar into the system network. Using swarming as an operational scheme, the guidance complexity becomes independent of size. This conflict resolution impacts the arising pilot workload (supports hypothesis H2).
The subjective rating of the pilots gives evidence for great acceptance of the overall system design. The degree of reality of the mission design was predominantly assessed as high to very high. The representativeness of the setting is one of the main aspects contributing to the validity of the human-in-the-loop evaluation and could be verified through the questionnaires. All missions could be performed within manageable work and task load. In no situation did any of the test persons feel overwhelmed (supports hypothesis H3). Thus, the trust in automation for the assistant system was high. All pilots relied on the unmanned systems to perform their tasks independently (indicates support for hypothesis H4). Due to the high trust, the monitoring process has been kept to a minimum by the pilots. Operator responses to Likert scale questionnaires as well as their verbal feedback during debrief sessions reinforced the finding that the taskbased guidance concept is a sophisticated way of interacting with other teammates. Figure 25 depicts some selected scores of a post-mission questionnaire. The scores show a great level of overall acceptance of our scalable tasking concept. The interaction concept was rated to be intuitive. The planning agent contributed to reaching the mission goal faster. The pilots also considered the tasking concept as suitable for operational use in purely manned crews. Integrating UVs by means of teaming led to a high appreciation of transparency and trust. The same holds true for our swarming approaches. Using the proposed concepts, we created a MUM-T system with which pilots could well imagine operating in the future.
We further investigated in these experiments how much time the pilots spent with the different aspects of the simulator in the missions. We observed that tactical assessment, which involves maintaining situational awareness by monitoring the actions of the UCAVs and enemy forces, plays a huge role in such missions. We also could validate that manual flight by the pilots only plays a subordinate role, since over 70% of the time the pilots used the autopilot flight. A more detailed evaluation of this experiment can be found in Lindner and Schulte (2020).
A questionnaire, cf. Fig. 26a, comparing the general attitude of the pilots regarding automated planning systems, i.e., the planner agent, before (blue) and after (orange) the experiment further revealed an improvement in all points. The fast and easy use of the system supported the pilot and was rated useful. The lowest score was given to the area of errors in these systems, although the pilots were still able to complete all missions successfully. Overall speaking the planner agent showed to be fast, transparent, useful, and easy to use, cf. Fig. 26b. The situation-adaptive interventions of the planner agent in the realistic missions were rated as neither too intrusive nor too conservative by the pilots, as desired by the system design.

Conclusion
In this work, we first described different approaches for the guidance of teams and swarms of unmanned vehicles with the help of the human autonomy teaming design patterns. Based on the design pattern, a promising approach with a scalable delegation concept using a central planning agent was developed. This approach was transferred to military air operation and the implemented cockpit prototype was presented. We then evaluated this design pattern in a human-inthe-loop experiment with German Air Force pilots. The different scalable delegation levels as well as the possibility to guide single unmanned vehicles as well as teams and swarms enabled faster, less error-prone, and situation-adapted planning. The applicability of the PA design pattern was further demonstrated in realistic mission scenarios with different force structures. Overall, we observed a high level of trust and transparency in the guided systems, as well as the suitability of the system for real-world applications, was shown.
Nevertheless, it also became apparent that in time-critical military situations, the delegation process may take too long and that system cooperation for plan creation might be indispensable in critical situations. In Heilemann and Schulte (2020a), situation-specific assistance is successfully demonstrated for different threat situations. In order to avoid automation-induced errors, we further plan to integrate mental state (Schwerd and Schulte 2019) and the mental workload (Mund et al. 2020) in the intervention decision of the PA (Müller and Schulte 2020). The experiments further revealed that some pilots would prefer not to receive tasks from the PA. Even though the problem could be solved by a subsequent delegation of the undesired task, it shows that different delegation preferences of the pilots should be provided or learned from the PA in the future. In future studies, a certain degree of failure should be additionally integrated into the PA and the UCAV agents to make the system more realistic. Such an increase in errors of the highly automated systems inevitably leads to a strong focus of research on human-agent communication, transparency, and trust as described in Lyons (2013).
Funding Open Access funding enabled and organized by Projekt DEAL.

Declarations
Conflict of interest Mr. Heilemann reports grants from Bundesministerium der Verteidigung/Federal Ministry of Defence (Germany), during the conduct of the study. In addition, Mr. Heilemann has a patent multi-vehicle control method pending to Universität der Bundeswehr München, Heilemann, Schmitt, Brand, Schulte.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.