1 Introduction

In every area of the economy there are plans to move from manual (human controlled systems) to autonomous (no human required systems). As the technology needed to support this rapid movement has improved, almost on a daily basis, there is greater recognition that human oversight of these systems will be needed in the near future. For example, when automakers and robot designers use the term autonomy they generally mean: autonomy within a limited range of functions or for a broad range of functions with human oversight. Before proceeding with the discussion of Human Autonomy Teaming, we would like to offer a few definitions of autonomous from Dictionary.com [1]:

  • Government. a. self-governing; independent; subject to its own laws only. b. pertaining to an autonomy or a self-governing community.

  • Business. Having autonomy; not subject to control from outside; independent: a subsidiary that functions as an autonomous unit.

  • (of a vehicle) navigated and maneuvered by a computer without a need for human control or intervention under a range of driving situations and conditions: an autonomous vehicle.

These definitions clearly describe systems that have both the ability and freedom to make independent judgments. However, some of our most advanced systems – Waymo’s self-driving car, Tesla’s auto-pilot – still require human oversight. For example, current “autonomous” cars have significant problems dealing with traffic when it is directed by people (e.g., flagmen or police officers) and with static objects in the roadway [2, 3]; thus, the need to team up autonomous systems with humans to improve overall system safety and efficiency.

People working with automation, even when that automation has a certain level of autonomy, does not equate to human autonomy teaming. HAT requires that there be some level of cooperation and coordination in achieving goals. This paper tells the story of how our research at NASA in support of work on single pilot operations (SPO) and reduced crew operations (RCO) came to incorporate HAT. The goal of that research was to explore the possibility of reducing the crew complement on commercial flight decks from two pilots to one. Based on task analysis, a concept of operations was developed that called for automation and a ground operator (similar to a dispatcher) to support the single pilot. Our initial prototype ground stations provided an ability to coordinate with a human ground operator, and provided (increasing levels of) automation. As we included more automation, our research participants expressed distrust of the automation and uncertainty about the rational for the suggestions recommended by the automation. This led us to begin work to make the automation act more like a teammate.

After a brief discussion of our pre-HAT work, this paper will present our initial vision for HAT, followed by a discussion of our HAT implementation to support an advanced airline dispatcher ground station and a final implementation of HAT tools on the flight deck. The majority of data reported in this paper will be flight dispatcher and commercial transport pilot ratings and their comments on the usability and acceptability of the HAT tools.

2 Pre-HAT SPO/RCO Work

2.1 Technical Interchange Meeting

NASA began its work on SPO by convening a technical interchange meeting (SPO TIM) to discuss the feasibility of SPO [4]. Two types of challenges resulting from the removal of the second pilot were often mentioned: workload and redundancy (see also [5]). The consensus of attendees was that to make SPO feasible, workload needed to be reduced to a level where a single pilot could handle it. Also, and perhaps more important, removing the second pilot raises issues about how to replicate the redundancy they currently provide which is required for certification and flight safety. The group converged on two approaches to the workload and redundancy problem: onboard automation or external support from other people.

2.2 Experiment 1: Together Versus Apart

In our first experiment, we evaluated the effect of crews working together, versus being in separate locations, on crew communications and workload (see Fig. 1), as suggested by Thomas Sheridan at the SPO TIM [4, 6]. In this study flight deck automation replicated that found on current transport category aircraft. Ten two-pilot crews flew both together and apart – at separate redundant ground stations – while resolving off-nominal diversion scenarios.

Fig. 1.
figure 1

Pilots flew together on the left and captain and first officer separated on right.

Lessons Learned.

In this experiment we found that while control manipulations can be acknowledged non-verbally in two-pilot operations, acknowledgement may be forgotten or require extensive radio use. Additionally, there is a risk of shared situation awareness (SA) being reduced when pilots are physically separated. Pilots appeared to have increased uncertainty about roles and responsibilities (e.g., Do I have the aircraft or do you?), uncertainty about control manipulation (e.g., Are you entering the altitude?) and uncertainty about completed actions (e.g., Did you put that in the CDU?).

Based on these results and additional feedback from our pilots, we developed tools to facilitate remote collaboration – Crew Resource Management (CRM) Tools. These tools were then implemented in our ground station and evaluated in the next experiment.

2.3 Experiment 2: Higher Fidelity with CRM Tool Manipulation

In our second experiment 18 two-pilot crews flew high-workload off-nominal scenarios that required diversions [7]. However, this time with CRM indicators we developed to show roles and responsibilities, shared charts, shared flight deck displays and video that allowed the pilots to see each other (see Fig. 2). As in the first experiment, crews flew side-by-side in a baseline configuration, (this time in a high-fidelity full motion simulator) and separated. In the separate condition the captain remained on the flight deck and the first officer flew a prototype ground operator station that incorporated aspects of both a flight deck and an airline dispatch station. To assist in planning diverts, the ground station was equipped with a rerouting tool incorporating a previously developed NASA technology, the Emergency Landing Planner (ELP; [8, 9]), which assessed the suitability of airports near the aircraft and returned recommendations for which airport would make the best divert. This tool also provided routing information to the selected airport. A simple dispatcher task to reroute aircraft around convective weather was introduced.

Fig. 2.
figure 2

SPO II ground station. CRM indicators circled on the right and video of the cockpit circled on the left.

Lessons Learned.

Data from this second experiment was generally positive for our shared tools (CRM indicators, video, flight deck displays, and shared charts) although pilots had multiple suggestions for improvement. A communication analysis showed that crews spent more time communicating, shared more decision-relevant information and were more responsive to each other when CRM indicators were available, suggesting these tools directed crewmembers’ attention to their joint responsibility for safe decision-making [10]. We also found that when the captain requested assistance from the ground dispatcher, the dispatcher focused on that aircraft and stopped performing the rerouting task. We concluded that a ground operator working off-nominal aircraft should be relieved from servicing other aircraft. This procedure is similar to current practice in Airline Operations Centers: dispatchers often hand off their nominal aircraft to other dispatchers and give one-on-one support to aircraft that need to divert. We refer to this one-on-one mode of operation as dedicated assistance.

2.4 Experiment 3: Investigation of Situation Awareness Issues

In the third study we tested two concepts of operation. If SPO was to be considered for implementation, a ground operator must give dedicated assistance to aircraft in high workload or off-nominal situations. However, in order for SPO to be cost effective, the ground operator must handle more than one aircraft. In this third study we evaluated two ground station concepts of operation:

Specialist, in which the ground operator only performs dispatch functions and hands the aircraft to a separate person (pilot) who provides dedicated assistance to the aircraft when needed; and

Hybrid, in which the ground operator performs dispatcher functions and, when needed can hand off all other aircraft and provide functions during dedicated assistance.

The CRM tools and the ELP [8] were similar to those in the previous experiment (see Fig. 3) [5]. In this experiment thirty-five commercial airline pilots participated. In the hybrid condition a ground operator (the participant pilot) acted as a dispatcher until one troubled aircraft (a confederate pilot) had an off-nominal situation, at which time the dispatcher entered dedicated support; assuming the role of first officer for that flight and handing off the other aircraft. Varying the level of interaction the ground operator had with both the ‘‘to-be-troubled’’ aircraft and with the airspace in general, prior to dedicated support, allowed us to look at the effects of this initial exposure on performance. In the specialist condition, the participant pilot was simply handed the troubled aircraft with a brief message (e.g., “Sir, flight 123 needs dedicated assistance”) without prior exposure to either the flight or other environmental conditions such as the weather.

Fig. 3.
figure 3

SPO III dispatch ground station: (a) flight deck displays for the selected aircraft; (b) TSD, ACL with ELP recommendations; and (c) CRM tools and sharable charts.

Lessons Learned.

We found no performance difference between our two ground station support concepts - hybrid and specialist. This suggests that with the tools provided participants could gain sufficient SA to perform the task relatively quickly. From a concept of operations perspective, it suggests that the decision of whether to have ground pilots waiting to takeover distressed aircraft or increase training -cost and time- for flight-followers could be made on economic grounds.

Overall, participants found the ground station tools (Information on the aircraft control list (ACL), shared charts, the traffic situation display (TSD) with ELP recommendations, and CRM indicators) to be useful. Of particular interest were their impressions of the ACL. Pilots reported that the ACL improved their SA. One pilot commented, “I would like to see a lot more info on the ACL. I really liked the concept.”

2.5 Experiment 4: Monitoring Multiple Aircraft

The previous three experiments focused on the ability of a ground-based flight-follower to perform piloting duties, sometimes helping to manage a single-piloted aircraft under high workload and off-nominal conditions. This study examined the ability of this flight follower to work with a fleet of aircraft. These flight-followers could not actually control the aircraft as they could in the previous studies, however, with additional automation they did perform some of the functions normally associated with the pilot not flying/monitoring in a two-person crew.

In order to facilitate the increased monitoring task, a new Aircraft Monitoring and Management System (AMMS) was introduced. This system gathered data from various sources (e.g., monitoring weather data, ATC clearances, aircraft position, and EICAS alerts) and placed prioritized alerts on a redesigned ACL when threats were detected (see Fig. 4) [5]. The route replanning tool, presented to the left of the TSD, used in Experiment 3 was augmented to display ATIS at the destination airport as well as indicate which of a number of risk factors were present in any potential divert location [12, 13]. Operators could request ratings for airports that were not recommended by the tool and could adjust the weighting of various factors going into the recommendation. The modified tool was renamed the Autonomous Constrained Flight Planner, ACFP. Five certified dispatchers and five commercial airline pilots participated in the build one evaluation. Participants ran two one hour-long scenarios. Each scenario required participants to make approximately six diversions using the ground station tools.

Fig. 4.
figure 4

Build 1 ground station. Bottom center, ACL, augmented with timeline, alerting information; above the ACL is the TSD; to the left is flight controls and displays for the selected aircraft in read-only mode; on the right is CONUS map and charts.

Lessons Learned.

The dispatchers and pilots were very positive about the ground station. Specifically, they agreed that the automation and displays did a good job of integrating information. They found that the alerts reduced the workload of the monitoring task. They also found the ACFP route replanning tool useful; ‘‘The ACFP is outstanding… We like to be able to verify stuff, so what is really cool is you guys have that ability, you don’t just blindly trust, you can verify by literally looking at the ATIS and say, ‘Ah! I think that is pretty accurate’. However, they also had significant issues with risk ratings. One participant reported, “I was not always sure what the tool was prioritizing: weather, distance, or time. [Because of this] I skewed my decisions more toward a personal judgment”.

Voice recognition and voice synthesis technologies were used to support both the ability to perform some functions by voice and to receive briefings from the ground station. However, our system lacked robustness and thus was not fully utilized by the operator. It also did not show the proper etiquette, speaking over the operator and pilot. We also found that dispatchers and pilots differed in their attitude toward the concept of enhanced ground support. While dispatchers were eager for the additional information and tools at the ground station, pilots on the other hand were more cautious about interruptions from the ground.

3 Our Concept for HAT

Based on these initial studies it was clear that the automation tools which were designed and implemented in the ground station were helpful in performing the flight-following task. Thus, we continued to work with dispatchers and pilots to develop more automation. However, there were issues noted with respect to transparency and trust in the automation. Thus, in the next series of studies we began to integrate new collaborative decision making technologies [14,15,16], which we will collectively refer to as human-autonomy teaming or HAT.

3.1 Why HAT?

HAT attempts to address a long standing issue with automation: while engineers attempt to develop systems for as many foreseeable conditions as possible, these systems inevitably end up in conditions they cannot handle. Sometimes this is because the engineers could not find a way to handle the situation. Typically in these cases the manual will explicitly call for the human operator to take control (e.g., the autopilot shutting off on Air France 447). In other cases, the engineers simply did not foresee the conditions. In either case, the human operators suddenly find themselves in tricky off-nominal conditions, often with little understanding of how they got there [17].

To overcome these issues, we sought to develop a framework for HAT in which automation could be treated as a teammate. Over the last 40 years, aviation has developed a model for good teamwork referred to as Crew Resource Management, or CRM. Our initial HAT framework focused on three design tenets inspired by CRM: transparency, bi-directional communication (including a shared language), and operator directed execution [16].

3.2 HAT Tenets

Transparency:

Good CRM between humans requires team members to understand what the others are doing and why. When teaming with automation, intention is often less intuitively obvious, so transparency about reasoning is necessary. Transparency of the automation has to do with whether its functioning is easily understood by operators. Operators must have knowledge of the general logic of how it works so that they can develop accurate mental models of its functioning, and be able to discern what mode the automation is in [17]. In the case of early fly-by-wire aeronautics systems, for example, test pilots placed little trust in the automation because the functioning was obscure to them [18].

Bi-directional Communication:

Good CRM between humans requires people with different information to enter a dialog about how best to achieve their goals. This implies explicit discussion of goals (as opposed to intent inferencing), as well as confidence, and rationale. To facilitate this dialogue a shared language or “phraseology” is needed to improve communication efficiency. This dialogue can be initiated with plays called by the operator. The play is an adaptable system of assigning specific tasks prior to a mission based on delegated agreements that can be invoked by the human.

Operator Directed Execution:

Good CRM requires someone to be responsible for final decisions and that decisions should be explicit. Through the use of “plays” this responsibility is ascribe to the human, and will continue to be for the near future. This does not mean that the automation can never act autonomously. Through the use of plays operators can still delegate tasks to automation, but only the human can execute the final action. However, we argue that automation should be adaptable. Goals, operating modes and levels of automation should change at operator direction or based on prior agreements.

3.3 HAT Agent

The HAT tenets described above give us general guidelines for implementing HAT. An important (and, to date, unanswered) question is the degree to which specific implementations can be used across multiple kinds of automation. That is, can we develop a “HAT Agent” that would add teaming capabilities to a variety of automation? This HAT agent could encapsulate a number of important teamwork functions such as maintaining a goal structure, coping with counterfactual “what if” questions, and understanding when to interrupt an ongoing task. It might also provide interfaces for HAT interactions such as cooperative decision-making and calling, modifying, and monitoring plays (a type of share plan of action, see [15]). A sketch of such an agent is presented in Fig. 5 [16].

Fig. 5.
figure 5

Initial model of HAT interactions.

4 Implementing HAT for RCO

In our initial implementation of HAT based on CRM principles, we developed an agent that only mimics intelligence because the knowledge that it presents is instantiated by our programmers and not learned through an interaction with the real world. However, as discussed in the next section, we attempted to imbue our ground station with the HAT principles outlined above.

4.1 Experiment 5: HAT no HAT

Experiment 5 was based on the HAT tenants outlined above and a human automation teaming approach was taken to the design of ground station automation.

The interface was implemented using the playbook approach to set goals and manage roles and responsibilities between the operator and the automation [15]. It provided 13 different plays the operator could call to address off-nominal airspace and system simulation events. When the operator selects a play, the ACFP is initiated with preset weights, and the corresponding play checklist appears on the display identifying shared operator tasks in white and automation tasks in blue (see Fig. 6) [19]. As per our tenets, the operator was always responsible for executing any recommendations.

Fig. 6.
figure 6

Operator directed interface for calling plays in the HAT condition and associated checklist of roles and responsibilities. (Color figure online)

For Bi-Directional Communication, weights were preset for each play and presented in slider bars (top of Fig. 7). The operator was able to negotiate with the system by altering these weights to what the operator considered appropriate for the situation. The operator can perform “what if exploration” by changing the weights to see how the divert recommendations are affected. Using the example shown in Fig. 7, if the operator decided that estimated time of arrival (ETA) to the airport was a higher priority than available services, the operator could adjust the ACFP weights and find new recommendations.

Fig. 7.
figure 7

Transparency and bi-directional communication in the HAT condition implemented by ACFP recommendations (on bottom) and weights (on top).

To address the significant task of monitoring 30 aircraft, an aircraft monitoring and messaging system (AMMS) was implemented in the ACL. The AMMS alerted the dispatcher to any non-normal events associated with the aircraft:

  • Weather along the current cleared path

  • Deviations from the current cleared path – both track and altitude

  • Adverse event at the destination airport that would render the airport unusable (weather minimums, airport closures, etc.)

  • Any system problems on the aircraft.

Previous research indicates that autonomous cooperation between robots can improve performance of human operators [20] and improve team performance. The idea of autonomous agents reporting problems to a central authority (call center) was proposed by Xu in 2012 [21]. Google maintains a call center to oversee its self-driving cars. The ACL coupled with the AMMS reduces the monitoring task of our ground station operator. The AMMS with its access to the information listed above allows the system to diagnose any problem and alert the operator, freeing up resources which can be used to service additional aircraft.

During this study four flight dispatchers and two pilots participated. After 3.5 h of training on the ground station, they managed the flight-following task during two 50 min scenarios, with and without HAT tools. During a scenario they managed approximately 30 aircraft, and worked with our pseudo-pilots to complete six diversions. During the study we collected subjective and objective data; only the subjective data is reported.

Lessons Learned.

In a study comparing ground stations with and without HAT, ground station operators (both dispatchers and pilots) preferred the ground station with HAT over the station without HAT features. They reported that the ground station with integrated HAT features (ACFP and AMMS) were preferred for keeping up with operationally important issues. Workload in the HAT condition was lower, as measured by both subjective rating and eye-gaze duration data. Participants agreed that the automation and displays did a good job of integrating information, and they liked the new HAT interface to the ACFP (for example, “The sliders, I thought, were pretty well done.”; “I loved the HAT…It doesn’t take long to learn.”).

4.2 Experiment 6: Integration of HAT on Flight Deck

Since the ground dispatch and the onboard captain share responsibility for the safety and efficiency of the flight and both must consent on any flight deviations, a clear next step was to install the HAT tools on the flight deck. So in the final study, we integrated the ACFP, AMMS, and playbook paradigm into an electronic flight bag (EFB) and installed it on the flight deck (see Fig. 8). Twelve airline transport pilots participated in our flight deck assessment of HAT tools which were presented on an EFB. Each pilot flew three off-nominal, 15 min scenarios in both the HAT and no HAT conditions. In each condition scenario difficulty varied – high, medium and low.

Fig. 8.
figure 8

Flight deck HAT setup: (A) EFB for interacting with HAT features.

Lessons Learned.

We found no differences for HAT ratings on situation awareness, workload or trust. However, participants showed a significant preference for HAT over No HAT conditions. Moreover, as with any emerging technology, the participants provided suggestions for improving the HAT agent. These suggestions included a better voice interface that uses natural language, better labeling of anchor points on our slider tools, and suggestions for providing pilots with additional information.

5 Conclusions

This paper describes a line of research whose goal was to explore the feasibility and acceptability of single pilot operations for commercial transport aircraft and the development of a human autonomy teaming approach to automation which supported the single pilot and flight dispatcher. One of the significant impediments to SPO was the loss of nonverbal cues when crews were not co-located. To mitigate this problem crews communicated more often and openly discussed roles and responsibility. Normally the roles and responsibilities are decided by the captain prior to a flight or during a flight they both will hand off responsibility as needed with just a nod or a single utterance – I got the stick. To remediate this loss of nonverbal cues we developed the CRM tools which allowed the team to quickly assess current roles and responsibilities. Data from the first two studies suggested that the suite of tools introduced and empirically tested to address CRM challenges stemming from non-co-located crews was generally useful although pilots had multiple suggestions for improvement. Another impediment was the importance of SA prior to providing dedicated support to a single pilot aircraft. This issue had significant implications for how quickly the single pilot could expect the needed dedicated support and consequently the number of piloted needed to provide dedicated support. In the third SPO study we found no difference between our two operational concepts – hybrid and specialist. We concluded from this that if the ground station displays present the environmental and systems data which are important to gaining overall situation awareness of the vehicle needing dedicated support, either concept would be feasible. The data from this study showed that with appropriate displays, ground operators can jump in and provide assistance, even if they are coming from a place where they have minimal situation awareness. Lastly, the final three studies suggest moving to a human autonomy teaming concept reduced the need to continuously monitor individual aircraft. With the HAT tools, when a problem arose on any particular aircraft, the ground operator would be immediately alerted and could call a play which immediately provided resolution alternatives. Additionally, some tasks could be handed off to the automation, reducing task workload.

In the future we plan to continue the development of our HAT agent, giving it some adaptive capabilities, and the ability to learn from its environment. However, we will be mindful of Miller and Parasuraman’s [15] caution about the technical and philosophical issues with adaptive systems – by their nature they usurp delegation authority from the human. Finally, we plan to evaluate the use of HAT concepts and tools in our future work on Urban Air Mobility, which seeks to safely and efficiently move cargo and passenger in urban areas.