Fig. 1
figure 1

A flow diagram showing how perception, control and communications typically interact in a multi-robot system. On the left is a proposed future approach to co-optimization, discussed in Section Challenges and Open Problems, which now admits communications and its confounding factors as integral to any co-optimization strategy designed for a multi-robot system

The use of multiple, connected robots in the place of individually uncommunicative robots provides evident gains by facilitating the inter-robot coordination that allows for work distribution, spatial coverage, and specialization. An increasing variety of applications leverage such networks of robots, including logistics [1, 2], resource distribution [3, 4], transport systems [5,6,7], manufacturing [8], and agriculture [9, 10].

These applications depend on an orchestration of robots over time and space that allows them to jointly work towards common higher-order goals, to deconflict individual actions in shared environments, and to share information in distributed computing schemes. Communication and the mutual exchange of information (state and control) are key to facilitating such interactions.

Early work in the multi-robot domain drew from nature-inspired paradigms [11], and consequently focused on devising collective behaviors that depended purely on local interactions of robots in close proximity [12]. A variety of transmission media (e.g., infrared) are used for such near-field communication schemes [13, 14]. Other nature-inspired work built on implicit communication and self-organization through stigmergy, by which robots coordinate indirectly through traces left in the environment [15]. The benefits of such peer-to-peer decentralized communication paradigms are manyfold, in particular due to their inherent robustness and scalability [16]. Centralized radio-based communication architectures have become increasingly popular in various instances, especially when the task requires performance guarantees; representative applications include product pickup and delivery [17], item retrieval in warehouses [3], and mobility-on-demand services [18]. Improvements in communication technologies, both hardware and software, have furthered more data-intensive applications, such as cloud robotics [19, 20].

Explicit communication methods generally assume that robots can broadcast information within a local neighborhood that comprises of tens to hundreds of individuals, or that a fixed network infrastructure is available. Yet, in reality, densely populated workspaces adversely affect communication capabilities because of practical contention over channel bandwidth and airtime [21]. Such networks are additionally burdened by clutter that can induce signal fading, leading to a drastic decrease in the expected communication range. This problem is compounded by the need for real-time transmission requirements in highly dynamic robot networks. Indeed, topologies and capabilities demanded by robotic applications are practically hostile to radio performance (because these radio networks were not initially designed with robotics in mind). As a consequence, the vast majority of robot applications are designed to merely work around available network technologies, and optimize their performance within the given constraints.

Our review is motivated by a lack of studies that provide a high-level overview of the interplay between communication networks and their role in robotic applications. Figure 1 graphically demonstrates a typical architecture of a multi-robot control scheme, in which each robotic system is designed strongly around its perception and control strategies. As mentioned prior, robot control algorithms generally do not actively employ the output of the communications network as part of the control loop, and as a result often overlook factors such as network contention. The result is a wide array of optimizations that work in favor of the network, but often not for autonomy, or vice-versa. Hence, we argue that a better co-optimization scheme (illustrated on the left in Fig. 1) would consider all aspects of the architecture simultaneously. The illustration shows this scheme linked via an “oracle”, which are sources of error estimations from incomplete information, that facilitates this, i.e., given a hypothetical oracle, we posit that one can co-design the various algorithm layers on the right.

In this survey, we capture a variety of network architectures and technologies, and a variety of multi-robot applications that employ them. A careful choice of communications architecture, medium and algorithm is key to ensuring that a given robot task can be completed. Therefore, we will also explore some of the newer approaches that consider bypassing such hand-crafted selections, and attempt to model inter-robot communications in a data-driven fashion.

Factors Influencing Robot Network Design

Choosing an adequate communications architecture, medium, and algorithm is key to ensuring desired robot performance. In the following, we distill the factors that influence the robot network design choices. We elaborate upon them in the following four categories: (i) application, (ii) robot, (iii) algorithm, and (iv) environment, and give an illustrative example for each.

The Application: The application defines what the shared information is for, and how the robots need to interact to solve the problem at hand. Examples: Real-world applications such as in environmental monitoring and agriculture require groups of robots to act over large distances (often operating with robots separated by \(\sim\)1000x body lengths). Such sparsely distributed robot systems, hence, necessitate networking capabilities that can span larger spaces [22•]. Other applications, such as cooperative driving [5], formation control [23], and flocking [24] require uninterrupted, situated, close-range communication for tight inter-robot coordination and control.

The Robot: The robot (and the physical hardware) define local constraints on the frequency and format of information to be transmitted and received. Examples: A quadrotor that uses state information for local stability control requires an update frequency in the order of several hundred Hertz; while on-board IMUs can provide the necessary information for body stabilization, extrinsic pose estimates are still required for tasks, and must be received at relatively high rates (e.g., 100 Hz) [23, 25]. Lack of reliable updates naturally poses a significant risk to tasks that require tight coordination, such as outdoor flocking and formation control; while sparse outdoor flight has been demonstrated in a team of 30 drones [26], there is a dearth of results on dense and agile outdoor flight. Moreover, in GPS-denied environments, robots resort to on-board sensing, and consequently, require dependable inter-robot communication to achieve group behavior.

The Algorithm: The algorithm connects the application to the robot, and essentially sets conditions on the nature of information that needs to be received (e.g., global or local), and when (e.g., asynchronously or synchronously, and how often). Examples: In allocation problems, the optimization objective is often global, and to achieve optimality, we deploy centralized algorithms that collect all robot-to-task assignment costs (e.g., expected travel times) to determine the optimal assignment (e.g., by running the Hungarian algorithm) [27, 28]. Similarly, multi-robot path planning has an optimal solution (for both makespan and flowtime objectives), but only when the computational unit has access to full system information [29]. In the absence of full observability, robots need to resort to locally available knowledge. In decoupled prioritized path planning, robots communicate to mutually deconflict their path plans in time-space [30,31,32]. Each time a robot’s plan changes, its robot neighborhood changes, or a new conflict arises, the deconfliction process restarts.

The Environment: The environment defines under what conditions shared information is delivered. Examples: Are the robots operating indoors or outdoors, or both [33]? Does the workspace afford a fixed (and possibly centralized) communications infrastructure, or must we instead rely on ad hoc networking? Is the environment cluttered with obstacles that interfere with wireless signals? What medium can we use, e.g., are the robots operating in air, under water, or in space? What legal jurisdictions regulate the communication infrastructure? And finally, is the communication channel safe, or can it be spoofed [34], or robots attacked [35, 36]?

Fig. 2
figure 2

An abridged timeline showing some key wireless communication mechanisms for robots. The shaded area represents the magnitude of theoretical capabilities (bandwidth or inverse latency), which have been increasing super-linearly since 2010. Boxes show communications standards or technologies, and their specific properties that are useful in the multi-robot control space

Communication Schemes

In this section we discuss multi-robot communication from the perspective of the underlying communications technologies, focusing upon the challenges, limitations and optimizations that are relevant in multi-robot system networks. Figure 2 shows a timeline with key wireless communication mechanisms, and some representative multi-robot applications that they enabled for the first time.


Synchronicity. Specifying robotic data flows is often the first consideration in discussing the challenges of a wireless data protocol. For example, it is often implicitly assumed that multi-robot control algorithms are executed synchronously by every participant [47]. This introduces a hard timing constraint on the maximum allowable delay in message delivery between those participants. This is, however, a feature that commonly deployed communications protocols are not designed to meet, with “best-effort” message delivery being the standard paradigm [48].

Dynamic Topologies. Hard timing constraints are often exacerbated by highly connected communications topologies that are dynamic, where a robot must communicate its status with many different (or sometimes every) participating robot(s). This can lead to a high degree of contention for radio resources since many messages may need to be sent at every control loop. While there are schemes that aim to minimize redundant data transmissions (see Section Communication‑Aware Algorithms), it remains true that, as multi-robot networks increase in scale, communications technologies must be selected and designed specifically to manage the dynamics of the application [49], something that is generally overlooked in robotic networks today.

Message Frequency. Bandwidth is often employed as a metric to specify the demands on a communications link [50]. However, this is often an insufficient characterization by itself, since the underlying technology may have significant overheads per message, and robot teams often depend on low-latency messaging as well. This is particularly true for ad hoc networks where there is no central entity enforcing message scheduling, to the extent that many communications protocols will not approach their rated bandwidths in highly connected ad hoc topologies, where overheads such as contention dominate radio resource consumption [21].

Connectivity. Since a greater connectivity range implies reachability and information exchange with more robots, it has an obvious impact on the overall messaging rate any specific robot must handle. The spatial density of robots must be considered while discussing range, and two key factors of interest emerge as a consequence. Firstly, as ranges increase, radio-based links are more prone to fading and interference [51] even when transmission power is commensurately increased. Secondly, robot control algorithms often assume a fixed range [52], which may result in greater inter-connectivity, and sometimes increased messaging rates in dense scenarios.

Dynamic Routing. The case where the required range of communication exceeds the underlying capabilities of the radio hardware onboard must also be considered, as this implies a mesh-type network where a message must traverse multiple robots (network nodes). This first requires planning the robots’ paths (discussed in Section Communication‑Aware Planning), before accounting for the computational and protocol overheads of robots processing messages other than their own. Then, the problem is that of message routing decisions and dynamic topologies. The routing decision problem is generally central to ad hoc mesh networks; this is only made more challenging by the potential for rapid shifts in communications topology, especially in highly mobile, or large-scale robotic scenarios [53]. Hard timing guarantees, in the range required by robotic control, are not currently available at non-trivial scales (especially with multi-hop routing over dynamic topologies), though some attempts have been made in this direction [54].

Operational Environment. Robotic networks will invariably be required to operate in environments with external noise and interference, which cause unpredictable impacts on link quality. This informs the selection of communications protocols, since some protocols operate in a licensed spectrum with reduced external interference, or are otherwise less prone to external noise due to atmospheric attenuation (60GHz). Doppler shift requires similar consideration, because many communications technologies fail at high relative velocities. Generally speaking, protocols that depend upon fine-grained frequency division multiplexing are more prone to Doppler related errors [55], and such schemes are often used in high bandwidth techniques.

Communications Scheme Selection

Despite considerable research interest, there are no current wireless data standards explicitly designed for exchanging information between autonomous robots [56]. Currently deployed robot-to-robot networks (such as  [26]) depend upon more generic wireless data networking standards which are not typically optimized for the challenges discussed above. In the absence of a specific standard, we will discuss the strengths and weaknesses of existing technologies for the multi-robot control application.

Ad hoc networks map well to the communications patterns required of decentralized robotic control, and the most relevant for this survey are Mobile Ad Hoc Networks (MANETs) which deal with the problem of facilitating communication between mobile nodes without coordination from infrastructure  [57]. More specific forms of interest are Vehicular Ad Hoc Networks (VANETs)  [58] and Flying Ad Hoc Networks (FANETS) [59], where the former generally deal with automotive use cases and the latter drones and UAVs, and these are more exposed to dynamic conditions that are expected from robotic ad hoc networks. Local area networking technologies are well suited for ad hoc networking.

In contrast with ad hoc operation, infrastructure networks map more closely to a centralized robotic control, where communications patterns are more similar to traditional bandwidth-focused networking applications. Despite this, hard latency requirements and rapid robot movement require specific mechanisms at the protocol level, which are not common for either cellular or local area standards.

Local Area Networks

The IEEE 802.11 protocol suite, commonly known as “wi-fi,” is frequently used due to abundant hardware availability, IP networking interoperability, high data rates and license-free operation. It is also capable of both infrastructure and ad hoc operations, which simplifies deployment from laboratory environments into the real world. The failures of 802.11 become apparent during such deployment processes [60, 61], because larger ranges, robot counts and velocity-induced Doppler shift cause lower message delivery rates than are seen in static 802.11 deployments. 802.11p, and its successor 802.11bd, have both introduced specific modifications to the physical layer  [43] to more robustly handle both range and Doppler-induced problems for VANET use cases, which potentially transfer to robotic control as well.

IEEE 802.15.4 has been commonly used as the basis for a number of different higher-layer protocols, including ZigBee. It has been deployed in the context of wireless sensor networks and multi-robot systems [42] due to hardware availability, license-free operation, low power usage, and flexible communications models that permit both IP-based and more simplified serial-like messaging. The major drawbacks are relatively low range and data rates. LoRaWAN [44] is an attractive alternative that maintains the positive aspects of 802.15.4 for the robotic use cases, but with a focus upon long range transmission (up to 16km) and a physical layer that is resilient to Doppler errors; however, its communications model is infrastructure based.

Both 802.11’s and 802.15’s underlying dependence upon the CSMA/CA collision avoidance scheme allows for the minimization of contention related losses without an authoritative central scheduler [62]; however, they have unbounded maximum latency on message delivery, and reduced message delivery rates with higher numbers of robots on the network. LoRaWAN uses a pure ALOHA protocol mechanism [63], and therefore scales even more poorly than the IEEE schemes. These characteristics make these protocols unsuitable for deployment on robots without modifications; fortunately, there are techniques proposed in the literature to help make these technologies more scalable [64, 65].

All of the protocols mentioned within this section share similar routing problems when it comes to highly dynamic topologies, in that they depend upon a network-layer routing mechanism to direct traffic without reliably converged information about the current disposition of other nodes. This issue is well covered by [66•], which categorizes and surveys many of the different approaches towards this routing problem. [67] includes the routing issue amongst a general survey of the issues in UAV networking.

Cellular Networks

Cellular networks avoid many of the problems encountered by local networking standards by making use of a nearly universal infrastructure-based communications model, licensed radio spectrum access, as well as economics of scale, all of which make extremely complex base station hardware and protocols commonplace. The centralized message scheduler and sophisticated radio resource management [68, 69] are significantly more scalable than typical ad hoc networks. These characteristics appear to be a good fit for robot control, however, the financial cost of network access, coupled with limited flexibility in logical network configuration have limited robotic deployments outside controlled environments.

For decentralized robot control, peer to peer traffic is routed through the infrastructure, inducing a minimum latency overhead [70] that could exceed timing constraints. 4G in particular has an access latency on the order of 50ms [60]. Additionally, cellular standards are naturally dependent upon the presence of infrastructure, which cannot always be assumed. Furthermore, the radiation pattern of cellular networks is typically setup assuming ground based users, and so aerial robots could experience degraded performance due to leaving the vertical coverage of cell antennas [71].

4G LTE supports direct Device-to-Device (D2D) modes that permit devices to communicate with each other in a local region by reserving some subset of radio resources in their local area from the network operator. LTE-V2V is a variant of this specifically for automotive use cases  [72]. This avoids the overhead of using the infrastructure as a relay but also has a cost in minimum association time and is dependent upon the network operator ceding resources on demand. Some proposals extend D2D cellular radio techniques into the unlicensed spectrum, or specifically, licensed sub-bands [73], however, this also has not been widely used in real-world systems due to the recency of the specification and a dearth of capable hardware.

5G introduced the Ultra-Reliable Low Latency Communications (URLLC) service to address the issues with low-latency medium access  [45] in a centralized or decentralized manner, with a variety of proposed physical layer  [74] and mac layer  [75] techniques. Despite the promise of these approaches, as well as 3GPP release 15 (including URLLC) being released in late 2017, roll out of these technologies into real-world networks has been limited, and considerable research is ongoing as to the best implementation methods for 5G’s technical goals [76].

Hybrid Schemes

Due to advancements in radio hardware, the most recent 802.11 revision, 802.11ax, has specified a physical radio resource allocation scheme that is far more similar to cellular standards, with multiple fine-grained frequency divisions being available within a single logical channel. Many proposals have been made to have future cellular standards directly inter-operate with cellular networks to leverage the best capabilities of both  [77]. For robot networks, this may prove to be highly valuable, permitting human control overrides over cellular infrastructure and low-latency robot-to-robot communications with a unified logical network addressing system for easier lab-to-world deployment, and efficient operation in control schemes that have evolving requirements throughout a single deployment. Though exciting, these proposals are still in their nascent phase and many issues remain to be addressed.

Though the state-of-the-art use of OFDMA in both 5G and 802.11ax significantly alleviates the contention problem due to the larger number of transmission slots made available, extremely dense ad hoc robot networks may still run into the limit of CSMA/CA. Non-orthogonal media access (NOMA) is a very promising technology that has the possibility to further extend the effective simultaneous radio resources available  [78], and therefore reducing the contention problem. In cellular systems, the network operator still has authoritative control over their radio resources, and so, grant-based schemes induce significant overhead despite the reduced contention; though grant-free access has attracted significant attention  [79]. Even in grant-free schemes, NOMA still requires coordination to ensure that node configurations do not overlap, and hence, message loss due to contention remains an open problem in decentralized networks without a coordinating infrastructure.

Communication-Aware Algorithms

Regardless of which underlying communication scheme or protocol is employed, unlimited and unconstrained communication cannot be assumed for any interactive scenario. A significant amount of literature in multi-robot applications, however, has generally focused on designing control schemes that do not explicitly model this dependency. This is reflected in the vast majority of literature in robot flocking [80,81,82,83]. We argue that the problem becomes more pronounced in cases where the robots need to deconflict and replan their motions in tight and constrained spaces [81, 82]. While some consideration for communication asynchronicity is made in some of the more recent works [82], the challenge is generally far from being solved.

One straight-forward approach to handle this is to simply reduce the amount of data (frequency, packet-size etc.) that needs to be communicated between agents. In an exploration problem, this is often done through various novelty metrics that determine whether a new datapoint needs to be communicated [84]. Trawny et al. [85] have proposed localization estimators that perform well by quantizing the transmitted information to very small packets, thereby tackling severe link constraints.

On the other hand, there is also a sustained research interest in modeling the communication channels between the agents, and factoring that as a constraint into the motion planning problem. This is done primarily to ensure robustness of a control scheme against imperfect and noisy communications. Alternatively, planning schemes have also considered communication as a sub-task (almost as if “scheduling” communications at intervals). Finally, there are several approaches that consider a joint optimization scheme, where path planning and communication planning are carried out in tandem. We divide this body of work into these three broad styles.

Communication-Aware Planning

As mentioned earlier, planning robot motions or trajectories that consider some model of the underlying communication links is an active area of research. Mularidharan and Mostofi provide a comprehensive overview of such methods [86]. For instance, several authors have considered the task of coverage & formation control by a team of robots. Evidently, these domains require explicit factoring of communication constraints into the planning problem [84, 87]. One way this is approached is by analyzing the stability of a formation under various communication link latencies [88]. This can then be then integrated into the control problem for a more reliable system [88, 89] that is “aware” of such latencies. Formation control laws are also explored that allow agents to maintain some degree of coordination while respecting limited communication ranges of their neighbors [90, 91]. This approach is also sometimes utilized in the context of cooperative target localization [92], or under constraints of 3G/4G mobile networks [93].

Path planning has also been developed such that connectivity with a subset of base-stations [94], or with some agents [95, 96] is maintained. Similar methods that plan for multiple robots (with connectivity constraints) involve using ACO-based (ant colony) planning [97] or genetic algorithms [98].

Plan-Aware Communications

In heavily constrained spaces, it is often desirable to design a network architecture that considers the planned path, and seeks opportunities to communicate therein. Underwater robots, for instance, have very limited communication capacities [99, 100], and this is an active field of interest; Zolich et al. [101] provide a comprehensive survey on the various challenges and solutions. Hollinger et al. have considered scheduling algorithms for underwater robotic sensor networks and show how path planning algorithms depend on these [102, 103]. Since bandwidth and interference constraints are much more severe in these environments, such scheduling algorithms often model the value of communicating at a particular timestep [103, 104]. This also plays a role in determining whether communicating has a positive impact on the state of the robot system [105, 106], and is also studied as an online decision problem [107], and an optimization problem that considers when/what to communicate [108]. Recent developments in subterranean robots operating under severe communication constraints have also explored the strategy of developing/maintaining communication “backbones” for explorer robots to continue on frontiers [109, 110], and of explicitly splitting communication pathways by plan and priority [111].

Joint Planning

Several of the works listed in the previous subsections may also be seen as jointly optimizing for communication quality as well as path qualities. However, there are other approaches that attempt to explicitly model this optimization problem. For instance, Kantaros and Zavlanos [112] propose a scheme that alternates between the two optimization problems sequentially. The nature of this scheme often makes it difficult to prove hard guarantees regarding optimality; however, a more hybrid approach in which the two controllers interact can offer more guarantees on network integrity available data rates [113]. A joint optimization scheme, on the other hand, can formulate this problem well; for instance, using an LQ (linear-quadratic) form can additionally offer robustness guarantees as well [114]. Yet another means of joint optimization is to consider the system as a cyber-physical system (CPS), where the “cyber” controller handles the communications domain, and the “physical” controller handles the kinematics of the robot [115]. Such models allow designers to factor various other elements of a CPS system, such as dynamically adapting one of the subsystems (communication capacity) while still maintaining the coupling with the other [116, 117].

Leveraging Machine Learning for Communication

Designing bespoke, hand-crafted communication protocols and behaviors is tedious and difficult. Firstly, numerous works point to the hardness of synthesizing decentralized policies (that have to operate in a partially observable regime), even when a centralized template is known [118, 119], and they leave the question of how (what, when, and to whom) to communicate unanswered. Secondly, the vast majority of existing robot communication strategies are based on idealistic operational assumptions, and besides a few specialized approaches to dealing with message loss, delay, or corruption, e.g.,  [34, 35, 120], it is not at all clear how to approach such problems in a manner that is transferable across applications. Leveraging machine learning methods is a promising new avenue to tackle some of these challenges.

Learning Communication Mechanisms

Message routing decisions in robotic mesh networks are complicated by highly dynamic topologies. While many routing mechanisms exist in ad hoc networks, these generally depend upon relatively slowly changing network conditions to function effectively. Many manually specified heuristic methods exist [66•]; however, these may lead to sub-optimal decisions as they may be constructed upon incorrect assumptions about the target network environment. Learning methods provide an attractive alternative, and have been explored in some depth in routing generally  [121]. An interesting example in the context of FANETs can be found in Zheng et al.  [122], who propose RLSRP which applies an online reinforcement learning method to the routing decision problem and shows improved performance across several metrics, including delivery latency.

Channel modeling and resource allocation are also key networking problems that are challenging for first principles methods to solve that can be improved with learning  [123]. Unsupervised learning has been applied to channel modeling, which allows for the optimization of transmission power by accurately estimating the quality of links to other network participants  [124].

Learning Communication Behaviors

Learning-based methods have proven effective at designing robot control policies for an increasing number of tasks [125, 126]. Recent work utilizes a data-driven approach to solve multi-robot problems, for example for multi-robot motion planning in the continuous domain [127] or path finding in the discrete domain [128].

Yet, research on learning how to synthesize robot-to-robot communication policies is nascent. From the point of view of an individual robot, its local decision-making system is incomplete, since other agents’ unobservable states affect future values. While the manner in which information is shared is crucial to the system’s performance, the problem is not well addressed by hand-crafted (bespoke) approaches. Learning-based methods, instead, promise to find solutions that balance optimality and real-world efficiency, by bridging the gap between the qualities of full-information centralized approaches and partial-information decentralized approaches [129].

Key to the decentralization of centralized (optimal) policies is the property of permutation equivariance. Permutation equivariance ensures that at the robot network level, the set of actions automatically rearranges itself as the agents swap order. One of the earliest works that satisfy this property is [130]. This was concurrently developed by a line of work that builds on Graph Neural Networks (GNNs), which are permutation equivariant by design [131,132,133]. GNNs have since then shown promising results in learning explicit communication strategies that enable complex multi-agent coordination [134,135,136,137].

When deploying GNNs in the context of multi-robot systems, individual robots are modeled as nodes, the communication links between them as edges, and the internal state of each robot as graph signals. By sending messages over the communication links, each robot in the graph indirectly receives access to the global state. A key attribute of GNNs is that they compress data as it flows through the communication graph. In effect, this compresses the global state, affording agents access to relevant encodings of global data. Since encodings are performed locally (with parameters that can be shared across the entire graph), the policies are intrinsically decentralized. In cases where the downstream task is tightly coupled with the communication requirements, it is beneficial to optimize the communication strategy jointly with perception and action policies. This was done in [138], for multi-robot flocking, and in [137], for multi-agent path planning. These frameworks implement a cascade of a convolutional neural network (CNN) and a GNN, which they jointly train so that image features and communication messages are learned in conjunction to better address the specific task. Recent work also shows how GNNs can be augmented by attention modules to produce message-aware communication strategies that allow robots to discern between important and less important message elements [139].

Approaches from within the multi-agent reinforcement learning (MARL) community tackle the learning of continuous communication protocols by formulating the problem as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP) [140,141,142]. The work in [143•] learns a targeted multi-agent communication strategy by exploiting a signature-based soft attention mechanism (whereby message relevance is learned). Similarly, the work in [144] has each robot learn to reason about other robots’ states and to more efficiently communicate trajectory information (i.e., when and to whom), and applies the solution to the problem of collision avoidance. While efficient cooperative communication strategies are desirable, the work in [145] shows how separate robot teams can learn to communicate with adversarial strategies that contribute to manipulative (non-cooperative) behaviors. Clearly, underlying training paradigms need to be carefully designed to avoid such outcomes.

Challenges and Open Problems

We finally present some avenues of research and engineering that are worth exploring in order to address our critiques discussed so far. We categorize them into four broad Open Problems.

1. Co-design. An emergent theme throughout this survey is the lack of approaches that co-design the robot and its communication capabilities. A variant of this concept [146, 147] considers a basic parallel reconfiguration of a network as well as the robot’s controller that can be beneficial when the robot moves across network stations. However, a true co-design scheme will jointly evolve all layers of the networking stack to favor the robotic task at hand. Design of a meta-system that is able to compute the limitations of robotic requirements as well as network capabilities and dynamically throttle both may be essential to safe deployment of robots into the real world. Any robotic control algorithm that uses explicit communications is vulnerable to failure if the network unexpectedly under-delivers, and performs sub-optimally if the network over-delivers — managing this resource allocation problem in a real-world multi-robot setting is a subject we will tackle in our future work.

2. Data-driven optimization. Machine learning, and specifically, reinforcement learning, can drive the development of multi-robot communications into new and interesting paradigms. Existing approaches that already learn what/when to send (and whom to send to) [130, 139, 144] still often depend on hand-designed architectures and specific task groups. With sufficiently large datasets, novel machine learning architectures also have the potential to learn to optimize multiple aspects of multi-robot systems at once (e.g., perception, action and communication [138]).

3. Sim-to-real for robot networks. The problems in sim-to-real transfer of robot coordination strategies are generally exacerbated by the “reality gap” found in communications [129]. Practical communication links suffer from message dropouts, asynchronous and out-of-order reception, and decentralized mesh topologies that may not offer reliability guarantees. Since multi-robot policies are typically trained in a synchronous fashion, these factors are hard to capture and simulate [148]. Furthermore, very few studies have captured any of these network effects in a large-scale setting [21]. Consequently, we find that embedding the reality gap of robot networking into data-driven approaches to multi-robot planning is an open research domain.

4. New technologies/schemes. As discussed in Section Communications Scheme Selection, there is a need for wireless data standards that specifically target the communication requirements of connected robots. The IEEE 1920 working group is a significant step in this direction, which was formed to propose a protocol that is intended for autonomous robotic networks [56]. Such a protocol is likely to be founded on 802.11bd since it is already a significant leap forward [43] from the legacy 802.11p standard used in V2V standards today.

Additionally, future 5G updates and 6G cellular communications promise dramatic improvements that hold the potential to bring cloud- and edge-computing at the forefront of many data-intensive multi-robot collaborations. Finally, we also note that geographic routing in FANETs may be an enabling technology for practically dealing with highly dynamic routing topologies. This will, however, require holistic developments in robot control algorithms that work in tandem to avoid an additional information distribution problem.


Through this manuscript, we have presented a survey of communication technologies and their role in enabling multi-robot applications. We have broadly covered the various technologies that have played key roles in networked robotics, and have also discussed how state-of-the-art robot applications typically deal with network constraints. Our approach to this has been mostly critical, and thus, has identified several deficiencies in the way robotics and networks have evolved. Towards the end, we also cover machine learning approaches and their role in developing data-driven communication strategies. We conclude the article with a list of challenges and open problems that the community currently faces, and also provide an outlook for how learning-based approaches can tackle several of them.