Advertisement

Cooperative Mission Planning for Multi-UAV Teams

  • Sameera S. Ponda
  • Luke B. Johnson
  • Alborz Geramifard
  • Jonathan P. How
Reference work entry

Abstract

The use of robotic agents, such as unmanned aerial vehicles (UAVs) or unmanned ground vehicles (UGVs), has motivated the development of numerous autonomous cooperative task allocation and planning methods for heterogeneous networked teams. Typically agents within the team have different roles and responsibilities, and ensuring proper coordination between them is critical for efficient mission execution. However, as the number of agents, system components, and mission tasks increase, planning for such teams becomes increasingly complex, motivating the development of algorithms that can operate in real-time dynamic environments.

Given the complexity of the cooperative missions considered, there have been numerous solution approaches developed in recent years. This chapter provides an overview of three of the most common planning frameworks: integer programming, Markov decision processes, and game theory. The chapter also considers various architectural decisions that must be addressed when implementing online planning systems for multi-agent teams, providing insights on when centralized, distributed, and decentralized architectures might be good choices for a given application, and how to organize the communication and computation to achieve desired mission performance. Algorithms that can be utilized within the various architectures are identified and discussed, and future directions for research are suggested.

Keywords

Markov Decision Process Situational Awareness Task Allocation Reward Function Negotiation Strategy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

This work was supported in part by the AFOSR and USAF under grant (FA9550-08-1-0086) and MURI (FA9550-08-1-0356). The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force Office of Scientific Research or the U.S. government.

References

  1. A. Ahmed, A. Patel, T. Brown, M. Ham, M. Jang, G. Agha, Task assignment for a physical agent team via a dynamic forward/reverse auction mechanism, in International Conference on Integration of Knowledge Intensive Multi-Agent Systems (IEEE, Piscataway, 2005)Google Scholar
  2. B. Alidaee, H. Wang, F. Landram, A note on integer programming formulations of the real-time optimal scheduling and flight path selection of UAVs. IEEE Trans. Control Syst. Technol. 17(4), 839–843 (2009)Google Scholar
  3. B. Alidaee, H. Wang, F. Landram, On the flexible demand assignment problems: case of unmanned aerial vehicles. IEEE Trans. Autom. Sci. Eng. 8(4), 865–868 (2011)Google Scholar
  4. M. Alighanbari, J.P. How, A robust approach to the UAV task assignment problem. Int. J. Robust Nonlinear Control 18(2), 118–134 (2008a)zbMATHMathSciNetGoogle Scholar
  5. M. Alighanbari, J.P. How, An unbiased Kalman consensus algorithm. AIAA J. Aerosp. Comput. Inf. Commun. 5(9), 298–311 (2008b)Google Scholar
  6. G. Arslan, J. Marden, J. Shamma, Autonomous vehicle-target assignment: a game-theoretical formulation. J. Dyn. Syst. Meas. Control 129, 584 (2007)Google Scholar
  7. M.L. Atkinson, Results analysis of using free market auctions to distribute control of UAVs, in AIAA 3rd Unmanned Unlimited Technical Conference, Workshop and Exhibit (AIAA, Reston, 2004)Google Scholar
  8. A.G. Banerjee, M. Ono, N. Roy, B.C. Williams, Regression-based LP solver for chance-constrained finite horizon optimal control with nonconvex constraints, in Proceedings of the American Control Conference (San Francisco, 2011)Google Scholar
  9. R. Beard, V. Stepanyan, Information consensus in distributed multiple vehicle coordinated control. IEEE Conf. Decis. Control 2, 2029–2034 (2003)Google Scholar
  10. R.W. Beard, T.W. McLain, M.A. Goodrich, E.P. Anderson, Coordinated target assignment and intercept for unmanned air vehicles. IEEE Trans. Robot. Autom. 18, 911–922 (2002)Google Scholar
  11. R. Becker, Solving transition independent decentralized Markov decision processes, in Computer Science Department Faculty Publication Series, 2004, pp. 208Google Scholar
  12. J. Bellingham, A. Richards, J.P. How, Receding horizon control of autonomous aerial vehicles. Am. Control Conf. 5, 3741–3746 (2002)Google Scholar
  13. R. Bellman, Dynamic Programming (Dover, Mineola, 2003)zbMATHGoogle Scholar
  14. A. Ben-Tal, A. Nemirovski, Robust convex optimization. Math. Oper. Res. 23(4), 769–805 (1998)zbMATHMathSciNetGoogle Scholar
  15. D. Bernstein, R. Givan, N. Immerman, S. Zilberstein, The complexity of decentralized control of Markov decision processes, in Mathematics of operations research (2002), pp. 769-805. http://dl.acm.org/citation.cfm?id=2073951
  16. D.P. Bertsekas, The auction algorithm for assignment and other network flow problems, Technical report, MIT, 1989Google Scholar
  17. D.P. Bertsekas, Dynamic Programming and Optimal Control, vol. I–II, 3rd edn. (Athena Scientific, Belmont, 2007)Google Scholar
  18. D.P. Bertsekas, J.N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods (Prentice-Hall, Englewood Cliffs, 1989)zbMATHGoogle Scholar
  19. D. Bertsimas, D. Brown, Constructing uncertainty sets for robust linear optimization. Oper. Res. 57(6), 1483–1495 (2009)zbMATHMathSciNetGoogle Scholar
  20. D. Bertsimas, R. Weismantel, Optimization over integers (Dynamic Ideas, Belmont, 2005)Google Scholar
  21. D. Bertsimas, D.B. Brown, C. Caramanis, Theory and applications of robust optimization. SIAM review. 53(3), 464–501 (2011)zbMATHMathSciNetGoogle Scholar
  22. L. Bertuccelli, J. How, Active exploration in robust unmanned vehicle task assignment. J. Aerosp. Comput. Inf. Commun. 8, 250–268 (2011)Google Scholar
  23. L. Bertuccelli, H. Choi, P. Cho, J. How, Real-time multi-UAV task assignment in dynamic and uncertain environments, in AIAA Guidance, Navigation, and Control Conference (AIAA, Reston, 2009). (AIAA 2009–5776)Google Scholar
  24. B.M. Bethke, Kernel-based approximate dynamic programming using bellman residual elimination, Ph.D. thesis, Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, Cambridge, 2010Google Scholar
  25. B. Bethke, J.P. How, J. Vian, Group health management of UAV teams with applications to persistent surveillance, in American Control Conference (ACC), Seattle (IEEE, New York, 2008), pp. 3145–3150Google Scholar
  26. L. Blackmore, M. Ono, Convex chance constrained predictive control without sampling, in AIAA Proceedings. (np) (2009)Google Scholar
  27. V.D. Blondel, J.M. Hendrickx, A. Olshevsky, J.N. Tsitsiklis, Convergence in multiagent coordination, consensus, and flocking, in Proceeding of the IEEE Conference on Decision and Control (2005)Google Scholar
  28. S. Bradtke, A. Barto, Linear least-squares algorithms for temporal difference learning. J. Mach. Learn. Res. 22, 33–57 (1996)zbMATHGoogle Scholar
  29. L. Buşoniu, R. Babuška, B. De Schutter, D. Ernst, Reinforcement Learning and Dynamic Programming Using Function Approximators (CRC, Boca Raton, 2010)Google Scholar
  30. J. Capitán, M. Spaan, L. Merino, A. Ollero, Decentralized multi-robot cooperation with auctioned POMDPs, in Sixth Annual Workshop on Multiagent Sequential Decision Making in Uncertain Domains (MSDM-2011), 2011, p. 24Google Scholar
  31. D. Castanon, J. Wohletz, Model predictive control for stochastic resource allocation. IEEE Trans. Autom. Control 54(8), 1739–1750 (2009)MathSciNetGoogle Scholar
  32. D. A. Castanon, C. Wu, Distributed algorithms for dynamic reassignment. IEEE Conf. Decis. Control 1, 13–18 (2003)Google Scholar
  33. PR. Chandler, M. Pachter, D. Swaroop, J.M. Fowler, J.K. Howlett, S. Rasmussen, C. Schumacher, K. Nygard, Complexity in UAV cooperative control, in American Control Conference (ACC), Anchorage, 2002Google Scholar
  34. A. Chapman, R. Micillo, R. Kota, N. Jennings, Decentralized dynamic task allocation using overlapping potential games. Comput. J. 53, 1462–1477 (2010)Google Scholar
  35. W. Chen, M. Sim, J. Sun, C. Teo, From CVaR to uncertainty set: implications in joint chance constrained optimization. Oper. Res. 58(2), 470–485 (2010)zbMATHMathSciNetGoogle Scholar
  36. T. Chockalingam, S. Arunkumar, A randomized heuristics for the mapping problem: the genetic approach. Parallel Comput. 18(10), 1157–1165 (1992)zbMATHGoogle Scholar
  37. H.-L. Choi, L. Brunet, J.P. How, Consensus-based decentralized auctions for robust task allocation. IEEE Trans. Robot. 25(4), 912–926 (2009)Google Scholar
  38. T. Cormen, Introduction to Algorithms (MIT, Cambridge, 2001)zbMATHGoogle Scholar
  39. J. Cruz Jr., G. Chen, D. Li, X. Wang, Particle swarm optimization for resource allocation in UAV cooperative control, in AIAA Guidance, Navigation, and Control Conference and Exhibit, Providence (AIAA, Reston, 2004), pp. 1–11Google Scholar
  40. M.L. Cummings, J.P. How, A. Whitten, O. Toupet, The impact of human-automation collaboration in decentralized multiple unmanned vehicle control. Proc. IEEE 100(3), 660–671 (2012)Google Scholar
  41. P. De Boer, D. Kroese, S. Mannor, R. Rubinstein, A tutorial on the cross-entropy method. Ann. Oper. Res. 134(1), 19–67 (2005)zbMATHMathSciNetGoogle Scholar
  42. E. Delage, S. Mannor, Percentile optimization for Markov decision processes with parameter uncertainty. Oper. Res. 58(1), 203–213 (2010)zbMATHMathSciNetGoogle Scholar
  43. M.B. Dias, A. Stentz, A free market architecture for distributed control of a multirobot system, in 6th International Conference on Intelligent Autonomous Systems IAS-6 (IOS, Amsterdam/Washington, DC, 2000), pp. 115–122Google Scholar
  44. M.B. Dias, R. Zlot, N. Kalra, A. Stentz, Market-based multirobot coordination: a survey and analysis. Proc. IEEE 94(7), 1257–1270 (2006)Google Scholar
  45. Y. Eun, H. Bang, Cooperative task assignment/path planning of multiple unmanned aerial vehicles using genetic algorithms. J. Aircr. 46(1), 338 (2010)Google Scholar
  46. A.M. Farahmand, M. Ghavamzadeh, C. Szepesvári, S. Mannor, Regularized policy iteration, in Advances in Neural Information Processing Systems (NIPS), ed. by D. Koller, D. Schuurmans, Y. Bengio, L. Bottou (MIT, Cambridge, 2008), pp. 441–448Google Scholar
  47. J. Fax, R. Murray, Information flow and cooperative control of vehicle formations. IEEE Trans. Autom. Control 49(9), 1465–1476 (2004)MathSciNetGoogle Scholar
  48. C.A. Floudas, Nonlinear and Mixed-Integer Programming - Fundamentals and Applications (Oxford University Press, New York, 1995)Google Scholar
  49. C.S.R. Fraser, L.F. Bertuccelli, J.P. How, Reaching consensus with imprecise probabilities over a network, in AIAA Guidance, Navigation, and Control Conference (GNC), Chicago, 2009. (AIAA-2009-5655)Google Scholar
  50. C. S. Fraser, L.F. Bertuccelli, H.-L. Choi, J.P. How, A hyperparameter consensus method for agreement under uncertainty. Automatica 48(2), 374–380 (2012)zbMATHMathSciNetGoogle Scholar
  51. E.W. Frew, B. Argrow, Embedded reasoning for atmospheric science using unmanned aircraft systems, in AAAI 2010 Spring Symposium on Embedded Reasoning: Intelligence in Embedded Systems, Palo Alto (AAAI, Menlo Park, 2010)Google Scholar
  52. D. Fudenberg, J. Tirole, Game Theory (MIT, Cambridge, 1991)Google Scholar
  53. A. Gelman, J. Carlin, H. Stern, D. Rubin, Bayesian Data Analysis, 2nd edn. (Chapman and Hall, Boca Raton, 2004)zbMATHGoogle Scholar
  54. A. Geramifard, F. Doshi, J. Redding, N. Roy, J. How, Online discovery of feature dependencies, in International Conference on Machine Learning (ICML), ed. by L. Getoor, T. Scheffer (ACM, New York, 2011), pp. 881–888Google Scholar
  55. B. Gerkey, M. Mataric, Sold!: auction methods for multirobot coordination. IEEE Trans. Robot. Autom 18(5), 758–768 (2002)Google Scholar
  56. B.P. Gerkey, M.J. Mataric, A formal analysis and taxonomy of task allocation in multi-robot systems. Int. J. Robot. Res. 23(9), 939–954 (2004)Google Scholar
  57. F. Glover, R. Marti, Tabu search, in Metaheuristic Procedures for Training Neutral Networks (Springer, Boston, 2006), pp. 53–69Google Scholar
  58. C. Goldman, S. Zilberstein, Optimizing information exchange in cooperative multi-agent systems, in Proceedings of the second international joint conference on Autonomous agents and multiagent systems (ACM, New York, 2003), pp. 137–144Google Scholar
  59. C. Goldman, S. Zilberstein, Decentralized control of cooperative systems: categorization and complexity analysis. J. Artif. Intell. Res. 22, 143–174 (2004)zbMATHMathSciNetGoogle Scholar
  60. D. Golovin, A. Krause, Adaptive submodularity: a new approach to active learning and stochastic optimization, Proceedings of the International Conference on Learning Theory (COLT), 2010Google Scholar
  61. S. Grime, H. Durrant-Whyte, Data fusion in decentralized sensor networks. Control Eng. Pract. 2(5), 849–863 (1994)Google Scholar
  62. C. Guestrin, D. Koller, R. Parr, Multiagent planning with factored MDPs, in NIPS, ed. by T.G. Dietterich, S. Becker, Z. Ghahramani (MIT, Cambridge, 2001), pp. 1523–1530Google Scholar
  63. Y. Hatano, M. Mesbahi, Agreement over random networks. IEEE Trans. Autom. Control 50(11), 1867–1872 (2005)MathSciNetGoogle Scholar
  64. A. Jadbabaie, J. Lin, A.S. Morse, Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Trans. Autom. Control 48(6), 988–1001 (2003)MathSciNetGoogle Scholar
  65. L.B. Johnson, S.S. Ponda, H.-L. Choi, J.P. How, Asynchronous decentralized task allocation for dynamic environments, in Proceedings of the AIAA Infotech@Aerospace Conference, St. Louis (AIAA, Reston, 2011)Google Scholar
  66. L.B. Johnson, H.-L. Choi, S.S. Ponda, J.P. How, Allowing non-submodular score functions in distributed task allocation, in IEEE Conference on Decision and Control (CDC), 2012 (submitted)Google Scholar
  67. Y. Kim, D. Gu, I. Postlethwaite, Real-time optimal mission scheduling and flight path selection. IEEE Trans. Autom. Control 52(6), 1119–1123 (2007)MathSciNetGoogle Scholar
  68. E. King, Y. Kuwata, M. Alighanbari, L. Bertuccelli, J.P. How, Coordination and control experiments on a multi-vehicle testbed, in American Control Conference (ACC), Boston, (American Automatic Control Council, Evanston; IEEE, Piscataway, 2004), pp. 5315–5320Google Scholar
  69. A. Krause, C. Guestrin, A. Gupta, J. Kleinberg, Near-optimal sensor placements: maximizing information while minimizing communication cost, in Information Processing in Sensor Neworks, 2006. IPSN 2006. The Fifth International Conference on (ACM, New York, 2006), pp. 2–10, 0–0Google Scholar
  70. M.G. Lagoudakis, R. Parr, Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107–1149 (2003)MathSciNetGoogle Scholar
  71. G. Laporte, F. Semet, Classical heuristics for the capacitated VRP, in The Vehicle Routing Problem, ed. by P. Toth, D. Vigo (Society for Industrial Mathematics, Philadelphia, 2002)Google Scholar
  72. S. Leary, M. Deittert, J. Bookless, Constrained UAV mission planning: a comparison of approaches, in Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, Barcelona (IEEE, Piscataway, 2011), pp. 2002–2009Google Scholar
  73. T. Lemaire, R. Alami, S. Lacroix, A distributed task allocation scheme in multi-UAV context. IEEE Int. Conf. Robot. Autom. 4, 3622–3627 (2004)Google Scholar
  74. J. Lin, A. Morse, B. Anderson, The multi-agent rendezvous problem. IEEE Conf. Decis. Control 2, 1508–1513 (2003)Google Scholar
  75. S. Mahadevan, M. Maggioni, C. Guestrin, Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes. J. Mach. Learn. Res. 8, 2007 (2006)Google Scholar
  76. A. Makarenko, H. Durrant-Whyte, Decentralized Bayesian algorithms for active sensor networks. Int. Conf. Inf. Fusion 7(4), 418–433 (2006)Google Scholar
  77. N.D. Manh, L.T.H. An, P.D. Tao, A cross-entropy method for nonlinear UAV task assignment problem, in IEEE International Conference on Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF) (IEEE, Piscataway, 2010), pp. 1–5Google Scholar
  78. J. Marden, A. Wierman, Overcoming limitations of game-theoretic distributed control, in Joint 48th IEEE Conference on Decision and Control and 28th Chinese Control Conference (IEEE, Piscataway, 2009)Google Scholar
  79. J. Marden, G. Arslan, J. Shamma, Cooperative control and potential games. IEEE Trans. Syst. Man Cybern. Part B Cybern. 39(6), 1393–1407 (2009)Google Scholar
  80. M.J. Mataric, G.S. Sukhatme, E.H. Ostergaard, Multi-robot task allocation in uncertain environments. Auton. Robots 142–3), 255–263 (2003)zbMATHGoogle Scholar
  81. I. Maza, F. Caballero, J. Capitan, J. Martínez-de Dios, A. Ollero, Experimental results in multi- UAV coordination for disaster management and civil security applications. J. Intell. Robot. Syst. 61(1), 563–585 (2011)Google Scholar
  82. T.W. McLain, R.W. Beard, Coordination variables, coordination functions, and cooperative-timing missions. J. Guid. Control Dyn. 28(1), 150–161 (2005)Google Scholar
  83. F.S. Melo, M. Veloso, Decentralized MDPs with sparse interactions. Artif. Intell. 175, 1757–1789 (2011)zbMATHMathSciNetGoogle Scholar
  84. C.C. Moallemi, B.V. Roy, Consensus propagation. IEEE Trans. Inf. Theory 52(11), 4753–4766 (2006)Google Scholar
  85. D. Monderer, L. Shapley, Potential games. Games Econ. Behav. 14, 124–143 (1996)zbMATHMathSciNetGoogle Scholar
  86. R. Murphey, Target-based weapon target assignment problems. Nonlinear Assign. Probl. Algorithms Appl. 7, 39–53 (1999)MathSciNetGoogle Scholar
  87. A. Nemirovski, A. Shapiro, Convex approximations of chance constrained programs. SIAM J. Optim. 17(4), 969–996 (2007)MathSciNetGoogle Scholar
  88. I. Nikolos, E. Zografos, A. Brintaki, UAV path planning using evolutionary algorithms, in Innovations in Intelligent Machines-1 (Springer, Berlin/New York, 2007), pp. 77–111Google Scholar
  89. Office of the Secretary of Defense, Unmanned aircraft systems roadmap, Technical report, OSD (2007), http://www.acq.osd.mil/usd/UnmannedSystemsRoadmap.2007-2032.pdf
  90. R. Olfati-saber, Distributed Kalman filtering and sensor fusion in sensor networks, in Network Embedded Sensing and Control, vol. 331 (Springer, Berlin, 2006), pp. 157–167Google Scholar
  91. R. Olfati-Saber, R.M. Murray, Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans. Autom. Control 49(9), 1520–1533 (2004)MathSciNetGoogle Scholar
  92. R. Olfati-Saber, A. Fax, R.M. Murray, Consensus and cooperation in networked multi-agent systems. IEEE Trans. Autom. Control 95(1), 215–233 (2007)Google Scholar
  93. A. Olshevsky, J.N. Tsitsiklis, Convergence speed in distributed consensus and averaging, in IEEE Conference on Decision and Control (CDC) (IEEE, Piscataway, 2006), pp. 3387–3392Google Scholar
  94. C. Papadimitriou, Computational Complexity (Wiley, Chichester, 2003)Google Scholar
  95. C. H. Papadimitriou, J.N. Tsitsiklis, The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)zbMATHMathSciNetGoogle Scholar
  96. S. Paquet, L. Tobin, B. Chaib-draa, Real-time decision making for large POMDPs. Adv. Artif. Intell. 3501, 450–455 (2005)Google Scholar
  97. K. Passino, M. Polycarpou, D. Jacques, M. Pachter, Y. Liu, Y. Yang, M. Flint, M. Baum, Cooperative control for autonomous air vehicles, in Cooperative control and optimization (Kluwer, Dordrecht/Boston, 2002), pp. 233–271Google Scholar
  98. S.S Ponda, J. Redding, H.-L. Choi, J.P. How, M.A. Vavrina, J. Vian, Decentralized planning for complex missions with dynamic communication constraints, in American Control Conference (ACC), Baltimore, 2010Google Scholar
  99. S.S Ponda, L.B. Johnson, H.-L. Choi, J.P. How, Ensuring network connectivity for decentralized planning in dynamic environments, in Proceedings of the AIAA Infotech@Aerospace Conference, St. Louis (AIAA, Reston, 2011)Google Scholar
  100. S.S Ponda, L.B. Johnson, J.P. How, Distributed chance-constrained task allocation for autonomous multi-agent teams, in American Control Conference (ACC), 2012Google Scholar
  101. A. Pongpunwattana, R. Rysdyk, J. Vagners, D. Rathbun, Market-based co-evolution planning for multiple autonomous vehicles, in Proceedings of the AIAA Unmanned Unlimited Conference, San Diego (AIAA, Reston, 2003)Google Scholar
  102. D. Pynadath, M. Tambe, The communicative multiagent team decision problem: analyzing teamwork theories and models. J. Artif. Intell. Res. 16(1), 389–423 (2002)zbMATHMathSciNetGoogle Scholar
  103. S. Rathinam, R. Sengupta, S. Darbha, A resource allocation algorithm for multivehicle systems with nonholonomic constraints. IEEE Trans. Autom. Sci. Eng. 4(1), 98–104 (2007)Google Scholar
  104. J. Redding, A. Geramifard, A. Undurti, H. Choi, J. How, An intelligent cooperative control architecture, in American Control Conference (ACC), Baltimore, 2010, pp. 57–62Google Scholar
  105. J.D. Redding, N.K. Ure, J.P. How, M. Vavrina, J. Vian, Scalable, MDP-based planning for multiple, cooperating agents, in American Control Conference (ACC) (2012, to appear)Google Scholar
  106. W. Ren, Consensus based formation control strategies for multi-vehicle systems, in American Control Conference (ACC) (American Automatic Control Council, Evanston; IEEE, Piscataway, 2006), pp. 6–12Google Scholar
  107. W. Ren, R. Beard, Consensus seeking in multiagent systems under dynamically changing interaction topologies. IEEE Trans. Autom. Control 50(5), 655–661 (2005)MathSciNetGoogle Scholar
  108. W. Ren, R.W. Beard, D.B. Kingston, Multi-agent Kalman consensus with relative uncertainty. Am. Control Conf. 3, 1865–1870 (2005)Google Scholar
  109. W. Ren, R.W. Beard, E.M. Atkins, Information consensus in multivehicle cooperative control. IEEE Control Syst. Mag. 27(2), 71–82 (2007)Google Scholar
  110. A. Richards, J. Bellingham, M. Tillerson, J.P. How, Coordination and control of multiple UAVs, in AIAA Guidance, Navigation, and Control Conference (GNC), Monterey (AIAA, Reston, 2002). AIAA Paper 2002–4588Google Scholar
  111. G.A. Rummery, M. Niranjan, Online Q-learning using connectionist systems (Technical Report No. CUED/F-INFENG/TR 166), Cambridge University Engineering Department (1994)Google Scholar
  112. R.O. Saber, W.B. Dunbar, R.M. Murray, Cooperative control of multi-vehicle systems using cost graphs and optimization, in Proceedings of the 2003 American Control Conference, 2003, vol. 3 (IEEE, Piscataway, 2003), pp. 2217–2222Google Scholar
  113. A. Salman, I. Ahmad, S. Al-Madani, Particle swarm optimization for task assignment problem. Microprocess. Microsyst. 26(8), 363–371 (2002)Google Scholar
  114. S. Sariel, T. Balch, Real time auction based allocation of tasks for multi-robot exploration problem in dynamic environments, in AIAA Workshop on Integrating Planning Into Scheduling (AAAI, Menlo Park, 2005)Google Scholar
  115. K. Savla, E. Frazzoli, F. Bullo, On the point-to-point and traveling salesperson problems for Dubins’ vehicle, in American Control Conference (ACC), June 2005. pp. 786–791Google Scholar
  116. B. Scherrer, Should one compute the temporal difference fix point or minimize the Bellman Residual? The unified oblique projection view, International Conference on Machine Learning (ICML) (IEEE, Los Alamitos, 2010)Google Scholar
  117. D.G. Schmale, B. Dingus, C. Reinholtz, Development and application of an autonomous unmanned aerial vehicle for precise aerobiological sampling above agricultural fields. J. Field Robot. 25(3), 133–147 (2008)Google Scholar
  118. C. Schumacher, P.R. Chandler, S. Rasmussen, Task allocation for wide area search munitions via network flow optimization, in Proceedings of the American Control Conference, Anchorage, 2002, pp. 1917–1922Google Scholar
  119. S. Seuken, S. Zilberstein, Formal models and algorithms for decentralized decision making under uncertainty. Auton. Agents Multi-Agent Syst. 17(2), 190–250 (2008)Google Scholar
  120. T. Shima, S.J. Rasmussen, UAV Cooperative Decision and Control: Challenges and Practical Approaches, vol. 18 (Society for Industrial Mathematics, Philadelphia, 2009)Google Scholar
  121. T. Shima, S. Rasmussen, A. Sparks, K. Passino, Multiple task assignments for cooperating uninhabited aerial vehicles using genetic algorithms. Comput. Oper. Res. 33(11), 3252–3269 (2006)zbMATHGoogle Scholar
  122. A. Singh, A. Krause, W. Kaiser, Nonmyopic adaptive informative path planning for multiple robots, in International Joint Conference on Artificial Intelligence (IJCAI) (AAAI, Menlo Park, 2009)Google Scholar
  123. R.G. Smith, R. Davis, Frameworks for cooperation in distributed problem solving. IEEE Trans. Syst. Man Cybern. 11(1), 61–70 (1981)Google Scholar
  124. M.T.J. Spaan, N. Vlassis, Perseus: randomized point-based value iteration for POMDPs. Int. J. Robot. Res. 24, 195–220 (2005)zbMATHGoogle Scholar
  125. P. Stone, R.S. Sutton, G. Kuhlmann, Reinforcement learning for RoboCup-Soccer keepaway. Int. Soc. Adapt. Behav. 13(3), 165–188 (2005)Google Scholar
  126. P.B. Sujit, D. Kingston, R. Beard, Cooperative forest fire monitoring using multiple UAVs, in IEEE Conference on Decision and Control, New Orleans (IEEE, Piscataway, 2007), pp. 4875–4880Google Scholar
  127. R.S. Sutton, Generalization in reinforcement learning: successful examples using sparse coarse coding, in Advances in Neural Information Processing Systems 8 (MIT, Cambridge/London, 1996), pp. 1038–1044Google Scholar
  128. R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction (MIT, Cambridge, 1998)Google Scholar
  129. R.S. Sutton, H.R. Maei, D. Precup, S. Bhatnagar, D. Silver, C. Szepesvari, E. Wiewiora, Fast gradient-descent methods for temporal-difference learning with linear function approximation, in International Conference on Machine Learning (ICML), ICML ’09 (ACM, New York, 2009), pp. 993–1000Google Scholar
  130. A. Tahbaz-Salehi, A. Jadbabaie, On consensus over random networks, in 44th Annual Allerton Conference, 2006Google Scholar
  131. P. Toth, D. Vigo, The Vehicle Routing Problem (Society for Industrial and Applied Mathematics, Philadelphia, 2001)Google Scholar
  132. J.N. Tsitsiklis, B.V. Roy, An analysis of temporal difference learning with function approximation. IEEE Trans. Autom. Control 42(5), 674–690 (1997)zbMATHGoogle Scholar
  133. K. Tumer, D. Wolpert, A survey of collectives, in Collectives and the Design of Complex Systems (Springer, New York, 2004), pp. 1–42Google Scholar
  134. A. Undurti, J.P. How, A Cross-entropy based approach for UAV task allocation with nonlinear reward, in AIAA Guidance, Navigation, and Control Conference (GNC) (AIAA, Reston, 2010). AIAA-2010-7731Google Scholar
  135. U.S. Air Force Chief Scientist (AF/ST), Technology horizons: a vision for air force science & technology during 2010-2030, Technical report, United States Air Force (2010)Google Scholar
  136. U.S. Army UAS Center of Excellence, Eyes of the Army: U.S. Army unmanned aircraft systems roadmap 2010–2035, Technical report (2010), http://www.fas.org/irp/program/collect/uas-army.pdf
  137. M. Valenti, B. Bethke, J.P. How, D.P. de Farias, J. Vian, Embedding health management into mission tasking for UAV teams, in American Control Conference (ACC), New York (IEEE, New York, 2007), pp. 5777–5783Google Scholar
  138. E. Waltz, J. Llinas, Multisensor Data Fusion (Artech House, Boston/London, 1990)Google Scholar
  139. R.V. Welch, G.O. Edmonds, Applying robotics to HAZMAT, in The Fourth National Technology Transfer Conference and Exposition, vol. 2 (2003), pp. 279–287Google Scholar
  140. A. K. Whitten, H.-L. Choi, L. Johnson, J.P. How, Decentralized task allocation with coupled constraints in complex missions, in American Control Conference (ACC), 2011, pp. 1642–1649Google Scholar
  141. C.W. Wu, Synchronization and convergence of linear dynamics in random directed networks. IEEE Trans. Autom. Control 51(7), 1207–1210 (2006)Google Scholar
  142. L. Xiao, S. Boyd, S. Lall, A scheme for robust distributed sensor fusion based on average consensus, in International Symposium on Information Processing in Sensor NeWorks (ACM, New York, 2005), pp. 63–70Google Scholar
  143. R. Zhou, E.A. Hansen, An improved grid-based approximation algorithm for POMDPs, in International Joint Conference on Artificial Intelligence, vol. 17, number 1 (Morgan Kaufmann, San Francisco, 2001), pp. 707–716Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2015

Authors and Affiliations

  1. 1.Department of Aeronautics and AstronauticsAerospace Controls Laboratory Massachusetts Institute of TechnologyCambridgeUSA
  2. 2.Department of Aeronautics and AstronauticsAerospace Controls LaboratoryMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations