Advertisement

When Players Affect Target Values: Modeling and Solving Dynamic Partially Observable Security Games

  • Xinrun WangEmail author
  • Milind Tambe
  • Branislav Bošanský
  • Bo An
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11836)

Abstract

Most of the current security models assume that the values of targets/areas are static or the changes (if any) are scheduled and known to the defender. Unfortunately, such models are not sufficient for many domains, where actions of the players modify the values of the targets. Examples include wildlife scenarios, where the attacker can increase value of targets by secretly building supporting facilities. To address such security game domains with player-affected values, we first propose DPOS3G, a novel partially observable stochastic Stackelberg game where target values are determined by the players’ actions; the defender can only partially observe these targets’ values, while the attacker can fully observe the targets’ values and the defender’s strategy. Second, we propose RITA (Reduced game Iterative Transfer Algorithm), which is based on the heuristic search value iteration algorithm for partially observable stochastic game (PG-HSVI) and introduces three key novelties: (a) building a reduced game with only key states (derived from partitioning the state space) to reduce the numbers of states and transitions considered when solving the game; (b) incrementally adding defender’s actions to further reduce the number of transitions; (c) providing novel heuristics for lower bound initialization of the algorithm. Third, extensive experimental evaluations of the algorithms show that RITA significantly outperforms the baseline PG-HSVI algorithm on scalability while allowing for trade off in scalability and solution quality.

Notes

Acknowledgements

This work was supported by Microsoft AI for Earth, NSF grant CCF-1522054, the Czech Science Foundation (no. 19-24384Y), National Research Foundation of Singapore (no. NCR2016NCR-NCR001-0002) and NAP.

References

  1. 1.
    Basilico, N., Gatti, N., Amigoni, F.: Leader-follower strategies for robotic patrolling in environments with arbitrary topologies. In: AAMAS, pp. 57–64 (2009)Google Scholar
  2. 2.
    Blum, A., Haghtalab, N., Procaccia, A.D.: Learning optimal commitment to overcome insecurity. In: NIPS, pp. 1826–1834 (2014)Google Scholar
  3. 3.
    Bucarey, V., Casorrán, C., Figueroa, Ó., Rosas, K., Navarrete, H., Ordóñez, F.: Building real stackelberg security games for border patrols. In: Rass, S., An, B., Kiekintveld, C., Fang, F., Schauer, S. (eds.) GameSec 2017. LNCS, vol. 10575, pp. 193–212. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-68711-7_11CrossRefzbMATHGoogle Scholar
  4. 4.
    Chung, T.H., Hollinger, G.A., Isler, V.: Search and pursuit-evasion in mobile robotics. Auton. Robots 31(4), 299–316 (2011)CrossRefGoogle Scholar
  5. 5.
    Fang, F., Jiang, A.X., Tambe, M.: Protecting moving targets with multiple mobile resources. JAIR 48, 583–634 (2013)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Fang, F., et al.: Deploying PAWS: field optimization of the protection assistant for wildlife security. In: AAAI, pp. 3966–3973 (2016)Google Scholar
  7. 7.
    Gan, J., An, B., Vorobeychik, Y., Gauch, B.: Security games on a plane. In: AAAI, pp. 530–536 (2017)Google Scholar
  8. 8.
    Gan, J., Elkind, E., Wooldridge, M.: Stackelberg security games with multiple uncoordinated defenders. In: AAMAS, pp. 703–711 (2018)Google Scholar
  9. 9.
    Halvorson, E., Conitzer, V., Parr, R.: Multi-step multi-sensor hider-seeker games. In: IJCAI, pp. 159–166 (2009)Google Scholar
  10. 10.
    Haskell, W.B., Kar, D., Fang, F., Tambe, M., Cheung, S., Denicola, E.: Robust protection of fisheries with compass. In: AAAI, pp. 2978–2983 (2014)Google Scholar
  11. 11.
    Horák, K., Bošanský, B.: A point-based approximate algorithm for one-sided partially observable pursuit-evasion games. In: Zhu, Q., Alpcan, T., Panaousis, E., Tambe, M., Casey, W. (eds.) GameSec 2016. LNCS, vol. 9996, pp. 435–454. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-47413-7_25CrossRefzbMATHGoogle Scholar
  12. 12.
    Horák, K., Bosanský, B., Pechoucek, M.: Heuristic search value iteration for one-sided partially observable stochastic games. In: AAAI, pp. 558–564 (2017)Google Scholar
  13. 13.
    Jain, M., Korzhyk, D., Vaněk, O., Conitzer, V., Pěchouček, M., Tambe, M.: A double oracle algorithm for zero-sum security games on graphs. In: AAMAS, pp. 327–334 (2011)Google Scholar
  14. 14.
    Johnson, M.P., Fang, F., Tambe, M.: Patrol strategies to maximize pristine forest area. In: AAAI, pp. 295–301 (2012)Google Scholar
  15. 15.
    Kar, D., et al.: Cloudy with a chance of poaching: adversary behavior modeling and forecasting with real-world poaching data. In: AAMAS, pp. 159–167 (2017)Google Scholar
  16. 16.
    Letchford, J., Conitzer, V., Munagala, K.: Learning and approximating the optimal strategy to commit to. In: Mavronicolas, M., Papadopoulou, V.G. (eds.) SAGT 2009. LNCS, vol. 5814, pp. 250–262. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-04645-2_23CrossRefGoogle Scholar
  17. 17.
    Marecki, J., Tesauro, G., Segal, R.: Playing repeated Stackelberg games with unknown opponents. In: AAMAS, pp. 821–828 (2012)Google Scholar
  18. 18.
    McMahan, H.B., Gordon, G.J., Blum, A.: Planning in the presence of cost functions controlled by an adversary. In: ICML, pp. 536–543 (2003)Google Scholar
  19. 19.
    Paruchuri, P., Pearce, J.P., Marecki, J., Tambe, M., Ordonez, F., Kraus, S.: Playing games for security: an efficient exact algorithm for solving Bayesian Stackelberg games. In: AAMAS, pp. 895–902 (2008)Google Scholar
  20. 20.
    Pita, J., et al.: Using game theory for Los Angeles airport security. AI Mag. 30(1), 43 (2009)CrossRefGoogle Scholar
  21. 21.
    Shieh, E., et al.: PROTECT: a deployed game theoretic system to protect the ports of the united states. In: AAMAS, pp. 13–20 (2012)Google Scholar
  22. 22.
    Tambe, M.: Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge University Press, Cambridge (2011)CrossRefGoogle Scholar
  23. 23.
    Tsai, J., Yin, Z., Kwak, J., Kempe, D., Kiekintveld, C., Tambe, M.: Urban security: game-theoretic resource allocation in networked physical domains. In: AAAI, pp. 881–886 (2010)Google Scholar
  24. 24.
    Varakantham, P., Lau, H.C., Yuan, Z.: Scalable randomized patrolling for securing rapid transit networks. In: IAAI, pp. 1563–1568 (2013)Google Scholar
  25. 25.
    Vidal, R., Shakernia, O., Kim, H.J., Shim, D.H., Sastry, S.: Probabilistic pursuit-evasion games: theory, implementation, and experimental evaluation. IEEE Trans. Robot. Autom. 18(5), 662–669 (2002)CrossRefGoogle Scholar
  26. 26.
    Vorobeychik, Y., An, B., Tambe, M., Singh, S.P.: Computing solutions in infinite-horizon discounted adversarial patrolling games. In: ICAPS, pp. 314–322 (2014)Google Scholar
  27. 27.
    Yin, Y., An, B., Jain, M.: Game-theoretic resource allocation for protecting large public events. In: AAAI, pp. 826–834 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Xinrun Wang
    • 1
    Email author
  • Milind Tambe
    • 2
  • Branislav Bošanský
    • 3
  • Bo An
    • 1
  1. 1.Nanyang Technological UniversitySingaporeSingapore
  2. 2.Harvard UniversityCambridgeUSA
  3. 3.Department of Computer Science, Faculty of Electrical EngineeringCzech Technical University in PraguePragueCzech Republic

Personalised recommendations