Advertisement

KI - Künstliche Intelligenz

, Volume 29, Issue 4, pp 363–368 | Cite as

Statistical Relational Artificial Intelligence: From Distributions through Actions to Optimization

  • Kristian Kersting
  • Sriraam Natarajan
Technical Contribution

Abstract

Statistical Relational AI—the science and engineering of making intelligent machines acting in noisy worlds composed of objects and relations among the objects—is currently motivating a lot of new AI research and has tremendous theoretical and practical implications. Theoretically, combining logic and probability in a unified representation and building general-purpose reasoning tools for it has been the dream of AI, dating back to the late 1980s. Practically, successful statistical relational AI tools enable new applications in several large, complex real-world domains including those involving big data, natural text, social networks, the web, medicine and robotics, among others. Such domains are often characterized by rich relational structure and large amounts of uncertainty. Logic helps to faithfully model the former while probability helps to effectively manage the latter. Our intention here is to give a brief (and necessarily incomplete) overview and invitation to the emerging field of Statistical Relational AI from the perspective of acting optimally and learning to act.

Keywords

Markov Decision Process Inductive Logic Programming Logical Query Probabilistic Graphical Model Situation Calculus 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

The authors thank the anonymous reviewers for their feedback. They are also grateful to all the people who contributed to the development of Statistical Relational AI, in particular to Scott Sanner and David Poole for previous and current joint efforts in writing introductions and overviews on statistical relational AI and symbolic dynamic programming, parts of which grew into the present paper. KK also likes to acknowledge the supported by by the German Science Foundation (DFG), KE 1686/2-1, within the SPP 1527 “Autonomous Learning”, and SN the support of Army Research Office (ARO) grant number W911NF-13-1-0432 under the Young Investigator Program.

References

  1. 1.
    Abdo N, Kretzschmar H, Spinello L, Stachniss C (2013) Learning manipulation actions from a few demonstrations. In: 2013 IEEE international conference on robotics and automation (ICRA), pp 1268–1275Google Scholar
  2. 2.
    Ahmadi B, Kersting K, Mladenov M, Natarajan S (2013) Exploiting symmetries for scaling loopy belief propagation and relational training. Mach Learn 92(1):91–132MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Bacchus F, Halpern JY, Levesque HJ (1999) Reasoning about noisy sensors and effectors in the situation calculus. Artificial Intelligence 111(1–2):171–208 (1999). URL:http://www.lpaig.uwaterloo.ca/fbacchus/on-line.html
  4. 4.
    Beetz M, Jain D, Mösenlechner L, Tenorth M (2010) Towards performing everyday manipulation activities. Robot Auton Syst 58(9):1085–1095CrossRefGoogle Scholar
  5. 5.
    Bellman RE (1957) Dynamic programming. Princeton University Press, PrincetonzbMATHGoogle Scholar
  6. 6.
    Boutilier C, Reiter R, Price B (2001) Symbolic dynamic programming for first-order MDPs. In: Nebel B (ed) Seventeenth international joint conference on artificial intelligence (IJCAI-01). Morgan Kaufmann, Seattle, USA, pp 690–700Google Scholar
  7. 7.
    Van den Broeck G, Thon I, van Otterlo M, De Raedt L (2010) DTProbLog: A decision-theoretic probabilistic prolog. In: Proceedings of the AAAI conference on artificial intelligence (AAAI 2010)Google Scholar
  8. 8.
    Driessens K, Dzeroski S (2002) Integrating experimentation and guidance in relational reinforcement learning. In: Proceedings of ICML, pp 115–122Google Scholar
  9. 9.
    Driessens K, Dzeroski S (2005) Combining model-based and instance-based learning for first-order regression, pp 193–200Google Scholar
  10. 10.
    Driessens K, Ramon J (2003) Relational instance based regression for relational reinforcement learning, pp 123–130Google Scholar
  11. 11.
    Driessens K, Ramon J, Gärtner T (2006) Graph kernels and Gaussian processes for relational reinforcement learning. Mach Learn J 64(1–3):91–119CrossRefzbMATHGoogle Scholar
  12. 12.
    Džeroski S, De Raedt L, Driessens K (2001) Relational reinforcement learning. Mach Learn J 43(1/2):7–52CrossRefzbMATHGoogle Scholar
  13. 13.
    Fern A, Yoon S, Givan R (2003) Approximate policy iteration with a policy language bias. In: NIPS-2003. VancouverGoogle Scholar
  14. 14.
    Fern A, Yoon S, Givan R (2006) Approximate policy iteration with a policy language bias: Solving relational markov decision processes. J Artif Intell Res (JAIR) 25:75–118MathSciNetzbMATHGoogle Scholar
  15. 15.
    Fikes RE, Nilsson NJ (1971) STRIPS: a new approach to the application of theorem proving to problem solving. Artif Intell 2(3–4):189–208CrossRefzbMATHGoogle Scholar
  16. 16.
    Geffner H (2014) Artificial intelligence: from programs to solvers. AI Commun. 27(1):45–51MathSciNetGoogle Scholar
  17. 17.
    Getoor L (2007) In: Taskar B (eds.) An introduction to statistical relational learning. MIT Press, CambridgeGoogle Scholar
  18. 18.
    Gretton C, Thiebaux S (2004) Exploiting first-order regression in inductive policy selection. In: UAI-04, pp 217–225Google Scholar
  19. 19.
    Hölldobler S, Skvortsova O (2004) A logic-based approach to dynamic programming. In: In AAAI-04 Workshop on Learning and planning in MDPs, pp 31–36. Menlo Park, CAGoogle Scholar
  20. 20.
    Joshi S, Kersting K, Khardon R (2010) Self-taught decision theoretic planning with first order decision diagrams. In: Proceedings of ICAPS-10Google Scholar
  21. 21.
    Kaelbling LP, Littman ML, Cassandra AR (1998) Planning and acting in partially observable stochastic domains. Artif Intell J 101(1–2):99–134MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Karabaev E, Skvortsova O (2005) A heuristic search algorithm for solving first-order MDPs. In: UAI-2005, pp 292–299Google Scholar
  23. 23.
    Kersting K (2012) Lifted probabilistic inference. In: Proceedings of the 20th European conference on artificial intelligence (ECAI), pp 33–38. (Invited Talk at the Frontiers of AI Track) Google Scholar
  24. 24.
    Kersting K, Driessens K (2008) Non-parametric policy gradients: A unified treatment of propositional and relational domains. In: McCallum SRA (ed) Proceedings of the 25th international conference on machine learning (ICML 2008). Helsinki, FinlandGoogle Scholar
  25. 25.
    Kersting K, Mladenov M, Tokmakov P (2015) Relational linear programming. Artif Intell J. (Accepted for publication) Google Scholar
  26. 26.
    Kersting K, van Otterlo M, de Raedt L (2004) Bellman goes relational. In: ICML-04. ACM Press, Banff, Alberta, Canada. doi: 10.1145/1015330.1015401
  27. 27.
    Kimmig A, Mihalkova L, Getoor L (2015) Lifted graphical models: a survey. Mach Learn 99(1):1–45MathSciNetCrossRefGoogle Scholar
  28. 28.
    Koller D, Friedman N (2009) Probabilistic graphical models—principles and techniques. MIT Press, CambridgeGoogle Scholar
  29. 29.
    Kulick J, Toussaint M, Lang T, Lopes M (2013) Active learning for teaching a robot grounded relational symbols. In: Proceedings of the 23rd international joint conference on artificial intelligence (IJCAI)Google Scholar
  30. 30.
    Lang T, Toussaint M (2009) Relevance grounding for planning in relational domains. In Buntine WL, Grobelnik M, Mladenic G, Shawe-Taylo J (eds) Proceedings of the european conference of machine learning and knowledge discovery in databases (ECML/PKDD), Part I. Springer, pp 736–751Google Scholar
  31. 31.
    Lang T, Toussaint M (2010) Planning with noisy probabilistic relational rules. J Artif Intell Res 39:1–49zbMATHGoogle Scholar
  32. 32.
    Lang T, Toussaint M, Kersting K (2010) Exploration in relational worlds. In: Balcazar J, Bonchi F, Gionis A, Sebag M (eds) European conference on machine learning and principles and practice of knowledge discovery in databases (ECML PKDD-10). Springer, BarcelonaGoogle Scholar
  33. 33.
    Lang T, Toussaint M, Kersting K (2012) Exploration in relational domains for model-based reinforcement learning. J Mach Learn Res 13:3691–3734MathSciNetGoogle Scholar
  34. 34.
    Natarajan S, Joshi S, Tadepalli P, Kersting K, Shavlik J (2011) Imitation learning in relational domains: a functional-gradient boosting approach. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI), pp 1414–1420Google Scholar
  35. 35.
    Nath A, Richardson M (2012) Counting-mlns: Learning relational structure for decision making. In: Proceedings of the twenty-sixth AAAI conference on artificial intelligenceGoogle Scholar
  36. 36.
    Ngo V, Toussaint M (2014) Model-based relational RL when object existence is partially observable. In: Proceedings of the 31th international conference on machine learning (ICML), pp 559–567Google Scholar
  37. 37.
    Orthey A, Toussaint M, Jetchev N (2013) Optimizing motion primitives to make symbolic models more predictive. In: 2013 IEEE international conference on robotics and automation (ICRA), pp 2868–2873Google Scholar
  38. 38.
    Pasula H, Zettlemoyer L, Pack Kaelbling L (2007)Learning symbolic models of stochastic domains. J Artif Intell Res 29:309–352Google Scholar
  39. 39.
    Pearl J (1991) Reasoning in intelligent systems: networks of plausible Inference, 2nd edn. Morgan Kaufmann, San Francisco, CA, USAGoogle Scholar
  40. 40.
    Poole D (1997) The independent choice logic for modelling multiple agents under uncertainty. Artif Intell J 94:7–56MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Poole D (1998) Decision theory, the situation calculus and conditional plans. Electron Trans Artif Intell 2(1–2). URL:http://www.etaij.org
  42. 42.
    Powell WB (2010) Merging ai and or to solve high-dimensional stochastic optimization problems using approximate dynamic programming. INFORMS INFORMS J Comput 22(1):2–17MathSciNetCrossRefzbMATHGoogle Scholar
  43. 43.
    Ramon J, Driessens K, Croonenborghs T (2007) Transfer learning in reinforcement learning problems through partial policy recycling. In: Proceedings of ECML, pp 699–707Google Scholar
  44. 44.
    Sanner S, Boutilier C (2007) Approximate solution techniques for factored first-order MDPs. In: Proceedings of ICAPSGoogle Scholar
  45. 45.
    Sanner S, Boutilier C (2009) Practical solution techniques for first order mdps. AIJ 173:748–788MathSciNetzbMATHGoogle Scholar
  46. 46.
    Sanner S, Boutilier C (2009) Practical solution techniques for first-order MDPs. Artif Intell J 173(5–6):748–788MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Sanner S, Kersting K (2010) Symbolic dynamic programming. In: C. Sammut, G. Webb (eds.) Encyclopedia of machine learning, p 946–954. SpringerGoogle Scholar
  48. 48.
    Sanner S, Kersting K (2010) Symbolic dynamic programming for first-order POMDPs. In: Proc. AAAI-2010Google Scholar
  49. 49.
    Sarjant S, Pfahringer B, Driessens K, Smith T (2011) Using the online cross-entropy method to learn relational policies for playing different games. In: 2011 IEEE conference on computational intelligence and games (CIG), pp 182–189Google Scholar
  50. 50.
    Sutton RS, Barto AG (1998) Reinforcement learning: an Introduction. MIT Press, Canbridge, MAGoogle Scholar
  51. 51.
    Tadepalli P, Givan R, Driessens K (2004) Relational reinforcement learning: an overview. In: Proc. ICML workshop on relational reinforcement learningGoogle Scholar
  52. 52.
    Tenorth M, Klank U, Pangercic D, Beetz M (2011) Web-enabled robots—robots that use the web as an information resource. IEEE robotics and automation magazine p 18Google Scholar
  53. 53.
    Van Haaren J, Van den Broeck G, Meert W, Davis J (2014) Tractable learning of liftable Markov logic networks. In: Proceedings of the ICML-14 workshop on learning tractable probabilistic models (LTPM)Google Scholar
  54. 54.
    van Otterlo M (2009) The logic of adaptive behavior−knowledge representation and algorithms for adaptive sequential decision making under uncertainty in first-order and relational domains. IOS PressGoogle Scholar
  55. 55.
    Wang C, Joshi S, Khardon R (2008) First order decision diagrams for relational mdps. JAIR 31:431–472MathSciNetzbMATHGoogle Scholar
  56. 56.
    Wang C, Khardon R (2010) Relational partially observable MDPs. In: Proc. AAAI-2010Google Scholar
  57. 57.
    Zamani Z, Sanner S, Poupart P, Kersting K (2012) Symbolic dynamic programming for continuous state and observation pomdps. In: Proceedings of the 26th annual conference on neural information processing systems (NIPS), pp 1403–1411Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.TU Dortmund UniversityDortmundGermany
  2. 2.Indiana UniversityBloomingtonUSA

Personalised recommendations