Advertisement

Testing a Learn-Verify-Repair Approach for Safe Human-Robot Interaction

  • Shashank Pathak
  • Luca Pulina
  • Armando Tacchella
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9336)

Abstract

Ensuring safe behaviors, i.e., minimizing the probability that a control strategy yields undesirable effects, becomes crucial when robots interact with humans in semi-structured environments through adaptive control strategies. In previous papers, we contributed to propose an approach that (i) computes control policies through reinforcement learning, (ii) verifies them against safety requirements with probabilistic model checking, and (iii) repairs them with greedy local methods until requirements are met. Such learn-verify-repair work-flow was shown effective in some — relatively simple and confined — test cases. In this paper, we frame human-robot interaction in light of such previous contributions, and we test the effectiveness of the learn-verify-repair approach in a more realistic factory-to-home deployment scenario. The purpose of our test is to assess whether we can verify that interaction patterns are carried out with negligible human-to-robot collision probability and whether, in the presence of user tuning, strategies which determine offending behaviors can be effectively repaired.

Keywords

Model Check Optimal Policy Reinforcement Learn Markov Decision Process Humanoid Robot 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the twenty-first international conference on Machine learning, p. 1. ACM (2004)Google Scholar
  2. 2.
    Baier, C., Katoen, J.P., et al.: Principles of model checking, vol. 26202649. MIT press, Cambridge (2008)zbMATHGoogle Scholar
  3. 3.
    Barto, A.G.: Reinforcement learning: An introduction. MIT press (1998)Google Scholar
  4. 4.
    Bringsjord, S., Arkoudas, K., Bello, P.: Toward a general logicist methodology for engineering ethically correct robots. IEEE Intelligent Systems 21(4), 38–44 (2006)CrossRefGoogle Scholar
  5. 5.
    Calinon, S., Sardellitti, I., Caldwell, D.G.: Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 249–254. IEEE (2010)Google Scholar
  6. 6.
    Chen, T., Hahn, E.M., Han, T., Kwiatkowska, M., Qu, H., Zhang, L.: Model repair for Markov decision processes. In: 2013 International Symposium on Theoretical Aspects of Software Engineering (TASE), pp. 85–92. IEEE (2013)Google Scholar
  7. 7.
    Courcoubetis, C., Yannakakis, M.: The complexity of probabilistic verification. Journal of the ACM (JACM) 42(4), 857–907 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Dautenhahn, K.: Socially intelligent robots: dimensions of human-robot interaction. Philosophical Transactions of the Royal Society B: Biological Sciences 362(1480), 679–704 (2007)CrossRefGoogle Scholar
  9. 9.
    Haddadin, S., Albu-Schäffer, A., Hirzinger, G.: Safe physical human-robot interaction: measurements, analysis & new insights. In: International Symposium on Robotics Research (ISRR2007), Hiroshima, Japan, pp. 439–450 (2007)Google Scholar
  10. 10.
    Katoen, J., Zapreev, I., Hahn, E., Hermanns, H., Jansen, D.: The ins and outs of the probabilistic model checker mrmc. Performance Evaluation 68(2), 90–104 (2011)CrossRefGoogle Scholar
  11. 11.
    Kwiatkowska, M., Norman, G., Parker, D.: Stochastic model checking. In: Bernardo, M., Hillston, J. (eds.) SFM 2007. LNCS, vol. 4486, pp. 220–270. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  12. 12.
    Metta, G., Natale, L., Nori, F., Sandini, G., Vernon, D., Fadiga, L., von Hofsten, C., Rosander, K., Lopes, M., Santos-Victor, J., et al.: The iCub Humanoid Robot: An Open-Systems Platform for Research in Cognitive Development. Neural networks: the official journal of the International Neural Network Society (2010)Google Scholar
  13. 13.
    Ng, A., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., Liang, E.: Autonomous inverted helicopter flight via reinforcement learning. In: Experimental Robotics IX, pp. 363–372 (2006)Google Scholar
  14. 14.
    Pathak, S., Ábrahám, E., Jansen, N., Tacchella, A., Katoen, J.-P.: A greedy approach for the efficient repair of stochastic models. In: Havelund, K., Holzmann, G., Joshi, R. (eds.) NFM 2015. LNCS, vol. 9058, pp. 295–309. Springer, Heidelberg (2015) Google Scholar
  15. 15.
    Pathak, S., Pulina, L., Metta, G., Tacchella, A.: Ensuring safety of policies learned by reinforcement: reaching objects in the presence of obstacles with the icub. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, November 3–7, 2013, pp. 170–175. IEEE (2013)Google Scholar
  16. 16.
    Pattacini, U., Nori, F., Natale, L., Metta, G., Sandini, G.: An experimental evaluation of a novel minimum-jerk cartesian controller for humanoid robots. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, October 18–22, 2010, Taipei, Taiwan, pp. 1668–1674 (2010)Google Scholar
  17. 17.
    Puterman, M.L.: Markov decision processes: discrete stochastic dynamic programming, vol. 414. John Wiley & Sons (2009)Google Scholar
  18. 18.
    Rummery, G., Niranjan, M.: On-line Q-learning using connectionist systems. University of Cambridge, Department of Engineering (1994)Google Scholar
  19. 19.
    Watkins, C.J., Dayan, P.: Q-learning. Machine Learning 8(3–4), 279–292 (1992)zbMATHGoogle Scholar
  20. 20.
    Wiering, M., Van Otterlo, M.: Reinforcement learning. In: Adaptation, Learning, and Optimization, vol. 12. Springer (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Shashank Pathak
    • 1
  • Luca Pulina
    • 2
  • Armando Tacchella
    • 3
  1. 1.iCub FacilityIstituto Italiano di Tecnologia (IIT)GenovaItaly
  2. 2.POLCOMINGUniversità degli Studi di SassariSassariItaly
  3. 3.DIBRISUniversità degli Studi di GenovaGenovaItaly

Personalised recommendations