Advertisement

Risk-Aware Recommender Systems

  • Djallel Bouneffouf
  • Amel Bouzeghoub
  • Alda Lopes Ganarski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8226)

Abstract

Context-Aware Recommender Systems can naturally be modelled as an exploration/exploitation trade-off (exr/exp) problem, where the system has to choose between maximizing its expected rewards dealing with its current knowledge (exploitation) and learning more about the unknown user’s preferences to improve its knowledge (exploration). This problem has been addressed by the reinforcement learning community but they do not consider the risk level of the current user’s situation, where it may be dangerous to recommend items the user may not desire in her current situation if the risk level is high. We introduce in this paper an algorithm named R-UCB that considers the risk level of the user’s situation to adaptively balance between exr and exp. The detailed analysis of the experimental results reveals several important discoveries in the exr/exp behaviour.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bouneffouf, D., Bouzeghoub, A., Gançarski, A.L.: A contextual-bandit algorithm for mobile context-aware recommender system. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part III. LNCS, vol. 7665, pp. 324–331. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  2. 2.
    Bouneffouf, D., Bouzeghoub, A., Gançarski, A.L.: Hybrid-ε-greedy for mobile context-aware recommender system. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part I. LNCS, vol. 7301, pp. 468–479. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Cherian, J.A.: Investment science: David g. luenberger. Journal of Economic Dynamics and Control 22(4), 645–646 (1998)CrossRefGoogle Scholar
  4. 4.
    Geibel, P., Wysotzki, F.: Risk-sensitive reinforcement learning applied to control under constraints. J. Artif. Int. Res. 24(1), 81–108 (2005)zbMATHGoogle Scholar
  5. 5.
    Howard, R.A., Matheson, J.E.: Risk-sensitive markov decision processes. Management Science 18(7), 356–369 (1972)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 661–670. ACM, USA (2010)CrossRefGoogle Scholar
  7. 7.
    Li, W., Wang, X., Zhang, R., Cui, Y.: Exploitation and exploration in a performance based contextual advertising system. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 27–36. ACM, USA (2010)Google Scholar
  8. 8.
    Mladenic, D.: Text-learning and related intelligent agents: A survey. IEEE Intelligent Systems 14(4), 44–54 (1999)CrossRefGoogle Scholar
  9. 9.
    Robbins, H.: Some Aspects of the Sequential Design of Experiments. Bulletin of the American Mathematical Society 58, 527–535 (1952)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Sehnke, F., Osendorfer, C., Rückstieß, T., Graves, A., Peters, J., Schmidhuber, J.: Policy gradients with parameter-based exploration for control. In: Kůrková, V., Neruda, R., Koutník, J. (eds.) ICANN 2008, Part I. LNCS, vol. 5163, pp. 387–396. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Tokic, M., Ertle, P., Palm, U., Soffker, D., Voos, H.: Robust Exploration/Exploitation trade-offs in safety-critical applications. In: Proceedings of the 8th International Symposium on Fault Detection, Supervision and Safety of Technical Processes, pp. 660–665. IFAC, Mexico City (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Djallel Bouneffouf
    • 1
  • Amel Bouzeghoub
    • 1
  • Alda Lopes Ganarski
    • 1
  1. 1.Department of Computer Science, UMR CNRS SamovarTélécom SudParisEvry CedexFrance

Personalised recommendations