Risk-Sensitive Online Learning
We consider the problem of online learning in settings in which we want to compete not simply with the rewards of the best expert or stock, but with the best trade-off between rewards and risk. Motivated by finance applications, we consider two common measures balancing returns and risk: the Sharpe ratio  and the mean-variance criterion of Markowitz . We first provide negative results establishing the impossibility of no-regret algorithms under these measures, thus providing a stark contrast with the returns-only setting. We then show that the recent algorithm of Cesa-Bianchi et al.  achieves nontrivial performance under a modified bicriteria risk-return measure, and give a modified best expert algorithm that achieves no regret for a “localized” version of the mean-variance criterion. We perform experimental comparisons of traditional online algorithms and the new risk-sensitive algorithms on a recent six-year S&P 500 data set and find that the modified best expert algorithm outperforms the traditional with respect to Sharpe ratio, MV, and accumulated wealth. To our knowledge this paper initiates the investigation of explicit risk considerations in the standard models of worst-case online learning.
KeywordsCompetitive Ratio Online Algorithm Sharpe Ratio Average Reward Good Expert
Unable to display preview. Download preview PDF.
- 1.Agarwal, A., Hazan, E., Kale, S., Schapire, R.E.: Algorithms for Portfolio Management based on the Newton Method. In: ICML (2006)Google Scholar
- 2.Bodie, Z., Kane, A., Marcus, A.J.: Portfolio Performance Evaluation, 4th edn. Investments. Irwin McGraw-Hill, New York (1999)Google Scholar