Sequence Labeling with Reinforcement Learning and Ranking Algorithms
Many problems in areas such as Natural Language Processing, Information Retrieval, or Bioinformatic involve the generic task of sequence labeling. In many cases, the aim is to assign a label to each element in a sequence. Until now, this problem has mainly been addressed with Markov models and Dynamic Programming.
We propose a new approach where the sequence labeling task is seen as a sequential decision process. This method is shown to be very fast with good generalization accuracy. Instead of searching for a globally optimal label sequence, we learn to construct this optimal sequence directly in a greedy fashion. First, we show that sequence labeling can be modelled using Markov Decision Processes, so that several Reinforcement Learning (RL) algorithms can be used for this task. Second, we introduce a new RL algorithm which is based on the ranking of local labeling decisions.
KeywordsInput Sequence Markov Decision Process Conditional Random Field Ranking Algorithm Sequence Label
- 2.Daumé III, H., Marcu, D.: Learning as search optimization: Approximate large margin methods for structured prediction. In: ICML, Bonn, Germany, ACM Press, New York (2005)Google Scholar
- 3.Daumé III, H., Langford, J., Marcu, D.: Search-based structured prediction (2006)Google Scholar
- 5.J. Si, A. G. Barto, W.B., P., W.II., D.: Handbook of Learning and Approximate Dynamic Programming. Wiley&Sons, INC., Publications (2004)Google Scholar
- 6.Sutton, R., Barto, A.: Reinforcement learning: an introduction. MIT Press, Cambridge (1998)Google Scholar
- 7.Bertsekas, D.P: Rollout agorithms: an overview. In: Decision and Control, pp. 448–449 (1999)Google Scholar
- 8.Tsampouka, P., Shawe-Taylor, J.: Perceptron-like large margin classifiers (2005)Google Scholar
- 9.Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289. Morgan Kaufmann, San Francisco, CA (2001)Google Scholar
- 10.Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden markov support vector machines. In: ICML, pp. 3–10. ACM Press, New York (2003)Google Scholar
- 11.Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: ICML, ACM Press, New York (2004)Google Scholar
- 12.Maes, F., Denoyer, L., Gallinari, P.: Sequence labeling with reinforcement learning and ranking algorithms. Technical report, LIP6 - University of Paris 6 (2007)Google Scholar
- 13.Kassel, R.H.: A comparison of approaches to on-line handwritten character recognition. PhD thesis, Cambridge, MA, USA (1995)Google Scholar