Neural Processing Letters

, Volume 14, Issue 2, pp 127–140 | Cite as

Online Text Prediction with Recurrent Neural Networks

  • Juan Antonio Pérez-Ortiz
  • Jorge Calera-Rubio
  • Mikel L. Forcada


Arithmetic coding is one of the most outstanding techniques for lossless data compression. It attains its good performance with the help of a probability model which indicates at each step the probability of occurrence of each possible input symbol given the current context. The better this model, the greater the compression ratio achieved. This work analyses the use of discrete-time recurrent neural networks and their capability for predicting the next symbol in a sequence in order to implement that model. The focus of this study is on online prediction, a task much harder than the classical offline grammatical inference with neural networks. The results obtained show that recurrent neural networks have no problem when the sequences come from the output of a finite-state machine, easily giving high compression ratios. When compressing real texts, however, the dynamics of the sequences seem to be too complex to be learned online correctly by the net.

arithmetic coding online nonlinear prediction recurrent neural networks text compression 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bell, T. C., Cleary, J. G. and Witten, I. H.: Text compression, Prentice Hall, 1990.Google Scholar
  2. 2.
    Bengio, Y., Simard, P. and Frasconi, P.: Learning long-term dependencies with gradient descent is difficult, IEEE Transactions on Neural Networks 5(2) (1994), 157-166.Google Scholar
  3. 3.
    Burrows, M. and Wheeler, D. J.: A block-sorting lossless data compression algorithm, Technical Report 124, Digital Systems Research Center (1994).Google Scholar
  4. 4.
    Carrasco, R. C., Forcada, M. L., Valdés-Muñoz, M. A. and Ñeco, R. P.: Stable-encoding of finite-state machines in discrete-time recurrent neural nets with sigmoid units, Neural Computation 12(9) (2000), 2129-2174.Google Scholar
  5. 5.
    Cleeremans, A., Servan-Schreiber, D. and McClelland, J. L.: Finite state automata and simple recurrent networks, Neural Computation 1(3) (1989), 372-381.Google Scholar
  6. 6.
    Elman, J. L.: Finding structure in time, Cognitive Science 14 (1990), 179-211.Google Scholar
  7. 7.
    Haykin, S.: Neural networks: a comprehensive foundation, New Jersey: Prentice Hall, 2nd edition (1999).Google Scholar
  8. 8.
    Hochreiter, S. and Schmidhuber, J.: Long short-term memory, Neural computation 9(8) (1997), 1735-1780.Google Scholar
  9. 9.
    Hopcroft, J. E. and Ullman, J. D.: Introduction to automata theory, languages and computation, Addison-Wesley (1979).Google Scholar
  10. 10.
    Jacobs, R. A.: Increased rates of convergence through learning rate adaptation, Neural Networks 1 (1988), 295-307.Google Scholar
  11. 11.
    Long, P. M., Natsev, A. I. and Vitter, J. S.: Text compression via alphabet re-representation, Neural Networks 12 (1999), 755-765.Google Scholar
  12. 12.
    Mahoney, M. V.: Fast text compression with neural networks, In: 13th International FLAIRS Conference. Orlando, Florida (2000).Google Scholar
  13. 13.
    Narendra, K. S. and Parthasarathy, K.: Identification and control of dynamical systems using neural networks, IEEE Transactions on Neural Networks 1 (1990), 4-27.Google Scholar
  14. 14.
    Nelson, M.: Arithmetic coding + statistical modeling = data compression, Dr. Dobb's Journal, February (1991). Available at Scholar
  15. 15.
    Nelson, M.: Data compression with the Burrows-Wheeler transform, Dr. Dobb's Journal, September (1996). Available at Scholar
  16. 16.
    Nelson, M. and Gailly, J.-L.: The data compression book. New York: M&T Books, 2nd edition (1995).Google Scholar
  17. 17.
    Puskorius, G. V. and Feldkamp, L. A.: Decoupled extended Kalman filter training of feedforward layered networks, In: International Joint Conference on Neural Networks, Vol. 1, (1991), pp. 771–777.Google Scholar
  18. 18.
    Robinson, A. J. and Fallside, F.: A recurrent error propagation speech recognition system, Computer Speech and Language 5 (1991), 259-274.Google Scholar
  19. 19.
    Rumelhart, D. E., Hinton, G. E. and Williams, R. J.: Learning representations by back-propagating errors, Nature 323 (1986), 533-536.Google Scholar
  20. 20.
    Schmidhuber, J. and Stefan, H.: Sequential neural text compression, IEEE Transactions on Neural Networks 7(1) (1996), 142-146.Google Scholar
  21. 21.
    Williams, R. J. and Zipser, R. A.: 1989, A learning algorithm for continually training recurrent neural networks, Neural Computation 1 (1989), 270-280.Google Scholar
  22. 22.
    Ziv, J. and Lempel, A.: A universal algorithm for sequential data compression, IEEE Transactions on Information Theory (1997).Google Scholar

Copyright information

© Kluwer Academic Publishers 2001

Authors and Affiliations

  • Juan Antonio Pérez-Ortiz
    • 1
  • Jorge Calera-Rubio
    • 1
  • Mikel L. Forcada
    • 1
  1. 1.Departament de Llenguatges i Sistemes InformàticsUniversitat d'AlacantAlacantSpain

Personalised recommendations