Machine Learning

, Volume 15, Issue 1, pp 43–68 | Cite as

Discrete sequence prediction and its applications

  • Philip Laird
  • Ronald Saul


Learning from experience to predict sequences of discrete symbols is a fundamental problem in machine learning with many applications. We present a simple and practical algorithm (TDAG) for discrete sequence prediction. Based on a text-compression method, the TDAG algorithm limits the growth of storage by retaining the most likely prediction contexts and discarding (forgetting) less likely ones. The storage/speed tradeoffs are parameterized so that the algorithm can be used in a variety of applications. Our experiments verify its performance on data compression tasks and show how it applies to two problems: dynamically optimizing Prolog programs for good average-case behavior and maintaining a cache for a database on mass storage.


sequence extrapolation statistical learning text compression speedup learning memory management 


  1. Abe, N., & Warmuth, M. (1990). On the computational complexity of approximating distributions by probabilistic automata. InProceedings of the 3rd Workshop on Computational Learning Theory (pp. 52–66). San Mateo, CA: Morgan Kaufmann.Google Scholar
  2. Bell, T.C., Cleary, J.G., & Witten, I.H. (1990).Text compression. Englewood Cliffs, NJ: Prentice Hall.Google Scholar
  3. Blumer, A. (1990). Application of DAWGs to data compression. In A. Capocelli (Ed.),Sequences: Combinatories, compression, security, and transmission (pp. 303–311). New York: Springer Verlag.Google Scholar
  4. Dietterich, T., & Michalski, R. (1986). Learning to predict sequences. In R.S. Michalski et al. (Eds.),Machine learning: An Al approach, Vol. II. San Mateo, CA: Morgan Kaufmann.Google Scholar
  5. Gooley, M., & Wah, B. (1989). Efficient reordering of Prolog programs.IEEE Transactions on Knowledge and Data Engineering, 1 470–482.Google Scholar
  6. Gratch, J. & DeJong, G. (1992). An analysis of learning to plan as a search problem. InProceedings of the Ninth International Machine Learning Conference. Morgan Kaufmann.Google Scholar
  7. Greiner, R., & Orponen, P. (1991). Probably approximately optimal derivation strategies. InProceedings of the 2nd International Conference, Knowledge Representation and Reasoning (pp. 277–288). San Mateo, CA: Morgan Kaufmann.Google Scholar
  8. Kotz, D., & Ellis, C.S. (1992). Practical Prefetching techniques for multi-processor file systems.Distributed and Parallel Databases, 1 33–51.Google Scholar
  9. Laird, P., & Gamble, E. (1990). Extending EBG to term-rewriting systems. InProceedings AAAI-90 (pp. 929–935). Menlo Park, CA: American Association for Artificial Intelligence.Google Scholar
  10. Laird, P., & Saul, R. (1992). Predictive caching using the TDAG algorithm (Technical Report FIA-92-30). NASA Ames Research Center, AI Research Branch, Moffett Field, CA.Google Scholar
  11. Laird, P. (1988). Efficient unsupervised learning. In D. Haussler & L. Pitt (Eds.),Proceedings of the 1st Computer Learning Theory Workshop (pp. 297–311). San Mateo, CA: Morgan Kaufmann.Google Scholar
  12. Laird, P. (1992). Discrete sequence prediction and its applications. InProceedings of the 10th National Conference on Artificial Intelligence (pp. 135–146). Menlo Park, CA: American Association for Artificial Intelligence.Google Scholar
  13. Laird, P. (1992). Dynamic optimization. InProceedings of the 9th International Machine Learning Conference (pp. 263–272). San Mateo, CA: Morgan Kaufmann.Google Scholar
  14. Lau, E.J. (1982). Improving page prefetching with prior knowledge.Performance Evaluation, 2(3), 195–206.Google Scholar
  15. Lelewer, D., & Hirschberg, D.S. (1987). Data compression.ACM Computing Surveys, 19 262–296.Google Scholar
  16. Lelewer, D., & Hirschberg, D.S. (1991). Streamlining context models for data compression. InProceedings, Data Compression Conference (pp. 313–322). Los Alamitos, CA: IEEE Press.Google Scholar
  17. Levinson, J., Rabiner, L., & Sondhi, M. (1983). An introduction to the application of the theory of probabilistic functions of Markov processes in automatic speech recognition.Bell Systems Technical Journal, 62 1035–1074.Google Scholar
  18. Lindsay, R., Buchanan, B., et al. (1980).DENDRAL. New York: McGraw-Hill.Google Scholar
  19. Martinez, M. (1982). Program behavior prediction and prepaging.Acta Informatica, 17 101–120.Google Scholar
  20. Norvig, P. (1991).Paradigms of A.I. programming: Case studies in common LISP. San Mateo, CA: Morgan Kaufmann.Google Scholar
  21. Palmer, M., & Zdonik, S.B. (1991). Fido: a cache that learns to fetch. InProceedings of 17th International Conference on Very Large Data Bases (pp. 255–264). San Mateo, CA: Morgan Kaufmann.Google Scholar
  22. Prieditis, A., & Mostow, J. (1987). Prolearn: Towards a Prolog interpreter that learns. InProceedings of the 6th National Conference on Artificial Intelligence (pp. 494–498). Menlo Park, CA: Morgan Kaufmann.Google Scholar
  23. Salem, Kenneth. (1991). Adaptive prefetching for disk buffers (Technical Report Tr-91-46). University of Maryland and CESDIS, Goddard Space Flight Center, Greenbelt, MD.Google Scholar
  24. Sejnowski, T., & Rosenberg, C. (1987). Parallel networks that learn to pronounce English text.Complex Systems, 1 145–168.Google Scholar
  25. Smith, A.J. (1978). Sequentiality and prefetching in database systems.Transactions on Database Systems, 3(3), 223–247.Google Scholar
  26. Subramanian, D., & Feldman, R. (1990). The utility of EBL in recursive domains. InProceedings of the 8th National Conference on Artificial Intelligence (pp. 942–949). Menlo Park, CA: American Association for Artificial Intelligence.Google Scholar
  27. Vitter, J., & Krishnan, P. (1991). Optimal prefetching via data compression. InProceedings of the 32nd Annual IEEE Symposium on Foundations of Computer Science (pp. 71–78). New York: IEEE Press.Google Scholar
  28. Williams, R. (1988). Dynamic history predictive compression.Information Systems, 13(1), 129–140.Google Scholar

Copyright information

© Kluwer Academic Publishers 1994

Authors and Affiliations

  • Philip Laird
    • 1
  • Ronald Saul
    • 2
  1. 1.AI Research BranchNASA Ames Research CenterMoffett Field
  2. 2.Recom Technologies, Inc.NASA Ames Research CenterMoffett Field

Personalised recommendations