Learning to Complete Sentences

  • Steffen Bickel
  • Peter Haider
  • Tobias Scheffer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3720)


We consider the problem of predicting how a user will continue a given initial text fragment. Intuitively, our goal is to develop a “tab-complete” function for natural language, based on a model that is learned from text data. We consider two learning mechanisms that generate predictive models from collections of application-specific document collections: we develop an N-gram based completion method and discuss the application of instance-based learning. After developing evaluation metrics for this task, we empirically compare the model-based to the instance-based method and assess the predictability of call-center emails, personal emails, and weather reports.


Precision Recall Curve Sentence Completion Complete Sentence Weather Report Personal Email 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Darragh, J., Witten, I.: The Reactive Keyboard. Cambridge University Press, Cambridge (1992)zbMATHGoogle Scholar
  2. 2.
    Davison, B., Hirsh, H.: Predicting sequences of user actions. In: AAAI/ICML Workshop on Predicting the Future: AI Approaches to Time Series Analysis (1998)Google Scholar
  3. 3.
    Debevc, M., Meyer, B., Svecko, R.: An adaptive short list for documents on the world wide web. In: Proceedings of the International Conference on Intelligent User Interfaces (1997)Google Scholar
  4. 4.
    Foster, G.: Text Prediction for Translators. PhD thesis, University of Montreal (2002)Google Scholar
  5. 5.
    Garay-Vitoria, N., Abascal, J.: A comparison of prediction techniques to enhance the communication of people with disabilities. In: Stary, C., Stephanidis, C. (eds.) UI4ALL 2004. LNCS, vol. 3196, pp. 400–417. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Grabski, K., Scheffer, T.: Sentence completion. In: Proceedings of the ACM SIGIR Conference on Information Retrieval (2004)Google Scholar
  7. 7.
    Jacobs, N., Blockeel, H.: User modelling with sequential data. In: Proceedings of the HCI International (2003)Google Scholar
  8. 8.
    Klimt, B., Yang, Y.: The Enron corpus: A new dataset for email classification research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 217–226. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Korvemaker, B., Greiner, R.: Predicting Unix command lines: adjusting to user patterns. In: Proceedings of the National Conference on Artificial Intelligence (2000)Google Scholar
  10. 10.
    Langlais, P., Loranger, M., Lapalme, G.: Translators at work with transtype: Resource and evaluation. In: Proceedings of the third international Conference on Language Resources and Evaluation (2002)Google Scholar
  11. 11.
    Magnuson, T., Hunnicutt, S.: Measuring the effectiveness of word prediction: The advantage of long-term use. Technical Report TMH-QPSR Volume 43, Speech, Music and Hearing, KTH, Stockholm, Sweden (2002)Google Scholar
  12. 12.
    Motoda, H., Yoshida, K.: Machine learning techniques to make computers easier to use. In: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (1997)Google Scholar
  13. 13.
    Shannon, C.: Prediction and entropy of printed english. Bell Systems Technical Journal 30, 50–64 (1951)zbMATHGoogle Scholar
  14. 14.
    Zagler, W., Beck, C.: FASTY - faster typing for disabled persons. In: Proceedings of the European Conference on Medical and Biological Engineering (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Steffen Bickel
    • 1
  • Peter Haider
    • 1
  • Tobias Scheffer
    • 1
  1. 1.Department of Computer ScienceHumboldt-Universität zu BerlinBerlinGermany

Personalised recommendations