A Synthetic Evaluation of Dialogue Systems

  • K. Hasida
  • Y. Den
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 511)


It is relatively easy to evaluate technologies such as morphological analysis and information retrieval in objecive and empirical terms, because unique solutions can be defined for such tasks There has been no established method for evaluating natural language dialogue systems, however, because dialogue is a very complex task involving massive interaction and it is impossible to define unique solutions for dialogues. In order to advance researches on dialogue systems, there should be some empirical method for evaluating them. For instance, whether the theory of plan inference (Cohen and Perrault, 1979; Allen and Perrault, 1980; Perrault and Allen, 1980; Allen 1983) is really useful in the design of a dialogue system should be evaluated by an empirical measure.


Dialogue System Synthetic Evaluation Successful Dialogue Dialogue Participant Summer Session 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Allen J. F. 1983. Recognizing Intentions from Natural Language Utterances, In Brady M. and Berwick R. C. Computational Models of Discourse, pp. 107–166. MIT Press, Cambridge MA.Google Scholar
  2. Allen J. F. and Perrault C. R. 1980. Analyzing Intention in Utterances. Artificial Intelligence, 15, pp. 143–178.CrossRefGoogle Scholar
  3. Anderson A. H., Bader M., Bard E. G., Doherty G., Garrod S., Isard S., Kowtko J., McAlister J., Miller J., Sotillo C., Thompson H. and Weinert R. 1992. The HCRC Map Task Corpus. Language and Speech, 34 (4), pp. 351–366.Google Scholar
  4. Aono M., Ichikawa A., Koiso H., Sato S., Naka M., Tutiya S., Yagi K., Watanabe N., Ishizaki, M, Okada M., Suzuki H., Nakano Y. and Nonaka K. 1994. Tizukadai Kopasu: Tyukanhokoku (Map Task Corpus: An Interim Report, in Japanese). In JSAI SIG-SLUD9402, pp. 25–30.Google Scholar
  5. Boisen S. and Bates M. 1992. A Practical Methodology for the Evaluation of Spoken Language Systems In Proceedings of the Third Conference on Applied Natural Language Processing, Trento, Italy, pp. 162–169.CrossRefGoogle Scholar
  6. Cohen P. R. and Perrault C. R. 1979. Elements of a Plan Based Theory of Speech Acts. Cognitive Science, 3 (3), pp. 177–212.CrossRefGoogle Scholar
  7. Epstein R. 1992. Can machines think? The quest for the thinking computer. AI Magazine, 13 (2), pp. 80–95.Google Scholar
  8. Grice H. P. 1969. Utterer’s Meaning and Intentions. Philosophical Review, 68(2), pp. 147177.Google Scholar
  9. Harman D. 1995. The First Text Retrieval Conference (TREC1). TR 500–207, National Institute of Standards and Technology Special Publication, Gaitherberg, MDGoogle Scholar
  10. Kumamoto T. and Ito A. 1998. Taiwa Sisutemu tono Taiwa niokeru Yuza no Hurumai ni Tuite (An Analysis of User Input Sentences in Dialogues with Our Dialogue System, in Japanese)’ JSAI SIG-SLUD-9703, 21–26.Google Scholar
  11. [MUC-3].
    Proceedings of the Third Message Understanding Conference. Morgan Kaufmann, 1991. San Mateo, CAGoogle Scholar
  12. Perrault C. R. and Allen J. F. 1980. A Plan-Based Analysis of Indirect Speech Act American Journal of Computational Linguistics, 6 (3–4), pp. 167–182.Google Scholar
  13. Sato S. 1995. Taiwariigusen `95 ni taisuru Kihonsenryaku (Basic Strategies for DiaLeague `95, in Japanese) In IPSJ SIGAI 96-AI-103, 1996, pp. 13–18.Google Scholar

Copyright information

© Springer Science+Business Media New York 1999

Authors and Affiliations

  • K. Hasida
  • Y. Den

There are no affiliations available

Personalised recommendations