Intelligent Virtual Agents

Volume 7502 of the series Lecture Notes in Computer Science pp 1-14

Fully Automated Generation of Question-Answer Pairs for Scripted Virtual Instruction

  • Pascal KuytenAffiliated withGraduate School of Information Science & Technology, The University of Tokyo
  • , Timothy BickmoreAffiliated withCollege of Computer and Information Science, Northeastern University
  • , Svetlana StoyanchevAffiliated withSpoken Language Processing Group, Department of Computer Science, Columbia University
  • , Paul PiwekAffiliated withNLG Group, Centre for Research in Computing, The Open University
  • , Helmut PrendingerAffiliated withNational Institute of Informatics
  • , Mitsuru IshizukaAffiliated withGraduate School of Information Science & Technology, The University of Tokyo

* Final gross prices may vary according to local VAT.

Get Access


We introduce a novel approach for automatically generating a virtual instructor from textual input only. Our fully implemented system first analyzes the rhetorical structure of the input text and then creates various question-answer pairs using patterns. These patterns have been derived from correlations found between rhetorical structure of monologue texts and question-answer pairs in the corresponding dialogues. A selection of the candidate pairs is verbalized into a diverse collection of question-answer pairs. Finally the system compiles the collection of question-answer pairs into scripts for a virtual instructor. Our end-to-end system presents questions in pre-fixed order and the agent answers them. Our system was evaluated with a group of twenty-four subjects. The evaluation was conducted using three informed consent documents of clinical trials from the domain of colon cancer. Each of the documents was explained by a virtual instructor using 1) text, 2) text and agent monologue, and 3) text and agent performing question-answering. Results show that an agent explaining an informed consent document did not provide significantly better comprehension scores, but did score higher on satisfaction, compared to two control conditions.


Dialogue Generation Rhetorical Structure Theory Medical Documents