Controlling Switching Pause Using an AR Agent for Interactive CALL System

  • Naoto Suzuki
  • Takashi Nose
  • Yutaka Hiroi
  • Akinori Ito
Part of the Communications in Computer and Information Science book series (CCIS, volume 435)


We are developing a voice-interactive CALL (Computer-Assisted Language Learning) system to provide more opportunity for better English conversation exercise. There are several types of CALL system, we focus on a spoken dialogue system for dialogue practice. When the user makes an answer to the system’s utterance, timing of making the answer utterance could be unnatural because the system usually does not make any reaction when the user keeps silence, and therefore the learner tends to take more time to make an answer to the system than that to the human counterpart. However, there is no framework to suppress the pause and practice an appropriate pause duration.

In this research, we did an experiment to investigate the effect of presence of the AR character to analyze the effect of character as a counterpart itself. In addition, we analyzed the pause between the two person’s utterances (switching pause). The switching pause is related to the smoothness of its conversation. Moreover, we introduced a virtual character realized by AR (Augmented Reality) as a counterpart of the dialogue to control the switching pause. Here, we installed the character the behavior of “time pressure” to prevent the learner taking long time to consider the utterance.

To verify if the expression is effective for controlling switching pause, we designed an experiment. The experiment was conducted with or without the expression. Consequently, we found that the switching pause duration became significantly shorter when the agent made the time-pressure expression.


Computer-assisted language learning English learning Spoken dialogue system Switching pause Augmented reality 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chujo, K., Nishigaki, C., Uchibori, A., Yamazaki, A.: Developing a beginning-level CALL system and its effect on college students’ communicative proficiency. J. of the College of Industrial Technology, Nihon University 38, 1–16 (2005)Google Scholar
  2. 2.
    Eskenazi, M.: An overview of spoken language technology for education. Speech Communication 51(10), 832–844 (2009)CrossRefGoogle Scholar
  3. 3.
    Kweon, O.-P., Ito, A., Suzuki, M., Makino, S.: A grammatical error detection method for dialog-based CALL system. J. of Natural Language Processing 12(4), 137–156 (2005)CrossRefGoogle Scholar
  4. 4.
    Anzai, T., Ito, A.: Recognition of utterances with grammatical mistakes based on optimization of language model towards interactive CALL systems. In: Proc. APSIPA ASC (2012)Google Scholar
  5. 5.
    Trimboli, C., Walker, M.B.: Switching pauses in cooperative and competitive conversations. J. of Experimental Social Psychology 20(4), 297–311 (1984)CrossRefGoogle Scholar
  6. 6.
    Miura, I.: Switching pauses in adult-adult and child-child turn takings: An initial study. J. of Psycholinguistic Research 22(3), 383–395 (1993)CrossRefGoogle Scholar
  7. 7.
    Nation, P.: The role of the first language in foreign language learning. The Asian EFL J. 5(4) (2003)Google Scholar
  8. 8.
    Miyake, S., Ito, A.: A spoken dialogue system using virtual conversational agent with augmented reality. In: Proc. APSIPA ASC (2012)Google Scholar
  9. 9.
    Brennan, S.E.: Lexical Entrainment in Spoken Dialog. In: Proc. Int. Symp. on Spoken Dialog, pp. 41–44 (1996)Google Scholar
  10. 10.
    Levitan, R., Hirschberg, J.: Measuring Acoustic-Prosodic Entrainment with respect to Multiple Levels and Dimensions. In: Proc. Interspeech (2011)Google Scholar
  11. 11.
    Watanabe, T., Okubo, M.: Physiological analysis of entrainment in communication. J. of Information Processing 39(5), 1225–1231 (1998)Google Scholar
  12. 12.
    Suzuki, N., Kakei, K., Takeuchi, Y., Okada, M.: Effects of the speed of hummed sounds on human-computer interaction. J. of Human Interface Society 5(1), 113–122 (2003)Google Scholar
  13. 13.
    Nisimura, R., Lee, A., Saruwatari, H., Shikano, K.: Public speech-oriented guidance system with adult and child discrimination capability. In: Proc. Int. Conf. on Acoustics, Speech and Signal Processing, vol. I, pp. 433–436 (2004)Google Scholar
  14. 14.
    Lee, A., Kawahara, T., Shikano, K.: Julius — an open source real-time large vocabulary recognition engine. In: Proc. European Conf. on Speech Communication and Technology (EUROSPEECH), pp. 1691–1694 (2001)Google Scholar
  15. 15.
    Minematsu, N., Tomiyama, Y., Yoshimoto, K., Shimizu, K., Nakagawa, S., Dantsuji, M., Makino, S.: Development of English speech database read by Japanese to support CALL research. In: Proc. Int. Conf. Acoustics, pp. 557–560 (2004)Google Scholar
  16. 16.
    The Festival Speech Synthesis System,
  17. 17.
    Hiroi, Y., Ito, A.: Evaluation of head size of an interactive robot using an augmented reality. In: Proc. World Automation Congress (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Naoto Suzuki
    • 1
  • Takashi Nose
    • 1
  • Yutaka Hiroi
    • 2
  • Akinori Ito
    • 1
  1. 1.Graduate School of EngineeringTohoku UniversitySendaiJapan
  2. 2.Department of RoboticsOsaka Institute of TechnologyOsakaJapan

Personalised recommendations