Skip to main content

Data Collection and Synchronisation: Towards a Multiperspective Multimodal Dialogue System with Metacognitive Abilities

  • 1335 Accesses

Part of the Lecture Notes in Electrical Engineering book series (LNEE,volume 427)

Abstract

This article describes the data collection system and methods adopted in the METALOGUE (Multiperspective Multimodal Dialogue System with Metacognitive Abilities) project. The ultimate goal of the METALOGUE project is to develop a multimodal dialogue system with abilities to deliver instructional advice by interacting with humans in a natural way. The data we are collecting will facilitate the development of a dialogue system which will exploit metacognitive reasoning in order to deliver feedback on the user’s performance in debates and negotiations. The initial data collection scenario consists of debates where two students are exchanging views and arguments on a social issue, such as a proposed ban on smoking in public areas, and delivering their presentations in front of an audience. Approximately 3 hours of data has been recorded to date, and all recorded streams have been precisely synchronized and pre-processed for statistical learning. The data consists of audio, video and 3-dimensional skeletal movement information of the participants. This data will be used in the development of cognitive dialogue and discourse models to underpin educational interventions in public speaking training.

Keywords

  • Dialogue systems
  • Metacognition
  • Multimodal data
  • Instructional advice
  • Presentation quality

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-981-10-2585-3_19
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   169.00
Price excludes VAT (USA)
  • ISBN: 978-981-10-2585-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   219.99
Price excludes VAT (USA)
Hardcover Book
USD   219.99
Price excludes VAT (USA)
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    http://www.metalogue.eu/.

  2. 2.

    http://www.efivoi.gr/?CMD=psifiako_iliko_intro.

  3. 3.

    https://developer.microsoft.com/en-us/windows/kinect.

  4. 4.

    https://www.myo.com/.

  5. 5.

    https://software.intel.com/en-us/realsense/home.

  6. 6.

    http://www.python.org/.

References

  1. Helvert, J.V., Rosmalen, P.V., Börner, D., Petukhova, V., Alexandersson, J.: Observing, coaching and reflecting: a multi-modal natural language-based dialogue system in a learning context. In: Workshop Proceedings of the 11th International Conference on Intelligent Environments, vol. 19, pp. 220–227. IOS Press (2015)

    Google Scholar 

  2. Tumposky, N.R.: The debate debate. Clearing House: J. Educ. Strateg. Issues Ideas 78(2), 52–56 (2004)

    CrossRef  Google Scholar 

  3. Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., Stone, M.: Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents. In: Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques, pp. 413–420. ACM (1994)

    Google Scholar 

  4. Huang, L., Morency, L.P., Gratch, J.: Virtual rapport 2.0. In: Proceedings of the International Workshop on Intelligent Virtual Agents, pp. 68–79. Springer (2011)

    Google Scholar 

  5. Zhao, R., Papangelis, A., Cassell, J.: Towards a dyadic computational model of rapport management for human-virtual agent interaction. In: Proceedings of the International Conference on Intelligent Virtual Agents, pp. 514–527. Springer (2014)

    Google Scholar 

  6. van Son, R., Wesseling, W., Sanders, E., van den Heuvel, H.: The IFADV corpus: a free dialog video corpus. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), pp. 501–508. European Language Resources Association (ELRA), Marrakech, Morocco (May 2008). http://www.lrec-conf.org/proceedings/lrec2008/

  7. McCowan, I., Lathoud, G., Lincoln, M., Lisowska, A., Post, W., Reidsma, D., Wellner, P.: The AMI meeting corpus. In: Noldus, L.P.J.J., Grieco, F., Loijens, L.W.S., Zimmerman, P.H. (eds.) Proceedings Measuring Behavior 2005, 5th International Conference on Methods and Techniques in Behavioral Research. Noldus Information Technology, Wageningen (2005)

    Google Scholar 

  8. Ochoa, X., Worsley, M., Chiluiza, K., Luz, S.: Mla’14: Third multimodal learning analytics workshop and grand challenges. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 531–532. ICMI ’14, ACM, New York, NY, USA (2014). http://doi.acm.org/10.1145/2663204.2668318

  9. Stassen, H., et al.: Speaking behavior and voice sound characteristics in depressive patients during recovery. J. Psychiatr. Res. 27(3), 289–307 (1993)

    CrossRef  Google Scholar 

  10. Lamerton, J.: Public Speaking. Everything You Need to Know. Harpercollins Publishers Ltd. (2001)

    Google Scholar 

  11. Grandstaff, D.: Speaking as a Professional: Enhance Your Therapy or Coaching Practice Through Presentations, Workshops, and Seminars. A Norton Professional Book. W.W. Norton & Company (2004). https://books.google.ie/books?id=UvmrZdAmNcYC

  12. DeCoske, M.A., White, S.J.: Public speaking revisited: delivery, structure, and style. Am. J. Health-Syst. Pharm. 67(15), 1225–1227 (2010). http://www.ajhp.org/cgi/content/full/67/15/1225

  13. Slaney, M., Stolcke, A., Hakkani-Tür, D.: The relation of eye gaze and face pose: potential impact on speech recognition. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 144–147. ACM (2014)

    Google Scholar 

  14. Rouvier, M., Dupuy, G., Gay, P., Khoury, E., Merlin, T., Meignier, S.: An open-source state-of-the-art toolbox for broadcast news diarization. Techniacl report, IDIAP (2013)

    Google Scholar 

Download references

Acknowledgements

This research is supported by EU FP7-METALOGUE project under Grant No. 611073 at School of Computer Science and Statistics, Trinity College Dublin.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fasih Haider .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer Science+Business Media Singapore

About this chapter

Cite this chapter

Haider, F., Luz, S., Campbell, N. (2017). Data Collection and Synchronisation: Towards a Multiperspective Multimodal Dialogue System with Metacognitive Abilities. In: Jokinen, K., Wilcock, G. (eds) Dialogues with Social Robots. Lecture Notes in Electrical Engineering, vol 427. Springer, Singapore. https://doi.org/10.1007/978-981-10-2585-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-2585-3_19

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-2584-6

  • Online ISBN: 978-981-10-2585-3

  • eBook Packages: EngineeringEngineering (R0)