Skip to main content
Log in

Investigating efficient speech-based information communication: a comparison between the high-rate and the concurrent playback designs

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

This research aims to assist users to seek information efficiently while interacting with speech-based information, particularly in multimedia delivery, and reports on an experiment that tested two speech-based designs for communicating multiple speech-based information streams efficiently. In this experiment, a high-rate playback design and a concurrent playback design are investigated. In the high-rate playback design, two speech-based information streams were communicated by doubling the normal playback-rate, and in the concurrent playback design, two speech-based information streams were played concurrently. Comprehension of content in both the designs was also compared with the benchmark set from regular baseline condition. The results showed that the users’ comprehension regarding the main information dropped significantly in the high-rate playback and the concurrent playback designs compared to the baseline condition. However, in answering the questions set from the detailed information, the comprehension was not significantly different in all three designs. It is expected that such equeryfficient communication methods may increase productivity by providing information efficiently while interacting with an interactive multimedia system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. All the percentages in the subsequent sections text are mentioned in rounded form. However, the the graphs and statistical analysis are drawn on precise values.

References

  1. Audacity: Audacity. Online. https://www.audacityteam.org/. Accessed 23 Dec 2018

  2. Bakker, S., van den Hoven, E., Eggen, B.: Knowing by ear: leveraging human attention abilities in interaction design. J. Multimodal User Interfaces 5(3–4), 197–209 (2012)

    Article  Google Scholar 

  3. Bradley, M.M., Lang, P.J.: Measuring emotion: the self-assessment manikin and the semantic differential. J. Behav. Ther. Exp. Psychiatry 25(1), 49–59 (1994)

    Article  Google Scholar 

  4. Brock, D., McClimens, B., Trafton, J.G., McCurry, M., Perzanowski, D.: Evaluating listeners’ attention to and comprehension of spatialized concurrent and serial talkers at normal and a synthetically faster rate of speech. In: Proceedings of the 14th International Conference on Auditory Display (ICAD), pp. 1–8. Georgia Institute of Technology (2008)

  5. Brock, D., Wasylyshyn, C., Mcclimens, B.: Word spotting in a multichannel virtual auditory display at normal and accelerated rates of speech. In: Proceedings of the 22nd International Conference on Auditory Display (ICAD), pp. 130–135. Georgia Institute of Technology (2016)

  6. Brock, D., Wasylyshyn, C., Mcclimens, B., Perzanowski, D.: Facilitating the Watchstander’s voice communications task in future navy operations. In: MILCOM 2011 Military Communications Conference, pp. 2222–2226 (2011). https://doi.org/10.1109/MILCOM.2011.6127692

  7. Brungart, D.S., Simpson, B.D.: Distance-based speech segregation in near-field virtual audio displays. In: Proceedings of the 2001 International Conference on Auditory Display (ICAD), pp. 169–174. Georgia Institute of Technology (2001)

  8. Brungart, D.S., Simpson, B.D.: Optimizing the spatial configuration of a seven-talker speech display. ACM Trans. Appl. Percept. (TAP) 2(4), 430–436 (2005)

    Article  Google Scholar 

  9. Council, B.: IELTS. Online. https://www.ielts.org (2019). Accessed 30 Apr 2019

  10. Fazal, M.A.U., Ferguson, S.: Investigating concurrent information communication in voice-based interaction. https://doi.org/10.6084/m9.figshare.11917857.v1. https://www.bit.ly/39vzSFP (2020). Accessed 29 Feb 2020

  11. Fazal, M.A.U., Ferguson, S., Karim, M.S., Johnston, A.: Concurrent voice-based multiple information communication: a study report of profile-based users’ interaction. In: 145th Convention of the Audio Engineering Society. Audio Engineering Society (2018)

  12. Fazal, M.A.U., Ferguson, S., Karim, M.S., Johnston, A.: Vinfomize: a framework for multiple voice-based information communication. In: Proceedings of the 3rd International Conference on Information System and Data Mining, pp. 1–7. ACM (2019) (accepted—to be published)

  13. Fazal, M.A.U., Shuaib, K.: Multiple information communication in voice-based interaction. In: Advances in Intelligent Systems and Computing, pp. 101–111 (2017)

  14. Feltham, F., Loke, L.: Felt sense through auditory display: a design case study into sound for somatic awareness while walking. In: Proceedings of the 2017 ACM SIGCHI Conference on Creativity and Cognition, pp. 287–298. ACM (2017)

  15. Ferguson, S., Cabrera, D.: Exploratory Sound Analysis: Sonifying Data about Sound, pp. 1–8. International Community for Auditory Display, Palo Alto (2008)

    Google Scholar 

  16. Guerreiro, J.: Towards screen readers with concurrent speech: where to go next? SIGACCESS Access. Comput. 114, 12–19 (2016)

    Article  Google Scholar 

  17. Guerreiro, J., Goncalves, D.: Scanning for digital content: how blind and sighted people perceive concurrent speech. ACM Trans. Access. Comput. (2016). https://doi.org/10.1109/CVPR.2016.105

    Article  Google Scholar 

  18. Hafen, R.P., Henry, M.J.: Speech information retrieval: a review. Multimed. Syst. 18(6), 499–518 (2012)

    Article  Google Scholar 

  19. Hermann, T.: Taxonomy and definitions for sonification and auditory display. In: Proceedings of the 14th International Conference on Auditory Display (ICAD), pp. 1–8. Georgia Institute of Technology (2008)

  20. Hinde, A.F.: Concurrency in auditory displays for connected television. Ph.D. thesis, University of York (2016)

  21. Ikei, Y., Yamazaki, H., Hirota, K., Hirose, M.: vCocktail: multiplexed-voice menu presentation method for wearable computers. In: Virtual Reality Conference, pp. 183–190. IEEE (2006). https://doi.org/10.1109/VR.2006.141

  22. Iyer, N., Thompson, E.R., Simpson, B.D., Brungart, D., Summers, V.: Exploring auditory gist: comprehension of two dichotic, simultaneously presented stories. In: Proceedings of Meetings on Acoustics, vol. 19, p. 050158. Acoustical Society of America (2013). https://doi.org/10.1121/1.4800507

  23. Kramer, G.: An introduction to auditory display. In: Kramer, G. (ed.) Auditory Display. Addison-Wesley, Reading, MA (1994)

    Google Scholar 

  24. McGookin, D.K., Brewster, S.A.: An investigation into the identification of concurrently presented earcons. In: Proceedings of 2003 International Conference on Auditory Display (ICAD), pp. 42–46. Georgia Institute of Technology (2003)

  25. McGookin, D.K., Brewster, S.A.: Understanding concurrent earcons: applying auditory scene analysis principles to concurrent earcon recognition. ACM Trans. Appl. Percept. 1(2), 130–155 (2004). https://doi.org/10.1145/1024083.1024087

    Article  Google Scholar 

  26. Moon, C.B., Lee, J.Y., Kim, D.S., Kim, B.M.: Multimedia content recommendation in social networks using mood tags and synonyms. In: Multimedia Systems pp. 1–18 (2019)

  27. Mullins, A.T.: Audiostreamer: leveraging the cocktail party effect for efficient listening. Ph.D. thesis, Massachusetts Institute of Technology (1996)

  28. Obermeyera, J.A., Edmondsa, L.A.: Attentive reading with constrained summarization adapted to address written discourse in people with mild aphasia. Am. J. Speech Lang. Pathol. 27(1S), 392–405 (2018)

    Article  Google Scholar 

  29. Parente, P.: Clique: perceptually based, task oriented auditory display for GUI applications. Ph.D. thesis, The University of North Carolina at Chapel Hill (2008)

  30. Patel, D., Ghosh, D., Zhao, S.: Teach me fast: how to optimize online lecture video speeding for learning in less time? In: Proceedings of the Sixth International Symposium of Chinese CHI, pp. 160–163. ACM (2018)

  31. Roto, V., Law, E., Vermeeren, A., Hoonhout, J.: User experience white paper: Bringing clarity to the concept of user experience. In: Dagstuhl Seminar on Demarcating User Experience, p. 12 (2011)

  32. Schmandt, C., Mullins, A.: AudioStreamer: exploiting simultaneity for listening. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 218–219. ACM (1995). https://doi.org/10.1145/223355.223533

  33. Song, H.J.: Evaluation of the effects of spatial separation and timbral differences on the identifiability of features of concurrent auditory streams. Doctoral dissertation, University of Sydney, Australia. Retrieved from http://hdl.handle.net/2123/7213 (2011)

  34. Test, P.: R Documentation. Online. https://bit.ly/2kkNEH5 (2018). Accessed 01 May 2018

  35. Tordini, F., Bregman, A.S., Cooperstock, J.R.: Prioritizing foreground selection of natural chirp sounds by tempo and spectral centroid. J. Multimodal User Interfaces 10(3), 221–234 (2016)

    Article  Google Scholar 

  36. Towers, J.A.: Enabling the effective application of spatial auditory displays in modern flight decks. Ph.D. thesis, The University of Queensland (2016)

  37. Wasylyshyn, C., McClimens, B., Brock, D.: Comprehension of speech presented at synthetically accelerated rates: evaluating training and practice effects. In: Proceedings of the 16th International Conference on Auditory Display (ICAD), pp. 133–136. Georgia Institute of Technology (2010)

  38. Welland, R.J., Lubinski, R., Higginbotham, D.J.: Discourse comprehension test performance of elders with dementia of the Alzheimer type. J. Speech Lang. Hear. Res. 45(6), 1175 (2002). https://doi.org/10.1044/1092-4388(2002/095)

    Article  Google Scholar 

  39. Werner, S., Hauck, C., Roome, N., Hoover, C., Choates, D.: Can VoiceScapes assist in menu navigation? Proc. Hum. Factors Ergon. Soc. 2015, 1095–1099 (2015). https://doi.org/10.1177/1541931215591157

    Article  Google Scholar 

  40. Wilson, E.B.: Probable inference, the law of succession, and statistical inference. J. Am. Stat. Assoc. 22(158), 209–212 (1927)

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the School of Software, Faculty of Engineering and IT, University of Technology Sydney, Australia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Abu ul Fazal.

Additional information

Communicated by M. Katsurai.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abu ul Fazal, M., Ferguson, S. & Johnston, A. Investigating efficient speech-based information communication: a comparison between the high-rate and the concurrent playback designs. Multimedia Systems 26, 621–630 (2020). https://doi.org/10.1007/s00530-020-00669-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-020-00669-2

Keywords

Navigation