Abstract
This research aims to assist users to seek information efficiently while interacting with speech-based information, particularly in multimedia delivery, and reports on an experiment that tested two speech-based designs for communicating multiple speech-based information streams efficiently. In this experiment, a high-rate playback design and a concurrent playback design are investigated. In the high-rate playback design, two speech-based information streams were communicated by doubling the normal playback-rate, and in the concurrent playback design, two speech-based information streams were played concurrently. Comprehension of content in both the designs was also compared with the benchmark set from regular baseline condition. The results showed that the users’ comprehension regarding the main information dropped significantly in the high-rate playback and the concurrent playback designs compared to the baseline condition. However, in answering the questions set from the detailed information, the comprehension was not significantly different in all three designs. It is expected that such equeryfficient communication methods may increase productivity by providing information efficiently while interacting with an interactive multimedia system.
Similar content being viewed by others
Notes
All the percentages in the subsequent sections text are mentioned in rounded form. However, the the graphs and statistical analysis are drawn on precise values.
References
Audacity: Audacity. Online. https://www.audacityteam.org/. Accessed 23 Dec 2018
Bakker, S., van den Hoven, E., Eggen, B.: Knowing by ear: leveraging human attention abilities in interaction design. J. Multimodal User Interfaces 5(3–4), 197–209 (2012)
Bradley, M.M., Lang, P.J.: Measuring emotion: the self-assessment manikin and the semantic differential. J. Behav. Ther. Exp. Psychiatry 25(1), 49–59 (1994)
Brock, D., McClimens, B., Trafton, J.G., McCurry, M., Perzanowski, D.: Evaluating listeners’ attention to and comprehension of spatialized concurrent and serial talkers at normal and a synthetically faster rate of speech. In: Proceedings of the 14th International Conference on Auditory Display (ICAD), pp. 1–8. Georgia Institute of Technology (2008)
Brock, D., Wasylyshyn, C., Mcclimens, B.: Word spotting in a multichannel virtual auditory display at normal and accelerated rates of speech. In: Proceedings of the 22nd International Conference on Auditory Display (ICAD), pp. 130–135. Georgia Institute of Technology (2016)
Brock, D., Wasylyshyn, C., Mcclimens, B., Perzanowski, D.: Facilitating the Watchstander’s voice communications task in future navy operations. In: MILCOM 2011 Military Communications Conference, pp. 2222–2226 (2011). https://doi.org/10.1109/MILCOM.2011.6127692
Brungart, D.S., Simpson, B.D.: Distance-based speech segregation in near-field virtual audio displays. In: Proceedings of the 2001 International Conference on Auditory Display (ICAD), pp. 169–174. Georgia Institute of Technology (2001)
Brungart, D.S., Simpson, B.D.: Optimizing the spatial configuration of a seven-talker speech display. ACM Trans. Appl. Percept. (TAP) 2(4), 430–436 (2005)
Council, B.: IELTS. Online. https://www.ielts.org (2019). Accessed 30 Apr 2019
Fazal, M.A.U., Ferguson, S.: Investigating concurrent information communication in voice-based interaction. https://doi.org/10.6084/m9.figshare.11917857.v1. https://www.bit.ly/39vzSFP (2020). Accessed 29 Feb 2020
Fazal, M.A.U., Ferguson, S., Karim, M.S., Johnston, A.: Concurrent voice-based multiple information communication: a study report of profile-based users’ interaction. In: 145th Convention of the Audio Engineering Society. Audio Engineering Society (2018)
Fazal, M.A.U., Ferguson, S., Karim, M.S., Johnston, A.: Vinfomize: a framework for multiple voice-based information communication. In: Proceedings of the 3rd International Conference on Information System and Data Mining, pp. 1–7. ACM (2019) (accepted—to be published)
Fazal, M.A.U., Shuaib, K.: Multiple information communication in voice-based interaction. In: Advances in Intelligent Systems and Computing, pp. 101–111 (2017)
Feltham, F., Loke, L.: Felt sense through auditory display: a design case study into sound for somatic awareness while walking. In: Proceedings of the 2017 ACM SIGCHI Conference on Creativity and Cognition, pp. 287–298. ACM (2017)
Ferguson, S., Cabrera, D.: Exploratory Sound Analysis: Sonifying Data about Sound, pp. 1–8. International Community for Auditory Display, Palo Alto (2008)
Guerreiro, J.: Towards screen readers with concurrent speech: where to go next? SIGACCESS Access. Comput. 114, 12–19 (2016)
Guerreiro, J., Goncalves, D.: Scanning for digital content: how blind and sighted people perceive concurrent speech. ACM Trans. Access. Comput. (2016). https://doi.org/10.1109/CVPR.2016.105
Hafen, R.P., Henry, M.J.: Speech information retrieval: a review. Multimed. Syst. 18(6), 499–518 (2012)
Hermann, T.: Taxonomy and definitions for sonification and auditory display. In: Proceedings of the 14th International Conference on Auditory Display (ICAD), pp. 1–8. Georgia Institute of Technology (2008)
Hinde, A.F.: Concurrency in auditory displays for connected television. Ph.D. thesis, University of York (2016)
Ikei, Y., Yamazaki, H., Hirota, K., Hirose, M.: vCocktail: multiplexed-voice menu presentation method for wearable computers. In: Virtual Reality Conference, pp. 183–190. IEEE (2006). https://doi.org/10.1109/VR.2006.141
Iyer, N., Thompson, E.R., Simpson, B.D., Brungart, D., Summers, V.: Exploring auditory gist: comprehension of two dichotic, simultaneously presented stories. In: Proceedings of Meetings on Acoustics, vol. 19, p. 050158. Acoustical Society of America (2013). https://doi.org/10.1121/1.4800507
Kramer, G.: An introduction to auditory display. In: Kramer, G. (ed.) Auditory Display. Addison-Wesley, Reading, MA (1994)
McGookin, D.K., Brewster, S.A.: An investigation into the identification of concurrently presented earcons. In: Proceedings of 2003 International Conference on Auditory Display (ICAD), pp. 42–46. Georgia Institute of Technology (2003)
McGookin, D.K., Brewster, S.A.: Understanding concurrent earcons: applying auditory scene analysis principles to concurrent earcon recognition. ACM Trans. Appl. Percept. 1(2), 130–155 (2004). https://doi.org/10.1145/1024083.1024087
Moon, C.B., Lee, J.Y., Kim, D.S., Kim, B.M.: Multimedia content recommendation in social networks using mood tags and synonyms. In: Multimedia Systems pp. 1–18 (2019)
Mullins, A.T.: Audiostreamer: leveraging the cocktail party effect for efficient listening. Ph.D. thesis, Massachusetts Institute of Technology (1996)
Obermeyera, J.A., Edmondsa, L.A.: Attentive reading with constrained summarization adapted to address written discourse in people with mild aphasia. Am. J. Speech Lang. Pathol. 27(1S), 392–405 (2018)
Parente, P.: Clique: perceptually based, task oriented auditory display for GUI applications. Ph.D. thesis, The University of North Carolina at Chapel Hill (2008)
Patel, D., Ghosh, D., Zhao, S.: Teach me fast: how to optimize online lecture video speeding for learning in less time? In: Proceedings of the Sixth International Symposium of Chinese CHI, pp. 160–163. ACM (2018)
Roto, V., Law, E., Vermeeren, A., Hoonhout, J.: User experience white paper: Bringing clarity to the concept of user experience. In: Dagstuhl Seminar on Demarcating User Experience, p. 12 (2011)
Schmandt, C., Mullins, A.: AudioStreamer: exploiting simultaneity for listening. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 218–219. ACM (1995). https://doi.org/10.1145/223355.223533
Song, H.J.: Evaluation of the effects of spatial separation and timbral differences on the identifiability of features of concurrent auditory streams. Doctoral dissertation, University of Sydney, Australia. Retrieved from http://hdl.handle.net/2123/7213 (2011)
Test, P.: R Documentation. Online. https://bit.ly/2kkNEH5 (2018). Accessed 01 May 2018
Tordini, F., Bregman, A.S., Cooperstock, J.R.: Prioritizing foreground selection of natural chirp sounds by tempo and spectral centroid. J. Multimodal User Interfaces 10(3), 221–234 (2016)
Towers, J.A.: Enabling the effective application of spatial auditory displays in modern flight decks. Ph.D. thesis, The University of Queensland (2016)
Wasylyshyn, C., McClimens, B., Brock, D.: Comprehension of speech presented at synthetically accelerated rates: evaluating training and practice effects. In: Proceedings of the 16th International Conference on Auditory Display (ICAD), pp. 133–136. Georgia Institute of Technology (2010)
Welland, R.J., Lubinski, R., Higginbotham, D.J.: Discourse comprehension test performance of elders with dementia of the Alzheimer type. J. Speech Lang. Hear. Res. 45(6), 1175 (2002). https://doi.org/10.1044/1092-4388(2002/095)
Werner, S., Hauck, C., Roome, N., Hoover, C., Choates, D.: Can VoiceScapes assist in menu navigation? Proc. Hum. Factors Ergon. Soc. 2015, 1095–1099 (2015). https://doi.org/10.1177/1541931215591157
Wilson, E.B.: Probable inference, the law of succession, and statistical inference. J. Am. Stat. Assoc. 22(158), 209–212 (1927)
Acknowledgements
This research was supported by the School of Software, Faculty of Engineering and IT, University of Technology Sydney, Australia.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by M. Katsurai.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Abu ul Fazal, M., Ferguson, S. & Johnston, A. Investigating efficient speech-based information communication: a comparison between the high-rate and the concurrent playback designs. Multimedia Systems 26, 621–630 (2020). https://doi.org/10.1007/s00530-020-00669-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-020-00669-2