Abstract
Perceptual characteristics of virtual auditory environments from three listener groups were compared. To generate convincing and pleasing virtual auditory environments, acoustic impulse responses were measured in two venues using an innovative microphone array and convolved with two anechoic recordings. Subsequently, the convolved sound sources were assigned to loudspeakers (five horizontal channels and four height channels), and inter-channel level balances were optimized. The authors conducted a controlled listening test with two variables: height-channel configurations (eight conditions) and stimuli (four conditions—two musical selections times and two target venues) to determine the influence of (1) two control variables on the perceived appropriateness of virtual auditory environments and (2) the cultural background of three listener groups composed of participants from Canada (group 1, 11 subjects), the USA (group 2, 12 subjects), and Japan (group 3, 14 subjects). The data analysis revealed that the configuration variable (the height position of the loudspeakers) has a greater influence on perceived appropriateness than the stimulus variable for all three groups. In addition, the results showed that although group 1 data had a similar listening response pattern to group 2, the response of group 3 was different. A subsequent analysis of reported descriptors found that groups 1 and 2 chose height configurations that generated a “frontal” and “narrow” impression as a more appropriate virtual auditory environment, while group 3 chose the same characteristics but as a less appropriate environment. Groups 1 and 2 also described a less appropriate auditory environment with “wide, spacious, and surrounding” images that again were described by group 3 as more appropriate. While room acoustics and loudspeaker size also contributed to the overall modulation of listeners’ judgment, the findings support the idea that cultural background affects perceptual responses to spatial sound and is therefore important in rendering a homogeneous experience of a virtual auditory environment for listeners in remote spaces.
This is a preview of subscription content, access via your institution.





Notes
In their book Blesser and Salter (2006), Blesser and Salter used the term “aural architecture” as equivalent to an auditory environment and “aural architect” to a listener who reports aural attributes of a space.
\(RT_{60}\) indicates a time for the initial excitation of acoustic energy to reduce \(-\)60 dB and is a standard metric to indicate a reverberation time of a space.
References
Alwin DF, Krosnick JA (1985) The measurement of values in surveys: a comparison of ratings and rankings. Public Opin Q 49(4):535–552
Anderson LM, Mulligan BE, Goodman LS (1984) Effects of vegetation on human response to sound. J Arboric 10(2):45–49
AURO Technologies (2013) AURO 3D listening formats. http://www.auro-technologies.com/system/listening-formats. As of April 2014
Blesser B, Salter L (2006) Spaces speaks, are you listening? experiencing aural architecture. The MIT Press, Massachusetts
Feys J (2015) npIntFactRep: nonparametric interaction tests for factorial designs with repeated measures. http://cran.r-project.org/web/packages/npIntFactRep/index.html. As of April 2015
Giragama CNW, Martens WL, Herath S, Wanasinghe DR, Sabbir AM (2003) Relating multilingual semantic scales to a common timbre space - Part II. In: Proceedings of audio engineering society 115th international convention, New York, USA. AES. Preprint 5895
Hamasaki K, Hiyama K, Okumura R (2005) The 22.2 Multichannel Sound System and Its Application. In: Proceedings of audio engineering society 118th international convention, Barcelona, Spain. AES. Preprint 6406
Herrington JD (1996) Effects of music in service environments: a field study. J Serv Mark 10(2):26–41
Holman T (2007) 5.1 Surround sound, up and running. Music technology series, 2nd edn. Focal Press, Oxford
ITU-R (2012) Recommendation BS.775-3, multi-channel stereophonic sound system with or without accompanying picture. International telecommunications union radiocommunication assembly, Geneva, Switzerland
Iwamiya S, Zhan M (1997) A comparison between Japanese and Chinese adjectives which express auditory impressions. J Acoust Soc Jpn (E) 18(6):319–323
Karampourniotis A, Kim S, Ko D, King R, Leonard B (2014) Significance of height loudspeaker positioning for perceived immersive sound field reproduction. J Acoust Soc Am 135:2282
Kim S, DeFrancisco M, Walker K, Marui A, Martens WL (2006) An examination of the influence of musical selection on listener preferences for multichannel microphone technique. In: Proceedings of audio engineering society 28th international conference on the future of audio technology—surround sound and beyond, Piteå, Sweden. AES
Kim S, Ko D, Nagendra A, Woszczyk W (2013) Subjective evaluation of multichannel sound with surround-height channels . In: Proceedings of audio engineering society 135th international convention, New York, USA. AES
Kim S, Walker K, Martens WL (2007) Cross-cultural descriptive analysis of multichannel auditory imagery: a comparison of Japanese and English adjectives. In: Proceedings of the 13th regional convention of AES, Tokyo, Japan. AES
Koichiro H, Hiyama S, Hamasaki K (2002) The minimum number of loudspeakers and its arrangement for reproducing the spatial impression of diffuse sound field. In: Proceedings of audio engineering society 113th international convention AES
Martens WL, Kim S (2007) Verbal elicitation and scale construction for evaluating perceptual differences between four multichannel microphone techniques. In: Proceedings of audio engineering society 122nd international convention, Vienna, Austria. AES
Martens WL, Kim S, Marui A (2008) Comparison of Japanese and English language descriptions of piano performances captured using popular multichannel microphone arrays. J Acoust Soc Am 123(5):3690
Namba S, Kuwano S, Hashimoto T, Berglund B, Rui ZD, Schick A, Hoege H, Florentine M (1991) Verbal expression of emotional impression of sound: a cross-cultural study. J Acoust Soc Jpn (E) 12(1):19–29
Olive SE (2004a) A multiple regression model for predicting loudspeaker preference using objective measurements: Part I—listening test results. In: Proceedings of audio engineering society 116th international convention, Berlin, Germany. AES. Preprint 6113
Olive SE (2004b) A multiple regression model for predicting loudspeaker preference using objective measurements: Part II—development of the model. In: Proceedings of audio engineering society 117th international convention, SanFrancisco, USA. AES. Preprint 6190
Olive SE, Toole FE (1989) The detection of reflections in typical rooms. J Audio Eng Soc 37(7):539–553
Olive SE, Welti T, McMullin E (2014) The influence of listeners’ experience, age, and culture on headphone sound quality preferences. In: Proceedings of audio engineering society 135th international convention, LA, USA. AES
Rumsey F (2001) Spatial audio, music technology series. Focal Press, Oxford
Varnum MEW, Grossmann I, Kitayama S, Nisbett RE (2010) The origin of cultural differences in cognition: evidence for the social orientation hypothesis. Curr Dir Psychol Sci 19(1):9–13
Woszczyk W, Ko D, Brett L, Benson D (2009) Selection and preparation of multichannel room impulse responses for interactive low-latency rendering of virtual rooms. In: Proceedings of the 16th international conference on sound and vibration, Kralow, Poland
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kim, S., King, R. & Kamekawa, T. A cross-cultural comparison of salient perceptual characteristics of height channels for a virtual auditory environment. Virtual Reality 19, 149–160 (2015). https://doi.org/10.1007/s10055-015-0269-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10055-015-0269-1
Keywords
- Height-channel perception
- Multichannel-reproduced virtual auditory environment
- Cross-cultural comparison