Skip to main content

A cross-cultural comparison of salient perceptual characteristics of height channels for a virtual auditory environment


Perceptual characteristics of virtual auditory environments from three listener groups were compared. To generate convincing and pleasing virtual auditory environments, acoustic impulse responses were measured in two venues using an innovative microphone array and convolved with two anechoic recordings. Subsequently, the convolved sound sources were assigned to loudspeakers (five horizontal channels and four height channels), and inter-channel level balances were optimized. The authors conducted a controlled listening test with two variables: height-channel configurations (eight conditions) and stimuli (four conditions—two musical selections times and two target venues) to determine the influence of (1) two control variables on the perceived appropriateness of virtual auditory environments and (2) the cultural background of three listener groups composed of participants from Canada (group 1, 11 subjects), the USA (group 2, 12 subjects), and Japan (group 3, 14 subjects). The data analysis revealed that the configuration variable (the height position of the loudspeakers) has a greater influence on perceived appropriateness than the stimulus variable for all three groups. In addition, the results showed that although group 1 data had a similar listening response pattern to group 2, the response of group 3 was different. A subsequent analysis of reported descriptors found that groups 1 and 2 chose height configurations that generated a “frontal” and “narrow” impression as a more appropriate virtual auditory environment, while group 3 chose the same characteristics but as a less appropriate environment. Groups 1 and 2 also described a less appropriate auditory environment with “wide, spacious, and surrounding” images that again were described by group 3 as more appropriate. While room acoustics and loudspeaker size also contributed to the overall modulation of listeners’ judgment, the findings support the idea that cultural background affects perceptual responses to spatial sound and is therefore important in rendering a homogeneous experience of a virtual auditory environment for listeners in remote spaces.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. In their book Blesser and Salter (2006), Blesser and Salter used the term “aural architecture” as equivalent to an auditory environment and “aural architect” to a listener who reports aural attributes of a space.

  2. \(RT_{60}\) indicates a time for the initial excitation of acoustic energy to reduce \(-\)60 dB and is a standard metric to indicate a reverberation time of a space.


  • Alwin DF, Krosnick JA (1985) The measurement of values in surveys: a comparison of ratings and rankings. Public Opin Q 49(4):535–552

    Article  Google Scholar 

  • Anderson LM, Mulligan BE, Goodman LS (1984) Effects of vegetation on human response to sound. J Arboric 10(2):45–49

    Google Scholar 

  • AURO Technologies (2013) AURO 3D listening formats. As of April 2014

  • Blesser B, Salter L (2006) Spaces speaks, are you listening? experiencing aural architecture. The MIT Press, Massachusetts

    Google Scholar 

  • Feys J (2015) npIntFactRep: nonparametric interaction tests for factorial designs with repeated measures. As of April 2015

  • Giragama CNW, Martens WL, Herath S, Wanasinghe DR, Sabbir AM (2003) Relating multilingual semantic scales to a common timbre space - Part II. In: Proceedings of audio engineering society 115th international convention, New York, USA. AES. Preprint 5895

  • Hamasaki K, Hiyama K, Okumura R (2005) The 22.2 Multichannel Sound System and Its Application. In: Proceedings of audio engineering society 118th international convention, Barcelona, Spain. AES. Preprint 6406

  • Herrington JD (1996) Effects of music in service environments: a field study. J Serv Mark 10(2):26–41

    Article  Google Scholar 

  • Holman T (2007) 5.1 Surround sound, up and running. Music technology series, 2nd edn. Focal Press, Oxford

    Google Scholar 

  • ITU-R (2012) Recommendation BS.775-3, multi-channel stereophonic sound system with or without accompanying picture. International telecommunications union radiocommunication assembly, Geneva, Switzerland

  • Iwamiya S, Zhan M (1997) A comparison between Japanese and Chinese adjectives which express auditory impressions. J Acoust Soc Jpn (E) 18(6):319–323

    Article  Google Scholar 

  • Karampourniotis A, Kim S, Ko D, King R, Leonard B (2014) Significance of height loudspeaker positioning for perceived immersive sound field reproduction. J Acoust Soc Am 135:2282

    Article  Google Scholar 

  • Kim S, DeFrancisco M, Walker K, Marui A, Martens WL (2006) An examination of the influence of musical selection on listener preferences for multichannel microphone technique. In: Proceedings of audio engineering society 28th international conference on the future of audio technology—surround sound and beyond, Piteå, Sweden. AES

  • Kim S, Ko D, Nagendra A, Woszczyk W (2013) Subjective evaluation of multichannel sound with surround-height channels . In: Proceedings of audio engineering society 135th international convention, New York, USA. AES

  • Kim S, Walker K, Martens WL (2007) Cross-cultural descriptive analysis of multichannel auditory imagery: a comparison of Japanese and English adjectives. In: Proceedings of the 13th regional convention of AES, Tokyo, Japan. AES

  • Koichiro H, Hiyama S, Hamasaki K (2002) The minimum number of loudspeakers and its arrangement for reproducing the spatial impression of diffuse sound field. In: Proceedings of audio engineering society 113th international convention AES

  • Martens WL, Kim S (2007) Verbal elicitation and scale construction for evaluating perceptual differences between four multichannel microphone techniques. In: Proceedings of audio engineering society 122nd international convention, Vienna, Austria. AES

  • Martens WL, Kim S, Marui A (2008) Comparison of Japanese and English language descriptions of piano performances captured using popular multichannel microphone arrays. J Acoust Soc Am 123(5):3690

    Article  Google Scholar 

  • Namba S, Kuwano S, Hashimoto T, Berglund B, Rui ZD, Schick A, Hoege H, Florentine M (1991) Verbal expression of emotional impression of sound: a cross-cultural study. J Acoust Soc Jpn (E) 12(1):19–29

    Article  Google Scholar 

  • Olive SE (2004a) A multiple regression model for predicting loudspeaker preference using objective measurements: Part I—listening test results. In: Proceedings of audio engineering society 116th international convention, Berlin, Germany. AES. Preprint 6113

  • Olive SE (2004b) A multiple regression model for predicting loudspeaker preference using objective measurements: Part II—development of the model. In: Proceedings of audio engineering society 117th international convention, SanFrancisco, USA. AES. Preprint 6190

  • Olive SE, Toole FE (1989) The detection of reflections in typical rooms. J Audio Eng Soc 37(7):539–553

    Google Scholar 

  • Olive SE, Welti T, McMullin E (2014) The influence of listeners’ experience, age, and culture on headphone sound quality preferences. In: Proceedings of audio engineering society 135th international convention, LA, USA. AES

  • Rumsey F (2001) Spatial audio, music technology series. Focal Press, Oxford

    Google Scholar 

  • Varnum MEW, Grossmann I, Kitayama S, Nisbett RE (2010) The origin of cultural differences in cognition: evidence for the social orientation hypothesis. Curr Dir Psychol Sci 19(1):9–13

    Article  Google Scholar 

  • Woszczyk W, Ko D, Brett L, Benson D (2009) Selection and preparation of multichannel room impulse responses for interactive low-latency rendering of virtual rooms. In: Proceedings of the 16th international conference on sound and vibration, Kralow, Poland

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sungyoung Kim.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kim, S., King, R. & Kamekawa, T. A cross-cultural comparison of salient perceptual characteristics of height channels for a virtual auditory environment. Virtual Reality 19, 149–160 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Height-channel perception
  • Multichannel-reproduced virtual auditory environment
  • Cross-cultural comparison