Abstract
To design, optimise and deliver multimedia and virtual-reality products and services it is necessary to match performance to the capabilities of users. When a multimedia system is used, the presence of audio and video stimuli introduces significant cross-modal effects (the sensory streams interact). This paper introduces a number of cross-modal interactions that are relevant to communications systems and discusses the advanced experimental techniques required to provide data for modelling multi-modal perception. The aim of the work is to provide a multi-modal perceptual model that can be used for performance assessment and can be incorporated into coding algorithms. The current and future applications of multi-modal modelling are discussed.
Similar content being viewed by others
References
Walker G R: 'The mirror — reflections on Inhabited TV', British Telecommunications Eng J, 16, pp 29-38, (April 1997).
Powers S J, Hinds M R and Morphett J: 'Distributed entertainment environment', BT Technol J, 15,No 4, pp 172-180. (October 1997).
Garner P et al: 'The application of telepresence in medicine', BT Technol J, 15,No 4, pp 181-187 (October 1997).
Fowler C J H and Mayes T: 'Applying telepresence to education', BT Technol J, 15,No 4, pp 188-195 (October 1997).
Traill D M, Bowskill J M and Lawrence P J: 'Interactive collaborative media environments', BT Technol J, 15,No 4, pp 130-139 (October 1997).
Hollier M P, Hawksford M O Guard D R: 'Error-activity and error entropy as a measure of psychoacoustic significance in the perceptual domain', IEE Proc-Vis Image Signal Process, 141,No 3 (June 1994).
Paillard B, Mabilleau P, Monssette S and Soumagne J: 'PERCEVAL: Perceptual evaluation of the quality of audio systems, J Audio Eng Soc, 40,No 1/2 (Jan/Feb 1992).
Ran X and Favardin N: 'A perceptually motivated three-component image model — Part 1: Description of the model', IEEE Transactions on Image Processing, 4,No 4 (April 1995).
Karunasekera A S and Kingsbury N G: 'A distortion measure for blocking artefacts in images based on human visual sensitivity', IEEE Transactions on Image Processing, 4,No 6 (June 1995).
Hollier M and Voelcker R: 'Towards a multimodal perceptual model', BT Technol J, 15No 4, pp 163-172 (October 1997).
Hollier M and Sheppard P J: 'Objective speech quality assessment: towards an engineering metric', 100th AES Convention, Preprint No 4242 (May 1996).
Wickens C D: 'Engineering psychology and human performance', (2nd Edition), New York, HarperCollins (1992).
Baddeley A D: 'Human memory: theory and practice', Hove, LEA, (1990).
Aldridge R, Davidoff J, Ghanbari M, Hands D and Pearson D: 'Measurement of scene-dependent quality variations in digitally-coded television pictures', IEE Proc. Vision, Signal and Image Processing, 142, 149-154, (1995).
Fredrickson B L and Kahneman D: 'Duration neglect in retrospective evaluations of affective episodes', Journal of Personality and Social Psychology, 65, pp 45-55 (1993).
Redelmeier D A and Kahneman D: 'Patients' memories of painful medical treatments — Real-time and retrospective evaluations of 2 minimally invasive procedures', Pain, 66, pp 3-8 (1996).
Aldndge R P, Hands D S, Pearson D E and Lodge N K: 'Continuous quality assessment of digitally-coded television pictures', IEE Proc. vision, signal and image processing, 145, 116-123 (1998).
ITU-T Recommendation J.100: 'Tolerances for transmission time differences between vision and sound components of a television signal', (1990).
Mortlock A N et al: 'Virtual conferencing', BT Technol J, 15,No 4, pp 120-129 (October 1997).
Hollier M and Rimell A: 'An experimental investigation into multimodal synchronisation sensitivity for perceptual model development', 105th AES Convention, Preprint No 4790 (September 1998).
Netravali A N and Haskell B G: 'Digital pictures: representation and compression', Plenum Press (June 1991).
Hollier M P and Voelcker R M: 'Objective performance assessment: video quality as an influence on audio perception', Presented at the 103rd AES Convention, Preprint No 4590 (L-10) (September 1997).
Rimell A, Hollier M P and Voelcker R M: 'The influence of cross-modal interaction on audio-visual speech quality perception', Presented at the 105th AES Convention, Preprint No 4791 (September 1998).
Rao R and Chen T: 'Cross-modal predictive coding', Symp on Multimedia Communication and Video Coding, New York, (October 1995) — available from: http://www.ece.gatech.edu/users/1061/publications.html
Shah D and Marshall S: 'Multimodality coding system for videophone applications', WIASIC'94, Berlin, Germany (October 1994).
Chen T and Rao R: 'Audio-visual interaction in multimedia — from lip synchronisation to joint audio-video coding', IEEE Circuits and Devices Magazine, pp 21-26 (November 1995).
Chellappa R and Chen T: 'Audio-Visual interaction in multimodal communication', IEEE signal processing magazine, pp 37-38, (July 1997).
ITU-T Recommendation P.930: 'Principles of a reference impairment system for video', (August 1996).
ITU-T Recommendation P.800: 'Methods for subjective determination of transmission quality', (formerly ITU-T P.80) (August 1996).
Rix A, Hollier M and Bourret A: 'Modelling human perception', BT Technol J, 17,No 1, pp 24-34 (January 1999).
ISO/IEC JTC1/SC29 WG1 1z Moving Pictures Expert Group (MPEG), Home Page — http://drogo.cselt.stet.it/mpeg/
ISO/IEC JTC1/SC29 WG1 Joint Photographic Experts Group (JPEG), Home Page — http://www.jpeg.org 45
About this article
Cite this article
Hollier, M.P., Rimell, A.N., Hands, D.S. et al. Multi-modal Perception. BT Technology Journal 17, 35–46 (1999). https://doi.org/10.1023/A:1009666623193
Issue Date:
DOI: https://doi.org/10.1023/A:1009666623193