Skip to main content
Log in

Multi-modal Perception

  • Published:
BT Technology Journal

Abstract

To design, optimise and deliver multimedia and virtual-reality products and services it is necessary to match performance to the capabilities of users. When a multimedia system is used, the presence of audio and video stimuli introduces significant cross-modal effects (the sensory streams interact). This paper introduces a number of cross-modal interactions that are relevant to communications systems and discusses the advanced experimental techniques required to provide data for modelling multi-modal perception. The aim of the work is to provide a multi-modal perceptual model that can be used for performance assessment and can be incorporated into coding algorithms. The current and future applications of multi-modal modelling are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Walker G R: 'The mirror — reflections on Inhabited TV', British Telecommunications Eng J, 16, pp 29-38, (April 1997).

    Google Scholar 

  2. Powers S J, Hinds M R and Morphett J: 'Distributed entertainment environment', BT Technol J, 15,No 4, pp 172-180. (October 1997).

    Google Scholar 

  3. Garner P et al: 'The application of telepresence in medicine', BT Technol J, 15,No 4, pp 181-187 (October 1997).

    Google Scholar 

  4. Fowler C J H and Mayes T: 'Applying telepresence to education', BT Technol J, 15,No 4, pp 188-195 (October 1997).

    Google Scholar 

  5. Traill D M, Bowskill J M and Lawrence P J: 'Interactive collaborative media environments', BT Technol J, 15,No 4, pp 130-139 (October 1997).

    Google Scholar 

  6. Hollier M P, Hawksford M O Guard D R: 'Error-activity and error entropy as a measure of psychoacoustic significance in the perceptual domain', IEE Proc-Vis Image Signal Process, 141,No 3 (June 1994).

  7. Paillard B, Mabilleau P, Monssette S and Soumagne J: 'PERCEVAL: Perceptual evaluation of the quality of audio systems, J Audio Eng Soc, 40,No 1/2 (Jan/Feb 1992).

  8. Ran X and Favardin N: 'A perceptually motivated three-component image model — Part 1: Description of the model', IEEE Transactions on Image Processing, 4,No 4 (April 1995).

  9. Karunasekera A S and Kingsbury N G: 'A distortion measure for blocking artefacts in images based on human visual sensitivity', IEEE Transactions on Image Processing, 4,No 6 (June 1995).

  10. Hollier M and Voelcker R: 'Towards a multimodal perceptual model', BT Technol J, 15No 4, pp 163-172 (October 1997).

    Google Scholar 

  11. Hollier M and Sheppard P J: 'Objective speech quality assessment: towards an engineering metric', 100th AES Convention, Preprint No 4242 (May 1996).

  12. Wickens C D: 'Engineering psychology and human performance', (2nd Edition), New York, HarperCollins (1992).

    Google Scholar 

  13. Baddeley A D: 'Human memory: theory and practice', Hove, LEA, (1990).

    Google Scholar 

  14. Aldridge R, Davidoff J, Ghanbari M, Hands D and Pearson D: 'Measurement of scene-dependent quality variations in digitally-coded television pictures', IEE Proc. Vision, Signal and Image Processing, 142, 149-154, (1995).

    Google Scholar 

  15. Fredrickson B L and Kahneman D: 'Duration neglect in retrospective evaluations of affective episodes', Journal of Personality and Social Psychology, 65, pp 45-55 (1993).

    Google Scholar 

  16. Redelmeier D A and Kahneman D: 'Patients' memories of painful medical treatments — Real-time and retrospective evaluations of 2 minimally invasive procedures', Pain, 66, pp 3-8 (1996).

    Google Scholar 

  17. Aldndge R P, Hands D S, Pearson D E and Lodge N K: 'Continuous quality assessment of digitally-coded television pictures', IEE Proc. vision, signal and image processing, 145, 116-123 (1998).

    Google Scholar 

  18. ITU-T Recommendation J.100: 'Tolerances for transmission time differences between vision and sound components of a television signal', (1990).

  19. Mortlock A N et al: 'Virtual conferencing', BT Technol J, 15,No 4, pp 120-129 (October 1997).

    Google Scholar 

  20. Hollier M and Rimell A: 'An experimental investigation into multimodal synchronisation sensitivity for perceptual model development', 105th AES Convention, Preprint No 4790 (September 1998).

  21. Netravali A N and Haskell B G: 'Digital pictures: representation and compression', Plenum Press (June 1991).

  22. Hollier M P and Voelcker R M: 'Objective performance assessment: video quality as an influence on audio perception', Presented at the 103rd AES Convention, Preprint No 4590 (L-10) (September 1997).

  23. Rimell A, Hollier M P and Voelcker R M: 'The influence of cross-modal interaction on audio-visual speech quality perception', Presented at the 105th AES Convention, Preprint No 4791 (September 1998).

  24. Rao R and Chen T: 'Cross-modal predictive coding', Symp on Multimedia Communication and Video Coding, New York, (October 1995) — available from: http://www.ece.gatech.edu/users/1061/publications.html

  25. Shah D and Marshall S: 'Multimodality coding system for videophone applications', WIASIC'94, Berlin, Germany (October 1994).

  26. Chen T and Rao R: 'Audio-visual interaction in multimedia — from lip synchronisation to joint audio-video coding', IEEE Circuits and Devices Magazine, pp 21-26 (November 1995).

  27. Chellappa R and Chen T: 'Audio-Visual interaction in multimodal communication', IEEE signal processing magazine, pp 37-38, (July 1997).

  28. ITU-T Recommendation P.930: 'Principles of a reference impairment system for video', (August 1996).

  29. ITU-T Recommendation P.800: 'Methods for subjective determination of transmission quality', (formerly ITU-T P.80) (August 1996).

  30. Rix A, Hollier M and Bourret A: 'Modelling human perception', BT Technol J, 17,No 1, pp 24-34 (January 1999).

    Google Scholar 

  31. ISO/IEC JTC1/SC29 WG1 1z Moving Pictures Expert Group (MPEG), Home Page — http://drogo.cselt.stet.it/mpeg/

  32. ISO/IEC JTC1/SC29 WG1 Joint Photographic Experts Group (JPEG), Home Page — http://www.jpeg.org 45

Download references

Authors

About this article

Cite this article

Hollier, M.P., Rimell, A.N., Hands, D.S. et al. Multi-modal Perception. BT Technology Journal 17, 35–46 (1999). https://doi.org/10.1023/A:1009666623193

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009666623193

Keywords

Navigation