Skip to main content

Combining Audio and Video in Perceptive Spaces

  • Conference paper
Managing Interactions in Smart Environments

Abstract

Virtual environments have great potential in applications such as entertainment, animation by example, design interface, information browsing, and even expressive performance. In this paper we describe an approach to unencumbered, natural interfaces called Perceptive Spaces with a particular focus on efforts to include true multi-modal interface: interfaces that attend to both the speech and gesture of the user. The spaces are unencumbered because they utilize passive sensors that don’t require special clothing and large format displays that don’t isolate the user from their environment. The spaces are natural because the open environment facilitates active participation. Several applications illustrate the expressive power of this approach, as well as the challenges associated with designing these interfaces.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ali Azarbayejani and Alex Pentland. Real-time self-calibrating stereo person tracking using 3-D shape estimation from blob features. In Proceedings of 13th ICPR, Vienna, Austria, August 1996. IEEE Computer Society Press.

    Google Scholar 

  2. Sumit Basu, Michael Casey, William Gardner, Ali Azarbayejani, and Alex Pentland. Vision-steered audio for interactive environments. In Proceedings of IMAGE’COM 96, Bordeaux, Prance, May 1996.

    Google Scholar 

  3. Sumit Basu and Alex Pentland. Headset-free voicing detection and pitch tracking in noisy environments. Technical Report 503, MIT Media Lab Vision and Modeling Group, June 1999.

    Google Scholar 

  4. R. A. Bolt, ’put-that-there’: Voice and gesture at the graphics interface. In Computer Graphics Proceedings, SIGGRAPH 1980, volume 14, pages 262–70, July 1980.

    Google Scholar 

  5. Brian Clarkson and Alex Pentland. Unsupervised clustering of ambulatory audio and video. In ICASSP’99, 1999.

    Google Scholar 

  6. T. Darreil, B. Moghaddam, and A. Pentland. Active face tracking and pose estimation in an interactive room. In CVPR96. IEEE Computer Society, 1996.

    Google Scholar 

  7. M. W. Krueger. Artificial Reality II. Addison Wesley, 1990.

    Google Scholar 

  8. Pattie Maes, Bruce Blumberg, Trevor Darrell, and Alex Pentland. The alive system: Full-body interaction with animated autonomous agents. ACM Multimedia Systems, 5:105–112, 1997.

    Article  Google Scholar 

  9. Deb Roy and Alex Pentland. ”learning words from audio-visual input. In Int. Conf. Spoken Language Processing, volume 4, page 1279, Sydney, Australia, December 1998.

    Google Scholar 

  10. Kenneth Russell, Thad Starner, and Alex Pentland. Unencumbered virtual environments. In IJCAI-95 Workshop on Entertainment and AI/Alife, 1995.

    Google Scholar 

  11. Flavia Sparacino, Christopher Wren, Glorianna Davenport, and Alex Pentland. Augmented performance in dance and theater. In International Dance and Technology 99, ASU, Tempe, Arizona, February 1999.

    Google Scholar 

  12. C. Wren, F. Sparacino, A. Azarbayejani, T. Darrell, James W. Davis, T. Starner, Kotani A, C. Chao, M. Hlavac, K. Russell, Aaron Bobick, and Pentland A. Perceptive spaces for performance and entertainment (revised). In ATR Workshop on Virtual Communication Environments, April 1998.

    Google Scholar 

  13. Christopher Wren, Ali Azarbayejani, Trevor Darrell, and Alex Pentland. Pfinder: Real-time tracking of the human body. IEEE Trans. Pattern Analysis and Machine Intelligence, 19(7):780–785, July 1997.

    Article  Google Scholar 

  14. Christopher R. Wren and Alex P. Pentland. Dynamic models of human motion. In Proceedings of FG’98, Nara, Japan, April 1998. IEEE.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag London Limited

About this paper

Cite this paper

Wren, C.R., Basu, S., Sparacino, F., Pentland, A.P. (2000). Combining Audio and Video in Perceptive Spaces. In: Nixon, P., Lacey, G., Dobson, S. (eds) Managing Interactions in Smart Environments. Springer, London. https://doi.org/10.1007/978-1-4471-0743-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-0743-9_5

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-85233-228-0

  • Online ISBN: 978-1-4471-0743-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics