Skip to main content

Multimodal Interface for Effective Man Machine Interaction

Part of the Media Business and Innovation book series (MEDIA)

Abstract

Providing human–human way of interaction for man machine interaction is still a research challenge. It is widely believed that as the computing, communication, and display technologies progress even further, the existing HCI techniques may become a constraint in the effective utilization of the available information flow. Multimodal interaction provides the user with multiple modes of interfacing with a system beyond the traditional keyboard and mouse interactions. This article mainly discusses about the effectiveness of Multimodal Interaction for Man–Machine interaction and also discusses about the implementation issues in various platforms and media. The convergence of various input and output technologies will subsidize the difficulties of humans in communicating with machines thus make maximum use of the converged media platforms. This chapter addresses the implementation of a multimodal interface system through a case study. In addition to that we also discuss about certain challenging application areas where we require a solution of this kind.

Keywords

  • Speech Recognition
  • Gesture Recognition
  • Automatic Speech Recognition
  • Hand Gesture
  • Parse Tree

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-54487-3_14
  • Chapter length: 21 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   129.00
Price excludes VAT (USA)
  • ISBN: 978-3-642-54487-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   169.99
Price excludes VAT (USA)
Hardcover Book
USD   169.99
Price excludes VAT (USA)
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

References

  • Bolt, R. A. (1980). Put-that-there: Voice and gesture at the graphics interface. Computer Graphics, 14(3), 262–270.

    CrossRef  Google Scholar 

  • Brashear, H., Henderson, V., Park, K., Hamilton, H., Lee, S., & Starner, T. (2006). American Sign Language recognition in game development for deaf children. In Proceedings of ACM SIGACCESS Conference on Assistive Technologies (Portland, OR, Oct. 23–25). ACM Press, New York, 2006, pp. 79–86

    Google Scholar 

  • Cetingul, H. E., Erzin, E., Yemez, Y., & Tekalp, A. M. (2006). Multimodal speaker/speech recognition using lip motion, lip texture and audio. Signal Processing, 86, 3549–3558. Science direct, Elsevier.

    CrossRef  Google Scholar 

  • Chen, Q. (2008). PhD thesis, Canada.

    Google Scholar 

  • Dumas, B., Lalanne, D., & Oviat, S. (2009). Human machine interaction (pp. 3–26). Berlin: Springer.

    CrossRef  Google Scholar 

  • Dumas, B., Lalanne, D., & Oviatt, S. (2009). Multimodal interfaces: A survey of principles, models and frameworks. In D. Lalanne & J. Kohlas (Eds.), Human machine interaction: Research results of the MMI Program (LNCS, Vol. 5440, pp. 3–26). Berlin: Springer.

    CrossRef  Google Scholar 

  • Flippo, F., Kerbs, A., & Marsic, I. (2003). A framework for rapid development of multimodal interface. In Proceedings of ICMI’03, Vancouver, British Columbia, Canada. November 5–7, 2003. http://www.caip.rutgers.edu/disciple/Publications/icmi2003.pdf

  • Ito, E. (2001). Multi-modal interface with voice and head tracking for multiple home appliances. In Proceedings of INTERACT2001 8th IFIP TC.13 Conference on Human-Computer Interaction, pp. 727–728.

    Google Scholar 

  • Keller, P. E., Kangas, L. J., Liden, L. H., Hashem, S., & Kouzes, R. T. (1995). Electronic noses and their applications. In IEEE Northcon/Technical Applications Conference (TAC’95) in Portland, OR, USA on 12 October 1995.

    Google Scholar 

  • Krahnstoever, N., Kettebekov, S., Yeasin, M., & Sharma, R. (2002). A real-time framework for natural multimodal interaction with large screen displays. In International Conference on Multimodal Interfaces Proceedings of the 4th IEEE International Conference on Multimodal Interfaces-2002, pp. 349–354.

    Google Scholar 

  • Kuno, Y., Murashima, T., Shimada, N., & Shirai, Y. (2000). Intelligent wheelchair remotely controlled by interactive gestures. In Proceedings of 15th International Conference on Pattern Recognition (Barcelona, Sept. 3–7, 2000), pp. 672–675.

    Google Scholar 

  • Maat, L., & Pantic, M. (2007). Gaze-X: Adaptive affective multimodal interface for single-user office scenarios. In T. S. Huang, A. Nijholt, M. Pantic, & A. Pentland (Eds.), Artificial intelligence for human computing (Lecture Notes in Computer Science, Vol. 4451, pp. 251–271). Berlin: Springer.

    CrossRef  Google Scholar 

  • Nishikawa, A., Hosoi, T., Koara, K., Negoro, D., Hikita, A., Asano, S., et al. (2003). FAce MOUSe: A novel human-machine interface for controlling the position of a laparoscope. IEEE Transactions on Robotics and Automation, 19(5), 825–841.

    CrossRef  Google Scholar 

  • Pieraccini, R., Dayanidhi, K., Bloom, J., Dahan, J.-G., & Phillips, M. (2003). A Multimodal Conversational Interface for a Concept Vehicle. In Eurospeech 2003.

    Google Scholar 

  • Quek, F. (2003). The catchment feature model for multimodal language analysis. In Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV 2003) 2-Volume Set.

    Google Scholar 

  • Quek, F., McNeill, D., Bryll, R., Duncan, S., Ma, X. F., Kirbas, C., et al. (2002). Multimodal human discourse: Gesture and speech. ACM Transactions on ComputerHuman Interaction (TOCHI), 9(3), 171–193.

    CrossRef  Google Scholar 

  • Rauschert, I., Agrawal, P., Sharma, R., Fuhrmann, S., Brewer, I., & MacEachren, A. M. (2002). Designing a human-centered, multimodal GIS interface to support emergency management. In Proceedings of the 10th ACM International Symposium on Advances in Geographic Information Systems (McLean, VA, Nov. 8–9). ACM Press, New York, 2002, pp. 119–124.

    Google Scholar 

  • Reitter, D., Panttaja, E. M., & Cummins, F. (2004). UI on the Fly: Generating a multimodal user interface. In Human Language Technology Conference Proceedings of HLT-NAACL 2004. pp. 45–48.

    Google Scholar 

  • Rogalla, O., Ehrenmann, M., Zöllner, R., Becher, R., & Dillmann, R. (2002). Using gesture and speech control for commanding a robot assistant. In Proceedings of the IEEE International Workshop on Robot and Human Interactive Communication (Berlin, Sept. 25–27, 2002), pp. 454–459.

    Google Scholar 

  • Sreekanth, N. S., Gopinath, P., Supriya, N. P., & Narayanan, N. K. (2011). Gesture based desktop interaction. International Journal of Machine Intelligence, 3(4), 268–271. ISSN: 0975–2927, E-ISSN: 0975–9166.

    Google Scholar 

  • Sreekanth, N. S., Supriya, N. P., Girish, K. G, Arunjith, A, & Narayanan, N. K. (2008). Performing operations on graph through multimodal interface: An agent based architecture. In Proceedings of ICADIWT 2008. First International Conference on the Applications of Digital Information and Web Technologies, pp. 74–77.

  • Sreekanth, N. S., Supriya, N. P., Thomas, M., Haassan, A., & Narayanan, N. K. (2009). Multimodal interface: Fusion of various modalities, multimodal interface fusion of various modalities. International Journal of Information Studies, 1(2), 131–137.

    Google Scholar 

  • Starner, T., Auxier, J., Ashbrook, D., & Gandy, M. (2000). The gesture pendant: A self-illuminating, wearable, infrared computer-vision system for home-automation control and medical monitoring. In Proceedings of the Fourth International Symposium on Wearable Computers (Atlanta, Oct. 2000), pp. 87–94.

    Google Scholar 

  • Thomas, M., Hassan, A., Sreekanth N. S., Supriya, N. P. (2008). Multimodal interface to desktop. In Proceedings of International Conference on Opensource Computing-2008, NMAMIT-Nitte Mangalor, Indiae, 2008, pp. 26–30.

    Google Scholar 

  • Turunen, M., Hakulinen, J., Kainulainen, A., Melto, A., & Hurtig, T. (2007) Design of a rich multimodal interface for mobile spoken route guidance. In Proceedings of Interspeech 2007–Eurospeech, pp. 2193–2196.

    Google Scholar 

  • W3C. (2003). Multimodal interaction frame work. http://www.w3.org/TR/mmi-framework/

  • W3C. (2009). EMMA: Extensible multimodal annotation markup language http://www.w3.org/TR/emma/

  • W3C. (2014). Multimodal interaction working group charter. http://www.w3.org/2011/03/mmi-charter

  • W3C. (n. d.). Multimodal Access http://www.w3.org/standards/webofdevices/multimodal

  • Wachs, J., Stern, H., Edan, Y., Gillam, M., Feied, C., Smith, M., et al. (2008). A hand-gesture sterile tool for browsing MRI images in the OR. Journal of the American Medical Informatics Association, 15(3), 321–323.

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to N. S. Sreekanth .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Sreekanth, N.S. et al. (2016). Multimodal Interface for Effective Man Machine Interaction. In: Lugmayr, A., Dal Zotto, C. (eds) Media Convergence Handbook - Vol. 2. Media Business and Innovation. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54487-3_14

Download citation