Skip to main content

MOBSY: Integration of Vision and Dialogue in Service Robots

  • Conference paper
  • First Online:
Computer Vision Systems (ICVS 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2095))

Included in the following conference series:

Abstract

MOBSY is a fully integrated autonomous mobile service robot system. It acts as an automatic dialogue based receptionist for visitors of our institute. MOBSY incorporates many techniques from different research areas into one working stand-alone system. Especially the computer vision and dialogue aspects are of main interest from the pattern recognition’s point of view. To summarize shortly, the involved techniques range from object classification over visual self-localization and recalibration to object tracking with multiple cameras. A dialogue component has to deal with speech recognition, understanding and answer generation. Further techniques needed are navigation, obstacle avoidance, and mechanisms to provide fault tolerant behavior. This contribution introduces our mobile system MOBSY. Among the main aspects vision and speech, we focus also on the integration aspect, both on the methodological and on the technical level. We describe the task and the involved techniques. Finally, we discuss the experiences that we gained with MOBSY during a live performance at the 25th anniversary of our institute.

Abstract

This work was supported by the “Deutsche Forschungsgemeinschaft” under grant SFB603/TP B2, C2 and by the “Bayerische Forschungsstiftung” under grant DIROKOL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://www5.informatik.uni-erlangen.de/~mobsy.

  2. U. Ahlrichs, J. Fischer, J. Denzler, Ch. Drexler, H. Niemann, E. Nöth, and D. Paulus. Knowledge based image and speech analysis for service robots. In Proceedings Integration of Speech and Image Understanding, pages 21–47, Corfu, Greece, 1999. IEEE Computer Society.

    Google Scholar 

  3. J. Bins B. Draper and K. Baek. Adore: Adaptive object recognition. In Christensen [9], pages 522–537.

    Google Scholar 

  4. R. Bischoff. Recent advances in the development of the humanoid service robot hermes. In 3rd EUREL Workshop and Masterclass–European Advanced Robotics Systems Development, volume I, pages 125–134, Salford, U.K., April 2000.

    Google Scholar 

  5. R. Bischoff and T. Jain. Natural communication and interaction with humanoid robots. In Second International Symposium on Humanoid Robots, pages 121–128, Tokyo, 1999.

    Google Scholar 

  6. A. Black, P. Taylor, R. Caley, and R. Clark. The festival speech synthesis system, last visited 4/10/2001. http://www.cstr.ed.ac.uk/projects/festival.html.

  7. W. Burgard, A.B. Cremers, D. Fox, D. Hähnel, G. Lakemeyer, D. Schulz, W. Steiner, and S. Thrun. The interactive museum tour-guide robot. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 11–18, Madison, Wisconsin, July 1998.

    Google Scholar 

  8. Douglas Chai and King N. Ngan. Locating facial region of a head-and-shoulders color image. In Proceedins Third IEEE International Conference on Automatic Face and Gesture Recognition, pages 124–129, Nara, Japan, 1998. IEEE Computer Society Technical Commitee on Pattern Analysis and Machine Intelligence (PAMI).

    Google Scholar 

  9. H. Christensen, editor. Computer Vision Systems, Heidelberg, Jan. 1999. Springer.

    Google Scholar 

  10. J. Denzler, R. Beβ J. Hornegger, H. Niemann, and D. Paulus. Learning, tracking and recognition of 3D objects. In V. Graefe, editor, International Conference on Intelligent Robots and Systems–Advanced Robotic Systems and Real World, volume 1, pages 89–96, München, 1994.

    Google Scholar 

  11. F. Gallwitz. Integrated Stochastic Models for Spontaneous Speech Recognition. Phd-thesis, Technische Fakultät der Universität Erlangen-Nürnberg, Erlangen. to appear.

    Google Scholar 

  12. F. Gallwitz, M. Aretoulaki, M. Boros, J. Haas, S. Harbeck, R. Huber, H. Niemann, and E. Nöth. The Erlangen Spoken Dialogue System EVAR: A State-of-the-Art Information Retrieval System. In Proceedings of 1998 International Symposium on Spoken Dialogue (ISSD 98), pages 19–26, Sydney, Australia, 1998.

    Google Scholar 

  13. U. Hanebeck, C. Fischer, and G Schmidt. Roman: A mobile robotic assistant for indoor service applications. In Proceedings of the IEEE RSJ International Conference on Intelligent Robots and Systems (IROS), pages 518–525, 1997.

    Google Scholar 

  14. B. Heigl, J. Denzler, and H. Niemann. Combining computer graphics and computer vision for probabilistic visual robot navigation. In Jacques G. Verly, editor, Enhanced and Synthetic Vision 2000, volume 4023 of Proceedings of SPIE, pages 226–235, Orlando, FL, USA, April 2000.

    Google Scholar 

  15. B. Heigl, R. Koch, M. Pollefeys, J. Denzler, and L. Van Gool. Plenoptic modeling and rendering from image sequences taken by a hand-held camera. In W. Förstner, J.M. Buhmann, A. Faber, and P. Faber, editors, Mustererkennung 1999, pages 94–101, Heidelberg, 1999. Springer.

    Google Scholar 

  16. Th. Joachims. Making large-scale support vector machine learning practical. In Schölkopf et al. [23], pages 169–184.

    Google Scholar 

  17. Gregor Möhler, Bernd Möbius, Antje Schweitzer, Edmilson Morais, Norbert Braunschweiler, and Martin Haase. The german festival system, last visited 4/10/2001. http://www.ims.unistuttgart.de/phonetik/synthesis/index.html.

  18. H. Niemann, V. Fischer, D. Paulus, and J. Fischer. Knowledge based image understanding by iterative optimization. In G. Görz and St. Hölldobler, editors, KI-96: Advances in Artificial Intelligence, volume 1137 (Lecture Notes in Artificial Intelligence), pages 287–301. Springer, Berlin, 1996.

    Google Scholar 

  19. E. Nöth, J. Haas, V. Warnke, F. Gallwitz, and M. Boros. A hybrid approach to spoken dialogue understanding: Prosody, statistics and partial parsing. In Proceedings European Conference on Speech Communication and Technology, volume 5, pages 2019–2022, Budapest, Hungary, 1999.

    Google Scholar 

  20. D. Paulus, U. Ahlrichs, B. Heigl, J. Denzler, J. Hornegger, and H. Niemann. Active knowledge based scene analysis. In Christensen [9], pages 180–199.

    Google Scholar 

  21. D. Paulus and J. Hornegger. Applied pattern recognition: A practical introduction to image and speech processing in C++. Advanced Studies in Computer Science. Vieweg, Braunschweig, 3rd edition, 2001.

    Google Scholar 

  22. D. Paulus, J. Hornegger, and H. Niemann. Software engineering for image processing and analysis. In B. Jähne, P. Geiβler, and H. Hauβecker, editors, Handbook of Computer Vision and Applications, volume 3, pages 77–103. Academic Press, San Diego, 1999.

    Google Scholar 

  23. B. Schölkopf, Ch. Burges, and A. Smola, editors. Advances in Kernel Methods: Support Vector Learning. The MIT Press, Cambridge, London, 1999.

    Google Scholar 

  24. S. Thrun, M. Bennewitz, W. Burgard, A. Cremers, F. Dellaert, D. Fox, D. Hahnel, C. Rosenberg, J. Schulte, and D. Schulz. Minerva: A second-generation museum tour-guide robot. In Proceedings of the IEEE International Conference on Robotics Automation (ICRA), pages 1999–2005, 1999.

    Google Scholar 

  25. R.Y. Tsai. A versatile camera calibration technique for high-accuracy 3d machine vision metrology using off-the-shelf tv cameras and lenses. IEEE Journal of Robotics and Automation, 3(4):323–344, 1987.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zobel, M. et al. (2001). MOBSY: Integration of Vision and Dialogue in Service Robots. In: Schiele, B., Sagerer, G. (eds) Computer Vision Systems. ICVS 2001. Lecture Notes in Computer Science, vol 2095. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48222-9_4

Download citation

  • DOI: https://doi.org/10.1007/3-540-48222-9_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42285-3

  • Online ISBN: 978-3-540-48222-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics