MOBSY: Integration of Vision and Dialogue in Service Robots

Zobel, Matthias; Denzler, Joachim; Heigl, Benno; Nöth, Elmar; Paulus, Dietrich; Schmidt, Jochen; Stemmer, Georg

doi:10.1007/3-540-48222-9_4

Matthias Zobel⁶,
Joachim Denzler⁶,
Benno Heigl⁶,
Elmar Nöth⁶,
Dietrich Paulus⁶,
Jochen Schmidt⁶ &
…
Georg Stemmer⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2095))

Included in the following conference series:

International Conference on Computer Vision Systems

515 Accesses
5 Citations

Abstract

MOBSY is a fully integrated autonomous mobile service robot system. It acts as an automatic dialogue based receptionist for visitors of our institute. MOBSY incorporates many techniques from different research areas into one working stand-alone system. Especially the computer vision and dialogue aspects are of main interest from the pattern recognition’s point of view. To summarize shortly, the involved techniques range from object classification over visual self-localization and recalibration to object tracking with multiple cameras. A dialogue component has to deal with speech recognition, understanding and answer generation. Further techniques needed are navigation, obstacle avoidance, and mechanisms to provide fault tolerant behavior. This contribution introduces our mobile system MOBSY. Among the main aspects vision and speech, we focus also on the integration aspect, both on the methodological and on the technical level. We describe the task and the involved techniques. Finally, we discuss the experiences that we gained with MOBSY during a live performance at the 25th anniversary of our institute.

Abstract

This work was supported by the “Deutsche Forschungsgemeinschaft” under grant SFB603/TP B2, C2 and by the “Bayerische Forschungsstiftung” under grant DIROKOL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

http://www5.informatik.uni-erlangen.de/~mobsy.
U. Ahlrichs, J. Fischer, J. Denzler, Ch. Drexler, H. Niemann, E. Nöth, and D. Paulus. Knowledge based image and speech analysis for service robots. In Proceedings Integration of Speech and Image Understanding, pages 21–47, Corfu, Greece, 1999. IEEE Computer Society.
Google Scholar
J. Bins B. Draper and K. Baek. Adore: Adaptive object recognition. In Christensen [9], pages 522–537.
Google Scholar
R. Bischoff. Recent advances in the development of the humanoid service robot hermes. In 3rd EUREL Workshop and Masterclass–European Advanced Robotics Systems Development, volume I, pages 125–134, Salford, U.K., April 2000.
Google Scholar
R. Bischoff and T. Jain. Natural communication and interaction with humanoid robots. In Second International Symposium on Humanoid Robots, pages 121–128, Tokyo, 1999.
Google Scholar
A. Black, P. Taylor, R. Caley, and R. Clark. The festival speech synthesis system, last visited 4/10/2001. http://www.cstr.ed.ac.uk/projects/festival.html.
W. Burgard, A.B. Cremers, D. Fox, D. Hähnel, G. Lakemeyer, D. Schulz, W. Steiner, and S. Thrun. The interactive museum tour-guide robot. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 11–18, Madison, Wisconsin, July 1998.
Google Scholar
Douglas Chai and King N. Ngan. Locating facial region of a head-and-shoulders color image. In Proceedins Third IEEE International Conference on Automatic Face and Gesture Recognition, pages 124–129, Nara, Japan, 1998. IEEE Computer Society Technical Commitee on Pattern Analysis and Machine Intelligence (PAMI).
Google Scholar
H. Christensen, editor. Computer Vision Systems, Heidelberg, Jan. 1999. Springer.
Google Scholar
J. Denzler, R. Beβ J. Hornegger, H. Niemann, and D. Paulus. Learning, tracking and recognition of 3D objects. In V. Graefe, editor, International Conference on Intelligent Robots and Systems–Advanced Robotic Systems and Real World, volume 1, pages 89–96, München, 1994.
Google Scholar
F. Gallwitz. Integrated Stochastic Models for Spontaneous Speech Recognition. Phd-thesis, Technische Fakultät der Universität Erlangen-Nürnberg, Erlangen. to appear.
Google Scholar
F. Gallwitz, M. Aretoulaki, M. Boros, J. Haas, S. Harbeck, R. Huber, H. Niemann, and E. Nöth. The Erlangen Spoken Dialogue System EVAR: A State-of-the-Art Information Retrieval System. In Proceedings of 1998 International Symposium on Spoken Dialogue (ISSD 98), pages 19–26, Sydney, Australia, 1998.
Google Scholar
U. Hanebeck, C. Fischer, and G Schmidt. Roman: A mobile robotic assistant for indoor service applications. In Proceedings of the IEEE RSJ International Conference on Intelligent Robots and Systems (IROS), pages 518–525, 1997.
Google Scholar
B. Heigl, J. Denzler, and H. Niemann. Combining computer graphics and computer vision for probabilistic visual robot navigation. In Jacques G. Verly, editor, Enhanced and Synthetic Vision 2000, volume 4023 of Proceedings of SPIE, pages 226–235, Orlando, FL, USA, April 2000.
Google Scholar
B. Heigl, R. Koch, M. Pollefeys, J. Denzler, and L. Van Gool. Plenoptic modeling and rendering from image sequences taken by a hand-held camera. In W. Förstner, J.M. Buhmann, A. Faber, and P. Faber, editors, Mustererkennung 1999, pages 94–101, Heidelberg, 1999. Springer.
Google Scholar
Th. Joachims. Making large-scale support vector machine learning practical. In Schölkopf et al. [23], pages 169–184.
Google Scholar
Gregor Möhler, Bernd Möbius, Antje Schweitzer, Edmilson Morais, Norbert Braunschweiler, and Martin Haase. The german festival system, last visited 4/10/2001. http://www.ims.unistuttgart.de/phonetik/synthesis/index.html.
H. Niemann, V. Fischer, D. Paulus, and J. Fischer. Knowledge based image understanding by iterative optimization. In G. Görz and St. Hölldobler, editors, KI-96: Advances in Artificial Intelligence, volume 1137 (Lecture Notes in Artificial Intelligence), pages 287–301. Springer, Berlin, 1996.
Google Scholar
E. Nöth, J. Haas, V. Warnke, F. Gallwitz, and M. Boros. A hybrid approach to spoken dialogue understanding: Prosody, statistics and partial parsing. In Proceedings European Conference on Speech Communication and Technology, volume 5, pages 2019–2022, Budapest, Hungary, 1999.
Google Scholar
D. Paulus, U. Ahlrichs, B. Heigl, J. Denzler, J. Hornegger, and H. Niemann. Active knowledge based scene analysis. In Christensen [9], pages 180–199.
Google Scholar
D. Paulus and J. Hornegger. Applied pattern recognition: A practical introduction to image and speech processing in C++. Advanced Studies in Computer Science. Vieweg, Braunschweig, 3rd edition, 2001.
Google Scholar
D. Paulus, J. Hornegger, and H. Niemann. Software engineering for image processing and analysis. In B. Jähne, P. Geiβler, and H. Hauβecker, editors, Handbook of Computer Vision and Applications, volume 3, pages 77–103. Academic Press, San Diego, 1999.
Google Scholar
B. Schölkopf, Ch. Burges, and A. Smola, editors. Advances in Kernel Methods: Support Vector Learning. The MIT Press, Cambridge, London, 1999.
Google Scholar
S. Thrun, M. Bennewitz, W. Burgard, A. Cremers, F. Dellaert, D. Fox, D. Hahnel, C. Rosenberg, J. Schulte, and D. Schulz. Minerva: A second-generation museum tour-guide robot. In Proceedings of the IEEE International Conference on Robotics Automation (ICRA), pages 1999–2005, 1999.
Google Scholar
R.Y. Tsai. A versatile camera calibration technique for high-accuracy 3d machine vision metrology using off-the-shelf tv cameras and lenses. IEEE Journal of Robotics and Automation, 3(4):323–344, 1987.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Lehrstuhl für Mustererkennung, Universität Erlangen-Nürnberg, Martensstr. 3, 91058, Erlangen, Germany
Matthias Zobel, Joachim Denzler, Benno Heigl, Elmar Nöth, Dietrich Paulus, Jochen Schmidt & Georg Stemmer

Authors

Matthias Zobel
View author publications
You can also search for this author in PubMed Google Scholar
Joachim Denzler
View author publications
You can also search for this author in PubMed Google Scholar
Benno Heigl
View author publications
You can also search for this author in PubMed Google Scholar
Elmar Nöth
View author publications
You can also search for this author in PubMed Google Scholar
Dietrich Paulus
View author publications
You can also search for this author in PubMed Google Scholar
Jochen Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Georg Stemmer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ETH Zentrum, IFW B26.1, 8092, Zürich, Switzerland
Bernt Schiele
Technische Fakultät, Angewandte Informatik, Universität Bielefeld, Postfach 100 131, 33501, Bielefeld, Germany
Gerhard Sagerer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zobel, M. et al. (2001). MOBSY: Integration of Vision and Dialogue in Service Robots. In: Schiele, B., Sagerer, G. (eds) Computer Vision Systems. ICVS 2001. Lecture Notes in Computer Science, vol 2095. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48222-9_4

Download citation

DOI: https://doi.org/10.1007/3-540-48222-9_4
Published: 28 June 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42285-3
Online ISBN: 978-3-540-48222-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

MOBSY: Integration of Vision and Dialogue in Service Robots

Abstract

Abstract

Access this chapter

Preview

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation