Visual Person Localization with Dynamic Neural Fields: Towards a Gesture Recognition System
For any visually-based interaction between persons and acting systems within a real-world environment the localization of a user by the system is a necessary condition. The presented work deals with this visual localization problem of a user concretely referred to the autonomous mobile robot system MILVA of our department. Since this system is applied under real-world conditions especially for the localization some proper techniques are needed which have an adequate robustness. In our opinion, this requires the combination of several components of saliency towards a multi-cue approach, consisting of structure- and color-based features .
This paper introduces one of them: the localization based on a typical shape of contour. A simple contour shape prototype model consists of an arrangement of oriented filters doing a piecewise approximation of the upper shape (head, shoulder) of a frontally aligned person. Applying such filter arrangement in a multiresolution manner, this leads to a robust localization of frontally aligned persons even in depth. The central problem of selecting the most promising (salient) image region is treated by means of a three-dimensional dynamic neural field performing a dynamic winner-take-all process (WTA, [1,6]).
After a successful localization of a person one can start a more detailed analysis of the gesture’s meaning: besides the recognition of static gestures we also concentrate on the acquisition and later the recognition of dynamic gestures.
Unable to display preview. Download preview PDF.
- Boehme H.-J., Braumann U.-D., Brakensiek A., Corradini A., Krabbes M. and Gross H.-M.: User Localization for Visually-based Human-Machine-Interaction, In Proc. of the Third IEEE Int. Conf. on Automatic Face and Gesture Recognition, pp. 486–491, 1998Google Scholar
- Davis, J. W. and Bobick, A. F.: The Representation and Recognition of Action Using Temporal Templates. In Proc. of the IEEE Computer Society Conference on Comp. Vis. and Patt. Rec., pp. 928–934, 1997.Google Scholar
- Jähne, B.: Digital Image Processing Springer-Verlag, 1995.Google Scholar
- Jones, J. P. and Palmer, L. A.: An Evaluation of the Two-Dimensional Gabor Filter Model of Simple Receptive Fields in Cat Striate Cortex. J. Neurophysiol., 58(6):1233–1258, 1987.Google Scholar
- Koffka, K.: Principles of the Gestalt Psychology. Brace & World, 1935.Google Scholar
- Oren, M., Papageorgiou, C., Sinha, P., Osuna, E., and Poggio, T.: Pedestrian Detection Using Wavelet Templates. In Proc. of the IEEE Computer Society Conference on Comp. Vis. and Patt. Rec., pp. 193–199, 1997.Google Scholar