Abstract
This paper presents an algorithm for the real-time computation of disparity using video stereo images captured by a stereo webcam. This algorithm is designed to provide both real-time throughput and robust disparity estimation for real-world applications where computation is limited to a pre-defined region-of-interest (ROI). More specifically, this algorithm is used as part of a hand-pair gesture recognition application where the disparity is computed for two ROI around a hand-pair identified by the segmentation component of the recognition application. The developed algorithm provides the required relative difference in disparity with background at high frame rates for the hand-pair gesture recognition application. The results obtained with an inexpensive commercial VGA stereo webcam show a robust disparity computation of 20 ms/frame enabling real-time hand-pair gesture recognition at 25 fps with >90% recognition rate for a maximum hand speed of 40 cm/s and for hand distances between 30 and 150 cm away from the camera.
Similar content being viewed by others
References
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47, 7–42 (2002)
Brown, M., Burschka, D., Hager, G.: Advances in computational stereo. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 993–1008 (2003)
Tombari, F., Mattoccia, S., Stefano, L., Addimanda, E.: Classification and evaluation of cost aggregation methods for stereo correspondence. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Kanade, T., Okutomi, M.: A stereo matching algorithm with an adaptive window: theory and experiment. IEEE Trans. Pattern Anal. Mach. Intell. 16(9), 920–932 (1994)
Fusiello, A., Roberto, V., Trucco, E.: Efficient stereo with multiple windowing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 858–863 (1997)
Jeon, J., Kim, C., Ho, Y.: Sharp and Dense Disparity Maps using Multiple Windows. Lecture Notes in Computer Science, vol. 2532, pp. 1057–1064. Springer, Berlin (2002)
Adhyapak, S., Kehtarnavaz, N., Nadin, M.: Stereo matching via selective multiple windows. SPIE J. Electr. Imaging. 16(1), 013012 (2007)
Kolmogorov, V., Zabih, R.: Computing visual correspondence with occlusions using graph cuts. In: Proceedings of Eighth IEEE International Conference on Computer Vision, vol. 2, pp. 508–515 (2001)
Wang, L., Liao, M., Gong, M., Yang, R., Nister, D.: High-quality real-time stereo using adaptive cost aggregation and dynamic programming. In: Third International Symposium on 3D Data Processing, Visualization, and Transmission, pp. 798–805 (2006)
Hirschmuller, H., Innocent, P., Garibaldi, J.: Real-time correlation-based stereo vision with reduced border errors. Int. J. Comput. Vis. 47, 229–246 (2002)
Gupta, R., Cho, S.: Real time stereo matching using adaptive binary window. In: Fifth International Symposium on 3-D Data Processing, Visualization and Transmission (2010)
Hirschmuller, H.: Accurate and efficient stereo processing by semi-global matching and mutual information. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 807–814 (2005)
Stefano, L., Marchionni, M., Mattoccia, S., Neri, G.: A fast area-based stereo matching algorithm. In: Fifteenth International Conference on Vision Interface, pp. 146–153 (2002)
Yang, Q., Engels, C., Akbarzadeh, A.: Near real-time stereo for weakly—textured scenes. In: Proceedings of 19th British Machine Vision Conference, pp. 80–87 (2008)
Mattoccia, S., Giardino, S., Gambini, A.: Accurate and efficient cost aggregation strategy for stereo correspondence based on approximated joint bilateral filtering. In: Proceedings of Asian Conference on Computer Vision (2009)
Humenberger, M., Zinner, C., Weber, M., Kubinger, W., Vincze, M.: A fast stereo matching algorithm suitable for embedded real-time systems. Comput. Vis. Image Underst. 114(11), 1180–1202 (2010)
Konolige, K.: Small vision system: hardware and implementation. In: Proceedings of Eight International Symposium on Robotics Research (1997)
Cyganek, B., Siebert, J.: An introduction to 3-D Computer Vision Techniques and Algorithms. John Wiley & Sons, New Jersey (2009)
Marr, D., Poggio, T.: A computational theory of human stereo vision. Proc. R. Soc. London B204, 301–328 (1979)
Prazdny, K.: Detection of binocular disparities. Biol. Cybern. 52, 93–99 (1985)
Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. In: Proceedings of Third European Conference on Computer Vision, pp. 151–158 (1994)
Muhlmann, K., Maier, D., Hesser, R., Manner, R.: Calculating dense disparity maps from color stereo images, an efficient implementation. Int. J. Comput. Vis. 47, 79–88 (2002)
Egnal, G., Wildes, R.: Detecting binocular half-occlusions: empirical comparisons of five approaches. IEEE Trans. Pattern Anal. Mach. Intell. 24(8), 1127–1133 (2002)
Devillard, N.: http://ndevilla.free.fr/median/median/src/quickselect.c (1998)
Patlolla, C.: Real-time hand-pair segmentation and gesture recognition, Master Thesis, Department of Electrical Engineering, University of Texas at Dallas (2011)
Rahman, M., Kehtarnavaz, N., Ren, J.: A hybrid face detection approach for real-time deployment on mobile devices. In: Proceedings of Asian Conference on Computer Vision (2009)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 511–518 (2001)
Shmaliy, Y.: An unbiased p-step predictive fir filter for a class of noise-free discrete-time models with independently observed states. Proc. Signal Image Video Process. Conf. 3(2), 127–135 (2009)
Senin, P.: Dynamic Time Warping Algorithm Review, Information and Computer Science Department, University of Hawaii at Manoa (2008)
Minoru stereo webcam: http://www.minoru3d.com/ (2009)
YouTube videoclip "Two hand gesture recognition using modified OpenCV stereo algorithm": http://www.youtube.com/watch?v=Y-LHD_zij88 (2011)
YouTube videoclip "Hand-Pair Gesture Recognition Using a Stereo Webcam for Augmented Reality Applications": http://www.youtube.com/watch?v=rXD0QHlJvGg (2011)
Acknowledgments
This project was sponsored by the Wireless Business Unit of Texas Instruments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mahotra, S., Patlolla, C. & Kehtarnavaz, N. Real-time computation of disparity for hand-pair gesture recognition using a stereo webcam. J Real-Time Image Proc 7, 257–266 (2012). https://doi.org/10.1007/s11554-011-0207-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-011-0207-8