A model-free voting approach for integrating multiple cues
Computer vision systems, such as “seeing” robots, aimed at functioning robustly in a natural environment rich on information benefit from relying on multiple cues. Then the problem of integrating these become central. Existing approaches to cue integration have typically been based on physical and mathematical models for each cue and used estimation and optimization methods to fuse the parameterizations of these models.
In this paper we consider an approach for fusion that does not rely on the underlying models for each cue. It is based on a simple binary voting scheme. A particular feature of such a scheme is that also incommensurable cues, such as intensity and surface orientation, can be fused in a direct way. Other features are that uncertainties and the normalization of them is avoided. Instead, consensus of several cues is considered as non-accidental and used as support for hypotheses of whatever structure is sought for. It is shown that only a small set of cues need to agree to obtain a reliable output.
We apply the proposed technique to finding instances of planar surfaces in binocular images, without resorting to scene reconstruction or segmentation. The results are of course not comparable to the best results that can be obtained by complete scene reconstruction. However, they provide the most obvious instances of planes also with rather crude assumptions and coarse algorithms. Even though the precise extent of the planar patches is not derived good overall hypotheses are obtained.
Our work applies voting schemes beyond earlier attempts, and also approaches the cue integration problem in a novel manner. Although further research is needed to establish the full applicability of our technique our results so far seem quite useful.
Keywordscue integration grouping and segmentation consensus voting model-free
- Bräutigam, C., Gårding, J. and Eklundh, J.-O. (1996). Seeing the obvious, Proc. 13th International Conference on Pattern Recognition, Vol. I, IEEE Computer Society Press, Vienna, Austria, pp. 67–72.Google Scholar
- Bülthoff, H. and Mallot, H. (1987). Interaction of different modules in depth perception, Proceedings of the First International Conference on Computer Vision, pp. 295–305.Google Scholar
- Burt, P. and Julesz, B. (1980). Modifications of the classical notion of panum's fusional area, Perception 9: 671–682.Google Scholar
- Clark, J. and Yuille, A. (1990). Data fusion for sensory information processing systems, Kluwer, Boston, Mass.Google Scholar
- Gårding, J. and Lindeberg, T. (1994). Direct estimation of local surface shape in a fixating binocular vision system, in J.-O. Eklundh (ed.), Proc. 3rd European Conference on Computer Vision, Vol. 800 of Lecture Notes in Computer Science, Springer Verlag, Berlin, Stockholm, Sweden, pp. 365–376.Google Scholar
- INRIA-Syntim (1994, 1995, 1996). Stereo images. *http://www-syntim.inria.fr/syntim/analyse/paires-eng.htmlGoogle Scholar
- Jones, D. and Malik, J. (1992). Determining three-dimensional shape from orientation and spatial frequency disparities, in G. Sandini (ed.), Proc. 2nd European Conference on Computer Vision, Vol. 588 of Lecture Notes in Computer Science, Springer Verlag, Berlin, pp. 661–669.Google Scholar
- Li, M. (1989). Hierarchical Multi-point Matching with Simultaneous Detection and Location of Breaklines, PhD thesis, Royal Institute of Technology, Department of Photogrammetry, S-100 44 Stockholm, Sweden.Google Scholar
- Lindeberg, T. (1995). Direct estimation of affine deformations of brightness patterns using visual front-end operators with automatic scale selection, Proc. 5th International Conference on Computer Vision, Cambridge, MA, pp. 134–141.Google Scholar
- Lindeberg, T. and Gårding, J. (1993). Shape from texture from a multi-scale perspective, Proc. 4th International Conference on Computer Vision, IEEE Computer Society Press, Berlin, Germany, pp. 683–691.Google Scholar
- Malik, J. (1987). Interpreting line drawings of curved objects, International Journal of Computer Vision pp. 73–104.Google Scholar
- Mundy, J. and Zisserman, A. (1992). Geometric Invariance in Computer Vision, MIT Press, Boston, MA.Google Scholar
- Pollard, S., Mayhew, J. and Frisby, J. (1985). PMF: A stereo correspondence algorithm using a disparity gradient limit, Perception 14: 449–470.Google Scholar
- Shakunaga, T. and Kaneko, H. (1988). Shape from angles under perspective projection, Proc. 2nd International Conference on Computer Vision.Google Scholar
- Sugihara, K. (1986). Machine Interpretation of Line Drawings, MIT Press, Cambridge, MA.Google Scholar
- Tyler, C. (1973). Stereoscopic vision: cortical limitations and a disparity scaling effect, Science 181: 276–278.Google Scholar
- Weiss, I. (1988). Projective invariants of shape, Proc. IEEE Conf. Computer Vision and Pattern Recognition, Vol. CVPR88, Ann Arbor, Michigan, June5–9, pp. 291–297.Google Scholar