Binocular Stereo Vision

Zhang, Yu-Jin

doi:10.1007/978-981-19-7580-6_6

Yu-Jin Zhang²

1049 Accesses
2 Citations

Abstract

The human visual system is a natural stereoscopic vision system that acquires 3-D information through binocular imaging.

In computer vision, stereo vision mainly studies how to use (multi-image) imaging technology to obtain distance (depth) information of objects in a scene from (multiple) images. This chapter will introduce the workflow of stereo vision, and analyzes the six functional modules involved in the process of stereo vision one by one. This chapter will discuss the method of matching binocular images based on regions. First, the principle of mask matching is introduced, and then the focus is on the detailed analysis of various constraints in stereo matching. This chapter will discuss the method of matching binocular images based on features. Based on the introduction of the basic steps and methods, the widely used Scale Invariant Feature Transformation (SIFT) is described in detail, and dynamic programming based on ordering constraints is also discussed. This chapter will introduce a method for detecting and correcting the errors of the parallax map/image, which is characterized by being more versatile and fast.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kuvich G. Active vision and image/video understanding systems for intelligent manufacturing. SPIE, 2004, 5605: 74-86.
Google Scholar
Maitre H, Luo W. Using models to improve stereo reconstruction. IEEE-PAMI, 1992, 14(2): 269-277.
Google Scholar
Zhang Y-J. Image Engineering, Vol.1: Image Processing. De Gruyter, 2017.
Google Scholar
Zhang Y-J. Image Engineering, Vol.3: Image Understanding. De Gruyter, 2017.
Google Scholar
Forsyth D, Ponce J. Computer Vision: A Modern Approach, 2nd Ed. Prentice Hall. 2012.
Google Scholar
Davies E R. Machine Vision: Theory, Algorithms, Practicalities. 3rd Ed. Elsevier. 2005.
Google Scholar
Lew M S, Huang T S, Wong K. Learning and feature selection in stereo matching. IEEE-PAMI, 1994, 16(9): 869-881.
Google Scholar
Kim Y C, Aggarwal J K. Positioning three-dimensional objects using stereo images. IEEE-RA, 1987, 1: 361-373.
Google Scholar
Nixon M S, Aguado A S. Feature Extraction and Image Processing. 2nd. Ed. Academic Press. 2008.
Google Scholar
Zhang Y-J. Image Engineering, Vol.2: Image Analysis. De Gruyter, 2017.
Google Scholar
Forsyth D, Ponce J. Computer Vision: A Modern Approach. Prentice Hall. 2003.
Google Scholar
Jia B, Zhang Y-J, Lin X G. General and fast algorithm for disparity error detection and correction. Journal of Tsinghua University (Sci & Tech), 2000, 40(1): 28-31.
Google Scholar
Huang X M, Zhang Y-J. An O(1) disparity refinement method for stereo matching. Pattern Recognition, 2016, 55: 198-206.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Tsinghua University, Beijing, China
Yu-Jin Zhang

Authors

Yu-Jin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Self-Test Questions

The following questions include both single-choice questions and multiple-choice questions, so each option must be judged.

6.1
Stereo Vision Process and Modules
1. 6.1.1
  In the stereo vision process shown in Fig. 6.1, (·).
  1. (a)
    Image acquisition should be carried out on the basis of camera calibration.
  2. (b)
    The function of the feature extraction module is to extract the features of the pixel set for matching.
  3. (c)
    The depth interpolation in post-processing is to help stereo matching.
  4. (d)
    Post-processing is needed because the 3-D information obtained is often incomplete or has certain errors.
[Hint] Consider the sequence of the stereo vision process.
1. 6.1.2
  Consider the various modules in the stereo vision process given in Fig. 6.1, (·).
  1. (a)
    The stereo matching module is only used when it can directly make 3-D imaging.
  2. (b)
    The feature extraction module can directly extract the gray value of the pixel set as a feature.
  3. (c)
    The image acquisition module can directly acquire 3-D images to achieve 3-D information recovery.
  4. (d)
    The function of the 3-D information recovery module is to establish the relationship between the image points of the same space points in different images.
[Hint] Consider the respective functions and connections of each module.
1. 6.1.3
  Which of the following description(s) is/are incorrect? (·).
  1. (a)
    Although the positioning accuracy of large-scale features is poor, they contain a lot of information and match faster.
  2. (b)
    If only a single camera is used for image acquisition, there is no need for calibration.
  3. (c)
    The gray values of pixels in small regions are relatively related, so it is suitable for grayscale correlation matching.
  4. (d)
    If the camera baseline is relatively short, the difference between the captured images will be relatively large.
[Hint] Analyze the meaning of each description carefully.
6.2
Region-Based Stereo Matching
1. 6.2.1
  In template matching, (·).
  1. (a)
    The template used must be square.
  2. (b)
    The size of the template used must be smaller than the size of the image to be matched.
  3. (c)
    The matching positions determined by the correlation function and the minimum mean square error function are consistent.
  4. (d)
    The matching position calculated by the correlation coefficient does not change with the gray value of the template and the matching image.
[Hint] Matching is to determine the most relevant position.
1. 6.2.2
  Among the various constraints used for matching, (·).
  1. (a)
    The epipolar line constraint restricts the position of the pixel.
  2. (b)
    The uniqueness constraint restricts the attributes of pixels.
  3. (c)
    The continuity constraint restricts the position of pixels.
  4. (d)
    The compatibility constraint restricts the attributes of pixels.
[Hint] The attribute of the pixel corresponds to f, while the position corresponds to (x, y).
1. 6.2.3
  In the following description of epipolar line constraint, (·).
  1. (a)
    The epipolar constraint can help reduce the amount of calculation by half in the matching search process.
  2. (b)
    The epipolar line in one imaging plane and the extreme point in another imaging plane are corresponding.
  3. (c)
    The epipolar line pattern can provide information about the relative position and orientation between two cameras.
  4. (d)
    For any point on an imaging plane, all points corresponding to it on the imaging plane 2 are on the same straight line.
[Hint] Refer to Example 6.2–Example 6.4.
1. 6.2.4
  Comparing the essential matrix and the fundamental matrix, (·).
  1. (a)
    The degree of freedom of the essential matrix is more than that of the fundamental matrix.
  2. (b)
    The role or function of the fundamental matrix and the essential matrix is similar.
  3. (c)
    The essential matrix is derived from uncorrected cameras.
  4. (d)
    The fundamental matrix reflects the relationship between the projection point coordinates of the same space points on two images.
[Hint] Consider the different conditions in the derivation of the two matrices.
6.3
Feature-Based Stereo Matching
1. 6.3.1
  For feature-based stereo matching technology, (·).
  1. (a)
    It is not very sensitive to the surface structure of the scene and light reflection.
  2. (b)
    The feature point pair used is the point determined according to the local properties in the image.
  3. (c)
    Each point in the stereo image pair can be used as a feature point in turn for matching.
  4. (d)
    The matching result is not yet a dense parallax field.
[Hint] Consider the particularity of the features.
1. 6.3.2
  Scale Invariant Feature Transformation (·).
  1. (a)
    Needs to use multi-scale representation of images
  2. (b)
    Needs to search for extreme values in 3-D space
  3. (c)
    In which the 3-D space here includes position, scale, and direction
  4. (d)
    In which the Gaussian difference operator used is a smoothing operator
[Hint] Analyze the meaning of each calculation step in the scale invariant feature transformation.
1. 6.3.3
  For sequential constraints, (·).
  1. (a)
    It indicates that the feature points on the visible surface of the object are in the same order as their projection points on the two images.
  2. (b)
    It can be used to design a stereo matching algorithm based on dynamic programming.
  3. (c)
    It may not be true/hold when there is occlusion between objects.
  4. (d)
    When the graphical representation is performed according to the dynamic programming method, the interval between some feature points will degenerate into one point, and the order of constraint determination is invalid.
[Hint] Analyze the conditions for the establishment of sequential constraints.
6.4
Error Detection and Correction of Parallax Map
1. 6.4.1
  In the method of parallax map error detection and correction, (·).
  1. (a)
    Only the region where the crossing number is not zero should be considered.
  2. (b)
    The crossing number in a region is proportional to the size of the region.
  3. (c)
    To calculate the total crossing number, twice summations are performed.
  4. (d)
    The crossing number in a region is proportional to the length of the region.
[Hint] Consider the definition and connection of the crossing number and the total crossing number.
1. 6.4.2
  Analyze the following statements, which is/are correct? (·).
  1. (a)
    In the crossing region, the crossing values of adjacent points differ by 1.
  2. (b)
    The zero-crossing correction algorithm must make N_tc = 0, so it is named.
  3. (c)
    The sequential matching constraint refers to the sequential constraint, so it indicates that the order of the space points is reversed to the order of their imaging points.
  4. (d)
    The zero-crossing correction algorithm is an iterative algorithm. After each iteration, the total crossing numbers will always decrease.
[Hint] Analyze the meaning of each step in the zero-crossing correction algorithm.
1. 6.4.3
  In Example 6.8, N_tc = 28 before correction; please find a new matching point f_L(187, j) that corresponds to f_R(160, j) and can reduce N_tc. It can correct the parallax value d(160, j), corresponding to f_R(160, j), to d(160, j) = X[f_L(187, j)] − X[f_R(160, j)] = 27. At this time, (·).
  1. (a)
    N_tc = 16
  2. (b)
    N_tc = 20
  3. (c)
    N_tc = 24
  4. (d)
    N_tc = 28
[Hint] The crossing number on the left side of the correction point f_R(160, j) will decrease but on the right side may increase. Need specific calculations.
1. 6.4.4
  On the basis of 6.4.3, find the f_R(161, j) with the largest crossing number, and determine the new matching point corresponding to f_R(161, j) that can reduce N_tc. In this way, the correction can make the total crossing number N_tc drop to (·).
  1. (a)
    20
  2. (b)
    15
  3. (c)
    10
  4. (d)
    5
[Hint] The new matching point corresponding to f_R(161, j) and capable of reducing N_tc is f_L(188, j).

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhang, YJ. (2023). Binocular Stereo Vision. In: 3-D Computer Vision. Springer, Singapore. https://doi.org/10.1007/978-981-19-7580-6_6

Download citation

DOI: https://doi.org/10.1007/978-981-19-7580-6_6
Published: 01 February 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-7579-0
Online ISBN: 978-981-19-7580-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Binocular Stereo Vision

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Self-Test Questions

Self-Test Questions

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation