Skip to main content

Computer Vision

  • Chapter
  • First Online:
Introduction to Intelligent Robot System Design

Abstract

The way that humans perceive the environment mainly relies on vision, which accounts for approximately 80% of the total acquired information. Computer vision technology has been developed with the aim of providing machines with the same visual ability as humans, as well as higher precision ranging capabilities. This enables machines to be applied in various practical systems. For example, in a vision inspection system, a camera is used to capture image signal of the inspected target. The signal is then transmitted to a dedicated image processing system where it is converted into digital signal based on brightness, color, and other information. The image processing system performs various operations on these signals to extract visual features of the target, such as area, position, quantity, etc., thus achieving automatic recognition function.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 84.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://opencv.org

  2. 2.

    https://docs.opencv.org/2.4/modules/features2d/doc/feature_detection_and_description.html

    https://docs.opencv.org/4.x/dc/dc3/tutorial_py_matcher.html

References

  1. Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334

    Article  Google Scholar 

  2. Lowe David G. Object recognition from local scale-invariant features. International Conference on Computer Vision (ICCV 1999). International Conference on Computer Vision (ICCV 1999). pp:1150-1157, September, 1999, Corfu, Greece.

  3. Lowe David G (2004) Distinctive image features from scale-invariant keypoints. Int J Comp Vis 60(2):91–110

    Article  Google Scholar 

  4. Bay H, Tuytelaars T, and Van Goolm L. SURF: Speeded up robust features. Proceedings of European Conference on Computer Vision (ECCV 2006), pp:404-417, May 2006, Graz, Austria.

    Google Scholar 

  5. Calonder M, Lepetit V, Strecha C, and Fua P. BRIEF: Binary robust independent elementary features. Proceedings of European Conference on Computer Vision (ECCV 2010), pp:778-792, September 2010, Hersonissos, Greece.

    Google Scholar 

  6. Rublee E, Rabaud V, Konolige K, and Bradski GR. ORB: An efficient alternative to SIFT or SURF. Proceedings of IEEE International Conference on Computer Vision (ICCV 2011), 6-13 November, 2011, Barcelona, Spain.

    Google Scholar 

  7. Jian L, Yaozhen He X, Chen TL (2019) Improvement of ORB algorithm combining scale invariant features. Measurement and Control Technology 38(3):97–101. 107

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Appendices

Further Reading

Familiarize yourself with the meaning and differences of the three color models, RGB, YUV, and HSV (Hue-Saturation-Value).

Some of the color model characteristics are utilized in different fields of application in our life, such as RGB, HSV, and YUV. Each color model has its own representation space and components, with the ability of transforming from one color pace to another through standard formula. And they have their own advantages and disadvantages.

RGB color model, combined together as one colors from the three additive primary colors (red, green, and blue), is used in Computer graphics, Image processing, Analysis, Storage and other fields, which has the following advantages:

  1. 1.

    No transformations required to display information on the screen, for this reason it considered as the base color space for various applications.

  2. 2.

    Used in video display because of additive property.

  3. 3.

    Considered as computationally practical system.

And RGB color model has the following disadvantages:

  1. 1.

    Nonuseful for objects specification and recognition of colors.

  2. 2.

    Difficult to determine specific color in RGB model.

  3. 3.

    RGB reflects the use of CRTs, since it is hardware oriented system.

YUV, in which the Y component referred to the luminance of the color, and the U and V components determine the color itself (chromaticity), is used in TV broadcasting, Video system, etc. It has the advantage of the ability to decouple the luminance and color information where the image can be processed with no effect on other color components. And YUV model has the following disadvantages:

  1. 1.

    The color range is restricted in the color TV images because of the information compression required for the displayed image.

  2. 2.

    Due to the limitation of the YUV standard, the image displayed in computer cannot be recreated in TV screen.

In the HSV color model, the H referred to hue that measures color purity, S indicates the saturation (the degree of white color embedded in specific color), and the V denoted the value. HSV are used for computer vision and image analysis for segmentation process which has the advantage of HSV colors defined easily by human perception not like RGB, but undefined achromatic hue points are sensitive to value deviations of RGB and instability of hue, because of the angular nature of the feature. Other aspects, such as the conversion between them, can be found in the paper “Understanding color models: a review” written by Ibraheem N A, Hasan M M, Khan R Z, et al. in 2012.

Exercises

  1. 1.

    Based on the principles of camera calibration, without using the calibration toolkit, what are the key process functions for camera calibration using OpenCV library functions?

  2. 2.

    Another common method of image transformation is affine transformation. What is the difference between affine transformation and perspective transformation?

  3. 3.

    Given a set of corresponding points \( {\mathbf{x}}_i={\left({u}_i,{v}_i,1\right)}^{\mathrm{T}}\leftrightarrow {\mathbf{x}}_i^{\hbox{'}}={\left({u}_i^{\hbox{'}},{v}_i^{\hbox{'}},1\right)}^{\mathrm{T}}\left(i=1,2,\cdots N\right) \) between two images and their camera projection matrices P and P', try to derive the formula for calculating their corresponding three-dimensional space points.

  4. 4.

    It is known that four pairs of corresponding points xi = (ui, vi, 1)T and Xi = (Xi, Yi, 1)T(i = 1, …, 4) are the homogeneous coordinates of the feature points on the image plane and the homogeneous coordinates of the points on the spatial plane corresponding to the feature points of the image, respectively, and s is an unknown nonzero scale factor. Try to calculate the homography matrix H according to these four corresponding points xi ↔ Xi.

  5. 5.

    Compare and analyze several different feature extraction methods, feature description operators, and feature matching methods.

  6. 6.

    What are the common feature detection methods other than those described in this book?

  7. 7.

    Through experiments, quantitatively compare the differences in computational speed, rotational robustness, fuzzy robustness, and scale transformation robustness of SIFT, SURF, and ORB algorithms in extracting feature points.

  8. 8.

    For two given sample images, explain what method can be used to stitch the two images together in their common area and how this can be achieved.

  9. 9.

    What are the common image pre-processing methods? Given a noisy image \( f\left(x,y\right)=\left[\begin{array}{cccc}1& 1& 2& 2\\ {}1& 1& \underline{9}& 2\\ {}1& \underline{5}& 2& 2\\ {}1& 1& 2& 2\end{array}\right] \), if the median filter template\( M=\left[\begin{array}{ccc}0& 1& 0\\ {}1& 1& 1\\ {}0& 1& 0\end{array}\right] \) is used to process the noise points (already marked), please write the de-noising result.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Peng, G., Lam, T.L., Hu, C., Yao, Y., Liu, J., Yang, F. (2023). Computer Vision. In: Introduction to Intelligent Robot System Design. Springer, Singapore. https://doi.org/10.1007/978-981-99-1814-0_8

Download citation

Publish with us

Policies and ethics