Advertisement

Synthetic Feature Analysis

  • Scott Krig
Open Access
Chapter

Abstract

This appendix provides analysis of several common detectors against the synthetic feature alphabets described in  Chapter 7. The complete source code, shell scripts, and the alphabet image sets are available from Springer Apress at: http://www.apress.com/source-code/ComputerVisionMetrics

Keywords

Interest Point Rotational Invariance Image Pyramid Detector Behavior Synthetic Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This appendix provides analysis of several common detectors against the synthetic feature alphabets described in  Chapter 7. The complete source code, shell scripts, and the alphabet image sets are available from Springer Apress at: http://www.apress.com/source-code/ComputerVisionMetrics
Figure A-1.

Example analysis results from Test #4 below, (left) annotated image showing detector locations, (center) count of each alphabet feature detected, shown as a 2D shaded histogram, (right) set of 2D shaded histograms for rotated image sets showing all 10 detectors

This appendix contains:
  • Background on the analysis, methodology, goals, and expectations.

  • Synthetic alphabet ground truth image summary.

  • List of detector parameters used for standard OpenCV methods: SIFT, SURF, BRISK, FAST, HARRIS, GFFT, MSER, ORB, STAR, SIMPLEBLOB. Note: No feature descriptors are computed or used, only the detector portions of BRISK, SURF, SIFT, ORB, and STAR are used in the analysis.

  • Test 1: Interest point alphabets.

  • Test 2: Corner point alphabets.

  • Test 3: Synthetic alphabet overlays onto real images.

  • Test 4: Rotational invariance of detectors against synthetic alphabets.

Background Goals and Expectations

The main goals for the analysis are:
  • To develop some simple intuition about human vs. machine detection of interest point and corner detectors, to observe detector behavior on the synthetic alphabets, and to develop some understanding of the problems involved in designing and tuning feature detectors.

  • To measure detector anomalies among white, black, and gray versions of the alphabets. A human would recognize the same pattern easily whether or not the background and foreground are changed; however, detector design and parameter settings influence detector invariance to background and foreground polarity.

  • To measure detector sensitivity to slight pixel interpolation artifacts under rotation.

Note

Experienced practitioners with well-developed intuition regarding capabilities of interest point and corner detector methods may not find any surprises in this analysis.

The analysis uses several well-known detector methods as implemented in the OpenCV library; see Table A-1. The analysis provides detector information only, with no intention to compare detector goodness against any criteria. Details on which features from the synthetic alphabets are recognized by the various detectors is shown in summary tables, counting the number of times a feature is detected with each grid cell. For some applications, the synthetic interest point alphabet approach could be useful, assuming that an application-specific alphabet is designed, and detectors are designed and tuned for the application, such as a factory inspection application to identify manufactured objects or parts.
Table A-1.

Tuning Parameters for Detectors

Detector

Tuning Parameters

BRISK

octaves = 3 threshold = 30

FAST

threshold = 10 nonMaximalSuppression = TRUE

HARRIS

maxCorners = 60000 (to capture all detections) qualityLevel = 1.0 minDistance = 1 blockSize = 3 useHarrisDetecror = TRUE k = .04

GFFT

maxCorners = 60000 (to capture all detections) qualityLevel = .01 minDistance = 1.0 blockSize = 3 useHarrisDetector = FALSE k = .04

MSER

Delta = 5 minArea = 60

maxArea 14400 maxvariation = .25 minDiversity = .2 maxEvolution = 200 areaThreshold = 1.01 minMargin = .003 edgeBlurSize = 5

ORB

WTA_K = 2 edgeThreshold = 31 firstLevel = 0 nFeatures = 60000 (to capture all detections) nLevels = 8 patchSize = 31 scaleFactor = 1.2 scoreType = 0

SIFT

contrastThreshold = 4.0 edgeThreshhold = 10.0 nFeatures = 0 nOctaveLayers = 3 sigma = 1.0

STAR

maxSize = 45 responseThreshold = 30 lineThresholdProjected = 10 lineThresholdBinarized = 8

SURF

Extended = 0 hessianThreshold = 100.0 nOctaveLayers = 3 nOctaves = 4 upright = 0

SIMPLEBLOB

thresholdStep = 10

minThreshold = 50

maxThreshold = 220

minRepeatability = 2

minDistBetweenBlobs = 10

filterByColor = true

blobColor = 0

filterByArea = true

minArea = 25

maxArea = 5000

filterByCircularity = false

minCircularity = 0.8f

maxCircularity = std::numeric_limits<float>::max()

filterByInertia = true

minInertiaRatio = 0.1f

maxInertiaRatio = std::numeric_limits<float>::max()

filterByConvexity = true

minConvexity = 0.95f

maxConvexity = std::numeric_limits<float>::max()

Test Methodology and Results

The images in the ground truth data set are used as input for a few modified OpenCV tests:
  • opencv_test_features2d

(BRISK, FAST, HARRIS, GFFT, MSER, ORB, STAR, SIMPLEBLOB)
  • opencv_test_nonfree

(SURF, SIFT)

The tuning parameters used for each detector are shown in Table A-1; see the OpenCV documentation for more information. Note: no attempt is made to tune the detector parameters for the synthetic alphabets. Parameter settings are reasonable defaults; however, the maximum keypoint feature count is bumped up in some cases to allow all the detected features to be recorded.

Each test produces a variety of results, including:
  1. 1.

    Annotated images showing location and orientation (if provided) for detected features.

     
  2. 2.

    Summary count of each detected synthetic feature across the grid in text files, including interest point coordinates, detector response strength, orientation if provided by the detector, and the number of total detected synthetic features found.

     
  3. 3.

    2D histograms showing bin count for each feature in the alphabet.

     

Detector Parameters Are Not Tuned for the Synthetic Alphabets

No feature detector tuning is attempted here. Why? In summary, feature detector tuning has very limited value in the absence of (1) a specific feature descriptor to use the keypoints, and (2) an intended application and use-cases. Some objections may be raised to this approach, since detectors are designed to be tuned and must be tuned to get best results for real applications. However, the test results herein are only a starting point, intended to allow for simple observations of detector behavior compared to human expectations.

In some cases, a keypoint is not suitable for producing a useful feature descriptor, even if the keypoint has a high score and high response. If the feature descriptor computed at the keypoint produces a descriptor that is too weak, the keypoint and corresponding descriptor should both be rejected. Each detector is designed to be useful for a different class of interest points, and tuned accordingly to filter the results down to a useful set of good candidates for a specific feature extractor.

Since we are not dealing with any specific feature descriptor methods here, tuning the keypoint detectors has limited value, since detector parameter tuning in the absence of a specific feature description is ambiguous. Furthermore, detector tuning will be different for each detector-descriptor pair, different for each application, and potentially different for each image.

Tuning detectors is not simple. Each detector has different parameters to tune for best results on a given image, and each image presents different challenges for lighting, contrast, and image pre-processing. For typical applications, detected keypoints are culled and discarded based on some filtering criteria. OpenCV provides several novel methods for tuning detectors, however none are used here. The OpenCV tuning methods include:
  • DynamicAdaptedFeatureDetectorclass will tune supported detectors using an adjusterAdapter() to only keep a limited number of features, and to iterate the detector parameters several times and re-detect features in order to try and find the best parameters, keeping only the requested number of best features. Several OpenCV detectors have an adjusterAdapter() provided while some do not, and the API allows for adjusters to be created.

  • AdjusterAdapterclass implements the criteria for culling and keeping interest points. Criteria may include KNN nearest matching, detector response or strength, radius distance to nearest other detected points, removing keypoints for which a descriptor cannot be computed, or other.

  • PyramidAdaptedFeatureDetectorclass is can be used to adapt detectors that do not use a scale-space pyramid, and this adapter will create a Gaussian pyramid and detect features over the pyramid.

  • GridAdaptedFeatureDetectorclass divides an image into grids, and adapts the detector to find the best features within each grid cell.

Expectationsfor Test Results

The reader should treat these tests as information only to develop intuition about feature detection. The test results do not prove the merits of any detector. Interpretation of the test results should be done with the following information in mind:
  1. 1.

    One set of detector tuning parameters is used for all images, and detector results will vary widely based on tuning parameters. In fact, the parameters are deliberately set to over-sensitive values for ORB, SURF, and other detectors to generate the maximum number of possible keypoints that can be found.

     
  2. 2.

    Sometimes an alphabet feature generates multiple detections; for example, a single corner alphabet feature may actually contain several corner features.

     
  3. 3.

    The detection results may not be repeatable over the distribution of replicated features in the image feature grid. In other words, identical patterns, which look about the same to a human, are sometimes not recognized at different locations. Without looking in detail at each algorithm, it is hard to say what is happening.

     
  4. 4.

    Detectors that use an image pyramid such as SIFT, SURF, ORB, STAR, and BRISK may identify keypoints in a scale space that are offset or in between the actual alphabet features. This is expected, since the detector is using features from multiple scales.

     

Summary of Synthetic Alphabet Ground Truth Images

The ground truth dataset is summarized here. Note that rotated versions of each image file in the set are provided from 0 to 90 degrees at 10-degree intervals. The 0-degree image in each set is 1024x1024 pixels, and the rotated images in each set are slightly larger to contain the entire rotated 1024x1024 pixel grid.

Synthetic Interest Point Alphabet

The synthetic interest point alphabet contains multiples of the 83 unique patterns, as shown in Figure A-2. A total of 7 x 7 sets of the 83 features fit within the 1024 x 1024 image. Total unique feature count for the image is 7 x 7 x 83 = 4116, with 7 x 7 = 49 instances of each feature. The features are laid out on a 14x14 pixel grid composed of 10 rows and 10 columns, including several empty grid locations. Gray image pixel values are 0x40 and 0xc0, black and white pixel values are 0x0 and 0xff.
Figure A-2.

Synthetic interest points

Synthetic Corner Point Alphabet

The synthetic corner point alphabet contains multiples of the 63 unique patterns, as shown in Figure A-3. A total of 8 x 12 sets of the 63 features fit within the 1024x1024 image. Total unique feature count is 8 x 12 x 63 = 6048, with 8 x 12 = 96 instances of each feature. Each feature is arranged on a grid of 14 x 14 pixel rectangles, including 9 rows and 6 columns of features. Gray image pixel values are 0x40 and 0xc0, black and white pixel values are 0x0 and 0xff.
Figure A-3.

Synthetic corner point

Synthetic Alphabet Overlays

A set of images with the synthetic alphabets overlaid is provided, including rotated versions of each image, as shown in Figure A-4.
Figure A-4.

Synthetic alphabets overlaid on real images

Test 1: Synthetic Interest Point Alphabet Detection

Table A-2 provides the total detected synthetic interest points. Note: total detector counts include features computed at each scale of an image pyramid. For detectors, which report feature detections at each level of an image pyramid, individual pyramid level detections are shown in Table A-3.
Table A-2.

Summary Count of Detected Features Found in the Synthetic Interest Point Alphabet, 0 degree Rotation

Table A-3.

Octave Count of Detected Features Found in the Synthetic Interest Point Alphabet, 0 degree Rotation

The total number of features detected in each alphabet cell is provided in summary tables from the annotated images. Note that several features may be detected within each 14x14 cell, and the detectors often provide non-repeatable results, which are discussed at the end of this appendix. The counts show the total number of alphabet features detected across the entire image, as shown in Figure A-5.
Figure A-5.

Annotated BRISK detector results. NOTE: there are several non-repeatability anomalies

Annotated Synthetic Interest Point Detector Results

For ORB and SURF detectors, the annotated renderings using the drawkeypoints() function are too dense to be useful for visualization, but are included in the online test results.

The diameter of the circle drawn at each detected keypoint corresponds to the “diameter of the meaningful keypoint neighborhood,” according to the OpenCV KeyPoint class definition, which varies in size according to the image pyramid level where the feature was detected. Some detectors do not use a pyramid, so the diameter is always the same. The position of the detected features is normalized to the full resolution image, and all detected keypoints are drawn.

Entire Images Available Online

To better understand the detector results for each test, the entire image should be viewed to see the anomalies, such as where detectors fail to recognize identical patterns. Figure A-5 is an entire image showing BRISK detector results, while others are available online. Test results shown in Figures A-6 through A-15 only show a portion of the images.
Figure A-6.

SIMPLEBLOB detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white and gray images, color-coded tables

Figure A-7.

STAR detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid across each 1024x1024 image, black, white and gray images, color-coded tables

Figure A-8.

GFFT detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-9.

MSER detector (black on white, white on black, and light gray on dark gray have no detected features)

Figure A-10.

ORB detector (annotations using default parameters not useful, images provided online), with results showing summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-11.

BRISK detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-12.

FAST detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-13.

HARRIS detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-14.

SIFT detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-15.

SURF detector (annotations using default parameters not useful, images provided online), with results showing summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Test 2: Synthetic Corner Point Alphabet Detection

Table A-4 provides the total detected synthetic corner points at all pyramid levels; some detectors do not use pyramids. Note: for detectors that report features separately over image pyramid levels, individual pyramid-level detections are shown in Table A-5.
Table A-4.

Summary Count of Detected Features Found in the Synthetic Interest Point Alphabet, 0 degree Rotation

Table A-5.

Octave Count of Detected Features Found in the Synthetic Corner Point Alphabet, 0 degree Rotation

Each feature exists within a 14x14 pixel region, and the total number of features detected in each cell is provided in summary tables with the annotated images. Note that several features may be detected within each 14 x 14 cell, and the detectors often provide non-repeatable results, which are discussed at the end of this appendix.

Annotated Synthetic Corner Point Detector Results

Test 2 is exactly like the interest point detector results in Test 1. As such, for ORB and SURF detectors, the annotated renderings using the drawkeypoints( ) function are too dense to be useful, but are included in the online test results.

The diameter of the circle drawn at each detected keypoint corresponds to the “diameter of the meaningful keypoint neighborhood,” according to the OpenCV KeyPoint class definition, which varies in size according to the image pyramid level where the feature was detected. Some detectors do not use a pyramid, so the diameter is always the same. The position of the detected features is normalized to the full resolution image, and all detected keypoints are drawn.

Entire Images Available Online

To better understand the detector results for each test, the entire image should be viewed to see the anomalies, such as where detectors fail to recognize identical patterns. Test results shown in Figures A-16 through A-25 only show a portion of the images.
Figure A-16.

SIMPLE BLOB detector (black on white is the only image with detected features), with results showing summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-17.

STAR detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-18.

GFFT detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-19.

BRISK detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-20.

FAST detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-21.

HARRIS detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-22.

SIFT detector, with results shown for a single alphabet grid set. (Top row) Gaussian and salt/pepper response. (Middle row) Black, white, and gray response. (Bottom row) Summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-23.

SURF detector (annotations using default parameters not useful, images provided online), with results showing summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-24.

ORB detector (annotations using default parameters not useful, images provided online), with results showing summary count of individual alphabet feature detections across all the alphabets in the grid, across each 1024x1024 image, black, white, and gray images, color-coded tables

Figure A-25.

MSER detector (black on white, white on black, and light gray on dark gray have no detected features)

Test 3: Synthetic Alphabets Overlaid on Real Images

Table A-6 provides the total detected synthetic features found in the test images of little girls, shown in Figure A-3. Note that only the 0-degree version is used (no rotations), and both the black versions and the white versions of each alphabet are overlaid. In general, the white feature overlays produce more interest points and corner-point detections.
Table A-6.

Summary Count of Detected Features Found in the Synthetic Overlay Images of Little Girls

Annotated Detector Results on Overlay Images

Annotated images are available online.

Test 4: Rotational Invariance for Each Alphabet

This section provides results showing detector response as rotational invariance across the full 0 to 90 degree rotated image sets of black, white, and gray alphabets. Key observations:
  • Black on white, white on black: Rotational invariance is generally less using black and white images with the current set of detectors and parameters, mainly owing to (1) the maxima and minima values of 0x0 and 0xff used for pixel values, and (2) un-optimized detector tuning parameters. The detectors each seem to operate in a similar manner on images at orientations of 0 degrees and 90 degrees that contain no rotational anti-aliasing artifacts on each alphabet pattern; however, for the other rotations of 10 to 80 degrees, pixel artifacts combine to reduce rotational invariance for these alphabet patterns—each detector behaves differently.

  • Light gray on dark gray: Rotational invariance is generally better for the detectors using the reduced-range gray scale image alphabet sets using pixel values of 0x40 and 0xc0, rather than the full maxima and minima range used in the black and white image sets. The gray alphabet detector results generally show the most well-recognized alphabet characters under rotation. This may be due to the less pronounced local curvature of closer range gray values in the local region at the interest point or corner.

Methodology for Determining Rotational Invariance

The methodology for determining rotational invariance is illustrated in Figures A-26 through A-30, and illustrated via pseudo-code as follows:

For (degree = 0; degree < 100; degree += 10)

        Rotate image (degree)

        For each detector (SURF, SIFT, BRISK, ...):

                Compute interest point locations

                Annotate rotated image showing interest point locations

                Compute bin count (# of times) each alphabet feature is detected

                Create bin count image: pixel value = bin count for each alphabet character

Figure A-26.

Method of computing and binning detected alphabet features across rotated image sets, mocked-up SIFT data for illustration. (Left) Original image. (Center left) Rotated image annotated with detected points. (Center) Count of all detected points across entire image superimposed on alphabet cell regions. (Center right) Summary bin counts of detected alphabet features in grid cells. (Right) 2D histogram rendering of bin counts as an image; each pixel value is the bin count. Brighter pixels in the image have a higher bin count, meaning that the alphabet cell has a higher detection count

Figure A-27.

Group of 10 SIFT gray scale corner alphabet feature detection results displayed as a 2D histogram image, sephia LUT applied, with pixel values set to the histogram bin values. The histogram for each rotated image is shown here: left image = 0 degree rotation; left-to-right sequence: 0,10,20,30,40,50,60,70,80,90 degree rotations. Note that the histogram bin counts are computed across the entire image, summing all detections of each alphabet feature

Figure A-28.

(Left) Gray corner points 2D histogram bin images. Left to right: 0 – 90 degree rotations, gray scale LUT applied, and light gray on dark gray interest points alphabet 2D histogram binning image, contrast enhanced, sephia LUT applied

Figures A-26 and A-30 show the summary bin counts of synthetic corner point detections across 0 to 90 degree rotations. The ten columns in each image show, left to right, the 0 to 90 degree rotated image final bin counts displayed as images.
Figure A-29.

Summary bin counts of detected corner alphabet features displayed as a set of 6x9 pixel images, where each pixel value is the bin count. (Left 10 x 10 image group) Black on white corners. (Center 10 x 10 image group) Light gray on dark gray corners. (Right 10 x 10 image group) White on black corners. Note that the gray alphabets are detected with the best rotational invariance. The columns are left to right 0-90 degree rotations, and rows are top to bottom, SURF, SIFT, BRISK, FAST, HARRIS, GFFT, MSER, ORB, STAR, SIMPLEBLOB. Sephia LUT applied

Figure A-30.

Summary bin counts of detected interest point alphabet features displayed as a set of 10x10 pixel images, where each pixel value is the bin count. (Left 10 x 10 image group) Black on white corners. (Center 10 x 10 image group) Light gray on dark gray corners. (Right 10 x 10 image group) White on black corners. Note that the gray alphabets are detected with the best rotational invariance. The columns are left to right 0-90 degree rotations, and rows are top to bottom, SURF, SIFT, BRISK, FAST, HARRIS, GFFT, MSER, ORB, STAR, SIMPLEBLOB. Sephia LUT applied

Analysis of Results and Non-Repeatability Anomalies

Complete analysis results are online, including annotated images showing detected keypoint locations and text files containing summary information on each detected keypoint.

Caveats

There are deliberate reasons why each interest point detector is designed differently; no detector may be considered superior in all cases by any absolute measure. A few arguments against loosely interpreting these tests results are as follows:
  1. 1.

    Unpredictability: Interest point detectors find features that are often unpredictable from the human visual system standpoint, and they are not restricted by design into the narrow boundaries of synthetic interest points and corners points shown here. Often, the interest point detectors find features that a human would not choose.

     
  2. 2.

    Pixel aliasing artifacts: The aliasing artifacts affect detection and are most pronounced for the rotated images using maxima and minima alphabets, such as black on white or white on black, and are less pronounced for light gray on dark gray alphabets.

     
  3. 3.

    Scale Space: Not all the detectors use scale space, and this is a critical point. For example, SIFT, SURF, and ORB use a scale-space pyramid in the detection process. The scale-space approach filters out synthetic alphabet features that are not visible in some levels of a scale-space pyramid.

     
  4. 4.

    Binary vs. scalar values: FAST uses a binary value comparison to build up the descriptor, while other methods use scalar values such as gradients. Binary value methods, such as FAST, will detect the same feature regardless of polarity or gray value range; however, scalar detectors based on gradients are more sensitive to pixel value polarity and pixel value ranges.

     
  5. 5.

    Pixel region size: FAST uses a 7x7 patch to look for connected circle perimeter regions, while other features like SIFT, SURF, and ORB use larger pixel regions that bleed across alphabet grid cells, resulting in interest points being centered between alphabet features, rather than on them.

     
  6. 6.

    Region shape: Features such as MSER and SIMPLEBLOB are designed to detect larger connected regions with no specific shape, rather than smaller local features such as the interest point alphabets. An affine-invariant detector, such as SIFT, may detect features in an oval or oblong region corresponding to affine scale and rotation transformations, while a non-affine detector, such as FAST, may only detect the same feature as a template in a circular or square region with some rotational invariance at scale.

     
  7. 7.

    Offset regions from image boundary: Some detectors, such as ORB, SURF, and SIFT, begin detector computations at an offset from the image boundaries, so features are not computed across the entire image.

     
  8. 8.

    Proven value: Each detector method used here has proved useful and valuable for real applications.

     

With these caveats in mind, the test results can be allowed to speak for themselves.

Non-Repeatability in Tests 1 and 2

One interesting anomaly visible in Tests 1 and 2 appears in the annotated images, illustrating that detector results are not repeatable on the synthetic interest point and corner alphabets. In some cases, the nonlinearity is striking; see the annotated images for Tests 1 and 2. The expectation of a human is that identical interest points should be equally well recognized. Here are some observations:
  1. 1.

    A human would recognize the same pattern easily whether or not the background and foreground are changed; however, some detectors do not have much invariance to extreme background and foreground polarity. The anomalies between detector behavior across white, black, and gray versions of the alphabets are less expected and harder to explain without looking deeper into each algorithm.

     
  2. 2.

    Some detectors compute over larger region boundaries than the 14x14 alphabet grid, so detectors virtually ignore the alphabet feature grid and use adjacent pieces of alphabet features.

     
  3. 3.

    Some detectors use scale space, so individual alphabet features are missed in some cases at higher scale levels, and detectors such as SIFT DoG use multiple scales together.

     

In summary, interest point detection and parameter tuning are analogous to image processing operators and their parameters: there are endless variations available to achieve the same goals. It is hoped that, by studying the test results here, intuition will be increased and new approaches can be devised.

Other Non-Repeatability in Test 3

We note non-repeatability anomalies with Test 3 using little girl images with synthetic overlays, but there is less expectation of repeatability in this test. Some analysis of the differences between the positive (white) and negative (black) feature overlays can be observed in the annotated synthetic overlay images online.

Test Summary

Take-away analysis for all tests includes the following:
  1. 1.

    Non-repeatability: some non-repeatability anomalies detecting nearly identical features, differeing only under rotation by local pixel interpolation artifacts. Some detectors also detect the black, white and gray alphabets differently.

     
  2. 2.

    Gray level alphabets (lt. gray on dk.gray) are detected generally most similar to human expectations. The results show that detectors, with the current tuning parameters, respond more uniformly across rotation with gray level patterns, rather than maxima black and white patterns.

     
  3. 3.

    Real images overlaid with synthetic images tests provide interesting information to develop intuition about detector behavior—for illustration purposes only.

     

Future Work

Additional analysis should include devising and using alternative alphabets suited for a given type of application, including a larger range of pixel sizes and scales, especially alphabets with closer gray level value polarity, rather than extreme maxima and minima pixel values. Detector tuning should also be explored across the alphabets.

Copyright information

© Scott Krig 2014

Authors and Affiliations

  • Scott Krig
    • 1
  1. 1.CAUS

Personalised recommendations