Linear Time Maximally Stable Extremal Regions
In this paper we present a new algorithm for computing Maximally Stable Extremal Regions (MSER), as invented by Matas et al. The standard algorithm makes use of a union-find data structure and takes quasi-linear time in the number of pixels. The new algorithm provides exactly identical results in true worst-case linear time. Moreover, the new algorithm uses significantly less memory and has better cache-locality, resulting in faster execution. Our CPU implementation performs twice as fast as a state-of-the-art FPGA implementation based on the standard algorithm.
The new algorithm is based on a different computational ordering of the pixels, which is suggested by another immersion analogy than the one corresponding to the standard connected-component algorithm. With the new computational ordering, the pixels considered or visited at any point during computation consist of a single connected component of pixels in the image, resembling a flood-fill that adapts to the grey-level landscape. The computation only needs a priority queue of candidate pixels (the boundary of the single connected component), a single bit image masking visited pixels, and information for as many components as there are grey-levels in the image. This is substantially more compact in practice than the standard algorithm, where a large number of connected components must be considered in parallel. The new algorithm can also generate the component tree of the image in true linear time. The result shows that MSER detection is not tied to the union-find data structure, which may open more possibilities for parallelization.
Unable to display preview. Download preview PDF.
- 3.Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference, pp. 384–393 (2002)Google Scholar
- 6.Kadir, T., Zisserman, A., Brady, M.: An affine invariant salient region detector. In: European Conference on Computer Vision, pp. 404–416 (2004)Google Scholar
- 8.Lindeberg, T.: Feature detection with automatic scale selection. International Journal of Computer Vision 30(2), 77–116 (1998)Google Scholar
- 9.Triggs, B.: Detecting keypoints with stable position, orientation and scale under illumination changes. In: European Conference on Computer Vision, vol. 4, pp. 100–113 (2004)Google Scholar
- 11.Donoser, M., Bischof, H.: Efficient maximally stable extremal region (mser) tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 553–560 (2006)Google Scholar
- 12.Brown, M., Lowe, D.: Invariant features from interest point groups. In: British Machine Vision Conference, pp. 656–665 (2002)Google Scholar
- 13.Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988)Google Scholar
- 14.Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: International Conference on Computer Vision, vol. 2, pp. 1470–1477 (2003)Google Scholar
- 15.Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 2161–2168 (2006)Google Scholar
- 16.Obdrzalek, S., Matas, J.: Object recognition using local affine frames on distinguished regions. In: British Machine Vision Conference, vol. 1, pp. 113–122 (2002)Google Scholar
- 17.Forssén, P.E.: Maximally stable colour regions for recognition and matching. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
- 18.Donoser, M., Bischof, H.: 3d segmentation by maximally stable volumes (msvs). In: ICPR 2006: Proceedings of the 18th International Conference on Pattern Recognition, pp. 63–66 (2006)Google Scholar
- 19.Kristensen, F., MacLean, W.: Fpga real-time extraction of maximally-stable extremal regions. In: IEEE International Symposium on Circuits and Systems (2007)Google Scholar