Linear Time Maximally Stable Extremal Regions

  • David Nistér
  • Henrik Stewénius
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5303)


In this paper we present a new algorithm for computing Maximally Stable Extremal Regions (MSER), as invented by Matas et al. The standard algorithm makes use of a union-find data structure and takes quasi-linear time in the number of pixels. The new algorithm provides exactly identical results in true worst-case linear time. Moreover, the new algorithm uses significantly less memory and has better cache-locality, resulting in faster execution. Our CPU implementation performs twice as fast as a state-of-the-art FPGA implementation based on the standard algorithm.

The new algorithm is based on a different computational ordering of the pixels, which is suggested by another immersion analogy than the one corresponding to the standard connected-component algorithm. With the new computational ordering, the pixels considered or visited at any point during computation consist of a single connected component of pixels in the image, resembling a flood-fill that adapts to the grey-level landscape. The computation only needs a priority queue of candidate pixels (the boundary of the single connected component), a single bit image masking visited pixels, and information for as many components as there are grey-levels in the image. This is substantially more compact in practice than the standard algorithm, where a large number of connected components must be considered in parallel. The new algorithm can also generate the component tree of the image in true linear time. The result shows that MSER detection is not tied to the union-find data structure, which may open more possibilities for parallelization.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Tuytelaars, T., Gool, L.V.: Matching widely separated views based on affine invariant regions. International Journal of Computer Vision 59(1), 61–85 (2004)CrossRefGoogle Scholar
  2. 2.
    Schmid, C., Mohr, R., Bauckhage, C.: Evaluation of interest point detectors. International Journal of Computer Vision 37(2), 151–172 (2000)CrossRefMATHGoogle Scholar
  3. 3.
    Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference, pp. 384–393 (2002)Google Scholar
  4. 4.
    Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. International Journal of Computer Vision 60(1), 63–86 (2004)CrossRefGoogle Scholar
  5. 5.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.V.: A comparison of affine region detectors. International Journal of Computer Vision 65(1–2), 43–72 (2005)CrossRefGoogle Scholar
  6. 6.
    Kadir, T., Zisserman, A., Brady, M.: An affine invariant salient region detector. In: European Conference on Computer Vision, pp. 404–416 (2004)Google Scholar
  7. 7.
    Lowe, D.: Distinctive image features from scale invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  8. 8.
    Lindeberg, T.: Feature detection with automatic scale selection. International Journal of Computer Vision 30(2), 77–116 (1998)Google Scholar
  9. 9.
    Triggs, B.: Detecting keypoints with stable position, orientation and scale under illumination changes. In: European Conference on Computer Vision, vol. 4, pp. 100–113 (2004)Google Scholar
  10. 10.
    Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: 3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints. International Journal of Computer Vision 66(3), 231–259 (2006)CrossRefGoogle Scholar
  11. 11.
    Donoser, M., Bischof, H.: Efficient maximally stable extremal region (mser) tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 553–560 (2006)Google Scholar
  12. 12.
    Brown, M., Lowe, D.: Invariant features from interest point groups. In: British Machine Vision Conference, pp. 656–665 (2002)Google Scholar
  13. 13.
    Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988)Google Scholar
  14. 14.
    Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: International Conference on Computer Vision, vol. 2, pp. 1470–1477 (2003)Google Scholar
  15. 15.
    Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 2161–2168 (2006)Google Scholar
  16. 16.
    Obdrzalek, S., Matas, J.: Object recognition using local affine frames on distinguished regions. In: British Machine Vision Conference, vol. 1, pp. 113–122 (2002)Google Scholar
  17. 17.
    Forssén, P.E.: Maximally stable colour regions for recognition and matching. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  18. 18.
    Donoser, M., Bischof, H.: 3d segmentation by maximally stable volumes (msvs). In: ICPR 2006: Proceedings of the 18th International Conference on Pattern Recognition, pp. 63–66 (2006)Google Scholar
  19. 19.
    Kristensen, F., MacLean, W.: Fpga real-time extraction of maximally-stable extremal regions. In: IEEE International Symposium on Circuits and Systems (2007)Google Scholar
  20. 20.
    Vincent, L., Soille, P.: Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis and Machine Intelligence 13, 583–598 (1991)CrossRefGoogle Scholar
  21. 21.
    Roerdink, J., Meijster, A.: The watershed transform: definitions, algorithms and parallelization strategies. Fundamenta Informaticae 41, 187–228 (2000)MathSciNetMATHGoogle Scholar
  22. 22.
    Couprie, M., Najman, L., Bertrand, G.: Quasi-linear algorithms for the topological watershed. Journal of Mathematical Imaging and Vision 22(2), 231–249 (2005)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Tarjan, R.: Data Structures and Network Algorithms. SIAM, Philadelphia (1983)CrossRefMATHGoogle Scholar
  24. 24.
    Gabow, H., Tarjan, R.: A linear-time algorithm for a special case of disjoint set union. Journal of Computer and System Sciences 30(2), 209–220 (1985)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • David Nistér
    • 1
  • Henrik Stewénius
    • 2
  1. 1.Microsoft Live LabsUSA
  2. 2.GoogleSwitzerland

Personalised recommendations