Skip to main content
Log in

Segmentation-based multi-class semantic object detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper we study the problem of the detection of semantic objects from known categories in images. Unlike existing techniques which operate at the pixel or at a patch level for recognition, we propose to rely on the categorization of image segments. Recent work has highlighted that image segments provide a sound support for visual object class recognition. In this work, we use image segments as primitives to extract robust features and train detection models for a predefined set of categories. Several segmentation algorithms are benchmarked and their performances for segment recognition are compared. We then propose two methods for enhancing the segments classification, one based on the fusion of the classification results obtained with the different segmentations, the other one based on the optimization of the global labelling by correcting local ambiguities between neighbor segments. We use as a benchmark the Microsoft MSRC-21 image database and show that our method competes with the current state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14a
Fig. 14b

Similar content being viewed by others

Notes

  1. http://www.caip.rutgers.edu/riul/research/code/EDISON/index.html

References

  1. Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: SODA07, pp 1027–1035

  2. Athanasiadis T, Mylonas P, Avrithis Y, Kollias S (2007) Semantic image segmentation and object labeling. IEEE Trans Circuits Syst Video Technol 13(3):298–312

    Article  Google Scholar 

  3. Ayache S, Quenot G, Gensel J (2007) Classifier fusion for svm-based multimedia semantic indexing. Lect Notes Comput Sci 4425:494–504

    Article  Google Scholar 

  4. Bay H, Ess A, Tuytelaars T, Gool LV (2008) Surf: speeded up robust features. Comput Vis Image Underst 110:346–359

    Article  Google Scholar 

  5. Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  6. Chang S-f, He J, Jiang Y-G, El Khoury E, Ngo C-W, Yanagawa A, Zavesky E (2008) CSF Columbia university/vireo-cityu/irit trecvid2008 high-level feature extraction and interactive video search. In: TRECVID’08. http://www-nlpir.nist.gov/projects/tvpubs/tv8.papers/columbia.pdf

  7. Chevalier F, Domenger JP, Benois-Pineau J, Delest M (2007) Retrieval of objects in video by similarity based on graph matching. Pattern Recogn Lett 28:939–949

    Article  Google Scholar 

  8. Christoudias C, Georgescu B, Meer P (2002) Synergism in low level vision. In: 16th International Conference on Pattern Recognition, pp 150–155

  9. Comanicu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Machine Intell 24:603–619

    Article  Google Scholar 

  10. Duygulu P, Barnard K, de Freitas J, Forsyth D (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: ECCV’02, pp 97–112

  11. Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59:167–181

    Article  Google Scholar 

  12. Freixenet J, Muoz X, Raba D, Mart J, Cuf X (2002) Yet another survey on image segmentation: region and boundary information integration. In: ECCV’02

  13. Galleguillos C, Rabinovich A, Belongie S (2008) Object categorization using co-occurrence, location and appearance. In: CVPR’08. Anchorage, AK

  14. Gokalp D, Aksoy S (2007) Scene classification using bag-of-regions representations. In: CVPR’07, pp 1–8

  15. Gould S, Rodgers J, Cohen D, Elidan G, Koller D (2008) Multi-class segmentation with relative location prior. Int J Comput Vis 80(3):300–316

    Article  Google Scholar 

  16. He X, Zemel RS, Carreira-Perpinan M (2004) Multiscale conditional random fields for image labeling. In: CVPR’04, pp 695–702

  17. Hoiem D, Efros AA, Hebert M (2005) Geometric context from a single image. In: ICCV’05

  18. Jiang YG, Yang J, Ngo C, Hauptmann AG (2010) Representations of keypoint-based semantic concept detection: a comprehensive study. IEEE Trans Multimedia 12:42–53

    Article  Google Scholar 

  19. Malisiewicz T, Efros A (2007) Improving spatial support for objects via multiple segmentations. In: British Machine Vision Conference 2007

  20. Meer P, Georgescu B (2001) Edge detection with embedded confidence. IEEE Trans Pattern Anal Mach Intell 23(12):1351–1365

    Article  Google Scholar 

  21. Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: ECCV’06

  22. Pal N, Pal S (1993) A review on image segmentation. Pattern Recogn 26:1277–1294

    Article  Google Scholar 

  23. Peng Y, Yang Z, Yi J, Cao L, Li H, Yao J (2008) Peking university at trecvid 2008: high level feature extraction. In: TRECVID’08, p (on line). NIST

  24. Platt J (2000) Probabilistic outputs for support vector machines and comparison to regularize likelihood methods. In: Advances in large margin classifiers, pp 61–74. http://citeseer.ist.psu.edu/platt99probabilistic.html

  25. Prasad L, Skourikhine AN (2006) Vectorized image segmentation via trixel agglomeration. Pattern Recogn 39(4):501–514

    Article  MATH  Google Scholar 

  26. Ren X, Malik J (2003) Learning a classification model for segmentation. In: ICCV’03, vol 1, pp 10–17

  27. Rosenfeld A, Hummel RA, Zucker SW (1976) Scene labeling by relaxation operations. IEEE Trans Syst Man Cybern 6:420–433

    Article  MathSciNet  MATH  Google Scholar 

  28. Shotton J, Winn J, Rother C, Criminisi A (2009) Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout and context. Int J Comput Vis 81(1):2–23

    Article  Google Scholar 

  29. Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: ICCV’03, vol 2, pp 1470–1477

  30. Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380

    Article  Google Scholar 

  31. Vapnik V (1995) The nature of statistical learning theory. Springer

  32. Varma M, Zisserman A (2008) A statistical approach to material classification using image patch examplars. IEEE Trans Pattern Anal Mach Intell 31:2032–2047

    Article  Google Scholar 

  33. Verbeek J, Triggs B (2007) Region classification with markov field aspect models. In: CVPR’07, pp 1–8 http://lear.inrialpes.fr/pubs/2007/VT07

  34. Verbeek J, Triggs B (2007) Region classification with markov field aspect models. In: CVPR’07

  35. Wu TF, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005

    MathSciNet  MATH  Google Scholar 

  36. Yang L, Meer P, Foran DJ (2007) Multiple class segmentation using a unified framework over mean-shift patches. In: CVPR

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Remi Vieux.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vieux, R., Benois-Pineau, J., Domenger, JP. et al. Segmentation-based multi-class semantic object detection. Multimed Tools Appl 60, 305–326 (2012). https://doi.org/10.1007/s11042-010-0611-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-010-0611-2

Keywords

Navigation