Abstract
In this paper we study the problem of the detection of semantic objects from known categories in images. Unlike existing techniques which operate at the pixel or at a patch level for recognition, we propose to rely on the categorization of image segments. Recent work has highlighted that image segments provide a sound support for visual object class recognition. In this work, we use image segments as primitives to extract robust features and train detection models for a predefined set of categories. Several segmentation algorithms are benchmarked and their performances for segment recognition are compared. We then propose two methods for enhancing the segments classification, one based on the fusion of the classification results obtained with the different segmentations, the other one based on the optimization of the global labelling by correcting local ambiguities between neighbor segments. We use as a benchmark the Microsoft MSRC-21 image database and show that our method competes with the current state-of-the-art.
Similar content being viewed by others
References
Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: SODA07, pp 1027–1035
Athanasiadis T, Mylonas P, Avrithis Y, Kollias S (2007) Semantic image segmentation and object labeling. IEEE Trans Circuits Syst Video Technol 13(3):298–312
Ayache S, Quenot G, Gensel J (2007) Classifier fusion for svm-based multimedia semantic indexing. Lect Notes Comput Sci 4425:494–504
Bay H, Ess A, Tuytelaars T, Gool LV (2008) Surf: speeded up robust features. Comput Vis Image Underst 110:346–359
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chang S-f, He J, Jiang Y-G, El Khoury E, Ngo C-W, Yanagawa A, Zavesky E (2008) CSF Columbia university/vireo-cityu/irit trecvid2008 high-level feature extraction and interactive video search. In: TRECVID’08. http://www-nlpir.nist.gov/projects/tvpubs/tv8.papers/columbia.pdf
Chevalier F, Domenger JP, Benois-Pineau J, Delest M (2007) Retrieval of objects in video by similarity based on graph matching. Pattern Recogn Lett 28:939–949
Christoudias C, Georgescu B, Meer P (2002) Synergism in low level vision. In: 16th International Conference on Pattern Recognition, pp 150–155
Comanicu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Machine Intell 24:603–619
Duygulu P, Barnard K, de Freitas J, Forsyth D (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: ECCV’02, pp 97–112
Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59:167–181
Freixenet J, Muoz X, Raba D, Mart J, Cuf X (2002) Yet another survey on image segmentation: region and boundary information integration. In: ECCV’02
Galleguillos C, Rabinovich A, Belongie S (2008) Object categorization using co-occurrence, location and appearance. In: CVPR’08. Anchorage, AK
Gokalp D, Aksoy S (2007) Scene classification using bag-of-regions representations. In: CVPR’07, pp 1–8
Gould S, Rodgers J, Cohen D, Elidan G, Koller D (2008) Multi-class segmentation with relative location prior. Int J Comput Vis 80(3):300–316
He X, Zemel RS, Carreira-Perpinan M (2004) Multiscale conditional random fields for image labeling. In: CVPR’04, pp 695–702
Hoiem D, Efros AA, Hebert M (2005) Geometric context from a single image. In: ICCV’05
Jiang YG, Yang J, Ngo C, Hauptmann AG (2010) Representations of keypoint-based semantic concept detection: a comprehensive study. IEEE Trans Multimedia 12:42–53
Malisiewicz T, Efros A (2007) Improving spatial support for objects via multiple segmentations. In: British Machine Vision Conference 2007
Meer P, Georgescu B (2001) Edge detection with embedded confidence. IEEE Trans Pattern Anal Mach Intell 23(12):1351–1365
Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: ECCV’06
Pal N, Pal S (1993) A review on image segmentation. Pattern Recogn 26:1277–1294
Peng Y, Yang Z, Yi J, Cao L, Li H, Yao J (2008) Peking university at trecvid 2008: high level feature extraction. In: TRECVID’08, p (on line). NIST
Platt J (2000) Probabilistic outputs for support vector machines and comparison to regularize likelihood methods. In: Advances in large margin classifiers, pp 61–74. http://citeseer.ist.psu.edu/platt99probabilistic.html
Prasad L, Skourikhine AN (2006) Vectorized image segmentation via trixel agglomeration. Pattern Recogn 39(4):501–514
Ren X, Malik J (2003) Learning a classification model for segmentation. In: ICCV’03, vol 1, pp 10–17
Rosenfeld A, Hummel RA, Zucker SW (1976) Scene labeling by relaxation operations. IEEE Trans Syst Man Cybern 6:420–433
Shotton J, Winn J, Rother C, Criminisi A (2009) Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout and context. Int J Comput Vis 81(1):2–23
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: ICCV’03, vol 2, pp 1470–1477
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Vapnik V (1995) The nature of statistical learning theory. Springer
Varma M, Zisserman A (2008) A statistical approach to material classification using image patch examplars. IEEE Trans Pattern Anal Mach Intell 31:2032–2047
Verbeek J, Triggs B (2007) Region classification with markov field aspect models. In: CVPR’07, pp 1–8 http://lear.inrialpes.fr/pubs/2007/VT07
Verbeek J, Triggs B (2007) Region classification with markov field aspect models. In: CVPR’07
Wu TF, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005
Yang L, Meer P, Foran DJ (2007) Multiple class segmentation using a unified framework over mean-shift patches. In: CVPR
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vieux, R., Benois-Pineau, J., Domenger, JP. et al. Segmentation-based multi-class semantic object detection. Multimed Tools Appl 60, 305–326 (2012). https://doi.org/10.1007/s11042-010-0611-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0611-2