Abstract
Image parsing is a process of understanding the contents of an image. The process normally involves labeling pixels or superpixels of a given image with classes of objects that may exist in the image. The accuracy of such labeling for the existing methodologies still needs to be improved. The parsing method needs to be able to identify multiple instances of objects of different classes and sizes. In our previous work, a novel feature representation for an instance of objects in an image was proposed for object recognition and image parsing. The feature representation consists of the histogram vector of 2 g of visual word ids of the two successive clockwise neighbors of any superpixels in the object instance and the shape vector of the instance. Using the feature representation, the instance can be classified with very high accuracy by the per class support vector machines (SVMs). A multi-objective genetic algorithm is also proposed to find a subset of image segments that would best constitute an instance for a class of objects, i.e., maximizing the SVM classification score and the size of the instance. However, the genetic algorithm can only identify a single instance for each class of objects, despite the fact that many instances of the same class may exist. In this paper, a crowding genetic algorithm is used instead to search for multiple optimal solutions and help alleviate this deficiency. The experimental results show that this crowding genetic algorithm performs better than the previously proposed method as well as the existing methodologies, in terms of class-wise and pixel-wise accuracy. The qualitative results also clearly show that this method can effectively identify multiple object instances existing in a given image.
Similar content being viewed by others
References
Shotton J, Winn J, Rother C, Criminisi A (2009) Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int J Comput Vis 81(1):2–23
Han F, Zhu S-C (2009) Bottom-up/top-down image parsing with attribute grammar. IEEE Trans Pattern Anal Mach Intell 31(1):59–73
Porway J, Wang Q, Zhu SC (2009) A hierarchical and contextual model for aerial image parsing. Int J Comput Vis 88(2):254–283
Yao BZ (2010) I2T: image parsing to text description. Proc IEEE 98(8):1485–1508
Ji Z, Jing P, Wang J, Su Y (2012) Scene image classification with biased spatial block and pLSA. Int J Digit Content Technol Appl 6(1):398–404
Tighe J, Lazebnik S (2012) Superparsing scalable nonparametric image parsing with superpixels. In: European conference on computer vision, pp 352–365
Tighe J, Lazebnik S (2013) Finding things: image parsing with regions and per-exemplar detectors. In: IEEE conference on computer vision and pattern recognition, pp 3001–3008
Razzaghi P, Samavi S (2014) A new fast approach to nonparametric scene parsing. Pattern Recognit Lett 42(1):56–64
Liu C, Yuen J, Torralba A (2011) Nonparametric scene parsing via label transfer. IEEE Trans Pattern Anal Mach Intell 33(12):2368–2382
Gould S (2012) Multiclass pixel labeling with non-local matching constraints. In: 2012 IEEE conference on computer vision and pattern recognition, pp 2783–2790
Jain A, Gupta A, Davis LS (2010) Learning what and how of contextual models for scene labeling. In: European conference on computer vision, pp 199–212
Yao J, Fidler S, Urtasun R (2012) Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: 2012 IEEE conference on computer vision and pattern recognition, pp 702–709
Gonfaus JM, Boix X, van de Weijer J, Bagdanov AD, Serrat J, Gonzalez J (2010) Harmony potentials for joint classification and segmentation. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 3280–3287
Gould S, Zhang Y (2012) PatchMatchGraph: building a graph of dense patch correspondences for label transfer. In: European conference on computer vision, pp 439–452
Gould S, Fulton R, Koller D (2009) Decomposing a scene into geometric and semantically consistent regions. In 2009 IEEE 12th international conference on computer vision, pp 1–8
John Joseph FJ, Auwatanamongkol S (2015) Image parsing using genetic algorithm and local features derived from 2-grams of visual words of clockwise neighboring superpixels. Int J Adv Comput Technol 7(1):41–49
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Sabine S (2010) SLIC superpixels, EPFL Technical Report 149300
Ojala T, Pietikainen M, Harwood D (1996) A comparative study of texture measures with classification based on feature distributions. Pattern Recognit 29(1):51–59
Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Kampa KB, Putthividhya DP, Principe JC (2011) Irregular tree-structured bayesian network for image segmentation. In IEEE international workshop on machine learning for signal processing, pp 1–6
Fonseca CM, Fleming PJ (1993) Genetic algorithms for multiobjective optimization: formulation, discussion and generalization. In: 5th international conference on genetic algorithms, pp 416–423
Galan SF, Mengshoel OJ (2010) Generalized crowding for genetic algorithms. In: 12th annual conference on genetic and evolutionary computation, pp 775–782
Criminisi A (2004) Microsoft research Cambridge (MSRC) object recognition image database (version 2.0)
Brostow GJ, Fauqueur J, Cipolla R (2009) Semantic object classes in video: a high-definition ground truth database. Pattern Recognit Lett 30(2):88–97
Vedaldi A, Fulkerson B (2010) VLFeat: an open and portable library of computer vision algorithms. In: ACM Multimedia, pp 1469–1472
Acknowledgments
This work was supported by a research grant from the National Institute of Development Administration, Bangkok, Thailand. The authors of this paper thank the reviewers for their valuable comments to improvise this manuscript during revised submissions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
John Joseph, F.J., Auwatanamongkol, S. A crowding multi-objective genetic algorithm for image parsing. Neural Comput & Applic 27, 2217–2227 (2016). https://doi.org/10.1007/s00521-015-2000-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-015-2000-2