Skip to main content

Advertisement

Log in

A crowding multi-objective genetic algorithm for image parsing

  • Predictive Analytics Using Machine Learning
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Image parsing is a process of understanding the contents of an image. The process normally involves labeling pixels or superpixels of a given image with classes of objects that may exist in the image. The accuracy of such labeling for the existing methodologies still needs to be improved. The parsing method needs to be able to identify multiple instances of objects of different classes and sizes. In our previous work, a novel feature representation for an instance of objects in an image was proposed for object recognition and image parsing. The feature representation consists of the histogram vector of 2 g of visual word ids of the two successive clockwise neighbors of any superpixels in the object instance and the shape vector of the instance. Using the feature representation, the instance can be classified with very high accuracy by the per class support vector machines (SVMs). A multi-objective genetic algorithm is also proposed to find a subset of image segments that would best constitute an instance for a class of objects, i.e., maximizing the SVM classification score and the size of the instance. However, the genetic algorithm can only identify a single instance for each class of objects, despite the fact that many instances of the same class may exist. In this paper, a crowding genetic algorithm is used instead to search for multiple optimal solutions and help alleviate this deficiency. The experimental results show that this crowding genetic algorithm performs better than the previously proposed method as well as the existing methodologies, in terms of class-wise and pixel-wise accuracy. The qualitative results also clearly show that this method can effectively identify multiple object instances existing in a given image.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Shotton J, Winn J, Rother C, Criminisi A (2009) Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int J Comput Vis 81(1):2–23

    Article  Google Scholar 

  2. Han F, Zhu S-C (2009) Bottom-up/top-down image parsing with attribute grammar. IEEE Trans Pattern Anal Mach Intell 31(1):59–73

    Article  Google Scholar 

  3. Porway J, Wang Q, Zhu SC (2009) A hierarchical and contextual model for aerial image parsing. Int J Comput Vis 88(2):254–283

    Article  MathSciNet  Google Scholar 

  4. Yao BZ (2010) I2T: image parsing to text description. Proc IEEE 98(8):1485–1508

    Article  Google Scholar 

  5. Ji Z, Jing P, Wang J, Su Y (2012) Scene image classification with biased spatial block and pLSA. Int J Digit Content Technol Appl 6(1):398–404

    Article  Google Scholar 

  6. Tighe J, Lazebnik S (2012) Superparsing scalable nonparametric image parsing with superpixels. In: European conference on computer vision, pp 352–365

  7. Tighe J, Lazebnik S (2013) Finding things: image parsing with regions and per-exemplar detectors. In: IEEE conference on computer vision and pattern recognition, pp 3001–3008

  8. Razzaghi P, Samavi S (2014) A new fast approach to nonparametric scene parsing. Pattern Recognit Lett 42(1):56–64

    Article  Google Scholar 

  9. Liu C, Yuen J, Torralba A (2011) Nonparametric scene parsing via label transfer. IEEE Trans Pattern Anal Mach Intell 33(12):2368–2382

    Article  Google Scholar 

  10. Gould S (2012) Multiclass pixel labeling with non-local matching constraints. In: 2012 IEEE conference on computer vision and pattern recognition, pp 2783–2790

  11. Jain A, Gupta A, Davis LS (2010) Learning what and how of contextual models for scene labeling. In: European conference on computer vision, pp 199–212

  12. Yao J, Fidler S, Urtasun R (2012) Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: 2012 IEEE conference on computer vision and pattern recognition, pp 702–709

  13. Gonfaus JM, Boix X, van de Weijer J, Bagdanov AD, Serrat J, Gonzalez J (2010) Harmony potentials for joint classification and segmentation. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 3280–3287

  14. Gould S, Zhang Y (2012) PatchMatchGraph: building a graph of dense patch correspondences for label transfer. In: European conference on computer vision, pp 439–452

  15. Gould S, Fulton R, Koller D (2009) Decomposing a scene into geometric and semantically consistent regions. In 2009 IEEE 12th international conference on computer vision, pp 1–8

  16. John Joseph FJ, Auwatanamongkol S (2015) Image parsing using genetic algorithm and local features derived from 2-grams of visual words of clockwise neighboring superpixels. Int J Adv Comput Technol 7(1):41–49

    Google Scholar 

  17. Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Sabine S (2010) SLIC superpixels, EPFL Technical Report 149300

  18. Ojala T, Pietikainen M, Harwood D (1996) A comparative study of texture measures with classification based on feature distributions. Pattern Recognit 29(1):51–59

    Article  Google Scholar 

  19. Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

  20. Kampa KB, Putthividhya DP, Principe JC (2011) Irregular tree-structured bayesian network for image segmentation. In IEEE international workshop on machine learning for signal processing, pp 1–6

  21. Fonseca CM, Fleming PJ (1993) Genetic algorithms for multiobjective optimization: formulation, discussion and generalization. In: 5th international conference on genetic algorithms, pp 416–423

  22. Galan SF, Mengshoel OJ (2010) Generalized crowding for genetic algorithms. In: 12th annual conference on genetic and evolutionary computation, pp 775–782

  23. Criminisi A (2004) Microsoft research Cambridge (MSRC) object recognition image database (version 2.0)

  24. Brostow GJ, Fauqueur J, Cipolla R (2009) Semantic object classes in video: a high-definition ground truth database. Pattern Recognit Lett 30(2):88–97

    Article  Google Scholar 

  25. Vedaldi A, Fulkerson B (2010) VLFeat: an open and portable library of computer vision algorithms. In: ACM Multimedia, pp 1469–1472

Download references

Acknowledgments

This work was supported by a research grant from the National Institute of Development Administration, Bangkok, Thailand. The authors of this paper thank the reviewers for their valuable comments to improvise this manuscript during revised submissions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ferdin Joe John Joseph.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

John Joseph, F.J., Auwatanamongkol, S. A crowding multi-objective genetic algorithm for image parsing. Neural Comput & Applic 27, 2217–2227 (2016). https://doi.org/10.1007/s00521-015-2000-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-015-2000-2

Keywords

Navigation