Fast Semantic Feature Extraction Using Superpixels for Soft Segmentation

Verma, Shashikant; Nagar, Rajendra; Raman, Shanmuganathan

doi:10.1007/978-981-15-4015-8_6

Shashikant Verma⁹,
Rajendra Nagar⁹ &
Shanmuganathan Raman⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1147))

Included in the following conference series:

International Conference on Computer Vision and Image Processing

728 Accesses
1 Citations

Abstract

In this work, we address the problem of extracting high dimensional, soft semantic feature descriptors for every pixel in an image using a deep learning framework. Existing methods rely on a metric learning objective called multi-class N-pair loss, which requires pairwise comparison of positive examples (same class pixels) to all negative examples (different class pixels). Computing this loss for all possible pixel pairs in an image leads to a high computational bottleneck. We show that this huge computational overhead can be reduced by learning this metric based on superpixels. This also conserves the global semantic context of the image, which is lost in pixel-wise computation because of the sampling to reduce comparisons. We design an end-to-end trainable network with a loss function and give a detailed comparison of two feature extraction methods: pixel-based and superpixel-based. We also investigate hard semantic labeling of these soft semantic feature descriptors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Brown, M., Lowe, D.G.: Invariant features from interest point groups. In: BMVC, vol. 4 (2002)
Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Chapter Google Scholar
Lindeberg, T.: Scale-Space Theory in Computer Vision, vol. 256. Springer, Boston (2013). https://doi.org/10.1007/978-1-4757-6465-9
Book MATH Google Scholar
Aksoy, Y., Oh, T.-H., Paris, S., Pollefeys, M., Matusik, W.: Semantic soft segmentation. ACM Trans. Graph. (TOG) 37(4), 72 (2018)
Article Google Scholar
Pan, J., Hu, Z., Su, Z., Lee, H.-Y., Yang, M.-H.: Soft-segmentation guided object motion deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 459–468 (2016)
Google Scholar
Chopra, S., Hadsell, R., LeCun, Y., et al.: Learning a similarity metric discriminatively, with application to face verification. In: CVPR, vol. 1, pp. 539–546 (2005)
Google Scholar
Sohn, K.: Improved deep metric learning with multi-class N-pair loss objective. In: Advances in Neural Information Processing Systems, pp. 1857–1865 (2016)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural networks. In: ICML, vol. 2, no. 3, p. 7 (2016)
Google Scholar
Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Advances in Neural Information Processing Systems, pp. 1988–1996 (2014)
Google Scholar
Zhang, X., Zhou, F., Lin, Y., Zhang, S.: Embedding label structures for fine-grained feature representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1114–1123 (2016)
Google Scholar
Aksoy, Y., Ozan Aydin, T., Pollefeys, M.: Designing effective inter-pixel information flow for natural image matting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 29–37 (2017)
Google Scholar
Singaraju, D., Vidal, R.: Estimation of alpha mattes for multiple image layers. IEEE Trans. Pattern Anal. Mach. Intell. 33(7), 1295–1309 (2010)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Bertasius, G., Shi, J., Torresani, L.: High-for-low and low-for-high: efficient boundary detection from deep object features and its applications to high-level vision. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 504–512 (2015)
Google Scholar
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)
Article Google Scholar
He, K., Sun, J., Tang, X.: Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. 35(6), 1397–1409 (2012)
Article Google Scholar
Bickel, P., Diggle, P., Fienberg, S., Gather, U., Olkin, I., Zeger, S.: Springer Series in Statistics. Springer, New York (2009). https://doi.org/10.1007/978-0-387-77501-2
Book Google Scholar
Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20k dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Google Scholar
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Article Google Scholar
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Gandhinagar, Gandhinagar, Gujarat, India
Shashikant Verma, Rajendra Nagar & Shanmuganathan Raman

Authors

Shashikant Verma
View author publications
You can also search for this author in PubMed Google Scholar
Rajendra Nagar
View author publications
You can also search for this author in PubMed Google Scholar
Shanmuganathan Raman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shashikant Verma .

Editor information

Editors and Affiliations

Malaviya National Institute of Technology, Jaipur, Rajasthan, India
Neeta Nain
Malaviya National Institute of Technology, Jaipur, Rajasthan, India
Santosh Kumar Vipparthi
Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Balasubramanian Raman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Verma, S., Nagar, R., Raman, S. (2020). Fast Semantic Feature Extraction Using Superpixels for Soft Segmentation. In: Nain, N., Vipparthi, S., Raman, B. (eds) Computer Vision and Image Processing. CVIP 2019. Communications in Computer and Information Science, vol 1147. Springer, Singapore. https://doi.org/10.1007/978-981-15-4015-8_6

Download citation

DOI: https://doi.org/10.1007/978-981-15-4015-8_6
Published: 29 March 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-4014-1
Online ISBN: 978-981-15-4015-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics