Advertisement

Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation

  • Xinge ZhuEmail author
  • Hui Zhou
  • Ceyuan Yang
  • Jianping Shi
  • Dahua Lin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11211)

Abstract

Due to the expensive and time-consuming annotations (e.g., segmentation) for real-world images, recent works in computer vision resort to synthetic data. However, the performance on the real image often drops significantly because of the domain shift between the synthetic data and the real images. In this setting, domain adaptation brings an appealing option. The effective approaches of domain adaptation shape the representations that (1) are discriminative for the main task and (2) have good generalization capability for domain shift. To this end, we propose a novel loss function, i.e., Conservative Loss, which penalizes the extreme good and bad cases while encouraging the moderate examples. More specifically, it enables the network to learn features that are discriminative by gradient descent and are invariant to the change of domains via gradient ascend method. Extensive experiments on synthetic to real segmentation adaptation show our proposed method achieves state of the art results. Ablation studies give more insights into properties of the Conservative Loss. Exploratory experiments and discussion demonstrate that our Conservative Loss has good flexibility rather than restricting an exact form.

Notes

Acknowledgments

This work is partially supported by the Big Data Collaboration Research grant from SenseTime Group (CUHK Agreement No. TS1610626).

Supplementary material

474212_1_En_35_MOESM1_ESM.pdf (12.3 mb)
Supplementary material 1 (pdf 12615 KB)

References

  1. 1.
    Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR, pp. 2881–2890 (2017)Google Scholar
  2. 2.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  3. 3.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)Google Scholar
  4. 4.
    Cordts, M., et al.: The Cityscapes dataset for semantic urban scene understanding. In: CVPR, pp. 3213–3223 (2016)Google Scholar
  5. 5.
    Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_7CrossRefGoogle Scholar
  6. 6.
    Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. The MIT Press, Cambridge (2009)Google Scholar
  7. 7.
    Geng, B., Tao, D., Xu, C.: DAML: domain adaptation metric learning. IEEE Trans. Image Process. 20(10), 2980–2989 (2011)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: CVPR, pp. 2066–2073. IEEE (2012)Google Scholar
  9. 9.
    Long, M., Cao, Y., Wang, J., Jordan, M.: Learning transferable features with deep adaptation networks. In: ICML, pp. 97–105 (2015)Google Scholar
  10. 10.
    Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: CVPR, vol. 1, p. 4 (2017)Google Scholar
  11. 11.
    Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: NIPS, pp. 469–477 (2016)Google Scholar
  12. 12.
    Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M.: Domain-adversarial neural networks. arXiv preprint arXiv:1412.4446 (2014)
  13. 13.
    Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: ICML, pp. 1180–1189 (2015)Google Scholar
  14. 14.
    Zhang, Y., David, P., Gong, B.: Curriculum domain adaptation for semantic segmentation of urban scenes. In: ICCV, vol. 2, p. 6 (2017)Google Scholar
  15. 15.
    Hoffman, J., Wang, D., Yu, F., Darrell, T.: FCNs in the wild: pixel-level adversarial and constraint-based adaptation. arXiv preprint arXiv:1612.02649 (2016)
  16. 16.
    Chen, Y.H., Chen, W.Y., Chen, Y.T., Tsai, B.C., Wang, Y.C.F., Sun, M.: No more discrimination: cross city adaptation of road scene segmenters. In: ICCV, pp. 2011–2020. IEEE (2017)Google Scholar
  17. 17.
    Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation: learning bounds and algorithms. arXiv preprint arXiv:0902.3430 (2009)
  18. 18.
    Ben-David, S., Blitzer, J., Crammer, K., Pereira, F.: Analysis of representations for domain adaptation. In: NIPS, pp. 137–144 (2007)Google Scholar
  19. 19.
    Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1–2), 151–175 (2010)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS, pp. 2672–2680 (2014)Google Scholar
  21. 21.
    Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: CVPR, pp. 3234–3243 (2016)Google Scholar
  22. 22.
    Wu, Z., Shen, C., van den Hengel, A.: Wider or deeper: revisiting the ResNet model for visual recognition. arXiv preprint arXiv:1611.10080 (2016)
  23. 23.
    Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: ICCV, pp. 1520–1528 (2015)Google Scholar
  24. 24.
    Fernando, B., Habrard, A., Sebban, M., Tuytelaars, T.: Unsupervised visual domain adaptation using subspace alignment. In: ICCV, pp. 2960–2967 (2013)Google Scholar
  25. 25.
    Kulis, B., Saenko, K., Darrell, T.: What you saw is not what you get: domain adaptation using asymmetric kernel transforms. In: CVPR, pp. 1785–1792 (2011)Google Scholar
  26. 26.
    Carlucci, F.M., Porzi, L., Caputo, B., Ricci, E., Bulò, S.R.: Autodial: automatic domain alignment layers. In: ICCV (2017)Google Scholar
  27. 27.
    Lu, H., et al.: When unsupervised domain adaptation meets tensor representations. In: ICCV, vol. 2 (2017)Google Scholar
  28. 28.
    Li, D., Yang, Y., Song, Y.Z., Hospedales, T.M.: Deeper, broader and artier domain generalization. In: ICCV, pp. 5543–5551. IEEE (2017)Google Scholar
  29. 29.
    Ghifary, M., Kleijn, W.B., Zhang, M., Balduzzi, D., Li, W.: Deep reconstruction-classification networks for unsupervised domain adaptation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 597–613. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_36CrossRefGoogle Scholar
  30. 30.
    Busto, P.P., Gall, J.: Open set domain adaptation. In: ICCV, vol. 1 (2017)Google Scholar
  31. 31.
    Motiian, S., Piccirilli, M., Adjeroh, D.A., Doretto, G.: Unified deep supervised domain adaptation and generalization. In: ICCV, vol. 2 (2017)Google Scholar
  32. 32.
    Tsai, Y.H., Hung, W.C., Schulter, S., Sohn, K., Yang, M.H., Chandraker, M.: Learning to adapt structured output space for semantic segmentation. In: CVPR (2018)Google Scholar
  33. 33.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)Google Scholar
  34. 34.
    Li, C., Wand, M.: Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 702–716. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_43CrossRefGoogle Scholar
  35. 35.
    Zhao, J., Mathieu, M., LeCun, Y.: Energy-based generative adversarial network. arXiv preprint arXiv:1609.03126 (2016)
  36. 36.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  37. 37.
    Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)
  38. 38.
    Tirumala, S.S., Ali, S., Ramesh, C.P.: Evolving deep neural networks: a new prospect. In: 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), pp. 69–74. IEEE (2016)Google Scholar
  39. 39.
    Zinkevich, M.: Theoretical analysis of a warm start technique. In: NIPS 2011 Workshop on BigLearn. Citeseer (2011)Google Scholar
  40. 40.
    Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)
  41. 41.
    van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  42. 42.
    Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Xinge Zhu
    • 1
    Email author
  • Hui Zhou
    • 2
  • Ceyuan Yang
    • 1
  • Jianping Shi
    • 2
  • Dahua Lin
    • 1
  1. 1.CUHK-SenseTime Joint LabCUHKHong KongHong Kong S.A.R.
  2. 2.SenseTime ResearchBeijingChina

Personalised recommendations