Patch-based detection of dynamic objects in CrowdCam images

Abstract

A scene can be divided into two parts: static and dynamic. The parts of the scene which do not admit any motion are static regions, while moving objects correspond to dynamic regions. In this work, we tackle the challenging task of identifying dynamic objects present in the CrowdCam images. Our approach exploits the coherency present in the natural images and utilizes the epipolar geometry present between a pair of images to achieve this objective. It does not require a dynamic object to be present in all the given images. We show that the proposed approach obtains state-of-the-art accuracy on standard datasets.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. 1.

    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)

    Article  Google Scholar 

  2. 2.

    Agarwal, S., Furukawa, Y., Snavely, N., Simon, I., Curless, B., Seitz, S.M., Szeliski, R.: Building rome in a day. Commun. ACM 54(10), 105–112 (2011)

    Article  Google Scholar 

  3. 3.

    Babenko, B., Yang, M.H., Belongie, S.: Robust object tracking with online multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1619–1632 (2011)

    Article  Google Scholar 

  4. 4.

    Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. TOG 28(3), 24 (2009)

    Google Scholar 

  5. 5.

    Barnes, C., Shechtman, E., Goldman, D.B., Finkelstein, A.: The generalized patchmatch correspondence algorithm. In: European Conference on Computer Vision, pp 29–43. Springer, Berlin (2010)

  6. 6.

    Basha, T., Moses, Y., Avidan, S.: Photo sequencing. In: European Conference on Computer Vision, pp. 654–667 (2012)

  7. 7.

    Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 500–513 (2011)

    Article  Google Scholar 

  8. 8.

    Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25(5), 564–577 (2003)

    Article  Google Scholar 

  9. 9.

    Cremers, D., Soatto, S.: Motion competition: a variational approach to piecewise parametric motion segmentation. Int. J. Comput. Vis. 62(3), 249–265 (2005)

    Article  Google Scholar 

  10. 10.

    Dafni, A., Moses, Y., Avidan, S., Dekel, T.: Detecting moving regions in CrowdCam images. Comput. Vis. Image Underst. 160, 36–44 (2017)

    Article  Google Scholar 

  11. 11.

    Dar, M., Moses, Y.: Temporal epipolar regions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1220–1228 (2016)

  12. 12.

    Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1841–1848 (2013)

  13. 13.

    Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., van der Smagt, P., Cremers, D., Brox, T.: Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)

  14. 14.

    Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol. 96, pp. 226–231 (1996)

  15. 15.

    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    MathSciNet  Article  Google Scholar 

  16. 16.

    Gullapally, S.C., Malireddi, S.R., Raman, S.: Dynamic object localization using hand-held cameras. In: 2015 Twenty First National Conference on Communications (NCC), IEEE, pp. 1–6 (2015)

  17. 17.

    HaCohen, Y., Shechtman, E., Goldman, D.B., Lischinski, D.: Non-rigid dense correspondence with applications for image enhancement. ACM Trans. Graph. (TOG) 30(4), 70 (2011)

    Article  Google Scholar 

  18. 18.

    Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, New York, NY (2003)

    Google Scholar 

  19. 19.

    He, K., Gkioxari, G., Dollr, P., Girshick, R.: Mask r-cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)

  20. 20.

    Korman, S., Avidan, S.: Coherency sensitive hashing. In: Proceedings of the IEEE International Conference on Computer Vision, IEEE, pp. 1607–1614 (2011)

  21. 21.

    Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)

    Article  Google Scholar 

  22. 22.

    Liu, X., Ym, Cheung, Tang, Y.Y.: Lip event detection using oriented histograms of regional optical flow and low rank affinity pursuit. Comput. Vis. Image Underst. 148, 153–163 (2016)

    Article  Google Scholar 

  23. 23.

    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  24. 24.

    Margolin, R., Zelnik-Manor, L., Tal, A.: Otc: a novel local descriptor for scene classification. In: European Conference on Computer Vision, pp. 377–391. Springer, Cham (2014)

  25. 25.

    Moses, Y., Avidan, S, et al.: Space-time tradeoffs in photo sequencing. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 977–984 (2013)

  26. 26.

    Ni, Q., Wang, J., Gu, X.: Moving target tracking based on pulse coupled neural network and optical flow. In: International Conference on Neural Information Processing, pp. 17–25. Springer, Cham (2015)

  27. 27.

    Ochs, P., Malik, J., Brox, T.: Segmentation of moving objects by long term video analysis. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1187–1200 (2014)

    Article  Google Scholar 

  28. 28.

    Park, H.S., Shiratori, T., Matthews, I., Sheikh, Y.: 3d reconstruction of a moving point from a series of 2d projections. In: European conference on computer vision, pp. 158–171. Springer, Berlin (2010)

  29. 29.

    Peng, Y., Chen, Z., Wu, Q.J., Liu, C.: Traffic flow detection and statistics via improved optical flow and connected region analysis. Signal Image Video Process. 12(1), 99–105 (2017)

    Article  Google Scholar 

  30. 30.

    Perazzi, F., Pont-Tuset, J., McWilliams, B., Gool, L.V., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 724–732 (2016)

  31. 31.

    Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Van Gool, L.: The 2017 davis challenge on video object segmentation. (2017). arXiv:1704.00675

  32. 32.

    Sevilla-Lara, L., Sun, D., Jampani, V., Black, M.J.: Optical flow with semantic segmentation and localized layers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3889–3898 (2016)

  33. 33.

    Shi, J., Malik, J.: Motion segmentation and tracking using normalized cuts. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1154–1160 (1998)

  34. 34.

    Hj, Song, Ml, Shen: Target tracking algorithm based on optical flow method using corner detection. Multimed. Tools Appl. 52(1), 121–131 (2011)

    Article  Google Scholar 

  35. 35.

    Tian, L., Li, M., Zhang, G., Zhao, J., Chen, YQ.: Robust human detection with super-pixel segmentation and random ferns classification using rgb-d camera. In: Proceedings of the IEEE International Conference on Multimedia and Expo, IEEE, pp. 1542–1547 (2017)

  36. 36.

    Tian, Z., Liu, L., Zhang, Z., Fei, B.: Superpixel-based segmentation for 3d prostate mr images. IEEE Trans. Med. Imaging 35(3), 791–801 (2016)

    Article  Google Scholar 

  37. 37.

    Tola, E., Lepetit, V., Fua, P.: Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans. Pattern Anal Mach Intell. 32(5), 815–830 (2010)

    Article  Google Scholar 

  38. 38.

    Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the 18th ACM international conference on Multimedia, ACM, pp. 1469–1472 (2010)

  39. 39.

    Wang, C., Chan, S.C., Zhu, Z.Y., Zhang, L., Shum, H.Y.: Superpixel-based color-depth restoration and dynamic environment modeling for kinect-assisted image-based rendering systems. Vis. Comput. 34(1), 67–81 (2016)

    Article  Google Scholar 

  40. 40.

    Wang, J.Y., Adelson, E.H.: Representing moving images with layers. IEEE Trans. Image Process. 3(5), 625–638 (1994)

    Article  Google Scholar 

  41. 41.

    Wang, T.Y., Kohli, P., Mitra, N.J.: Dynamic sfm: detecting scene changes from image pairs. Comput. Graph. Forum 34, 177–189 (2015)

    Article  Google Scholar 

  42. 42.

    Wei, X.S., Xie, C.W., Wu, J., Shen, C.: Mask-cnn: localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recognit. 76, 704–714 (2018)

    Article  Google Scholar 

  43. 43.

    Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: Deepflow: large displacement optical flow with deep matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1385–1392 (2013)

  44. 44.

    Wu, Y., Lim, J., Yang, MH.: Online object tracking: a benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411–2418 (2013)

  45. 45.

    Zhang, C., Chen, Z., Wang, M., Li, M., Jiang, S.: Robust non-local tv- \(l^{1}\) optical flow estimation with occlusion detection. IEEE Trans. Image Process. 26(8), 4055–4067 (2017a)

    MathSciNet  Article  MATH  Google Scholar 

  46. 46.

    Zhang, G., Liu, J., Li, H., Chen, Y.Q., Davis, L.S.: Joint human detection and head pose estimation via multistream networks for rgb-d videos. IEEE Signal Process. Lett. 24(11), 1666–1670 (2017b)

    Article  Google Scholar 

  47. 47.

    Zhu, G., Porikli, F., Li, H.: Robust visual tracking with deep convolutional neural network based object proposals on pets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 26–33 (2016)

  48. 48.

    Zontak, M., Irani, M.: Internal statistics of a single natural image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp. 977–984 (2011)

Download references

Acknowledgements

We thank Dr. Arka Chattopadhyay for his assistance in revising the manuscript. Gagan kanojia was supported by TCS Research Scholarship.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Gagan Kanojia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kanojia, G., Raman, S. Patch-based detection of dynamic objects in CrowdCam images. Vis Comput 35, 521–534 (2019). https://doi.org/10.1007/s00371-018-1480-3

Download citation

Keywords

  • Object detection
  • Dynamic objects
  • Epipolar geometry