The Visual Computer

, Volume 35, Issue 4, pp 521–534 | Cite as

Patch-based detection of dynamic objects in CrowdCam images

  • Gagan KanojiaEmail author
  • Shanmuganathan Raman
Original Article


A scene can be divided into two parts: static and dynamic. The parts of the scene which do not admit any motion are static regions, while moving objects correspond to dynamic regions. In this work, we tackle the challenging task of identifying dynamic objects present in the CrowdCam images. Our approach exploits the coherency present in the natural images and utilizes the epipolar geometry present between a pair of images to achieve this objective. It does not require a dynamic object to be present in all the given images. We show that the proposed approach obtains state-of-the-art accuracy on standard datasets.


Object detection Dynamic objects Epipolar geometry 



We thank Dr. Arka Chattopadhyay for his assistance in revising the manuscript. Gagan kanojia was supported by TCS Research Scholarship.


  1. 1.
    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)CrossRefGoogle Scholar
  2. 2.
    Agarwal, S., Furukawa, Y., Snavely, N., Simon, I., Curless, B., Seitz, S.M., Szeliski, R.: Building rome in a day. Commun. ACM 54(10), 105–112 (2011)CrossRefGoogle Scholar
  3. 3.
    Babenko, B., Yang, M.H., Belongie, S.: Robust object tracking with online multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1619–1632 (2011)CrossRefGoogle Scholar
  4. 4.
    Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. TOG 28(3), 24 (2009)Google Scholar
  5. 5.
    Barnes, C., Shechtman, E., Goldman, D.B., Finkelstein, A.: The generalized patchmatch correspondence algorithm. In: European Conference on Computer Vision, pp 29–43. Springer, Berlin (2010)Google Scholar
  6. 6.
    Basha, T., Moses, Y., Avidan, S.: Photo sequencing. In: European Conference on Computer Vision, pp. 654–667 (2012)Google Scholar
  7. 7.
    Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 500–513 (2011)CrossRefGoogle Scholar
  8. 8.
    Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25(5), 564–577 (2003)CrossRefGoogle Scholar
  9. 9.
    Cremers, D., Soatto, S.: Motion competition: a variational approach to piecewise parametric motion segmentation. Int. J. Comput. Vis. 62(3), 249–265 (2005)CrossRefGoogle Scholar
  10. 10.
    Dafni, A., Moses, Y., Avidan, S., Dekel, T.: Detecting moving regions in CrowdCam images. Comput. Vis. Image Underst. 160, 36–44 (2017)CrossRefGoogle Scholar
  11. 11.
    Dar, M., Moses, Y.: Temporal epipolar regions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1220–1228 (2016)Google Scholar
  12. 12.
    Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1841–1848 (2013)Google Scholar
  13. 13.
    Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., van der Smagt, P., Cremers, D., Brox, T.: Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)Google Scholar
  14. 14.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol. 96, pp. 226–231 (1996)Google Scholar
  15. 15.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Gullapally, S.C., Malireddi, S.R., Raman, S.: Dynamic object localization using hand-held cameras. In: 2015 Twenty First National Conference on Communications (NCC), IEEE, pp. 1–6 (2015)Google Scholar
  17. 17.
    HaCohen, Y., Shechtman, E., Goldman, D.B., Lischinski, D.: Non-rigid dense correspondence with applications for image enhancement. ACM Trans. Graph. (TOG) 30(4), 70 (2011)CrossRefGoogle Scholar
  18. 18.
    Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, New York, NY (2003)zbMATHGoogle Scholar
  19. 19.
    He, K., Gkioxari, G., Dollr, P., Girshick, R.: Mask r-cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)Google Scholar
  20. 20.
    Korman, S., Avidan, S.: Coherency sensitive hashing. In: Proceedings of the IEEE International Conference on Computer Vision, IEEE, pp. 1607–1614 (2011)Google Scholar
  21. 21.
    Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)CrossRefGoogle Scholar
  22. 22.
    Liu, X., Ym, Cheung, Tang, Y.Y.: Lip event detection using oriented histograms of regional optical flow and low rank affinity pursuit. Comput. Vis. Image Underst. 148, 153–163 (2016)CrossRefGoogle Scholar
  23. 23.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  24. 24.
    Margolin, R., Zelnik-Manor, L., Tal, A.: Otc: a novel local descriptor for scene classification. In: European Conference on Computer Vision, pp. 377–391. Springer, Cham (2014)Google Scholar
  25. 25.
    Moses, Y., Avidan, S, et al.: Space-time tradeoffs in photo sequencing. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 977–984 (2013)Google Scholar
  26. 26.
    Ni, Q., Wang, J., Gu, X.: Moving target tracking based on pulse coupled neural network and optical flow. In: International Conference on Neural Information Processing, pp. 17–25. Springer, Cham (2015)Google Scholar
  27. 27.
    Ochs, P., Malik, J., Brox, T.: Segmentation of moving objects by long term video analysis. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1187–1200 (2014)CrossRefGoogle Scholar
  28. 28.
    Park, H.S., Shiratori, T., Matthews, I., Sheikh, Y.: 3d reconstruction of a moving point from a series of 2d projections. In: European conference on computer vision, pp. 158–171. Springer, Berlin (2010)Google Scholar
  29. 29.
    Peng, Y., Chen, Z., Wu, Q.J., Liu, C.: Traffic flow detection and statistics via improved optical flow and connected region analysis. Signal Image Video Process. 12(1), 99–105 (2017)CrossRefGoogle Scholar
  30. 30.
    Perazzi, F., Pont-Tuset, J., McWilliams, B., Gool, L.V., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 724–732 (2016)Google Scholar
  31. 31.
    Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Van Gool, L.: The 2017 davis challenge on video object segmentation. (2017). arXiv:1704.00675
  32. 32.
    Sevilla-Lara, L., Sun, D., Jampani, V., Black, M.J.: Optical flow with semantic segmentation and localized layers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3889–3898 (2016)Google Scholar
  33. 33.
    Shi, J., Malik, J.: Motion segmentation and tracking using normalized cuts. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1154–1160 (1998)Google Scholar
  34. 34.
    Hj, Song, Ml, Shen: Target tracking algorithm based on optical flow method using corner detection. Multimed. Tools Appl. 52(1), 121–131 (2011)CrossRefGoogle Scholar
  35. 35.
    Tian, L., Li, M., Zhang, G., Zhao, J., Chen, YQ.: Robust human detection with super-pixel segmentation and random ferns classification using rgb-d camera. In: Proceedings of the IEEE International Conference on Multimedia and Expo, IEEE, pp. 1542–1547 (2017)Google Scholar
  36. 36.
    Tian, Z., Liu, L., Zhang, Z., Fei, B.: Superpixel-based segmentation for 3d prostate mr images. IEEE Trans. Med. Imaging 35(3), 791–801 (2016)CrossRefGoogle Scholar
  37. 37.
    Tola, E., Lepetit, V., Fua, P.: Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans. Pattern Anal Mach Intell. 32(5), 815–830 (2010)CrossRefGoogle Scholar
  38. 38.
    Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the 18th ACM international conference on Multimedia, ACM, pp. 1469–1472 (2010)Google Scholar
  39. 39.
    Wang, C., Chan, S.C., Zhu, Z.Y., Zhang, L., Shum, H.Y.: Superpixel-based color-depth restoration and dynamic environment modeling for kinect-assisted image-based rendering systems. Vis. Comput. 34(1), 67–81 (2016)CrossRefGoogle Scholar
  40. 40.
    Wang, J.Y., Adelson, E.H.: Representing moving images with layers. IEEE Trans. Image Process. 3(5), 625–638 (1994)CrossRefGoogle Scholar
  41. 41.
    Wang, T.Y., Kohli, P., Mitra, N.J.: Dynamic sfm: detecting scene changes from image pairs. Comput. Graph. Forum 34, 177–189 (2015)CrossRefGoogle Scholar
  42. 42.
    Wei, X.S., Xie, C.W., Wu, J., Shen, C.: Mask-cnn: localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recognit. 76, 704–714 (2018)CrossRefGoogle Scholar
  43. 43.
    Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: Deepflow: large displacement optical flow with deep matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1385–1392 (2013)Google Scholar
  44. 44.
    Wu, Y., Lim, J., Yang, MH.: Online object tracking: a benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411–2418 (2013)Google Scholar
  45. 45.
    Zhang, C., Chen, Z., Wang, M., Li, M., Jiang, S.: Robust non-local tv- \(l^{1}\) optical flow estimation with occlusion detection. IEEE Trans. Image Process. 26(8), 4055–4067 (2017a)MathSciNetCrossRefzbMATHGoogle Scholar
  46. 46.
    Zhang, G., Liu, J., Li, H., Chen, Y.Q., Davis, L.S.: Joint human detection and head pose estimation via multistream networks for rgb-d videos. IEEE Signal Process. Lett. 24(11), 1666–1670 (2017b)CrossRefGoogle Scholar
  47. 47.
    Zhu, G., Porikli, F., Li, H.: Robust visual tracking with deep convolutional neural network based object proposals on pets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 26–33 (2016)Google Scholar
  48. 48.
    Zontak, M., Irani, M.: Internal statistics of a single natural image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp. 977–984 (2011)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Indian Institute of Technology GandhinagarGandhinagarIndia

Personalised recommendations