Abstract
A scene can be divided into two parts: static and dynamic. The parts of the scene which do not admit any motion are static regions, while moving objects correspond to dynamic regions. In this work, we tackle the challenging task of identifying dynamic objects present in the CrowdCam images. Our approach exploits the coherency present in the natural images and utilizes the epipolar geometry present between a pair of images to achieve this objective. It does not require a dynamic object to be present in all the given images. We show that the proposed approach obtains state-of-the-art accuracy on standard datasets.
Similar content being viewed by others
References
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)
Agarwal, S., Furukawa, Y., Snavely, N., Simon, I., Curless, B., Seitz, S.M., Szeliski, R.: Building rome in a day. Commun. ACM 54(10), 105–112 (2011)
Babenko, B., Yang, M.H., Belongie, S.: Robust object tracking with online multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1619–1632 (2011)
Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. TOG 28(3), 24 (2009)
Barnes, C., Shechtman, E., Goldman, D.B., Finkelstein, A.: The generalized patchmatch correspondence algorithm. In: European Conference on Computer Vision, pp 29–43. Springer, Berlin (2010)
Basha, T., Moses, Y., Avidan, S.: Photo sequencing. In: European Conference on Computer Vision, pp. 654–667 (2012)
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 500–513 (2011)
Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 25(5), 564–577 (2003)
Cremers, D., Soatto, S.: Motion competition: a variational approach to piecewise parametric motion segmentation. Int. J. Comput. Vis. 62(3), 249–265 (2005)
Dafni, A., Moses, Y., Avidan, S., Dekel, T.: Detecting moving regions in CrowdCam images. Comput. Vis. Image Underst. 160, 36–44 (2017)
Dar, M., Moses, Y.: Temporal epipolar regions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1220–1228 (2016)
Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1841–1848 (2013)
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., van der Smagt, P., Cremers, D., Brox, T.: Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)
Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol. 96, pp. 226–231 (1996)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Gullapally, S.C., Malireddi, S.R., Raman, S.: Dynamic object localization using hand-held cameras. In: 2015 Twenty First National Conference on Communications (NCC), IEEE, pp. 1–6 (2015)
HaCohen, Y., Shechtman, E., Goldman, D.B., Lischinski, D.: Non-rigid dense correspondence with applications for image enhancement. ACM Trans. Graph. (TOG) 30(4), 70 (2011)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, New York, NY (2003)
He, K., Gkioxari, G., Dollr, P., Girshick, R.: Mask r-cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
Korman, S., Avidan, S.: Coherency sensitive hashing. In: Proceedings of the IEEE International Conference on Computer Vision, IEEE, pp. 1607–1614 (2011)
Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2011)
Liu, X., Ym, Cheung, Tang, Y.Y.: Lip event detection using oriented histograms of regional optical flow and low rank affinity pursuit. Comput. Vis. Image Underst. 148, 153–163 (2016)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Margolin, R., Zelnik-Manor, L., Tal, A.: Otc: a novel local descriptor for scene classification. In: European Conference on Computer Vision, pp. 377–391. Springer, Cham (2014)
Moses, Y., Avidan, S, et al.: Space-time tradeoffs in photo sequencing. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 977–984 (2013)
Ni, Q., Wang, J., Gu, X.: Moving target tracking based on pulse coupled neural network and optical flow. In: International Conference on Neural Information Processing, pp. 17–25. Springer, Cham (2015)
Ochs, P., Malik, J., Brox, T.: Segmentation of moving objects by long term video analysis. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1187–1200 (2014)
Park, H.S., Shiratori, T., Matthews, I., Sheikh, Y.: 3d reconstruction of a moving point from a series of 2d projections. In: European conference on computer vision, pp. 158–171. Springer, Berlin (2010)
Peng, Y., Chen, Z., Wu, Q.J., Liu, C.: Traffic flow detection and statistics via improved optical flow and connected region analysis. Signal Image Video Process. 12(1), 99–105 (2017)
Perazzi, F., Pont-Tuset, J., McWilliams, B., Gool, L.V., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 724–732 (2016)
Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Van Gool, L.: The 2017 davis challenge on video object segmentation. (2017). arXiv:1704.00675
Sevilla-Lara, L., Sun, D., Jampani, V., Black, M.J.: Optical flow with semantic segmentation and localized layers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3889–3898 (2016)
Shi, J., Malik, J.: Motion segmentation and tracking using normalized cuts. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1154–1160 (1998)
Hj, Song, Ml, Shen: Target tracking algorithm based on optical flow method using corner detection. Multimed. Tools Appl. 52(1), 121–131 (2011)
Tian, L., Li, M., Zhang, G., Zhao, J., Chen, YQ.: Robust human detection with super-pixel segmentation and random ferns classification using rgb-d camera. In: Proceedings of the IEEE International Conference on Multimedia and Expo, IEEE, pp. 1542–1547 (2017)
Tian, Z., Liu, L., Zhang, Z., Fei, B.: Superpixel-based segmentation for 3d prostate mr images. IEEE Trans. Med. Imaging 35(3), 791–801 (2016)
Tola, E., Lepetit, V., Fua, P.: Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans. Pattern Anal Mach Intell. 32(5), 815–830 (2010)
Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the 18th ACM international conference on Multimedia, ACM, pp. 1469–1472 (2010)
Wang, C., Chan, S.C., Zhu, Z.Y., Zhang, L., Shum, H.Y.: Superpixel-based color-depth restoration and dynamic environment modeling for kinect-assisted image-based rendering systems. Vis. Comput. 34(1), 67–81 (2016)
Wang, J.Y., Adelson, E.H.: Representing moving images with layers. IEEE Trans. Image Process. 3(5), 625–638 (1994)
Wang, T.Y., Kohli, P., Mitra, N.J.: Dynamic sfm: detecting scene changes from image pairs. Comput. Graph. Forum 34, 177–189 (2015)
Wei, X.S., Xie, C.W., Wu, J., Shen, C.: Mask-cnn: localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recognit. 76, 704–714 (2018)
Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: Deepflow: large displacement optical flow with deep matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1385–1392 (2013)
Wu, Y., Lim, J., Yang, MH.: Online object tracking: a benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411–2418 (2013)
Zhang, C., Chen, Z., Wang, M., Li, M., Jiang, S.: Robust non-local tv- \(l^{1}\) optical flow estimation with occlusion detection. IEEE Trans. Image Process. 26(8), 4055–4067 (2017a)
Zhang, G., Liu, J., Li, H., Chen, Y.Q., Davis, L.S.: Joint human detection and head pose estimation via multistream networks for rgb-d videos. IEEE Signal Process. Lett. 24(11), 1666–1670 (2017b)
Zhu, G., Porikli, F., Li, H.: Robust visual tracking with deep convolutional neural network based object proposals on pets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 26–33 (2016)
Zontak, M., Irani, M.: Internal statistics of a single natural image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp. 977–984 (2011)
Acknowledgements
We thank Dr. Arka Chattopadhyay for his assistance in revising the manuscript. Gagan kanojia was supported by TCS Research Scholarship.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kanojia, G., Raman, S. Patch-based detection of dynamic objects in CrowdCam images. Vis Comput 35, 521–534 (2019). https://doi.org/10.1007/s00371-018-1480-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-018-1480-3