Abstract
Recent work on dense optical flow has shown significant progress, primarily in a supervised learning manner requiring a large amount of labeled data. Due to the expensiveness of obtaining large-scale real-world data, computer graphics are typically leveraged for constructing datasets. However, there is a common belief that synthetic-to-real domain gaps limit generalization to real scenes. In this paper, we show that the required characteristics in an optical flow dataset are rather simple and present a simpler synthetic data generation method that achieves a certain level of realism with compositions of elementary operations. With 2D motion-based datasets, we systematically analyze the simplest yet critical factors for generating synthetic datasets. Furthermore, we propose a novel method of utilizing occlusion masks in a supervised method and observe that suppressing gradients on occluded regions serves as a powerful initial state in the curriculum learning sense. The RAFT network initially trained on our dataset outperforms the original RAFT on the two most challenging online benchmarks, MPI Sintel and KITTI 2015.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analyzed during the current study are available from the first and corresponding authors on reasonable request. The data generation code will be published through a separate project web-page if accepted.
Notes
References
Aleotti, F., Poggi, M., Mattoccia, S.: 2021. Learning optical flow from still images, in: IEEE Conference on Computer Vision and Pattern Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Black, M.J., Anandan, P.: 1993. A framework for the robust estimation of optical flow, in: 1993 (4th) International Conference on Computer Vision, IEEE. pp. 231–236
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: 2012a. A naturalistic open source movie for optical flow evaluation, in: European Conference on Computer Vision (ECCV), Springer. pp. 611–625
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: 2012b. A naturalistic open source movie for optical flow evaluation, in: European Conference on Computer Vision (ECCV), Springer. pp. 611–625
Byung-Ki, K., Hyeon-Woo, N., Kim, J.Y., Oh, T.H.: Dflow: Learning to synthesize better optical flow datasets via a differentiable pipeline. Presented at the (2022)
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., Brox, T.: 2015. Flownet: Learning optical flow with convolutional networks, in: IEEE International Conference on Computer Vision (ICCV), pp. 2758–2766
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. (IJCV) 88, 303–338 (2010)
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: 2016. Virtual worlds as proxy for multi-object tracking analysis, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. (IJRR) 32, 1231–1237 (2013)
Hofinger, M., Bulo, S.R., Porzi, L., Knapitsch, A., Pock, T., Kontschieder, P.: 2020. Improving optical flow on a pyramid level, in: European Conference on Computer Vision, Springer. pp. 770–786
Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. intell. 17, 185–203 (1981)
Hui, T.W., Tang, X., Loy, C.C.: 2018. Liteflownet: A lightweight convolutional neural network for optical flow estimation, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8981–8989
Hur, J., Roth, S.: 2019a. Iterative residual refinement for joint optical flow and occlusion estimation, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5754–5763
Hur, J., Roth, S.: 2019b. Iterative residual refinement for joint optical flow and occlusion estimation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5754–5763
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: 2017. Flownet 2.0: Evolution of optical flow estimation with deep networks, in: IEEE International Conference on Computer Vision (ICCV), pp. 1647–1655
Janai, J., Guney, F., Wulff, J., Black, M.J., Geiger, A.: 2017. Slow flow: Exploiting high-speed cameras for accurate and diverse optical flow reference data, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3597–3607
Jeong, J., Lin, J.M., Porikli, F., Kwak, N.: 2022. Imposing consistency for optical flow estimation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3181–3191
Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: 2018. Super slomo: High quality estimation of multiple intermediate frames for video interpolation, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9000–9008
Jonschkowski, R., Stone, A., Barron, J.T., Gordon, A., Konolige, K., Angelova, A.: 2020. What matters in unsupervised optical flow, in: European Conference on Computer Vision (ECCV)
Kondermann, D., Nair, R., Honauer, K., Krispin, K., Andrulis, J., Brock, A., Gussefeld, B., Rahimimoghaddam, M., Hofmann, S., Brenner, C., et al.: 2016. The hci benchmark suite: Stereo and flow ground truth with uncertainties for urban autonomous driving, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 19–28
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: 2014. Microsoft coco: Common objects in context, in: European Conference on Computer Vision (ECCV), Springer. pp. 740–755
Mayer, N., Ilg, E., Fischer, P., Hazirbas, C., Cremers, D., Dosovitskiy, A., Brox, T.: What makes good synthetic training data for learning disparity and optical flow estimation? Int. J. Comput. Vis. (IJCV) 126, 942–960 (2018)
Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: 2016. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4040–4048
Menze, M., Geiger, A.: 2015. Object scene flow for autonomous vehicles, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3061–3070
Menze, M., Heipke, C., Geiger, A.: 2015. Discrete optimization for optical flow, in: German Conference on Pattern Recognition, Springer. pp. 16–28
Oh, T.H., Jaroensri, R., Kim, C., Elgharib, M., Durand, F., Freeman, W.T., Matusik, W.: 2018. Learning-based video motion magnification, in: European Conference on Computer Vision (ECCV), pp. 633–648
Roth, S., Black, M.J.: On the spatial statistics of optical flow. Int. J. Comput. Vision (IJCV) 74, 33–50 (2007)
Sun, D., Vlasic, D., Herrmann, C., Jampani, V., Krainin, M., Chang, H., Zabih, R., Freeman, W.T., Liu, C.: 2021. Autoflow: Learning a better training set for optical flow, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10093–10102
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: 2018. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8934–8943
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Models matter, so does training: an empirical study of CNNs for optical flow estimation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 42, 1408–1423 (2019)
Teed, Z., Deng, J.: 2020. Raft: Recurrent all-pairs field transforms for optical flow, in: European Conference on Computer Vision (ECCV), Springer
Yang, G., Ramanan, D.: Volumetric correspondence networks for optical flow. Adv. Neural Inform. Process. Syst. (NeurIPS) 5, 12 (2019)
Zach, C., Pock, T., Bischof, H.: 2007. A duality based approach for realtime tv-l 1 optical flow, in: Joint Pattern Recognition Symposium, Springer
Zhao, S., Sheng, Y., Dong, Y., Chang, E.I., Xu, Y., et al.: 2020. Maskflownet: Asymmetric feature matching with learnable occlusion mask, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Acknowledgements
This work was supported by the Korea Research Institute for defense Technology planning & advancement (KRIT) grant funded by the Defense Acquisition Program Administration (DAPA) (No. KRIT-CT-22-037, Hyper-connected space-time information fusion artificial intelligence technologies for signs detection and analysis)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest of potential conflicts of interest:
Not applicable.
Compliance with Ethical Standards
The authors ensure objectivity and transparency in research and ensure that accepted principles of ethical and professional conduct have been followed.
Research involving Human Participants and/or Animals:
Not applicable.
Informed consent:
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kwon, BK., Kim, SB. & Oh, TH. The devil in the details: simple and effective optical flow synthetic data generation. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03263-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s00371-024-03263-z