A multi-frame sparse self-learning PWC-Net for motion estimation in satellite video scenes

Wang, Tengfei; Gu, Yanfeng; Li, Shengyang

doi:10.1007/s11432-022-3634-x

A multi-frame sparse self-learning PWC-Net for motion estimation in satellite video scenes

Research Paper
Published: 21 August 2023

Volume 66, article number 192301, (2023)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Tengfei Wang¹,
Yanfeng Gu¹ &
Shengyang Li^2,3

137 Accesses
1 Citation
Explore all metrics

Abstract

Motion estimation is an important approach to acquiring motion information of all targets in satellite video while it provides the ability to real-time monitor the Earth observation region. Compared with the case in computer vision, motion estimation in satellite video has to face two main difficulties: the large scale of observation and numerous weak targets of low signal-to-noise ratio. In this paper, a multi-frame sparse self-learning PWC-Net (MSSPWC-Net) is proposed to implement motion estimation of the weak targets in satellite video. To overcome the shortage that the existing PWC-Net fails to extract motion information from numerous weak targets, motion consistency and sparse self-learning are introduced to modify the pyramid, warping, and cost volume convolutional neural networks (CNN) network (PWC-Net). The motion consistency between neighboring frames as a multi-frame framework is mainly used to improve the accuracy of motion estimation of the weak targets, and sparse self-learning is adopted to deal with the case that labeled samples in satellite video are insufficient to train PWC-Net. Numerical experiments are conducted on 4 real satellite video datasets. Experimental results demonstrate that the proposed MSSPWC-Net achieves the excellent performance of motion estimation of the weak targets in satellite video and outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep feature extraction and motion representation for satellite video scene classification

Article 09 March 2020

Attention-Guided Motion Estimation for Video Compression

Sparse-Temporal Segment Network for Action Recognition

References

Stiller C. Object-based estimation of dense motion fields. IEEE Trans Image Process, 1997, 6: 234–250
Article Google Scholar
Oh T H, Jaroensri R, Kim C, et al. Learning-based video motion magnification. In: Proceedings of European Conference on Computer Vision. Cham: Springer, 2018
Book Google Scholar
Zhu J S, Sun K, Jia S, et al. Urban traffic density estimation based on ultrahigh-resolution UAV video and deep neural network. IEEE J Sel Top Appl Earth Observations Remote Sens, 2018, 11: 4968–4981
Article Google Scholar
Gupta M, Baireddy S, Comer M L, et al. Small target detection using optical flow. In: Proceedings of IEEE Aerospace Conference, 2021. 1–9
Du B, Cai S H, Wu C. Object tracking in satellite videos based on a multiframe optical flow tracker. IEEE J Sel Top Appl Earth Observations Remote Sens, 2019, 12: 3043–3055
Article Google Scholar
Xuan S Y, Li S Y, Zhao Z F, et al. Rotation adaptive correlation filter for moving object tracking in satellite videos. Neurocomputing, 2021, 438: 94–106
Article Google Scholar
Xuan S Y, Li S Y, Han M F, et al. Object tracking in satellite videos by improved correlation filters with motion estimations. IEEE Trans Geosci Remote Sens, 2020, 58: 1074–1086
Article Google Scholar
Memin E, Perez P. Dense estimation and object-based segmentation of the optical flow with robust techniques. IEEE Trans Image Process, 1998, 7: 703–719
Article Google Scholar
Liu H, Gu Y F, Wang T F, et al. Satellite video super-resolution based on adaptively spatiotemporal neighbors and nonlocal similarity regularization. IEEE Trans Geosci Remote Sens, 2020, 58: 8372–8383
Article Google Scholar
Tanaka M, Yaguchi Y, Okutomi M. Robust and accurate estimation of multiple motions for whole-image super-resolution. In: Proceedings of IEEE International Conference on Image Processing, San Diego, 2008. 649–652
Dai W, Chen Y M, Huang C, et al. Two-stream convolution neural network with video-stream for action recognition. In: Proceedings of International Joint Conference on Neural Networks, Budapest, 2019. 1–8
Jin P, Mou L C, Hua Y S, et al. FuTH-Net: fusing temporal relations and holistic features for aerial video classification. IEEE Trans Geosci Remote Sens, 2022, 60: 1–13
Google Scholar
Yin Z Y, Tang Y Q. Analysis of traffic flow in urban area for satellite video. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Seattle, 2020. 2898–2901
Bao L C, Yang Q L, Jin H L. Fast edge-preserving PatchMatch for large displacement optical flow. IEEE Trans Image Process, 2014, 23: 4996–5006
Article MathSciNet MATH Google Scholar
Sun D Q, Yang X D, Liu M Y, et al. Models matter, so does training: an empirical study of CNNs for optical flow estimation. IEEE Trans Pattern Anal Mach Intell, 2020, 42: 1408–1423
Article Google Scholar
Revaud J, Weinzaepfel P, Harchaoui Z, et al. EpicFlow: Edge-preserving interpolation of correspondences for optical flow. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 1164–1172
Sun D, Roth S, Lewis J P, et al. Learning optical flow. In: Proceedings of European Conference on Computer Vision. Berlin: Springer, 2008. 83–97
Google Scholar
Liu P P, King I, Lyu M R, et al. DDFlow: learning optical flow with unlabeled data distillation. In: Proceedings of American Association for Artificial Intelligence, Hawaii, 2019
Dosovitskiy A, Fischer P, Ilg E, et al. FlowNet: learning optical flow with convolutional networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 2758–2766
Ilg E, Mayer N, Saikia T, et al. FlowNet2.0: evolution of optical flow estimation with deep networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 1647–1655
Sun D Q, Yang X D, Liu M Y, et al. PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 8934–8943
Lim J H, Choi H, Park J C, et al. Learning spatio-temporally invariant representations from video. In: Proceedings of International Joint Conference on Neural Networks, Brisbane, 2012. 1–6
Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 640–651
Article Google Scholar
Bai M, Luo W, Kundu K, et al. Exploiting semantic information and deep matching for optical flow. In: Proceedings of European Conference on Computer Vision. Cham: Springer, 2016
Book Google Scholar
Jia X, Ranftl R, Koltun V. Accurate optical flow via direct cost volume processing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 5807–5815
Žbontar J, LeCun Y. Computing the stereo matching cost with a convolutional neural network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 1592v1599
Hosni A, Rhemann C, Bleyer M, et al. Fast cost-volume filtering for visual correspondence and beyond. IEEE Trans Pattern Anal Mach Intell, 2013, 35: 504–511
Article Google Scholar
Ren Z Z, Lee Y J. Cross-domain self-supervised multi-task feature learning using synthetic imagery. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 762–771
Wang X L, He K M, Gupta A. Transitive invariance for self-supervised visual representation learning. In: Proceedings of IEEE International Conference on Computer Vision, Salt Lake City, 2017. 1338–1347
Doersch C, Zisserman A. Multi-task self-supervised visual learning. In: Proceedings of IEEE International Conference on Computer Vision, Venice, 2017. 2070–2079
Liu P P, Lyu M, King I, et al. SelFlow: self-supervised learning of optical flow. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 4566–4575
Meister S, Hur J, Roth S. UnFlow: unsupervised learning of optical flow with a bidirectional census loss. In: Proceedings of American Association for Artificial, New Orleans, 2018
Brox T, Bruhn A, Papenberg N. High accuracy optical flow estimation based on a theory for warping. In: Proceedings of European Conference on Computer Vision, Berlin, 2004
Dosovitskiy A, Fischer P, Springenberg J T, et al. Discriminative unsupervised feature learning with exemplar convolutional neural networks. IEEE Trans Pattern Anal Mach Intell, 2016, 38: 1734–1747
Article Google Scholar
Jia S, Jiang S G, Lin Z J, et al. A semisupervised siamese network for hyperspectral image classification. IEEE Trans Geosci Remote Sens, 2022, 60: 1–17
Google Scholar
Liu L, Hong D F, Ni L, et al. Multilayer cascade screening strategy for semi-supervised change detection in hyperspectral images. IEEE J Sel Top Appl Earth Observations Remote Sens, 2022, 15: 1926–1940
Article Google Scholar
Bergen J R, Burt P J, Hingorani R, et al. A three-frame algorithm for estimating two-component image motion. IEEE Trans Pattern Anal Machine Intell, 1992, 14: 886–896
Article Google Scholar
Zeiler M D, Krishnan D, Taylor G W, et al. Deconvolutional networks. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, 2010. 2528–2535
Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, 2009. 248–255
Horn B K P, Schunck B G. Determining optical flow. Artif Intelligence, 1981, 17: 185–203
Article MATH Google Scholar
Wang X L, Jabri A, Efros A A. Learning correspondence from the cycle-consistency of time. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 2561–2571
Gao L R, Han Z, Hong D F, et al. CyCU-Net: cycle-consistency unmixing network by learning cascaded autoencoders. IEEE Trans Geosci Remote Sens, 2022, 60: 1–14
Google Scholar
Dwibedi D, Aytar Y, Tompson J, et al. Temporal cycle-consistency learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 1801–1810
Baker S, Roth S, Scharstein D, et al. A database and evaluation methodology for optical flow. In: Proceedings of IEEE International Conference on Computer Vision, Rio de Janeiro, 2007. 1–8

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of Key International Cooperation (Grant No. 61720106002) and National Natural Science Foundation for Outstanding Scholars (Grant No. 62025107).

Author information

Authors and Affiliations

School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, 150001, China
Tengfei Wang & Yanfeng Gu
Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing, 100094, China
Shengyang Li
University of Chinese Academy of Sciences, Beijing, 100049, China
Shengyang Li

Authors

Tengfei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yanfeng Gu
View author publications
You can also search for this author in PubMed Google Scholar
Shengyang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanfeng Gu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, T., Gu, Y. & Li, S. A multi-frame sparse self-learning PWC-Net for motion estimation in satellite video scenes. Sci. China Inf. Sci. 66, 192301 (2023). https://doi.org/10.1007/s11432-022-3634-x

Download citation

Received: 19 June 2022
Revised: 13 September 2022
Accepted: 24 November 2022
Published: 21 August 2023
DOI: https://doi.org/10.1007/s11432-022-3634-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-frame sparse self-learning PWC-Net for motion estimation in satellite video scenes

Abstract

Access this article

Similar content being viewed by others

Deep feature extraction and motion representation for satellite video scene classification

Attention-Guided Motion Estimation for Video Compression

Sparse-Temporal Segment Network for Action Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multi-frame sparse self-learning PWC-Net for motion estimation in satellite video scenes

Abstract

Access this article

Similar content being viewed by others

Deep feature extraction and motion representation for satellite video scene classification

Attention-Guided Motion Estimation for Video Compression

Sparse-Temporal Segment Network for Action Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation