Skip to main content
Log in

A multi-frame sparse self-learning PWC-Net for motion estimation in satellite video scenes

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Motion estimation is an important approach to acquiring motion information of all targets in satellite video while it provides the ability to real-time monitor the Earth observation region. Compared with the case in computer vision, motion estimation in satellite video has to face two main difficulties: the large scale of observation and numerous weak targets of low signal-to-noise ratio. In this paper, a multi-frame sparse self-learning PWC-Net (MSSPWC-Net) is proposed to implement motion estimation of the weak targets in satellite video. To overcome the shortage that the existing PWC-Net fails to extract motion information from numerous weak targets, motion consistency and sparse self-learning are introduced to modify the pyramid, warping, and cost volume convolutional neural networks (CNN) network (PWC-Net). The motion consistency between neighboring frames as a multi-frame framework is mainly used to improve the accuracy of motion estimation of the weak targets, and sparse self-learning is adopted to deal with the case that labeled samples in satellite video are insufficient to train PWC-Net. Numerical experiments are conducted on 4 real satellite video datasets. Experimental results demonstrate that the proposed MSSPWC-Net achieves the excellent performance of motion estimation of the weak targets in satellite video and outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Stiller C. Object-based estimation of dense motion fields. IEEE Trans Image Process, 1997, 6: 234–250

    Article  Google Scholar 

  2. Oh T H, Jaroensri R, Kim C, et al. Learning-based video motion magnification. In: Proceedings of European Conference on Computer Vision. Cham: Springer, 2018

    Book  Google Scholar 

  3. Zhu J S, Sun K, Jia S, et al. Urban traffic density estimation based on ultrahigh-resolution UAV video and deep neural network. IEEE J Sel Top Appl Earth Observations Remote Sens, 2018, 11: 4968–4981

    Article  Google Scholar 

  4. Gupta M, Baireddy S, Comer M L, et al. Small target detection using optical flow. In: Proceedings of IEEE Aerospace Conference, 2021. 1–9

  5. Du B, Cai S H, Wu C. Object tracking in satellite videos based on a multiframe optical flow tracker. IEEE J Sel Top Appl Earth Observations Remote Sens, 2019, 12: 3043–3055

    Article  Google Scholar 

  6. Xuan S Y, Li S Y, Zhao Z F, et al. Rotation adaptive correlation filter for moving object tracking in satellite videos. Neurocomputing, 2021, 438: 94–106

    Article  Google Scholar 

  7. Xuan S Y, Li S Y, Han M F, et al. Object tracking in satellite videos by improved correlation filters with motion estimations. IEEE Trans Geosci Remote Sens, 2020, 58: 1074–1086

    Article  Google Scholar 

  8. Memin E, Perez P. Dense estimation and object-based segmentation of the optical flow with robust techniques. IEEE Trans Image Process, 1998, 7: 703–719

    Article  Google Scholar 

  9. Liu H, Gu Y F, Wang T F, et al. Satellite video super-resolution based on adaptively spatiotemporal neighbors and nonlocal similarity regularization. IEEE Trans Geosci Remote Sens, 2020, 58: 8372–8383

    Article  Google Scholar 

  10. Tanaka M, Yaguchi Y, Okutomi M. Robust and accurate estimation of multiple motions for whole-image super-resolution. In: Proceedings of IEEE International Conference on Image Processing, San Diego, 2008. 649–652

  11. Dai W, Chen Y M, Huang C, et al. Two-stream convolution neural network with video-stream for action recognition. In: Proceedings of International Joint Conference on Neural Networks, Budapest, 2019. 1–8

  12. Jin P, Mou L C, Hua Y S, et al. FuTH-Net: fusing temporal relations and holistic features for aerial video classification. IEEE Trans Geosci Remote Sens, 2022, 60: 1–13

    Google Scholar 

  13. Yin Z Y, Tang Y Q. Analysis of traffic flow in urban area for satellite video. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Seattle, 2020. 2898–2901

  14. Bao L C, Yang Q L, Jin H L. Fast edge-preserving PatchMatch for large displacement optical flow. IEEE Trans Image Process, 2014, 23: 4996–5006

    Article  MathSciNet  MATH  Google Scholar 

  15. Sun D Q, Yang X D, Liu M Y, et al. Models matter, so does training: an empirical study of CNNs for optical flow estimation. IEEE Trans Pattern Anal Mach Intell, 2020, 42: 1408–1423

    Article  Google Scholar 

  16. Revaud J, Weinzaepfel P, Harchaoui Z, et al. EpicFlow: Edge-preserving interpolation of correspondences for optical flow. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 1164–1172

  17. Sun D, Roth S, Lewis J P, et al. Learning optical flow. In: Proceedings of European Conference on Computer Vision. Berlin: Springer, 2008. 83–97

    Google Scholar 

  18. Liu P P, King I, Lyu M R, et al. DDFlow: learning optical flow with unlabeled data distillation. In: Proceedings of American Association for Artificial Intelligence, Hawaii, 2019

  19. Dosovitskiy A, Fischer P, Ilg E, et al. FlowNet: learning optical flow with convolutional networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 2758–2766

  20. Ilg E, Mayer N, Saikia T, et al. FlowNet2.0: evolution of optical flow estimation with deep networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 1647–1655

  21. Sun D Q, Yang X D, Liu M Y, et al. PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 8934–8943

  22. Lim J H, Choi H, Park J C, et al. Learning spatio-temporally invariant representations from video. In: Proceedings of International Joint Conference on Neural Networks, Brisbane, 2012. 1–6

  23. Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 640–651

    Article  Google Scholar 

  24. Bai M, Luo W, Kundu K, et al. Exploiting semantic information and deep matching for optical flow. In: Proceedings of European Conference on Computer Vision. Cham: Springer, 2016

    Book  Google Scholar 

  25. Jia X, Ranftl R, Koltun V. Accurate optical flow via direct cost volume processing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 5807–5815

  26. Žbontar J, LeCun Y. Computing the stereo matching cost with a convolutional neural network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 1592v1599

  27. Hosni A, Rhemann C, Bleyer M, et al. Fast cost-volume filtering for visual correspondence and beyond. IEEE Trans Pattern Anal Mach Intell, 2013, 35: 504–511

    Article  Google Scholar 

  28. Ren Z Z, Lee Y J. Cross-domain self-supervised multi-task feature learning using synthetic imagery. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 762–771

  29. Wang X L, He K M, Gupta A. Transitive invariance for self-supervised visual representation learning. In: Proceedings of IEEE International Conference on Computer Vision, Salt Lake City, 2017. 1338–1347

  30. Doersch C, Zisserman A. Multi-task self-supervised visual learning. In: Proceedings of IEEE International Conference on Computer Vision, Venice, 2017. 2070–2079

  31. Liu P P, Lyu M, King I, et al. SelFlow: self-supervised learning of optical flow. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 4566–4575

  32. Meister S, Hur J, Roth S. UnFlow: unsupervised learning of optical flow with a bidirectional census loss. In: Proceedings of American Association for Artificial, New Orleans, 2018

  33. Brox T, Bruhn A, Papenberg N. High accuracy optical flow estimation based on a theory for warping. In: Proceedings of European Conference on Computer Vision, Berlin, 2004

  34. Dosovitskiy A, Fischer P, Springenberg J T, et al. Discriminative unsupervised feature learning with exemplar convolutional neural networks. IEEE Trans Pattern Anal Mach Intell, 2016, 38: 1734–1747

    Article  Google Scholar 

  35. Jia S, Jiang S G, Lin Z J, et al. A semisupervised siamese network for hyperspectral image classification. IEEE Trans Geosci Remote Sens, 2022, 60: 1–17

    Google Scholar 

  36. Liu L, Hong D F, Ni L, et al. Multilayer cascade screening strategy for semi-supervised change detection in hyperspectral images. IEEE J Sel Top Appl Earth Observations Remote Sens, 2022, 15: 1926–1940

    Article  Google Scholar 

  37. Bergen J R, Burt P J, Hingorani R, et al. A three-frame algorithm for estimating two-component image motion. IEEE Trans Pattern Anal Machine Intell, 1992, 14: 886–896

    Article  Google Scholar 

  38. Zeiler M D, Krishnan D, Taylor G W, et al. Deconvolutional networks. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, 2010. 2528–2535

  39. Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, 2009. 248–255

  40. Horn B K P, Schunck B G. Determining optical flow. Artif Intelligence, 1981, 17: 185–203

    Article  MATH  Google Scholar 

  41. Wang X L, Jabri A, Efros A A. Learning correspondence from the cycle-consistency of time. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 2561–2571

  42. Gao L R, Han Z, Hong D F, et al. CyCU-Net: cycle-consistency unmixing network by learning cascaded autoencoders. IEEE Trans Geosci Remote Sens, 2022, 60: 1–14

    Google Scholar 

  43. Dwibedi D, Aytar Y, Tompson J, et al. Temporal cycle-consistency learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 1801–1810

  44. Baker S, Roth S, Scharstein D, et al. A database and evaluation methodology for optical flow. In: Proceedings of IEEE International Conference on Computer Vision, Rio de Janeiro, 2007. 1–8

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of Key International Cooperation (Grant No. 61720106002) and National Natural Science Foundation for Outstanding Scholars (Grant No. 62025107).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanfeng Gu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, T., Gu, Y. & Li, S. A multi-frame sparse self-learning PWC-Net for motion estimation in satellite video scenes. Sci. China Inf. Sci. 66, 192301 (2023). https://doi.org/10.1007/s11432-022-3634-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-022-3634-x

Keywords

Navigation